提交 · 2ba38943ba190eb6a494262003e23187d1b40fb4 · openeuler / Kernel

06 9月, 2014 29 次提交

Merge branch 'eth_get_headlen' · 2ba38943

由 David S. Miller 提交于 9月 05, 2014

Alexander Duyck says:

====================
net: Drop get_headlen functions in favor of generic function

This series replaces the igb_get_headlen and ixgbe_get_headlen functions
with a generic function named eth_get_headlen.

I have done some performance testing on ixgbe with 258 byte frames since
the calls are only used on frames larger than 256 bytes and have seen no
significant difference in CPU utilization.

v2: renamed __skb_get_poff to skb_get_poff
    renamed ___skb_get_poff to __skb_get_poff
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ba38943

ixgbe: use new eth_get_headlen interface · 8496e338

由 Alexander Duyck 提交于 9月 05, 2014

Update ixgbe to drop the ixgbe_get_headlen function in favor of eth_get_headlen.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Acked-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8496e338

igb: use new eth_get_headlen interface · 24cd23d3

由 Alexander Duyck 提交于 9月 05, 2014

Update igb to drop the igb_get_headlen function in favor of eth_get_headlen.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Acked-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24cd23d3

net: Add function for parsing the header length out of linear ethernet frames · 56193d1b

由 Alexander Duyck 提交于 9月 05, 2014

This patch updates some of the flow_dissector api so that it can be used to
parse the length of ethernet buffers stored in fragments. Most of the
changes needed were to __skb_get_poff as it needed to be updated to support
sending a linear buffer instead of a skb.

I have split __skb_get_poff into two functions, the first is skb_get_poff
and it retains the functionality of the original __skb_get_poff. The other
function is __skb_get_poff which now works much like __skb_flow_dissect in
relation to skb_flow_dissect in that it provides the same functionality but
works with just a data buffer and hlen instead of needing an skb.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56193d1b

Merge branch 'timestamping' · 2c048e64

由 David S. Miller 提交于 9月 05, 2014

Alexander Duyck says:

====================
This change makes it so that the core path for the phy timestamping logic
is shared between skb_tx_tstamp and skb_complete_tx_timestamp.  In addition
it provides a means of using the same skb clone type path in non phy
timestamping drivers.

The main motivation for this is to enable non-phy drivers to be able to
manipulate tx timestamp skbs for such things as putting them in lists or
setting aside buffer in the context block.

v2: Incorporated suggested changes from Willem de Bruijn and Eric Dumazet
     dropped uneeded comment
     restored order of hwtstamp vs swtstamp
     added destructor for skb
    Dropped usage of skb_complete_tx_timestamp as a kfree_skb w/ destructor

v3: Updated destructor handling and dealt with socket reference counting issues

v4: Split out combining destructors into a separate patch
====================

2c048e64

net: merge cases where sock_efree and sock_edemux are the same function · 82eabd9e

由 Alexander Duyck 提交于 9月 04, 2014

Since sock_efree and sock_demux are essentially the same code for non-TCP
sockets and the case where CONFIG_INET is not defined we can combine the
code or replace the call to sock_edemux in several spots. As a result we
can avoid a bit of unnecessary code or code duplication.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82eabd9e

net-timestamp: Make the clone operation stand-alone from phy timestamping · 62bccb8c

由 Alexander Duyck 提交于 9月 04, 2014

The phy timestamping takes a different path than the regular timestamping
does in that it will create a clone first so that the packets needing to be
timestamped can be placed in a queue, or the context block could be used.

In order to support these use cases I am pulling the core of the code out
so it can be used in other drivers beyond just phy devices.

In addition I have added a destructor named sock_efree which is meant to
provide a simple way for dropping the reference to skb exceptions that
aren't part of either the receive or send windows for the socket, and I
have removed some duplication in spots where this destructor could be used
in place of sock_edemux.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62bccb8c

net-timestamp: Merge shared code between phy and regular timestamping · 37846ef0

由 Alexander Duyck 提交于 9月 04, 2014

This change merges the shared bits that exist between skb_tx_tstamp and
skb_complete_tx_timestamp.  By doing this we can avoid the two diverging as
there were already changes pushed into skb_tx_tstamp that hadn't made it
into the other function.

In addition this resolves issues with the fact that
skb_complete_tx_timestamp was included in linux/skbuff.h even though it was
only compiled in if phy timestamping was enabled.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37846ef0

ipv4: harden fnhe_hashfun() · d546c621

由 Eric Dumazet 提交于 9月 04, 2014

Lets make this hash function a bit secure, as ICMP attacks are still
in the wild.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d546c621

net-timestamp: fix allocation error in test · 18a47e6d

由 Willem de Bruijn 提交于 9月 04, 2014

A buffer is incorrectly zeroed to the length of the pointer. If
cfg_payload_len < sizeof(void *) this can overwrites unrelated memory.
The buffer contents are never read, so no need to zero.

Fixes: 8fe2f761 ("net-timestamp: expand documentation")
Reported-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18a47e6d

hyperv: NULL dereference on error · b1c84927

由 Dan Carpenter 提交于 9月 04, 2014

We try to call free_netvsc_device(net_device) when "net_device" is NULL.
It leads to an Oops.

Fixes: f90251c8 ('hyperv: Increase the buffer length for netvsc_channel_cb()')
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1c84927

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · a77f9a28

由 David S. Miller 提交于 9月 05, 2014

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2014-09-04

This series contains updates to i40e, i40evf, ixgbe and ixgbevf.

Catherine adds dual speed module support to i40e.  Updates i40e to allow
the user to change link settings when the link is down.

Serey renames i40e_ndo_set_vf_spoofck() to i40e_ndo_set_vf_spookchk()
to be more consistent with what is defined in netdev and removes a
unnecessary variable assignment.

Jesse makes a malicious driver detection warning only print if extended
driver string is enabled for i40e.  Fixes a panic under traffic load when
resetting or if/whenever there was a Tx-timeout because we were enabling
the Tx queue to early.

Anjali fixes an issue when PF reset fails, where we were trying to restart
the admin queue which has not been setup at that point.  This resolves an
occasional kernel panic when PF reset fails for some reason.

Ethan Zhao replaces the use of a local i40e_vfs_are_assigned() with the
global kernel pci_vfs_assigned() for i40e.

Alex cleans up the FDB handling for ixgbe.  This change makes it so that
the behavior for FDB handling is consistent between both the SR-IOV and
non-SR-IOV cases.  The main change is that we perform bounds checking on
the number of SR-IOV addresses regardless of if SR-IOV is enabled or not
as we can only support a certain number of addresses in the hardware.

Emil extends the pending Tx work check to the VF interfaces, where the
driver initiates a reset of the interface on link loss with pending Tx
work in order to clear the rings.  Introduces a delay for 82599 VFs of
at least 500 usecs to make sure the VFLINKS value is correct, since this
bit tends to flap when a DA or SFP+ cable is disconnected.

Jacob adds code comments in ixgbe to make it more obvious that we are
resetting features based on the fact that we do not have MSI-X enabled,
and cannot use the previous settings.  Also resolves a kernel NULL
pointer dereference by limiting the combined total of MACVLAN and
SR-IOV VFs, since the hardware has a limited number of pools available
(64).  Previously, no checks were in place to limit the number of
accelerated MACVLAN devices based on the number of pools, which would
be ok since there was already a limit for these well below the number of
available pools.  However, SR-IOV uses the very same pools, therefore
we need to ensure that the total number of pools does not exceed the
number of pools available in the hardware.

v2:
 - clean up code comment in patch 5 by replacing "an" with "auto
   negotiation" based on feedback from Sergei Shtylyov
 - removed un-necessary parenthesis around function call in patch 8
   based on feedback from Sergei Shtylyov
====================

a77f9a28

net: ethernet: cpsw: improve interrupt lookup logic in cpsw_probe() · c2b32e58

由 Daniel Mack 提交于 9月 04, 2014

Simplify the interrupt resource lookup code in cpsw_probe() by the
following:

 * Only look at the first member of the resource. As the driver only
   works for DT-enabled platforms anyway, a resource of type
   IORESOURCE_IRQ will only contain one single entry
   (res->start == res->end), so there is no need for the iteration.

 * Add a bounds check to avoid overflows if we are passed more than
   ARRAY_SIZE(priv->irqs_table) resources.

 * Assign 'ret' with the return value of devm_request_irq() so that
   cpsw_probe() returns the appropriate error code.

 * If devm_request_irq() fails, report the error code in the log
   message.
Signed-off-by: NDaniel Mack <zonque@gmail.com>
Acked-by: NMugunthan V N <mugunthanvnm@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2b32e58

ipv4: fix a race in update_or_create_fnhe() · caa41527

由 Eric Dumazet 提交于 9月 03, 2014

nh_exceptions is effectively used under rcu, but lacks proper
barriers. Between kzalloc() and setting of nh->nh_exceptions(),
we need a proper memory barrier.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Fixes: 4895c771 ("ipv4: Add FIB nexthop exceptions.")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

caa41527

l2tp: fix missing line continuation · 29abe2fd

由 Andy Zhou 提交于 9月 03, 2014

This syntax error was covered by L2TP_REFCNT_DEBUG not being set by
default.
Signed-off-by: NAndy Zhou <azhou@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29abe2fd

Merge branch 'amd-xgbe-next' · f35d2a5f

由 David S. Miller 提交于 9月 05, 2014

Tom Lendacky says:

====================
amd-xgbe: AMD XGBE driver updates 2014-09-03

The following series of patches includes fixes/updates to the driver.

- Query the device for the actual speed mode (KR/KX) rather than trying
  to track it
- Update parallel detection logic to support KR mode
- Fix new warnings from checkpatch in the amd-xgbe and amd-xgbe-phy
  driver

This patch series is based on net-next.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f35d2a5f

amd-xgbe-phy: Checkpatch driver fixes · b73c798b

由 Lendacky, Thomas 提交于 9月 03, 2014

This patch contains fixes identified by checkpatch when run with the
strict option.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b73c798b

amd-xgbe: Checkpatch driver fixes · a2ea14d7

由 Lendacky, Thomas 提交于 9月 03, 2014

This patch contains fixes identified by checkpatch when run with the
strict option.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2ea14d7

amd-xgbe-phy: Enhance parallel detection to support KR speed · e6f0562f

由 Lendacky, Thomas 提交于 9月 03, 2014

Add support to allow parallel detection to work in KR speed. With
both speed modes of KX and KR supported, KX must be checked first.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6f0562f

amd-xgbe-phy: Check device for current speed mode (KR/KX) · e3eec4e7

由 Lendacky, Thomas 提交于 9月 03, 2014

Since device resets can change the current mode it's possible to think
the device is in a different mode than it actually is.  Rather than
trying to determine every place that is needed to set/save the current
mode, be safe and check the devices actual mode when needed rather than
trying to track it.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3eec4e7

Merge branch 'r8152-next' · e4cf0b75

由 David S. Miller 提交于 9月 05, 2014

Hayes Wang says:

====================
r8152: random MAC address

If the interface has invalid MAC address, it couldn't
be used. In order to let it work normally, give a
random one.

v3:
  Remove
	ether_addr_copy(dev->perm_addr, dev->dev_addr);

v2:
  Use "%pM" format specifier for printing a MAC address.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e4cf0b75

r8152: use eth_hw_addr_random · 179bb6d7

由 hayeswang 提交于 9月 04, 2014

If the hw doesn't have a valid MAC address, give a random one and
set it to the hw.
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

179bb6d7

r8152: change the location of rtl8152_set_mac_address · 8ba789ab

由 hayeswang 提交于 9月 04, 2014

Exchange the location of rtl8152_set_mac_address() and
set_ethernet_addr(). Then, the set_ethernet_addr() could
set the MAC address by calling rtl8152_set_mac_address()
later.
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ba789ab

Merge branch 'rx_copybreak' · b52b7275

由 David S. Miller 提交于 9月 05, 2014

Govindarajulu Varadarajan says:

====================
enic: Add support for rx_copybreak

The following series implements rx_copybreak.

dma_map_single()/dma_unmap_single() is more expensive than alloc_skb & memcpy
for smaller packets. By doing this we can reuse the dma buff which is already
mapped. This is very useful when iommu is on. The default skb copybreak value
is 256.

When iommu is on, we can go much higher than 256. All the drivers that supports
rx_copybreak provides module parameter to change this value. Since module
parameter is the least preferred way for changing driver values, this series
adds ethtool support for setting rx_copybreak.

v4:
Validate tunable length in ethtool_get_tunable, not in driver implemented
function.

Loose tunable_ops array for each tunable type. Define one function and let the
driver use switch case for each type.

Use double underscore for data type in UAPI headers.
Use const qualifier where possible.

v3:
Add tunable namespace to ethtool. Use new ethtool cmd ETHTOOL_S/GTUNABLE to
set/get rx_copybreak from userspace.

v2:
Add new ethtool_cmd for DMA buffer parameters, instead of adding new members to
existing ethtool_ringparam.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b52b7275

enic: Add tunable_ops support for rx_copybreak · d4ad30b1

由 Govindarajulu Varadarajan 提交于 9月 03, 2014

This patch adds support for setting/getting rx_copybreak using
generic ethtool tunable.

Defines enic_get_tunable() & enic_set_tunable() to get/set rx_copybreak.
As of now, these two function supports only rx_copybreak.
Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4ad30b1

ethtool: Add generic options for tunables · f0db9b07

由 Govindarajulu Varadarajan 提交于 9月 03, 2014

This patch adds new ethtool cmd, ETHTOOL_GTUNABLE & ETHTOOL_STUNABLE for getting
tunable values from driver.

Add get_tunable and set_tunable to ethtool_ops. Driver implements these
functions for getting/setting tunable value.
Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0db9b07

enic: implement rx_copybreak · a03bb56e

由 Govindarajulu Varadarajan 提交于 9月 03, 2014

Calling dma_map_single()/dma_unmap_single() is quite expensive compared
to copying a small packet. So let's copy short frames and keep the buffers
mapped.
Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a03bb56e

dev_ioctl: remove dev_load() CAP_SYS_MODULE message · e020836d

由 Daniel Borkmann 提交于 9月 02, 2014

Marcel reported to see the following message when autoloading
is being triggered when adding nlmon device:

  Loading kernel module for a network device with
  CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias
  netdev-nlmon instead.

This false-positive happens despite with having correct
capabilities set, e.g. through issuing `ip link del dev nlmon`
more than once on a valid device with name nlmon, but Marcel
has also seen it on creation time when no nlmon module is
previously compiled-in or loaded as module and the device
name equals a link type name (e.g. nlmon, vxlan, team).

Stephen says:

  The netdev module alias is a hold over from the past. For
  normal devices, people used to create a alias eth0 to and
  point it to the type of network device used, that was back
  in the bad old ISA days before real discovery.

  Also, the tunnels create module alias for the control device
  and ip used to use this to autoload the tunnel device.

  The message is bogus and should just be removed, I also see
  it in a couple of other cases where tap devices are renamed
  for other usese.

As mentioned in 8909c9ad ("net: don't allow CAP_NET_ADMIN
to load non-netdev kernel modules"), we nevertheless still
might want to leave the old autoloading behaviour in place
as it could break old scripts, so for now, lets just remove
the log message as Stephen suggests.

Reference: http://thread.gmane.org/gmane.linux.kernel/1105168Reported-by: NMarcel Holtmann <marcel@holtmann.org>
Suggested-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e020836d

net: bpf: make eBPF interpreter images read-only · 60a3b225

由 Daniel Borkmann 提交于 9月 02, 2014

With eBPF getting more extended and exposure to user space is on it's way,
hardening the memory range the interpreter uses to steer its command flow
seems appropriate.  This patch moves the to be interpreted bytecode to
read-only pages.

In case we execute a corrupted BPF interpreter image for some reason e.g.
caused by an attacker which got past a verifier stage, it would not only
provide arbitrary read/write memory access but arbitrary function calls
as well. After setting up the BPF interpreter image, its contents do not
change until destruction time, thus we can setup the image on immutable
made pages in order to mitigate modifications to that code. The idea
is derived from commit 314beb9b ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks").

This is possible because bpf_prog is not part of sk_filter anymore.
After setup bpf_prog cannot be altered during its life-time. This prevents
any modifications to the entire bpf_prog structure (incl. function/JIT
image pointer).

Every eBPF program (including classic BPF that are migrated) have to call
bpf_prog_select_runtime() to select either interpreter or a JIT image
as a last setup step, and they all are being freed via bpf_prog_free(),
including non-JIT. Therefore, we can easily integrate this into the
eBPF life-time, plus since we directly allocate a bpf_prog, we have no
performance penalty.

Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
inspection of kernel_page_tables.  Brad Spengler proposed the same idea
via Twitter during development of this patch.

Joint work with Hannes Frederic Sowa.
Suggested-by: NBrad Spengler <spender@grsecurity.net>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60a3b225

05 9月, 2014 3 次提交

net: systemport: update UMAC_CMD only when link is detected · 4a804c01

由 Florian Fainelli 提交于 9月 02, 2014

When we bring the interface down, phy_stop() will schedule the PHY
state machine to call our link adjustment callback. By the time we do so,
we may have clock gated off the SYSTEMPORT hardware block, and this will
cause bus errors to happen in bcm_sysport_adj_link():

Make sure that we only touch the UMAC_CMD register when there is an
actual link. This is safe to do for two reasons:

- updating the Ethernet MAC registers only make sense when a physical
  link is present
- the PHY library state machine first set phydev->link = 0 before
  invoking phydev->adjust_link in the PHY_HALTED case

This is a similar fix to the GENET one:
c677ba8b ("net: bcmgenet: update
UMAC_CMD only when link is detected").

Fixes: 80105bef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a804c01

ipv4: implement igmp_qrv sysctl to tune igmp robustness variable · a9fe8e29

由 Hannes Frederic Sowa 提交于 9月 02, 2014

As in IPv6 people might increase the igmp query robustness variable to
make sure unsolicited state change reports aren't lost on the network. Add
and document this new knob to igmp code.

RFCs allow tuning this parameter back to first IGMP RFC, so we also use
this setting for all counters, including source specific multicast.

Also take over sysctl value when upping the interface and don't reuse
the last one seen on the interface.

Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: NFlavio Leitner <fbl@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9fe8e29

ipv6: add sysctl_mld_qrv to configure query robustness variable · 2f711939

由 Hannes Frederic Sowa 提交于 9月 02, 2014

This patch adds a new sysctl_mld_qrv knob to configure the mldv1/v2 query
robustness variable. It specifies how many retransmit of unsolicited mld
retransmit should happen. Admins might want to tune this on lossy links.

Also reset mld state on interface down/up, so we pick up new sysctl
settings during interface up event.

IPv6 certification requests this knob to be available.

I didn't make this knob netns specific, as it is mostly a setting in a
physical environment and should be per host.

Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: NFlavio Leitner <fbl@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f711939

04 9月, 2014 8 次提交

ixgbe: limit combined total of macvlan and SR-IOV VFs · aac2f1bf

由 Jacob Keller 提交于 8月 21, 2014

Hardware has a limited number of pools available (64). Previously, no
checks were in place to limit the number of accelerated macvlan devices
based on the number of pools. Normally this would be ok, because there
was already a limit for these well below the number of available pools.
However, SR-IOV uses the very same pools. Therefor, we need to ensure
that the total number of pools (number of VFs plus the number of non-VF
pools in use for accelerated macvlans) does not exceed the number of
pools available in hardware.

This patch resolves a kernel NULL pointer dereference caused by the following commands:

$modprobe ixgbe max_vfs=63

$ethtool -K eth2 l2-fwd-offload on

$ip link add link eth2 macvlan0 type macvlan

$ip link set dev macvlan0 up

[  992.950080] BUG: unable to handle kernel NULL pointer dereference at 0000000000000056
[  992.951109] IP: [<ffffffffa003b71e>] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe]
[  992.951684] PGD 22a80e067 PUD 232e9b067 PMD 0
[  992.952389] Oops: 0000 [#1] SMP
[  992.953014] Modules linked in: nfsd lockd nfs_acl exportfs auth_rpcgss oid_registry sunrpc bridge stp llc vhost_net macvtap macvlan vhost tun kvm_intel kvm ioatdma ixgbe mdio igb dca
[  992.956042] CPU: 2 PID: 11928 Comm: ifconfig Not tainted 3.16.0-rc6-net-next-07-29-2014-FCoE+ #1
[  992.956915] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014
[  992.957791] task: ffff8804341c0000 ti: ffff8801d7dc8000 task.ti: ffff8801d7dc8000
[  992.958660] RIP: 0010:[<ffffffffa003b71e>]  [<ffffffffa003b71e>] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe]
[  992.959613] RSP: 0018:ffff8801d7dcbbb8  EFLAGS: 00010286
[  992.960093] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001
[  992.960575] RDX: ffff880232eb7000 RSI: 0000000000000000 RDI: ffff88022dc05800
[  992.961059] RBP: ffff8801d7dcbbd8 R08: 0000000000000000 R09: 0000000000000000
[  992.961541] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88022ec20980
[  992.962023] R13: ffff880232eb7000 R14: 0000000000000001 R15: 0000000000000001
[  992.962508] FS:  00007fab264887a0(0000) GS:ffff880237640000(0000) knlGS:0000000000000000
[  992.963378] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  992.963858] CR2: 0000000000000056 CR3: 000000022a939000 CR4: 00000000001427e0
[  992.964340] Stack:
[  992.964806]  ffff88022ec28840 ffff88022ec20980 ffff88022dc05800 ffff880232eb7000
[  992.965976]  ffff8801d7dcbc28 ffffffffa003bae8 ffff8801d7dcbbe8 0000000000000400
[  992.967147]  000000000000000d ffff88022ec20980 ffff88022ec20000 ffff88022dc05800
[  992.968319] Call Trace:
[  992.968795]  [<ffffffffa003bae8>] ixgbe_fwd_ring_up+0x88/0x280 [ixgbe]
[  992.969284]  [<ffffffffa0041d83>] ixgbe_fwd_add+0x173/0x220 [ixgbe]
[  992.969767]  [<ffffffffa015056c>] macvlan_open+0x1bc/0x230 [macvlan]
[  992.970256]  [<ffffffff816b8de7>] __dev_open+0xd7/0x150
[  992.970735]  [<ffffffff816b8bd7>] __dev_change_flags+0xa7/0x170
[  992.971220]  [<ffffffff816b8ccb>] dev_change_flags+0x2b/0x70
[  992.971703]  [<ffffffff817471b2>] devinet_ioctl+0x602/0x6d0
[  992.972184]  [<ffffffff81748168>] inet_ioctl+0x78/0x90
[  992.972666]  [<ffffffff816a143b>] sock_do_ioctl+0x2b/0x70
[  992.973146]  [<ffffffff816a14ed>] sock_ioctl+0x6d/0x260
[  992.973627]  [<ffffffff811ad3b4>] do_vfs_ioctl+0x84/0x540
[  992.974109]  [<ffffffff811a4c81>] ? final_putname+0x21/0x50
[  992.974593]  [<ffffffff818725d5>] ? sysret_check+0x22/0x5d
[  992.975073]  [<ffffffff811ad901>] SyS_ioctl+0x91/0xa0
[  992.975550]  [<ffffffff818725a9>] system_call_fastpath+0x16/0x1b
[  992.976026] Code: ff 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 48 89 f3 4c 89 6d f8 4c 8b a7 08 02 00 00 <44> 0f b6 6e 56 44 03 af 14 02 00 00 4c 89 e7 e8 5e f2 ff ff be
[  992.982261] RIP  [<ffffffffa003b71e>] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe]
[  992.983212]  RSP <ffff8801d7dcbbb8>
[  992.983681] CR2: 0000000000000056
[  992.984248] ---[ end trace 9f54802b5cc3638b ]---

Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

aac2f1bf

ixgbe: add comment noting recalculation of queues · eec66731

由 Jacob Keller 提交于 8月 21, 2014

Since we previously called ixgbe_set_num_queues just prior to attempting
to set our interrupt scheme, it may be non obvious why we have to call
it again inside the function. Add a comment which helps make it more
obvious that we are resetting features based on the fact that we do not
have MSI-X enabled, and cannot use the previous settings.
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

eec66731

ixgbevf: introduce delay for checking VFLINKS on 82599 · b8a2ca19

由 Emil Tantilov 提交于 8月 13, 2014

VFLINKS.LINKUP bit tends to flap when a DA or SFP+ cable is disconnected.
It can take up to 500 usecs for the LINKUP bit to be correct.

This patch resolves the issue by introducing a delay for 82599 VFs of at
least 500 usecs to make sure the VFLINKS value is correct.
Signed-off-by: NEmil Tantilov <emil.s.tantilov@intel.com>
Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

b8a2ca19

ixgbe: reset interface on link loss with pending Tx work from the VF · 07923c17

由 Emil Tantilov 提交于 8月 12, 2014

ixgbe initiates a reset of the interface on link loss with pending Tx work
in order to clear the rings.

This patch extends the pending Tx work check to the VF interfaces with the
same purpose.
Signed-off-by: NEmil Tantilov <emil.s.tantilov@intel.com>
Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

07923c17

ixgbe: Cleanup FDB handling code · bcfd3432

由 Alexander Duyck 提交于 7月 17, 2014

This change makes it so that the behavior for FDB handling is consistent
between both the SR-IOV and non-SR-IOV cases. The main change here is that we
perform bounds checking on the number of SR-IOV addresses regardless of if
SR-IOV is enabled or not as we can only support a certain number of addresses
in the hardware.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

bcfd3432

i40e: use global pci_vfs_assigned() to replace local i40e_vfs_are_assigned() · c24817b6

由 Ethan Zhao 提交于 7月 22, 2014

There is global funcion pci_vfs_assigned(), so use it instead of composing
local one.
Signed-off-by: NEthan Zhao <ethan.kernel@gmail.com>
Tested-by: NSibai Li <sibai.li@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

c24817b6

i40e/i40evf: Bump i40e/i40evf versions · e966d5c6

由 Catherine Sullivan 提交于 7月 12, 2014

Bump i40e version to 1.0.11 and i40evf version to 1.0.5.

Change-ID: I63a60fa2efe82aae87a8a3095f43218db57d46ce
Signed-off-by: NCatherine Sullivan <catherine.sullivan@intel.com>
Tested-by: NJim Young <jamesx.m.young@intel.com>

e966d5c6

i40e: fix panic due to too-early Tx queue enable · 32b5b811

由 Jesse Brandeburg 提交于 8月 12, 2014

This fixes the panic under traffic load when resetting.  This issue
could also show up if/whenever there is a Tx-timeout.

Change-ID: Ie393a1f17fd5d962e56fc3bfe784899ef25402f5
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NJim Young <jamesx.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

32b5b811

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功