1. 09 1月, 2017 3 次提交
  2. 30 12月, 2016 1 次提交
    • M
      net: dev_weight: TX/RX orthogonality · 3d48b53f
      Matthias Tafelmeier 提交于
      Oftenly, introducing side effects on packet processing on the other half
      of the stack by adjusting one of TX/RX via sysctl is not desirable.
      There are cases of demand for asymmetric, orthogonal configurability.
      
      This holds true especially for nodes where RPS for RFS usage on top is
      configured and therefore use the 'old dev_weight'. This is quite a
      common base configuration setup nowadays, even with NICs of superior processing
      support (e.g. aRFS).
      
      A good example use case are nodes acting as noSQL data bases with a
      large number of tiny requests and rather fewer but large packets as responses.
      It's affordable to have large budget and rx dev_weights for the
      requests. But as a side effect having this large a number on TX
      processed in one run can overwhelm drivers.
      
      This patch therefore introduces an independent configurability via sysctl to
      userland.
      Signed-off-by: NMatthias Tafelmeier <matthias.tafelmeier@gmx.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d48b53f
  3. 26 12月, 2016 1 次提交
    • T
      ktime: Get rid of the union · 2456e855
      Thomas Gleixner 提交于
      ktime is a union because the initial implementation stored the time in
      scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
      variant for 32bit machines. The Y2038 cleanup removed the timespec variant
      and switched everything to scalar nanoseconds. The union remained, but
      become completely pointless.
      
      Get rid of the union and just keep ktime_t as simple typedef of type s64.
      
      The conversion was done with coccinelle and some manual mopping up.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      2456e855
  4. 25 12月, 2016 1 次提交
  5. 09 12月, 2016 1 次提交
  6. 30 11月, 2016 1 次提交
  7. 26 11月, 2016 1 次提交
    • E
      net: properly flush delay-freed skbs · f52dffe0
      Eric Dumazet 提交于
      Typical NAPI drivers use napi_consume_skb(skb) at TX completion time.
      This put skb in a percpu special queue, napi_alloc_cache, to get bulk
      frees.
      
      It turns out the queue is not flushed and hits the NAPI_SKB_CACHE_SIZE
      limit quite often, with skbs that were queued hundreds of usec earlier.
      I measured this can take ~6000 nsec to perform one flush.
      
      __kfree_skb_flush() can be called from two points right now :
      
      1) From net_tx_action(), but only for skbs that were queued to
      sd->completion_queue.
      
       -> Irrelevant for NAPI drivers in normal operation.
      
      2) From net_rx_action(), but only under high stress or if RPS/RFS has a
      pending action.
      
      This patch changes net_rx_action() to perform the flush in all cases and
      after more urgent operations happened (like kicking remote CPUS for
      RPS/RFS).
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Acked-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f52dffe0
  8. 17 11月, 2016 3 次提交
    • E
      netpoll: more efficient locking · 89c4b442
      Eric Dumazet 提交于
      Callers of netpoll_poll_lock() own NAPI_STATE_SCHED
      
      Callers of netpoll_poll_unlock() have BH blocked between
      the NAPI_STATE_SCHED being cleared and poll_lock is released.
      
      We can avoid the spinlock which has no contention, and use cmpxchg()
      on poll_owner which we need to set anyway.
      
      This removes a possible lockdep violation after the cited commit,
      since sk_busy_loop() re-enables BH before calling busy_poll_stop()
      
      Fixes: 217f6974 ("net: busy-poll: allow preemption in sk_busy_loop()")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89c4b442
    • E
      net: busy-poll: return busypolling status to drivers · 364b6055
      Eric Dumazet 提交于
      NAPI drivers use napi_complete_done() or napi_complete() when
      they drained RX ring and right before re-enabling device interrupts.
      
      In busy polling, we can avoid interrupts being delivered since
      we are polling RX ring in a controlled loop.
      
      Drivers can chose to use napi_complete_done() return value
      to reduce interrupts overhead while busy polling is active.
      
      This is optional, legacy drivers should work fine even
      if not updated.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Adam Belay <abelay@google.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Cc: Yuval Mintz <Yuval.Mintz@cavium.com>
      Cc: Ariel Elior <ariel.elior@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      364b6055
    • E
      net: busy-poll: allow preemption in sk_busy_loop() · 217f6974
      Eric Dumazet 提交于
      After commit 4cd13c21 ("softirq: Let ksoftirqd do its job"),
      sk_busy_loop() needs a bit of care :
      softirqs might be delayed since we do not allow preemption yet.
      
      This patch adds preemptiom points in sk_busy_loop(),
      and makes sure no unnecessary cache line dirtying
      or atomic operations are done while looping.
      
      A new flag is added into napi->state : NAPI_STATE_IN_BUSY_POLL
      
      This prevents napi_complete_done() from clearing NAPIF_STATE_SCHED,
      so that sk_busy_loop() does not have to grab it again.
      
      Similarly, netpoll_poll_lock() is done one time.
      
      This gives about 10 to 20 % improvement in various busy polling
      tests, especially when many threads are busy polling in
      configurations with large number of NIC queues.
      
      This should allow experimenting with bigger delays without
      hurting overall latencies.
      
      Tested:
       On a 40Gb mlx4 NIC, 32 RX/TX queues.
      
       echo 70 >/proc/sys/net/core/busy_read
       for i in `seq 1 40`; do echo -n $i: ; ./super_netperf $i -H lpaa24 -t UDP_RR -- -N -n; done
      
          Before:      After:
       1:   90072   92819
       2:  157289  184007
       3:  235772  213504
       4:  344074  357513
       5:  394755  458267
       6:  461151  487819
       7:  549116  625963
       8:  544423  716219
       9:  720460  738446
      10:  794686  837612
      11:  915998  923960
      12:  937507  925107
      13: 1019677  971506
      14: 1046831 1113650
      15: 1114154 1148902
      16: 1105221 1179263
      17: 1266552 1299585
      18: 1258454 1383817
      19: 1341453 1312194
      20: 1363557 1488487
      21: 1387979 1501004
      22: 1417552 1601683
      23: 1550049 1642002
      24: 1568876 1601915
      25: 1560239 1683607
      26: 1640207 1745211
      27: 1706540 1723574
      28: 1638518 1722036
      29: 1734309 1757447
      30: 1782007 1855436
      31: 1724806 1888539
      32: 1717716 1944297
      33: 1778716 1869118
      34: 1805738 1983466
      35: 1815694 2020758
      36: 1893059 2035632
      37: 1843406 2034653
      38: 1888830 2086580
      39: 1972827 2143567
      40: 1877729 2181851
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Adam Belay <abelay@google.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Cc: Yuval Mintz <Yuval.Mintz@cavium.com>
      Cc: Ariel Elior <ariel.elior@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      217f6974
  9. 13 11月, 2016 1 次提交
    • M
      bpf: Fix bpf_redirect to an ipip/ip6tnl dev · 4e3264d2
      Martin KaFai Lau 提交于
      If the bpf program calls bpf_redirect(dev, 0) and dev is
      an ipip/ip6tnl, it currently includes the mac header.
      e.g. If dev is ipip, the end result is IP-EthHdr-IP instead
      of IP-IP.
      
      The fix is to pull the mac header.  At ingress, skb_postpull_rcsum()
      is not needed because the ethhdr should have been pulled once already
      and then got pushed back just before calling the bpf_prog.
      At egress, this patch calls skb_postpull_rcsum().
      
      If bpf_redirect(dev, BPF_F_INGRESS) is called,
      it also fails now because it calls dev_forward_skb() which
      eventually calls eth_type_trans(skb, dev).  The eth_type_trans()
      will set skb->type = PACKET_OTHERHOST because the mac address
      does not match the redirecting dev->dev_addr.  The PACKET_OTHERHOST
      will eventually cause the ip_rcv() errors out.  To fix this,
      ____dev_forward_skb() is added.
      
      Joint work with Daniel Borkmann.
      
      Fixes: cfc7381b ("ip_tunnel: add collect_md mode to IPIP tunnel")
      Fixes: 8d79266b ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@fb.com>
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e3264d2
  10. 10 11月, 2016 3 次提交
  11. 08 11月, 2016 1 次提交
    • J
      net/qdisc: IFF_NO_QUEUE drivers should use consistent TX queue len · 11597084
      Jesper Dangaard Brouer 提交于
      The flag IFF_NO_QUEUE marks virtual device drivers that doesn't need a
      default qdisc attached, given they will be backed by physical device,
      that already have a qdisc attached for pushback.
      
      It is still supported to attach a qdisc to a IFF_NO_QUEUE device, as
      this can be useful for difference policy reasons (e.g. bandwidth
      limiting containers).  For this to work, the tx_queue_len need to have
      a sane value, because some qdiscs inherit/copy the tx_queue_len
      (namely, pfifo, bfifo, gred, htb, plug and sfb).
      
      Commit a813104d ("IFF_NO_QUEUE: Fix for drivers not calling
      ether_setup()") caught situations where some drivers didn't initialize
      tx_queue_len.  The problem with the commit was choosing 1 as the
      fallback value.
      
      A qdisc queue length of 1 causes more harm than good, because it
      creates hard to debug situations for userspace. It gives userspace a
      false sense of a working config after attaching a qdisc.  As low
      volume traffic (that doesn't activate the qdisc policy) works,
      like ping, while traffic that e.g. needs shaping cannot reach the
      configured policy levels, given the queue length is too small.
      
      This patch change the value to DEFAULT_TX_QUEUE_LEN, given other
      IFF_NO_QUEUE devices (that call ether_setup()) also use this value.
      
      Fixes: a813104d ("IFF_NO_QUEUE: Fix for drivers not calling ether_setup()")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      11597084
  12. 01 11月, 2016 5 次提交
  13. 30 10月, 2016 2 次提交
    • D
      net: dev: Fix non-RCU based lower dev walker · 46b5ab1a
      David Ahern 提交于
      netdev_walk_all_lower_dev is not properly walking the lower device
      list.  Commit 1a3f060c made netdev_walk_all_lower_dev similar
      to netdev_walk_all_upper_dev_rcu and netdev_walk_all_lower_dev_rcu
      but failed to update its netdev_next_lower_dev iterator. This patch
      fixes that.
      
      Fixes: 1a3f060c ("net: Introduce new api for walking upper and
                           lower devices")
      Reported-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Tested-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      46b5ab1a
    • W
      packet: on direct_xmit, limit tso and csum to supported devices · 104ba78c
      Willem de Bruijn 提交于
      When transmitting on a packet socket with PACKET_VNET_HDR and
      PACKET_QDISC_BYPASS, validate device support for features requested
      in vnet_hdr.
      
      Drop TSO packets sent to devices that do not support TSO or have the
      feature disabled. Note that the latter currently do process those
      packets correctly, regardless of not advertising the feature.
      
      Because of SKB_GSO_DODGY, it is not sufficient to test device features
      with netif_needs_gso. Full validate_xmit_skb is needed.
      
      Switch to software checksum for non-TSO packets that request checksum
      offload if that device feature is unsupported or disabled. Note that
      similar to the TSO case, device drivers may perform checksum offload
      correctly even when not advertising it.
      
      When switching to software checksum, packets hit skb_checksum_help,
      which has two BUG_ON checksum not in linear segment. Packet sockets
      always allocate at least up to csum_start + csum_off + 2 as linear.
      
      Tested by running github.com/wdebruij/kerneltools/psock_txring_vnet.c
      
        ethtool -K eth0 tso off tx on
        psock_txring_vnet -d $dst -s $src -i eth0 -l 2000 -n 1 -q -v
        psock_txring_vnet -d $dst -s $src -i eth0 -l 2000 -n 1 -q -v -N
      
        ethtool -K eth0 tx off
        psock_txring_vnet -d $dst -s $src -i eth0 -l 1000 -n 1 -q -v -G
        psock_txring_vnet -d $dst -s $src -i eth0 -l 1000 -n 1 -q -v -G -N
      
      v2:
        - add EXPORT_SYMBOL_GPL(validate_xmit_skb_list)
      
      Fixes: d346a3fa ("packet: introduce PACKET_QDISC_BYPASS socket option")
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      104ba78c
  14. 21 10月, 2016 1 次提交
    • S
      net: add recursion limit to GRO · fcd91dd4
      Sabrina Dubroca 提交于
      Currently, GRO can do unlimited recursion through the gro_receive
      handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
      to one level with encap_mark, but both VLAN and TEB still have this
      problem.  Thus, the kernel is vulnerable to a stack overflow, if we
      receive a packet composed entirely of VLAN headers.
      
      This patch adds a recursion counter to the GRO layer to prevent stack
      overflow.  When a gro_receive function hits the recursion limit, GRO is
      aborted for this skb and it is processed normally.  This recursion
      counter is put in the GRO CB, but could be turned into a percpu counter
      if we run out of space in the CB.
      
      Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report.
      
      Fixes: CVE-2016-7039
      Fixes: 9b174d88 ("net: Add Transparent Ethernet Bridging GRO support.")
      Fixes: 66e5133f ("vlan: Add GRO support for non hardware accelerated vlan")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fcd91dd4
  15. 19 10月, 2016 1 次提交
    • I
      net: core: Correctly iterate over lower adjacency list · e4961b07
      Ido Schimmel 提交于
      Tamir reported the following trace when processing ARP requests received
      via a vlan device on top of a VLAN-aware bridge:
      
       NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0]
      [...]
       CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.8.0-rc7 #1
       Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
       task: ffff88017edfea40 task.stack: ffff88017ee10000
       RIP: 0010:[<ffffffff815dcc73>]  [<ffffffff815dcc73>] netdev_all_lower_get_next_rcu+0x33/0x60
      [...]
       Call Trace:
        <IRQ>
        [<ffffffffa015de0a>] mlxsw_sp_port_lower_dev_hold+0x5a/0xa0 [mlxsw_spectrum]
        [<ffffffffa016f1b0>] mlxsw_sp_router_netevent_event+0x80/0x150 [mlxsw_spectrum]
        [<ffffffff810ad07a>] notifier_call_chain+0x4a/0x70
        [<ffffffff810ad13a>] atomic_notifier_call_chain+0x1a/0x20
        [<ffffffff815ee77b>] call_netevent_notifiers+0x1b/0x20
        [<ffffffff815f2eb6>] neigh_update+0x306/0x740
        [<ffffffff815f38ce>] neigh_event_ns+0x4e/0xb0
        [<ffffffff8165ea3f>] arp_process+0x66f/0x700
        [<ffffffff8170214c>] ? common_interrupt+0x8c/0x8c
        [<ffffffff8165ec29>] arp_rcv+0x139/0x1d0
        [<ffffffff816e505a>] ? vlan_do_receive+0xda/0x320
        [<ffffffff815e3794>] __netif_receive_skb_core+0x524/0xab0
        [<ffffffff815e6830>] ? dev_queue_xmit+0x10/0x20
        [<ffffffffa06d612d>] ? br_forward_finish+0x3d/0xc0 [bridge]
        [<ffffffffa06e5796>] ? br_handle_vlan+0xf6/0x1b0 [bridge]
        [<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
        [<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
        [<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
        [<ffffffffa06d7856>] br_pass_frame_up+0xc6/0x160 [bridge]
        [<ffffffffa06d63d7>] ? deliver_clone+0x37/0x50 [bridge]
        [<ffffffffa06d656c>] ? br_flood+0xcc/0x160 [bridge]
        [<ffffffffa06d7b14>] br_handle_frame_finish+0x224/0x4f0 [bridge]
        [<ffffffffa06d7f94>] br_handle_frame+0x174/0x300 [bridge]
        [<ffffffff815e3599>] __netif_receive_skb_core+0x329/0xab0
        [<ffffffff81374815>] ? find_next_bit+0x15/0x20
        [<ffffffff8135e802>] ? cpumask_next_and+0x32/0x50
        [<ffffffff810c9968>] ? load_balance+0x178/0x9b0
        [<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
        [<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
        [<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
        [<ffffffffa01544a1>] mlxsw_sp_rx_listener_func+0x61/0xb0 [mlxsw_spectrum]
        [<ffffffffa005c9f7>] mlxsw_core_skb_receive+0x187/0x200 [mlxsw_core]
        [<ffffffffa007332a>] mlxsw_pci_cq_tasklet+0x63a/0x9b0 [mlxsw_pci]
        [<ffffffff81091986>] tasklet_action+0xf6/0x110
        [<ffffffff81704556>] __do_softirq+0xf6/0x280
        [<ffffffff8109213f>] irq_exit+0xdf/0xf0
        [<ffffffff817042b4>] do_IRQ+0x54/0xd0
        [<ffffffff8170214c>] common_interrupt+0x8c/0x8c
      
      The problem is that netdev_all_lower_get_next_rcu() never advances the
      iterator, thereby causing the loop over the lower adjacency list to run
      forever.
      
      Fix this by advancing the iterator and avoid the infinite loop.
      
      Fixes: 7ce856aa ("mlxsw: spectrum: Add couple of lower device helper functions")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reported-by: NTamir Winetroub <tamirw@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4961b07
  16. 18 10月, 2016 6 次提交
  17. 14 10月, 2016 1 次提交
  18. 13 10月, 2016 1 次提交
    • J
      net: centralize net_device min/max MTU checking · 61e84623
      Jarod Wilson 提交于
      While looking into an MTU issue with sfc, I started noticing that almost
      every NIC driver with an ndo_change_mtu function implemented almost
      exactly the same range checks, and in many cases, that was the only
      practical thing their ndo_change_mtu function was doing. Quite a few
      drivers have either 68, 64, 60 or 46 as their minimum MTU value checked,
      and then various sizes from 1500 to 65535 for their maximum MTU value. We
      can remove a whole lot of redundant code here if we simple store min_mtu
      and max_mtu in net_device, and check against those in net/core/dev.c's
      dev_set_mtu().
      
      In theory, there should be zero functional change with this patch, it just
      puts the infrastructure in place. Subsequent patches will attempt to start
      using said infrastructure, with theoretically zero change in
      functionality.
      
      CC: netdev@vger.kernel.org
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      61e84623
  19. 11 10月, 2016 1 次提交
    • E
      latent_entropy: Mark functions with __latent_entropy · 0766f788
      Emese Revfy 提交于
      The __latent_entropy gcc attribute can be used only on functions and
      variables.  If it is on a function then the plugin will instrument it for
      gathering control-flow entropy. If the attribute is on a variable then
      the plugin will initialize it with random contents.  The variable must
      be an integer, an integer array type or a structure with integer fields.
      
      These specific functions have been selected because they are init
      functions (to help gather boot-time entropy), are called at unpredictable
      times, or they have variable loops, each of which provide some level of
      latent entropy.
      Signed-off-by: NEmese Revfy <re.emese@gmail.com>
      [kees: expanded commit message]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      0766f788
  20. 04 10月, 2016 1 次提交
    • A
      net: Add netdev all_adj_list refcnt propagation to fix panic · 93409033
      Andrew Collins 提交于
      This is a respin of a patch to fix a relatively easily reproducible kernel
      panic related to the all_adj_list handling for netdevs in recent kernels.
      
      The following sequence of commands will reproduce the issue:
      
      ip link add link eth0 name eth0.100 type vlan id 100
      ip link add link eth0 name eth0.200 type vlan id 200
      ip link add name testbr type bridge
      ip link set eth0.100 master testbr
      ip link set eth0.200 master testbr
      ip link add link testbr mac0 type macvlan
      ip link delete dev testbr
      
      This creates an upper/lower tree of (excuse the poor ASCII art):
      
                  /---eth0.100-eth0
      mac0-testbr-
                  \---eth0.200-eth0
      
      When testbr is deleted, the all_adj_lists are walked, and eth0 is deleted twice from
      the mac0 list. Unfortunately, during setup in __netdev_upper_dev_link, only one
      reference to eth0 is added, so this results in a panic.
      
      This change adds reference count propagation so things are handled properly.
      
      Matthias Schiffer reported a similar crash in batman-adv:
      
      https://github.com/freifunk-gluon/gluon/issues/680
      https://www.open-mesh.org/issues/247
      
      which this patch also seems to resolve.
      Signed-off-by: NAndrew Collins <acollins@cradlepoint.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93409033
  21. 25 9月, 2016 1 次提交
  22. 11 9月, 2016 1 次提交
  23. 05 9月, 2016 1 次提交
    • M
      bonding: Fix bonding crash · 24b27fc4
      Mahesh Bandewar 提交于
      Following few steps will crash kernel -
      
        (a) Create bonding master
            > modprobe bonding miimon=50
        (b) Create macvlan bridge on eth2
            > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \
      	   type macvlan
        (c) Now try adding eth2 into the bond
            > echo +eth2 > /sys/class/net/bond0/bonding/slaves
            <crash>
      
      Bonding does lots of things before checking if the device enslaved is
      busy or not.
      
      In this case when the notifier call-chain sends notifications, the
      bond_netdev_event() assumes that the rx_handler /rx_handler_data is
      registered while the bond_enslave() hasn't progressed far enough to
      register rx_handler for the new slave.
      
      This patch adds a rx_handler check that can be performed right at the
      beginning of the enslave code to avoid getting into this situation.
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      24b27fc4
  24. 31 8月, 2016 1 次提交