1. 09 8月, 2019 1 次提交
    • Y
      bonding: Add vlan tx offload to hw_enc_features · d595b03d
      YueHaibing 提交于
      As commit 30d8177e ("bonding: Always enable vlan tx offload")
      said, we should always enable bonding's vlan tx offload, pass the
      vlan packets to the slave devices with vlan tci, let them to handle
      vlan implementation.
      
      Now if encapsulation protocols like VXLAN is used, skb->encapsulation
      may be set, then the packet is passed to vlan device which based on
      bonding device. However in netif_skb_features(), the check of
      hw_enc_features:
      
      	 if (skb->encapsulation)
                       features &= dev->hw_enc_features;
      
      clears NETIF_F_HW_VLAN_CTAG_TX/NETIF_F_HW_VLAN_STAG_TX. This results
      in same issue in commit 30d8177e like this:
      
      vlan_dev_hard_start_xmit
        -->dev_queue_xmit
          -->validate_xmit_skb
            -->netif_skb_features //NETIF_F_HW_VLAN_CTAG_TX is cleared
            -->validate_xmit_vlan
              -->__vlan_hwaccel_push_inside //skb->tci is cleared
      ...
       --> bond_start_xmit
         --> bond_xmit_hash //BOND_XMIT_POLICY_ENCAP34
           --> __skb_flow_dissect // nhoff point to IP header
              -->  case htons(ETH_P_8021Q)
                   // skb_vlan_tag_present is false, so
                   vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan),
                   //vlan point to ip header wrongly
      
      Fixes: b2a103e6 ("bonding: convert to ndo_fix_features")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Acked-by: NJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d595b03d
  2. 23 7月, 2019 1 次提交
    • T
      bonding: Force slave speed check after link state recovery for 802.3ad · 12185dfe
      Thomas Falcon 提交于
      The following scenario was encountered during testing of logical
      partition mobility on pseries partitions with bonded ibmvnic
      adapters in LACP mode.
      
      1. Driver receives a signal that the device has been
         swapped, and it needs to reset to initialize the new
         device.
      
      2. Driver reports loss of carrier and begins initialization.
      
      3. Bonding driver receives NETDEV_CHANGE notifier and checks
         the slave's current speed and duplex settings. Because these
         are unknown at the time, the bond sets its link state to
         BOND_LINK_FAIL and handles the speed update, clearing
         AD_PORT_LACP_ENABLE.
      
      4. Driver finishes recovery and reports that the carrier is on.
      
      5. Bond receives a new notification and checks the speed again.
         The speeds are valid but miimon has not altered the link
         state yet.  AD_PORT_LACP_ENABLE remains off.
      
      Because the slave's link state is still BOND_LINK_FAIL,
      no further port checks are made when it recovers. Though
      the slave devices are operational and have valid speed
      and duplex settings, the bond will not send LACPDU's. The
      simplest fix I can see is to force another speed check
      in bond_miimon_commit. This way the bond will update
      AD_PORT_LACP_ENABLE if needed when transitioning from
      BOND_LINK_FAIL to BOND_LINK_UP.
      
      CC: Jarod Wilson <jarod@redhat.com>
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NThomas Falcon <tlfalcon@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12185dfe
  3. 05 7月, 2019 1 次提交
    • V
      bonding: add an option to specify a delay between peer notifications · 07a4ddec
      Vincent Bernat 提交于
      Currently, gratuitous ARP/ND packets are sent every `miimon'
      milliseconds. This commit allows a user to specify a custom delay
      through a new option, `peer_notif_delay'.
      
      Like for `updelay' and `downdelay', this delay should be a multiple of
      `miimon' to avoid managing an additional work queue. The configuration
      logic is copied from `updelay' and `downdelay'. However, the default
      value cannot be set using a module parameter: Netlink or sysfs should
      be used to configure this feature.
      
      When setting `miimon' to 100 and `peer_notif_delay' to 500, we can
      observe the 500 ms delay is respected:
      
          20:30:19.354693 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
          20:30:19.874892 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
          20:30:20.394919 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
          20:30:20.914963 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
      
      In bond_mii_monitor(), I have tried to keep the lock logic readable.
      The change is due to the fact we cannot rely on a notification to
      lower the value of `bond->send_peer_notif' as `NETDEV_NOTIFY_PEERS' is
      only triggered once every N times, while we need to decrement the
      counter each time.
      
      iproute2 also needs to be updated to be able to specify this new
      attribute through `ip link'.
      Signed-off-by: NVincent Bernat <vincent@bernat.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      07a4ddec
  4. 04 7月, 2019 1 次提交
  5. 03 7月, 2019 1 次提交
    • E
      bonding/main: fix NULL dereference in bond_select_active_slave() · b8bd72d3
      Eric Dumazet 提交于
      A bonding master can be up while best_slave is NULL.
      
      [12105.636318] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      [12105.638204] mlx4_en: eth1: Linkstate event 1 -> 1
      [12105.648984] IP: bond_select_active_slave+0x125/0x250
      [12105.653977] PGD 0 P4D 0
      [12105.656572] Oops: 0000 [#1] SMP PTI
      [12105.660487] gsmi: Log Shutdown Reason 0x03
      [12105.664620] Modules linked in: kvm_intel loop act_mirred uhaul vfat fat stg_standard_ftl stg_megablocks stg_idt stg_hdi stg elephant_dev_num stg_idt_eeprom w1_therm wire i2c_mux_pca954x i2c_mux mlx4_i2c i2c_usb cdc_acm ehci_pci ehci_hcd i2c_iimc mlx4_en mlx4_ib ib_uverbs ib_core mlx4_core [last unloaded: kvm_intel]
      [12105.685686] mlx4_core 0000:03:00.0: dispatching link up event for port 2
      [12105.685700] mlx4_en: eth2: Linkstate event 2 -> 1
      [12105.685700] mlx4_en: eth2: Link Up (linkstate)
      [12105.724452] Workqueue: bond0 bond_mii_monitor
      [12105.728854] RIP: 0010:bond_select_active_slave+0x125/0x250
      [12105.734355] RSP: 0018:ffffaf146a81fd88 EFLAGS: 00010246
      [12105.739637] RAX: 0000000000000003 RBX: ffff8c62b03c6900 RCX: 0000000000000000
      [12105.746838] RDX: 0000000000000000 RSI: ffffaf146a81fd08 RDI: ffff8c62b03c6000
      [12105.754054] RBP: ffffaf146a81fdb8 R08: 0000000000000001 R09: ffff8c517d387600
      [12105.761299] R10: 00000000001075d9 R11: ffffffffaceba92f R12: 0000000000000000
      [12105.768553] R13: ffff8c8240ae4800 R14: 0000000000000000 R15: 0000000000000000
      [12105.775748] FS:  0000000000000000(0000) GS:ffff8c62bfa40000(0000) knlGS:0000000000000000
      [12105.783892] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [12105.789716] CR2: 0000000000000000 CR3: 0000000d0520e001 CR4: 00000000001626f0
      [12105.796976] Call Trace:
      [12105.799446]  [<ffffffffac31d387>] bond_mii_monitor+0x497/0x6f0
      [12105.805317]  [<ffffffffabd42643>] process_one_work+0x143/0x370
      [12105.811225]  [<ffffffffabd42c7a>] worker_thread+0x4a/0x360
      [12105.816761]  [<ffffffffabd48bc5>] kthread+0x105/0x140
      [12105.821865]  [<ffffffffabd42c30>] ? rescuer_thread+0x380/0x380
      [12105.827757]  [<ffffffffabd48ac0>] ? kthread_associate_blkcg+0xc0/0xc0
      [12105.834266]  [<ffffffffac600241>] ret_from_fork+0x51/0x60
      
      Fixes: e2a7420d ("bonding/main: convert to using slave printk macros")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NJohn Sperbeck <jsperbeck@google.com>
      Cc: Jarod Wilson <jarod@redhat.com>
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8bd72d3
  6. 26 6月, 2019 1 次提交
    • Y
      bonding: Always enable vlan tx offload · 30d8177e
      YueHaibing 提交于
      We build vlan on top of bonding interface, which vlan offload
      is off, bond mode is 802.3ad (LACP) and xmit_hash_policy is
      BOND_XMIT_POLICY_ENCAP34.
      
      Because vlan tx offload is off, vlan tci is cleared and skb push
      the vlan header in validate_xmit_vlan() while sending from vlan
      devices. Then in bond_xmit_hash, __skb_flow_dissect() fails to
      get information from protocol headers encapsulated within vlan,
      because 'nhoff' is points to IP header, so bond hashing is based
      on layer 2 info, which fails to distribute packets across slaves.
      
      This patch always enable bonding's vlan tx offload, pass the vlan
      packets to the slave devices with vlan tci, let them to handle
      vlan implementation.
      
      Fixes: 278339a4 ("bonding: propogate vlan_features to bonding master")
      Suggested-by: NJiri Pirko <jiri@resnulli.us>
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30d8177e
  7. 10 6月, 2019 3 次提交
    • J
      bonding/main: convert to using slave printk macros · e2a7420d
      Jarod Wilson 提交于
      All of these printk instances benefit from having both master and slave
      device information included, so convert to using a standardized macro
      format and remove redundant information.
      Suggested-by: NJoe Perches <joe@perches.com>
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: netdev@vger.kernel.org
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2a7420d
    • J
      bonding: fix error messages in bond_do_fail_over_mac · f43b6530
      Jarod Wilson 提交于
      Passing the bond name again to debug output when referencing slave is wrong.
      We're trying to set the bond's MAC to that of the new_active slave, so adjust
      the error message slightly and pass in the slave's name, not the bond's.
      Then we're trying to set the MAC on the old active slave, but putting the
      new active slave's name in the output. While we're at it, clarify the
      error messages so you know which one actually triggered.
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: netdev@vger.kernel.org
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f43b6530
    • J
      bonding: improve event debug usability · 75466dce
      Jarod Wilson 提交于
      Seeing bonding debug log data along the lines of "event: 5" is a bit spartan,
      and often requires a lookup table if you don't remember what every event is.
      Make use of netdev_cmd_to_name for an improved debugging experience, so for
      the prior example, you'll see: "bond_netdev_event received NETDEV_REGISTER"
      instead (both are prefixed with the device for which the event pertains).
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: netdev@vger.kernel.org
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      75466dce
  8. 05 6月, 2019 1 次提交
    • A
      net: bonding: Inherit MPLS features from slave devices · 2e770b50
      Ariel Levkovich 提交于
      When setting the bonding interface net device features,
      the kernel code doesn't address the slaves' MPLS features
      and doesn't inherit them.
      
      Therefore, HW offloads that enhance performance such as
      checksumming and TSO are disabled for MPLS tagged traffic
      flowing via the bonding interface.
      
      The patch add the inheritance of the MPLS features from the
      slave devices with a similar logic to setting the bonding device's
      VLAN and encapsulation features.
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NAriel Levkovich <lariel@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e770b50
  9. 27 5月, 2019 1 次提交
    • J
      bonding/802.3ad: fix slave link initialization transition states · 33403121
      Jarod Wilson 提交于
      Once in a while, with just the right timing, 802.3ad slaves will fail to
      properly initialize, winding up in a weird state, with a partner system
      mac address of 00:00:00:00:00:00. This started happening after a fix to
      properly track link_failure_count tracking, where an 802.3ad slave that
      reported itself as link up in the miimon code, but wasn't able to get a
      valid speed/duplex, started getting set to BOND_LINK_FAIL instead of
      BOND_LINK_DOWN. That was the proper thing to do for the general "my link
      went down" case, but has created a link initialization race that can put
      the interface in this odd state.
      
      The simple fix is to instead set the slave link to BOND_LINK_DOWN again,
      if the link has never been up (last_link_up == 0), so the link state
      doesn't bounce from BOND_LINK_DOWN to BOND_LINK_FAIL -- it hasn't failed
      in this case, it simply hasn't been up yet, and this prevents the
      unnecessary state change from DOWN to FAIL and getting stuck in an init
      failure w/o a partner mac.
      
      Fixes: ea53abfa ("bonding/802.3ad: fix link_failure_count tracking")
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: netdev@vger.kernel.org
      Tested-by: NHeesoon Kim <Heesoon.Kim@stratus.com>
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Acked-by: NJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      33403121
  10. 16 4月, 2019 1 次提交
    • S
      bonding: fix event handling for stacked bonds · 92480b39
      Sabrina Dubroca 提交于
      When a bond is enslaved to another bond, bond_netdev_event() only
      handles the event as if the bond is a master, and skips treating the
      bond as a slave.
      
      This leads to a refcount leak on the slave, since we don't remove the
      adjacency to its master and the master holds a reference on the slave.
      
      Reproducer:
        ip link add bondL type bond
        ip link add bondU type bond
        ip link set bondL master bondU
        ip link del bondL
      
      No "Fixes:" tag, this code is older than git history.
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92480b39
  11. 21 3月, 2019 1 次提交
    • P
      net: remove 'fallback' argument from dev->ndo_select_queue() · a350ecce
      Paolo Abeni 提交于
      After the previous patch, all the callers of ndo_select_queue()
      provide as a 'fallback' argument netdev_pick_tx.
      The only exceptions are nested calls to ndo_select_queue(),
      which pass down the 'fallback' available in the current scope
      - still netdev_pick_tx.
      
      We can drop such argument and replace fallback() invocation with
      netdev_pick_tx(). This avoids an indirect call per xmit packet
      in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen)
      with device drivers implementing such ndo. It also clean the code
      a bit.
      
      Tested with ixgbe and CONFIG_FCOE=m
      
      With pktgen using queue xmit:
      threads		vanilla 	patched
      		(kpps)		(kpps)
      1		2334		2428
      2		4166		4278
      4		7895		8100
      
       v1 -> v2:
       - rebased after helper's name change
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a350ecce
  12. 25 2月, 2019 1 次提交
  13. 22 2月, 2019 1 次提交
    • M
      bonding: fix PACKET_ORIGDEV regression · 3c963a33
      Michal Soltys 提交于
      This patch fixes a subtle PACKET_ORIGDEV regression which was a side
      effect of fixes introduced by:
      
      6a9e461f bonding: pass link-local packets to bonding master also.
      
      ... to:
      
      b89f04c6 bonding: deliver link-local packets with skb->dev set to link that packets arrived on
      
      While 6a9e461f restored pre-b89f04c6 presence of link-local
      packets on bonding masters (which is required e.g. by linux bridges
      participating in spanning tree or needed for lab-like setups created
      with group_fwd_mask) it also caused the originating device
      information to be lost due to cloning.
      
      Maciej Żenczykowski proposed another solution that doesn't require
      packet cloning and retains original device information - instead of
      returning RX_HANDLER_PASS for all link-local packets it's now limited
      only to packets from inactive slaves.
      
      At the same time, packets passed to bonding masters retain correct
      information about the originating device and PACKET_ORIGDEV can be used
      to determine it.
      
      This elegantly solves all issues so far:
      
      - link-local packets that were removed from bonding masters
      - LLDP daemons being forced to explicitly bind to slave interfaces
      - PACKET_ORIGDEV having no effect on bond interfaces
      
      Fixes: 6a9e461f (bonding: pass link-local packets to bonding master also.)
      Reported-by: NVincent Bernat <vincent@bernat.ch>
      Signed-off-by: NMichal Soltys <soltys@ziu.info>
      Signed-off-by: NMaciej Żenczykowski <maze@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c963a33
  14. 11 1月, 2019 1 次提交
    • W
      bonding: update nest level on unlink · 001e465f
      Willem de Bruijn 提交于
      A network device stack with multiple layers of bonding devices can
      trigger a false positive lockdep warning. Adding lockdep nest levels
      fixes this. Update the level on both enslave and unlink, to avoid the
      following series of events ..
      
          ip netns add test
          ip netns exec test bash
          ip link set dev lo addr 00:11:22:33:44:55
          ip link set dev lo down
      
          ip link add dev bond1 type bond
          ip link add dev bond2 type bond
      
          ip link set dev lo master bond1
          ip link set dev bond1 master bond2
      
          ip link set dev bond1 nomaster
          ip link set dev bond2 master bond1
      
      .. from still generating a splat:
      
          [  193.652127] ======================================================
          [  193.658231] WARNING: possible circular locking dependency detected
          [  193.664350] 4.20.0 #8 Not tainted
          [  193.668310] ------------------------------------------------------
          [  193.674417] ip/15577 is trying to acquire lock:
          [  193.678897] 00000000a40e3b69 (&(&bond->stats_lock)->rlock#3/3){+.+.}, at: bond_get_stats+0x58/0x290
          [  193.687851]
          	       but task is already holding lock:
          [  193.693625] 00000000807b9d9f (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0x58/0x290
      
          [..]
      
          [  193.851092]        lock_acquire+0xa7/0x190
          [  193.855138]        _raw_spin_lock_nested+0x2d/0x40
          [  193.859878]        bond_get_stats+0x58/0x290
          [  193.864093]        dev_get_stats+0x5a/0xc0
          [  193.868140]        bond_get_stats+0x105/0x290
          [  193.872444]        dev_get_stats+0x5a/0xc0
          [  193.876493]        rtnl_fill_stats+0x40/0x130
          [  193.880797]        rtnl_fill_ifinfo+0x6c5/0xdc0
          [  193.885271]        rtmsg_ifinfo_build_skb+0x86/0xe0
          [  193.890091]        rtnetlink_event+0x5b/0xa0
          [  193.894320]        raw_notifier_call_chain+0x43/0x60
          [  193.899225]        netdev_change_features+0x50/0xa0
          [  193.904044]        bond_compute_features.isra.46+0x1ab/0x270
          [  193.909640]        bond_enslave+0x141d/0x15b0
          [  193.913946]        do_set_master+0x89/0xa0
          [  193.918016]        do_setlink+0x37c/0xda0
          [  193.921980]        __rtnl_newlink+0x499/0x890
          [  193.926281]        rtnl_newlink+0x48/0x70
          [  193.930238]        rtnetlink_rcv_msg+0x171/0x4b0
          [  193.934801]        netlink_rcv_skb+0xd1/0x110
          [  193.939103]        rtnetlink_rcv+0x15/0x20
          [  193.943151]        netlink_unicast+0x3b5/0x520
          [  193.947544]        netlink_sendmsg+0x2fd/0x3f0
          [  193.951942]        sock_sendmsg+0x38/0x50
          [  193.955899]        ___sys_sendmsg+0x2ba/0x2d0
          [  193.960205]        __x64_sys_sendmsg+0xad/0x100
          [  193.964687]        do_syscall_64+0x5a/0x460
          [  193.968823]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 7e2556e4 ("bonding: avoid lockdep confusion in bond_get_stats()")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      001e465f
  15. 14 12月, 2018 3 次提交
  16. 07 12月, 2018 1 次提交
  17. 05 11月, 2018 1 次提交
    • J
      bonding/802.3ad: fix link_failure_count tracking · ea53abfa
      Jarod Wilson 提交于
      Commit 4d2c0cda set slave->link to
      BOND_LINK_DOWN for 802.3ad bonds whenever invalid speed/duplex values
      were read, to fix a problem with slaves getting into weird states, but
      in the process, broke tracking of link failures, as going straight to
      BOND_LINK_DOWN when a link is indeed down (cable pulled, switch rebooted)
      means we broke out of bond_miimon_inspect()'s BOND_LINK_DOWN case because
      !link_state was already true, we never incremented commit, and never got
      a chance to call bond_miimon_commit(), where slave->link_failure_count
      would be incremented. I believe the simple fix here is to mark the slave
      as BOND_LINK_FAIL, and let bond_miimon_inspect() transition the link from
      _FAIL to either _UP or _DOWN, and in the latter case, we now get proper
      incrementing of link_failure_count again.
      
      Fixes: 4d2c0cda ("bonding: speed/duplex update at NETDEV_UP event")
      CC: Mahesh Bandewar <maheshb@google.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: netdev@vger.kernel.org
      CC: stable@vger.kernel.org
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea53abfa
  18. 20 10月, 2018 1 次提交
    • D
      netpoll: allow cleanup to be synchronous · c9fbd71f
      Debabrata Banerjee 提交于
      This fixes a problem introduced by:
      commit 2cde6acd ("netpoll: Fix __netpoll_rcu_free so that it can hold the rtnl lock")
      
      When using netconsole on a bond, __netpoll_cleanup can asynchronously
      recurse multiple times, each __netpoll_free_async call can result in
      more __netpoll_free_async's. This means there is now a race between
      cleanup_work queues on multiple netpoll_info's on multiple devices and
      the configuration of a new netpoll. For example if a netconsole is set
      to enable 0, reconfigured, and enable 1 immediately, this netconsole
      will likely not work.
      
      Given the reason for __netpoll_free_async is it can be called when rtnl
      is not locked, if it is locked, we should be able to execute
      synchronously. It appears to be locked everywhere it's called from.
      
      Generalize the design pattern from the teaming driver for current
      callers of __netpoll_free_async.
      
      CC: Neil Horman <nhorman@tuxdriver.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NDebabrata Banerjee <dbanerje@akamai.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9fbd71f
  19. 03 10月, 2018 1 次提交
  20. 27 9月, 2018 2 次提交
    • M
      bonding: avoid possible dead-lock · d4859d74
      Mahesh Bandewar 提交于
      Syzkaller reported this on a slightly older kernel but it's still
      applicable to the current kernel -
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.18.0-next-20180823+ #46 Not tainted
      ------------------------------------------------------
      syz-executor4/26841 is trying to acquire lock:
      00000000dd41ef48 ((wq_completion)bond_dev->name){+.+.}, at: flush_workqueue+0x2db/0x1e10 kernel/workqueue.c:2652
      
      but task is already holding lock:
      00000000768ab431 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:77 [inline]
      00000000768ab431 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x412/0xc30 net/core/rtnetlink.c:4708
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #2 (rtnl_mutex){+.+.}:
             __mutex_lock_common kernel/locking/mutex.c:925 [inline]
             __mutex_lock+0x171/0x1700 kernel/locking/mutex.c:1073
             mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1088
             rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77
             bond_netdev_notify drivers/net/bonding/bond_main.c:1310 [inline]
             bond_netdev_notify_work+0x44/0xd0 drivers/net/bonding/bond_main.c:1320
             process_one_work+0xc73/0x1aa0 kernel/workqueue.c:2153
             worker_thread+0x189/0x13c0 kernel/workqueue.c:2296
             kthread+0x35a/0x420 kernel/kthread.c:246
             ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415
      
      -> #1 ((work_completion)(&(&nnw->work)->work)){+.+.}:
             process_one_work+0xc0b/0x1aa0 kernel/workqueue.c:2129
             worker_thread+0x189/0x13c0 kernel/workqueue.c:2296
             kthread+0x35a/0x420 kernel/kthread.c:246
             ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415
      
      -> #0 ((wq_completion)bond_dev->name){+.+.}:
             lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901
             flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655
             drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820
             destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155
             __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138
             bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734
             register_netdevice+0x337/0x1100 net/core/dev.c:8410
             bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453
             rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099
             rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711
             netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
             rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729
             netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
             netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
             netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
             sock_sendmsg_nosec net/socket.c:622 [inline]
             sock_sendmsg+0xd5/0x120 net/socket.c:632
             ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115
             __sys_sendmsg+0x11d/0x290 net/socket.c:2153
             __do_sys_sendmsg net/socket.c:2162 [inline]
             __se_sys_sendmsg net/socket.c:2160 [inline]
             __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160
             do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
             entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      other info that might help us debug this:
      
      Chain exists of:
        (wq_completion)bond_dev->name --> (work_completion)(&(&nnw->work)->work) --> rtnl_mutex
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(rtnl_mutex);
                                     lock((work_completion)(&(&nnw->work)->work));
                                     lock(rtnl_mutex);
        lock((wq_completion)bond_dev->name);
      
       *** DEADLOCK ***
      
      1 lock held by syz-executor4/26841:
      
      stack backtrace:
      CPU: 1 PID: 26841 Comm: syz-executor4 Not tainted 4.18.0-next-20180823+ #46
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
       print_circular_bug.isra.34.cold.55+0x1bd/0x27d kernel/locking/lockdep.c:1222
       check_prev_add kernel/locking/lockdep.c:1862 [inline]
       check_prevs_add kernel/locking/lockdep.c:1975 [inline]
       validate_chain kernel/locking/lockdep.c:2416 [inline]
       __lock_acquire+0x3449/0x5020 kernel/locking/lockdep.c:3412
       lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901
       flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655
       drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820
       destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155
       __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138
       bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734
       register_netdevice+0x337/0x1100 net/core/dev.c:8410
       bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453
       rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099
       rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711
       netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
       rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729
       netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
       netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
       netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
       sock_sendmsg_nosec net/socket.c:622 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:632
       ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115
       __sys_sendmsg+0x11d/0x290 net/socket.c:2153
       __do_sys_sendmsg net/socket.c:2162 [inline]
       __se_sys_sendmsg net/socket.c:2160 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457089
      Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f2df20a5c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f2df20a66d4 RCX: 0000000000457089
      RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003
      RBP: 0000000000930140 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000004d40b8 R14: 00000000004c8ad8 R15: 0000000000000001
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4859d74
    • M
      bonding: pass link-local packets to bonding master also. · 6a9e461f
      Mahesh Bandewar 提交于
      Commit b89f04c6 ("bonding: deliver link-local packets with
      skb->dev set to link that packets arrived on") changed the behavior
      of how link-local-multicast packets are processed. The change in
      the behavior broke some legacy use cases where these packets are
      expected to arrive on bonding master device also.
      
      This patch passes the packet to the stack with the link it arrived
      on as well as passes to the bonding-master device to preserve the
      legacy use case.
      
      Fixes: b89f04c6 ("bonding: deliver link-local packets with skb->dev set to link that packets arrived on")
      Reported-by: NMichal Soltys <soltys@ziu.info>
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6a9e461f
  21. 24 9月, 2018 1 次提交
  22. 02 8月, 2018 1 次提交
    • E
      bonding: avoid lockdep confusion in bond_get_stats() · 7e2556e4
      Eric Dumazet 提交于
      syzbot found that the following sequence produces a LOCKDEP splat [1]
      
      ip link add bond10 type bond
      ip link add bond11 type bond
      ip link set bond11 master bond10
      
      To fix this, we can use the already provided nest_level.
      
      This patch also provides correct nesting for dev->addr_list_lock
      
      [1]
      WARNING: possible recursive locking detected
      4.18.0-rc6+ #167 Not tainted
      --------------------------------------------
      syz-executor751/4439 is trying to acquire lock:
      (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline]
      (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426
      
      but task is already holding lock:
      (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline]
      (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&(&bond->stats_lock)->rlock);
        lock(&(&bond->stats_lock)->rlock);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      3 locks held by syz-executor751/4439:
       #0: (____ptrval____) (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77
       #1: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline]
       #1: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426
       #2: (____ptrval____) (rcu_read_lock){....}, at: bond_get_stats+0x0/0x560 include/linux/compiler.h:215
      
      stack backtrace:
      CPU: 0 PID: 4439 Comm: syz-executor751 Not tainted 4.18.0-rc6+ #167
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
       print_deadlock_bug kernel/locking/lockdep.c:1765 [inline]
       check_deadlock kernel/locking/lockdep.c:1809 [inline]
       validate_chain kernel/locking/lockdep.c:2405 [inline]
       __lock_acquire.cold.64+0x1fb/0x486 kernel/locking/lockdep.c:3435
       lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924
       __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
       _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:144
       spin_lock include/linux/spinlock.h:310 [inline]
       bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426
       dev_get_stats+0x10f/0x470 net/core/dev.c:8316
       bond_get_stats+0x232/0x560 drivers/net/bonding/bond_main.c:3432
       dev_get_stats+0x10f/0x470 net/core/dev.c:8316
       rtnl_fill_stats+0x4d/0xac0 net/core/rtnetlink.c:1169
       rtnl_fill_ifinfo+0x1aa6/0x3fb0 net/core/rtnetlink.c:1611
       rtmsg_ifinfo_build_skb+0xc8/0x190 net/core/rtnetlink.c:3268
       rtmsg_ifinfo_event.part.30+0x45/0xe0 net/core/rtnetlink.c:3300
       rtmsg_ifinfo_event net/core/rtnetlink.c:3297 [inline]
       rtnetlink_event+0x144/0x170 net/core/rtnetlink.c:4716
       notifier_call_chain+0x180/0x390 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394 [inline]
       raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1735
       call_netdevice_notifiers net/core/dev.c:1753 [inline]
       netdev_features_change net/core/dev.c:1321 [inline]
       netdev_change_features+0xb3/0x110 net/core/dev.c:7759
       bond_compute_features.isra.47+0x585/0xa50 drivers/net/bonding/bond_main.c:1120
       bond_enslave+0x1b25/0x5da0 drivers/net/bonding/bond_main.c:1755
       bond_do_ioctl+0x7cb/0xae0 drivers/net/bonding/bond_main.c:3528
       dev_ifsioc+0x43c/0xb30 net/core/dev_ioctl.c:327
       dev_ioctl+0x1b5/0xcc0 net/core/dev_ioctl.c:493
       sock_do_ioctl+0x1d3/0x3e0 net/socket.c:992
       sock_ioctl+0x30d/0x680 net/socket.c:1093
       vfs_ioctl fs/ioctl.c:46 [inline]
       file_ioctl fs/ioctl.c:500 [inline]
       do_vfs_ioctl+0x1de/0x1720 fs/ioctl.c:684
       ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
       __do_sys_ioctl fs/ioctl.c:708 [inline]
       __se_sys_ioctl fs/ioctl.c:706 [inline]
       __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x440859
      Code: e8 2c af 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffc51a92878 EFLAGS: 00000213 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440859
      RDX: 0000000020000040 RSI: 0000000000008990 RDI: 0000000000000003
      RBP: 0000000000000000 R08: 00000000004002c8 R09: 00000000004002c8
      R10: 00000000022d5880 R11: 0000000000000213 R12: 0000000000007390
      R13: 0000000000401db0 R14: 0000000000000000 R15: 0000000000000000
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e2556e4
  23. 10 7月, 2018 1 次提交
  24. 13 6月, 2018 1 次提交
    • K
      treewide: kzalloc() -> kcalloc() · 6396bb22
      Kees Cook 提交于
      The kzalloc() function has a 2-factor argument form, kcalloc(). This
      patch replaces cases of:
      
              kzalloc(a * b, gfp)
      
      with:
              kcalloc(a * b, gfp)
      
      as well as handling cases of:
      
              kzalloc(a * b * c, gfp)
      
      with:
      
              kzalloc(array3_size(a, b, c), gfp)
      
      as it's slightly less ugly than:
      
              kzalloc_array(array_size(a, b), c, gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              kzalloc(4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      type TYPE;
      expression THING, E;
      @@
      
      (
        kzalloc(
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        kzalloc(
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        kzalloc(
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (COUNT_ID)
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * COUNT_ID
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * COUNT_CONST
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (COUNT_ID)
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * COUNT_ID
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * COUNT_CONST
      +	COUNT_CONST, sizeof(THING)
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      identifier SIZE, COUNT;
      @@
      
      - kzalloc
      + kcalloc
        (
      -	SIZE * COUNT
      +	COUNT, SIZE
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        kzalloc(
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        kzalloc(
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kzalloc(
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        kzalloc(
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products,
      // when they're not all constants...
      @@
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        kzalloc(C1 * C2 * C3, ...)
      |
        kzalloc(
      -	(E1) * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	(E1) * (E2) * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	(E1) * (E2) * (E3)
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants,
      // keeping sizeof() as the second factor argument.
      @@
      expression THING, E1, E2;
      type TYPE;
      constant C1, C2, C3;
      @@
      
      (
        kzalloc(sizeof(THING) * C2, ...)
      |
        kzalloc(sizeof(TYPE) * C2, ...)
      |
        kzalloc(C1 * C2 * C3, ...)
      |
        kzalloc(C1 * C2, ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (E2)
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * E2
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (E2)
      +	E2, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * E2
      +	E2, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	(E1) * E2
      +	E1, E2
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	(E1) * (E2)
      +	E1, E2
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	E1 * E2
      +	E1, E2
        , ...)
      )
      Signed-off-by: NKees Cook <keescook@chromium.org>
      6396bb22
  25. 25 5月, 2018 1 次提交
  26. 24 5月, 2018 1 次提交
    • W
      gso: limit udp gso to egress-only virtual devices · 8eea1ca8
      Willem de Bruijn 提交于
      Until the udp receive stack supports large packets (UDP GRO), GSO
      packets must not loop from the egress to the ingress path.
      
      Revert the change that added NETIF_F_GSO_UDP_L4 to various virtual
      devices through NETIF_F_GSO_ENCAP_ALL as this included devices that
      may loop packets, such as veth and macvlan.
      
      Instead add it to specific devices that forward to another device's
      egress path, bonding and team.
      
      Fixes: 83aa025f ("udp: add gso support to virtual devices")
      CC: Alexander Duyck <alexander.duyck@gmail.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8eea1ca8
  27. 18 5月, 2018 1 次提交
  28. 17 5月, 2018 3 次提交
  29. 12 5月, 2018 2 次提交
  30. 11 5月, 2018 1 次提交
    • D
      bonding: send learning packets for vlans on slave · 21706ee8
      Debabrata Banerjee 提交于
      There was a regression at some point from the intended functionality of
      commit f60c3704 ("bonding: Fix alb mode to only use first level
      vlans.")
      
      Given the return value vlan_get_encap_level() we need to store the nest
      level of the bond device, and then compare the vlan's encap level to
      this. Without this, this check always fails and learning packets are
      never sent.
      
      In addition, this same commit caused a regression in the behavior of
      balance_alb, which requires learning packets be sent for all interfaces
      using the slave's mac in order to load balance properly. For vlan's
      that have not set a user mac, we can send after checking one bit.
      Otherwise we need send the set mac, albeit defeating rx load balancing
      for that vlan.
      Signed-off-by: NDebabrata Banerjee <dbanerje@akamai.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      21706ee8
  31. 23 4月, 2018 1 次提交
    • X
      bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave · ddea788c
      Xin Long 提交于
      After Commit 8a8efa22 ("bonding: sync netpoll code with bridge"), it
      would set slave_dev npinfo in slave_enable_netpoll when enslaving a dev
      if bond->dev->npinfo was set.
      
      However now slave_dev npinfo is set with bond->dev->npinfo before calling
      slave_enable_netpoll. With slave_dev npinfo set, __netpoll_setup called
      in slave_enable_netpoll will not call slave dev's .ndo_netpoll_setup().
      It causes that the lower dev of this slave dev can't set its npinfo.
      
      One way to reproduce it:
      
        # modprobe bonding
        # brctl addbr br0
        # brctl addif br0 eth1
        # ifconfig bond0 192.168.122.1/24 up
        # ifenslave bond0 eth2
        # systemctl restart netconsole
        # ifenslave bond0 br0
        # ifconfig eth2 down
        # systemctl restart netconsole
      
      The netpoll won't really work.
      
      This patch is to remove that slave_dev npinfo setting in bond_enslave().
      
      Fixes: 8a8efa22 ("bonding: sync netpoll code with bridge")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddea788c
  32. 28 3月, 2018 1 次提交