1. 23 2月, 2019 2 次提交
    • D
      r8152: Fix an error on RTL8153-BD MAC Address Passthrough support · c286909f
      David Chen 提交于
      RTL8153-BD is used in Dell DA300 type-C dongle.
      Added RTL8153-BD support to activate MAC address pass through on DA300.
      Apply correction on previously submitted patch in net.git tree.
      Signed-off-by: NDavid Chen <david.chen7@dell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c286909f
    • D
      ipvlan: disallow userns cap_net_admin to change global mode/flags · 7cc9f700
      Daniel Borkmann 提交于
      When running Docker with userns isolation e.g. --userns-remap="default"
      and spawning up some containers with CAP_NET_ADMIN under this realm, I
      noticed that link changes on ipvlan slave device inside that container
      can affect all devices from this ipvlan group which are in other net
      namespaces where the container should have no permission to make changes
      to, such as the init netns, for example.
      
      This effectively allows to undo ipvlan private mode and switch globally to
      bridge mode where slaves can communicate directly without going through
      hostns, or it allows to switch between global operation mode (l2/l3/l3s)
      for everyone bound to the given ipvlan master device. libnetwork plugin
      here is creating an ipvlan master and ipvlan slave in hostns and a slave
      each that is moved into the container's netns upon creation event.
      
      * In hostns:
      
        # ip -d a
        [...]
        8: cilium_host@bond0: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
           link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
           ipvlan  mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
           inet 10.41.0.1/32 scope link cilium_host
             valid_lft forever preferred_lft forever
        [...]
      
      * Spawn container & change ipvlan mode setting inside of it:
      
        # docker run -dt --cap-add=NET_ADMIN --network cilium-net --name client -l app=test cilium/netperf
        9fff485d69dcb5ce37c9e33ca20a11ccafc236d690105aadbfb77e4f4170879c
      
        # docker exec -ti client ip -d a
        [...]
        10: cilium0@if4: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
               valid_lft forever preferred_lft forever
      
        # docker exec -ti client ip link change link cilium0 name cilium0 type ipvlan mode l2
      
        # docker exec -ti client ip -d a
        [...]
        10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
               valid_lft forever preferred_lft forever
      
      * In hostns (mode switched to l2):
      
        # ip -d a
        [...]
        8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.0.1/32 scope link cilium_host
               valid_lft forever preferred_lft forever
        [...]
      
      Same l3 -> l2 switch would also happen by creating another slave inside
      the container's network namespace when specifying the existing cilium0
      link to derive the actual (bond0) master:
      
        # docker exec -ti client ip link add link cilium0 name cilium1 type ipvlan mode l2
      
        # docker exec -ti client ip -d a
        [...]
        2: cilium1@if4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
        10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
               valid_lft forever preferred_lft forever
      
      * In hostns:
      
        # ip -d a
        [...]
        8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.0.1/32 scope link cilium_host
               valid_lft forever preferred_lft forever
        [...]
      
      One way to mitigate it is to check CAP_NET_ADMIN permissions of
      the ipvlan master device's ns, and only then allow to change
      mode or flags for all devices bound to it. Above two cases are
      then disallowed after the patch.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7cc9f700
  2. 22 2月, 2019 7 次提交
    • M
      bonding: fix PACKET_ORIGDEV regression · 3c963a33
      Michal Soltys 提交于
      This patch fixes a subtle PACKET_ORIGDEV regression which was a side
      effect of fixes introduced by:
      
      6a9e461f bonding: pass link-local packets to bonding master also.
      
      ... to:
      
      b89f04c6 bonding: deliver link-local packets with skb->dev set to link that packets arrived on
      
      While 6a9e461f restored pre-b89f04c6 presence of link-local
      packets on bonding masters (which is required e.g. by linux bridges
      participating in spanning tree or needed for lab-like setups created
      with group_fwd_mask) it also caused the originating device
      information to be lost due to cloning.
      
      Maciej Żenczykowski proposed another solution that doesn't require
      packet cloning and retains original device information - instead of
      returning RX_HANDLER_PASS for all link-local packets it's now limited
      only to packets from inactive slaves.
      
      At the same time, packets passed to bonding masters retain correct
      information about the originating device and PACKET_ORIGDEV can be used
      to determine it.
      
      This elegantly solves all issues so far:
      
      - link-local packets that were removed from bonding masters
      - LLDP daemons being forced to explicitly bind to slave interfaces
      - PACKET_ORIGDEV having no effect on bond interfaces
      
      Fixes: 6a9e461f (bonding: pass link-local packets to bonding master also.)
      Reported-by: NVincent Bernat <vincent@bernat.ch>
      Signed-off-by: NMichal Soltys <soltys@ziu.info>
      Signed-off-by: NMaciej Żenczykowski <maze@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c963a33
    • H
      net: vrf: remove MTU limits for vrf device · ad49bc63
      Hangbin Liu 提交于
      Similiar to commit e94cd811 ("net: remove MTU limits for dummy and
      ifb device"), MTU is irrelevant for VRF device. We init it as 64K while
      limit it to [68, 1500] may make users feel confused.
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad49bc63
    • J
      ixgbe: don't do any AF_XDP zero-copy transmit if netif is not OK · c685c69f
      Jan Sokolowski 提交于
      An issue has been found while testing zero-copy XDP that
      causes a reset to be triggered. As it takes some time to
      turn the carrier on after setting zc, and we already
      start trying to transmit some packets, watchdog considers
      this as an erroneous state and triggers a reset.
      
      Don't do any work if netif carrier is not OK.
      
      Fixes: 8221c5eb (ixgbe: add AF_XDP zero-copy Tx support)
      Signed-off-by: NJan Sokolowski <jan.sokolowski@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c685c69f
    • B
      i40e: fix XDP_REDIRECT/XDP xmit ring cleanup race · 59eb2a88
      Björn Töpel 提交于
      When the driver clears the XDP xmit ring due to re-configuration or
      teardown, in-progress ndo_xdp_xmit must be taken into consideration.
      
      The ndo_xdp_xmit function is typically called from a NAPI context that
      the driver does not control. Therefore, we must be careful not to
      clear the XDP ring, while the call is on-going. This patch adds a
      synchronize_rcu() to wait for napi(s) (preempt-disable regions and
      softirqs), prior clearing the queue. Further, the __I40E_CONFIG_BUSY
      flag is checked in the ndo_xdp_xmit implementation to avoid touching
      the XDP xmit queue during re-configuration.
      
      Fixes: d9314c47 ("i40e: add support for XDP_REDIRECT")
      Fixes: 123cecd4 ("i40e: added queue pair disable/enable functions")
      Reported-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      59eb2a88
    • M
      ixgbe: fix potential RX buffer starvation for AF_XDP · 4a9b32f3
      Magnus Karlsson 提交于
      When the RX rings are created they are also populated with buffers so
      that packets can be received. Usually these are kernel buffers, but
      for AF_XDP in zero-copy mode, these are user-space buffers and in this
      case the application might not have sent down any buffers to the
      driver at this point. And if no buffers are allocated at ring creation
      time, no packets can be received and no interrupts will be generated so
      the NAPI poll function that allocates buffers to the rings will never
      get executed.
      
      To rectify this, we kick the NAPI context of any queue with an
      attached AF_XDP zero-copy socket in two places in the code. Once after
      an XDP program has loaded and once after the umem is registered.  This
      take care of both cases: XDP program gets loaded first then AF_XDP
      socket is created, and the reverse, AF_XDP socket is created first,
      then XDP program is loaded.
      
      Fixes: d0bcacd0 ("ixgbe: add AF_XDP zero-copy Rx support")
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4a9b32f3
    • M
      i40e: fix potential RX buffer starvation for AF_XDP · 14ffeb52
      Magnus Karlsson 提交于
      When the RX rings are created they are also populated with buffers
      so that packets can be received. Usually these are kernel buffers,
      but for AF_XDP in zero-copy mode, these are user-space buffers and
      in this case the application might not have sent down any buffers
      to the driver at this point. And if no buffers are allocated at ring
      creation time, no packets can be received and no interrupts will be
      generated so the NAPI poll function that allocates buffers to the
      rings will never get executed.
      
      To rectify this, we kick the NAPI context of any queue with an
      attached AF_XDP zero-copy socket in two places in the code. Once
      after an XDP program has loaded and once after the umem is registered.
      This take care of both cases: XDP program gets loaded first then AF_XDP
      socket is created, and the reverse, AF_XDP socket is created first,
      then XDP program is loaded.
      
      Fixes: 0a714186 ("i40e: add AF_XDP zero-copy Rx support")
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      14ffeb52
    • J
      ixgbe: fix older devices that do not support IXGBE_MRQC_L3L4TXSWEN · 156a67a9
      Jeff Kirsher 提交于
      The enabling L3/L4 filtering for transmit switched packets for all
      devices caused unforeseen issue on older devices when trying to send UDP
      traffic in an ordered sequence.  This bit was originally intended for X550
      devices, which supported this feature, so limit the scope of this bit to
      only X550 devices.
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      156a67a9
  3. 21 2月, 2019 1 次提交
    • R
      net: marvell: mvneta: fix DMA debug warning · a8fef9ba
      Russell King 提交于
      Booting 4.20 on SolidRun Clearfog issues this warning with DMA API
      debug enabled:
      
      WARNING: CPU: 0 PID: 555 at kernel/dma/debug.c:1230 check_sync+0x514/0x5bc
      mvneta f1070000.ethernet: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000002dd7dc00] [size=240 bytes]
      Modules linked in: ahci mv88e6xxx dsa_core xhci_plat_hcd xhci_hcd devlink armada_thermal marvell_cesa des_generic ehci_orion phy_armada38x_comphy mcp3021 spi_orion evbug sfp mdio_i2c ip_tables x_tables
      CPU: 0 PID: 555 Comm: bridge-network- Not tainted 4.20.0+ #291
      Hardware name: Marvell Armada 380/385 (Device Tree)
      [<c0019638>] (unwind_backtrace) from [<c0014888>] (show_stack+0x10/0x14)
      [<c0014888>] (show_stack) from [<c07f54e0>] (dump_stack+0x9c/0xd4)
      [<c07f54e0>] (dump_stack) from [<c00312bc>] (__warn+0xf8/0x124)
      [<c00312bc>] (__warn) from [<c00313b0>] (warn_slowpath_fmt+0x38/0x48)
      [<c00313b0>] (warn_slowpath_fmt) from [<c00b0370>] (check_sync+0x514/0x5bc)
      [<c00b0370>] (check_sync) from [<c00b04f8>] (debug_dma_sync_single_range_for_cpu+0x6c/0x74)
      [<c00b04f8>] (debug_dma_sync_single_range_for_cpu) from [<c051bd14>] (mvneta_poll+0x298/0xf58)
      [<c051bd14>] (mvneta_poll) from [<c0656194>] (net_rx_action+0x128/0x424)
      [<c0656194>] (net_rx_action) from [<c000a230>] (__do_softirq+0xf0/0x540)
      [<c000a230>] (__do_softirq) from [<c00386e0>] (irq_exit+0x124/0x144)
      [<c00386e0>] (irq_exit) from [<c009b5e0>] (__handle_domain_irq+0x58/0xb0)
      [<c009b5e0>] (__handle_domain_irq) from [<c03a63c4>] (gic_handle_irq+0x48/0x98)
      [<c03a63c4>] (gic_handle_irq) from [<c0009a10>] (__irq_svc+0x70/0x98)
      ...
      
      This appears to be caused by mvneta_rx_hwbm() calling
      dma_sync_single_range_for_cpu() with the wrong struct device pointer,
      as the buffer manager device pointer is used to map and unmap the
      buffer.  Fix this.
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8fef9ba
  4. 20 2月, 2019 1 次提交
  5. 19 2月, 2019 7 次提交
  6. 18 2月, 2019 4 次提交
  7. 16 2月, 2019 6 次提交
  8. 15 2月, 2019 4 次提交
  9. 14 2月, 2019 7 次提交
    • D
      net: dsa: bcm_sf2: potential array overflow in bcm_sf2_sw_suspend() · 8d6ea932
      Dan Carpenter 提交于
      The value of ->num_ports comes from bcm_sf2_sw_probe() and it is less
      than or equal to DSA_MAX_PORTS.  The ds->ports[] array is used inside
      the dsa_is_user_port() and dsa_is_cpu_port() functions.  The ds->ports[]
      array is allocated in dsa_switch_alloc() and it has ds->num_ports
      elements so this leads to a static checker warning about a potential out
      of bounds read.
      
      Fixes: 8cfa9498 ("net: dsa: bcm_sf2: add suspend/resume callbacks")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d6ea932
    • J
      dsa: mv88e6xxx: Ensure all pending interrupts are handled prior to exit · 7c0db24c
      John David Anglin 提交于
      The GPIO interrupt controller on the espressobin board only supports edge interrupts.
      If one enables the use of hardware interrupts in the device tree for the 88E6341, it is
      possible to miss an edge.  When this happens, the INTn pin on the Marvell switch is
      stuck low and no further interrupts occur.
      
      I found after adding debug statements to mv88e6xxx_g1_irq_thread_work() that there is
      a race in handling device interrupts (e.g. PHY link interrupts).  Some interrupts are
      directly cleared by reading the Global 1 status register.  However, the device interrupt
      flag, for example, is not cleared until all the unmasked SERDES and PHY ports are serviced.
      This is done by reading the relevant SERDES and PHY status register.
      
      The code only services interrupts whose status bit is set at the time of reading its status
      register.  If an interrupt event occurs after its status is read and before all interrupts
      are serviced, then this event will not be serviced and the INTn output pin will remain low.
      
      This is not a problem with polling or level interrupts since the handler will be called
      again to process the event.  However, it's a big problem when using level interrupts.
      
      The fix presented here is to add a loop around the code servicing switch interrupts.  If
      any pending interrupts remain after the current set has been handled, we loop and process
      the new set.  If there are no pending interrupts after servicing, we are sure that INTn has
      gone high and we will get an edge when a new event occurs.
      
      Tested on espressobin board.
      
      Fixes: dc30c35b ("net: dsa: mv88e6xxx: Implement interrupt support.")
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Tested-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c0db24c
    • H
      net: phy: fix interrupt handling in non-started states · b79555d5
      Heiner Kallweit 提交于
      phylib enables interrupts before phy_start() has been called, and if
      we receive an interrupt in a non-started state, the interrupt handler
      returns IRQ_NONE. This causes problems with at least one Marvell chip
      as reported by Andrew.
      Fix this by handling interrupts the same as in phy_mac_interrupt(),
      basically always running the phylib state machine. It knows when it
      has to do something and when not.
      This change allows to handle interrupts gracefully even if they
      occur in a non-started state.
      
      Fixes: 2b3e88ea ("net: phy: improve phy state checking")
      Reported-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b79555d5
    • S
      net/mlx5e: XDP, fix redirect resources availability check · 407e17b1
      Saeed Mahameed 提交于
      Currently mlx5 driver creates xdp redirect hw queues unconditionally on
      netdevice open, This is great until someone starts redirecting XDP traffic
      via ndo_xdp_xmit on mlx5 device and changes the device configuration at
      the same time, this might cause crashes, since the other device's napi
      is not aware of the mlx5 state change (resources un-availability).
      
      To fix this we must synchronize with other devices napi's on the system.
      Added a new flag under mlx5e_priv to determine XDP TX resources are
      available, set/clear it up when necessary and use synchronize_rcu()
      when the flag is turned off, so other napi's are in-sync with it, before
      we actually cleanup the hw resources.
      
      The flag is tested prior to committing to transmit on mlx5e_xdp_xmit, and
      it is sufficient to determine if it safe to transmit or not. The other
      two internal flags (MLX5E_STATE_OPENED and MLX5E_SQ_STATE_ENABLED) become
      unnecessary. Thus, they are removed from data path.
      
      Fixes: 58b99ee3 ("net/mlx5e: Add support for XDP_REDIRECT in device-out side")
      Reported-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      407e17b1
    • T
      net/mlx5: Fix a compilation warning in events.c · 5400261e
      Tariq Toukan 提交于
      Eliminate the following compilation warning:
      
      drivers/net/ethernet/mellanox/mlx5/core/events.c: warning: 'error_str'
      may be used uninitialized in this function [-Wuninitialized]:  => 238:3
      
      Fixes: c2fb3db2 ("net/mlx5: Rework handling of port module events")
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: NMikhael Goikhman <migo@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      5400261e
    • H
      net/mlx5: No command allowed when command interface is not ready · 4cab346b
      Huy Nguyen 提交于
      When EEH is injected and PCI bus stalls, mlx5's pci error detect
      function is called to deactivate the command interface and tear down
      the device. The issue is that there can be a thread that already
      passed MLX5_DEVICE_STATE_INTERNAL_ERROR check, it will send the command
      and stuck in the wait_func.
      
      Solution:
      Add function mlx5_cmd_flush to disable command interface and clear all
      the pending commands. When device state is set to
      MLX5_DEVICE_STATE_INTERNAL_ERROR, call mlx5_cmd_flush to ensure all
      pending threads waiting for firmware commands completion are terminated.
      
      Fixes: c1d4d2e9 ("net/mlx5: Avoid calling sleeping function by the health poll thread")
      Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      4cab346b
    • M
      net/mlx5e: Fix NULL pointer derefernce in set channels error flow · fb35c534
      Maria Pasechnik 提交于
      New channels are applied to the priv channels only after they
      are successfully opened. Then, the indirection table should be built
      according to the new number of channels.
      Currently, such build is preformed independently of whether the
      channels opening is successful, and is not reverted on failure.
      
      The bug is caused due to removal of rss params from channels struct
      and moving it to priv struct. That change cause to independency between
      channels and rss params.
      This causes a crash on a later point, when accessing rqn of a non
      existing channel.
      
      This patch fixes it by moving the indirection table build right before
      switching the priv channels to new channels struct, after the new set of
      channels was successfully opened.
      
      Fixes: bbeb53b8 ("net/mlx5e: Move RSS params to a dedicated struct")
      Signed-off-by: NMaria Pasechnik <mariap@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      fb35c534
  10. 13 2月, 2019 1 次提交
    • C
      team: avoid complex list operations in team_nl_cmd_options_set() · 2fdeee25
      Cong Wang 提交于
      The current opt_inst_list operations inside team_nl_cmd_options_set()
      is too complex to track:
      
          LIST_HEAD(opt_inst_list);
          nla_for_each_nested(...) {
              list_for_each_entry(opt_inst, &team->option_inst_list, list) {
                  if (__team_option_inst_tmp_find(&opt_inst_list, opt_inst))
                      continue;
                  list_add(&opt_inst->tmp_list, &opt_inst_list);
              }
          }
          team_nl_send_event_options_get(team, &opt_inst_list);
      
      as while we retrieve 'opt_inst' from team->option_inst_list, it could
      be added to the local 'opt_inst_list' for multiple times. The
      __team_option_inst_tmp_find() doesn't work, as the setter
      team_mode_option_set() still calls team->ops.exit() which uses
      ->tmp_list too in __team_options_change_check().
      
      Simplify the list operations by moving the 'opt_inst_list' and
      team_nl_send_event_options_get() into the nla_for_each_nested() loop so
      that it can be guranteed that we won't insert a same list entry for
      multiple times. Therefore, __team_option_inst_tmp_find() can be removed
      too.
      
      Fixes: 4fb0534f ("team: avoid adding twice the same option to the event list")
      Fixes: 2fcdb2c9 ("team: allow to send multiple set events in one message")
      Reported-by: syzbot+4d4af685432dc0e56c91@syzkaller.appspotmail.com
      Reported-by: syzbot+68ee510075cf64260cc4@syzkaller.appspotmail.com
      Cc: Jiri Pirko <jiri@resnulli.us>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Reviewed-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2fdeee25