1. 08 10月, 2021 1 次提交
    • W
      ice: Move devlink port to PF/VF struct · 2ae0aa47
      Wojciech Drewek 提交于
      Keeping devlink port inside VSI data structure causes some issues.
      Since VF VSI is released during reset that means that we have to
      unregister devlink port and register it again every time reset is
      triggered. With the new changes in devlink API it
      might cause deadlock issues. After calling
      devlink_port_register/devlink_port_unregister devlink API is going to
      lock rtnl_mutex. It's an issue when VF reset is triggered in netlink
      operation context (like setting VF MAC address or VLAN),
      because rtnl_lock is already taken by netlink. Another call of
      rtnl_lock from devlink API results in dead-lock.
      
      By moving devlink port to PF/VF we avoid creating/destroying it
      during reset. Since this patch, devlink ports are created during
      ice_probe, destroyed during ice_remove for PF and created during
      ice_repr_add, destroyed during ice_repr_rem for VF.
      Signed-off-by: NWojciech Drewek <wojciech.drewek@intel.com>
      Tested-by: NSandeep Penigalapati <sandeep.penigalapati@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      2ae0aa47
  2. 05 10月, 2021 1 次提交
  3. 02 10月, 2021 1 次提交
  4. 29 9月, 2021 3 次提交
  5. 27 9月, 2021 1 次提交
  6. 24 9月, 2021 1 次提交
  7. 22 9月, 2021 1 次提交
  8. 28 8月, 2021 1 次提交
    • B
      ice: Only lock to update netdev dev_addr · b357d971
      Brett Creeley 提交于
      commit 3ba7f53f ("ice: don't remove netdev->dev_addr from uc sync
      list") introduced calls to netif_addr_lock_bh() and
      netif_addr_unlock_bh() in the driver's ndo_set_mac() callback. This is
      fine since the driver is updated the netdev's dev_addr, but since this
      is a spinlock, the driver cannot sleep when the lock is held.
      Unfortunately the functions to add/delete MAC filters depend on a mutex.
      This was causing a trace with the lock debug kernel config options
      enabled when changing the mac address via iproute.
      
      [  203.273059] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281
      [  203.273065] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6698, name: ip
      [  203.273068] Preemption disabled at:
      [  203.273068] [<ffffffffc04aaeab>] ice_set_mac_address+0x8b/0x1c0 [ice]
      [  203.273097] CPU: 31 PID: 6698 Comm: ip Tainted: G S      W I       5.14.0-rc4 #2
      [  203.273100] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020
      [  203.273102] Call Trace:
      [  203.273107]  dump_stack_lvl+0x33/0x42
      [  203.273113]  ? ice_set_mac_address+0x8b/0x1c0 [ice]
      [  203.273124]  ___might_sleep.cold.150+0xda/0xea
      [  203.273131]  mutex_lock+0x1c/0x40
      [  203.273136]  ice_remove_mac+0xe3/0x180 [ice]
      [  203.273155]  ? ice_fltr_add_mac_list+0x20/0x20 [ice]
      [  203.273175]  ice_fltr_prepare_mac+0x43/0xa0 [ice]
      [  203.273194]  ice_set_mac_address+0xab/0x1c0 [ice]
      [  203.273206]  dev_set_mac_address+0xb8/0x120
      [  203.273210]  dev_set_mac_address_user+0x2c/0x50
      [  203.273212]  do_setlink+0x1dd/0x10e0
      [  203.273217]  ? __nla_validate_parse+0x12d/0x1a0
      [  203.273221]  __rtnl_newlink+0x530/0x910
      [  203.273224]  ? __kmalloc_node_track_caller+0x17f/0x380
      [  203.273230]  ? preempt_count_add+0x68/0xa0
      [  203.273236]  ? _raw_spin_lock_irqsave+0x1f/0x30
      [  203.273241]  ? kmem_cache_alloc_trace+0x4d/0x440
      [  203.273244]  rtnl_newlink+0x43/0x60
      [  203.273245]  rtnetlink_rcv_msg+0x13a/0x380
      [  203.273248]  ? rtnl_calcit.isra.40+0x130/0x130
      [  203.273250]  netlink_rcv_skb+0x4e/0x100
      [  203.273256]  netlink_unicast+0x1a2/0x280
      [  203.273258]  netlink_sendmsg+0x242/0x490
      [  203.273260]  sock_sendmsg+0x58/0x60
      [  203.273263]  ____sys_sendmsg+0x1ef/0x260
      [  203.273265]  ? copy_msghdr_from_user+0x5c/0x90
      [  203.273268]  ? ____sys_recvmsg+0xe6/0x170
      [  203.273270]  ___sys_sendmsg+0x7c/0xc0
      [  203.273272]  ? copy_msghdr_from_user+0x5c/0x90
      [  203.273274]  ? ___sys_recvmsg+0x89/0xc0
      [  203.273276]  ? __netlink_sendskb+0x50/0x50
      [  203.273278]  ? mod_objcg_state+0xee/0x310
      [  203.273282]  ? __dentry_kill+0x114/0x170
      [  203.273286]  ? get_max_files+0x10/0x10
      [  203.273288]  __sys_sendmsg+0x57/0xa0
      [  203.273290]  do_syscall_64+0x37/0x80
      [  203.273295]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  203.273296] RIP: 0033:0x7f8edf96e278
      [  203.273298] Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 63 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55
      [  203.273300] RSP: 002b:00007ffcb8bdac08 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [  203.273303] RAX: ffffffffffffffda RBX: 000000006115e0ae RCX: 00007f8edf96e278
      [  203.273304] RDX: 0000000000000000 RSI: 00007ffcb8bdac70 RDI: 0000000000000003
      [  203.273305] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007ffcb8bda5b0
      [  203.273306] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
      [  203.273306] R13: 0000555e10092020 R14: 0000000000000000 R15: 0000000000000005
      
      Fix this by only locking when changing the netdev->dev_addr. Also, make
      sure to restore the old netdev->dev_addr on any failures.
      
      Fixes: 3ba7f53f ("ice: don't remove netdev->dev_addr from uc sync list")
      Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
      Tested-by: NGurucharan G <gurucharanx.g@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      b357d971
  9. 10 8月, 2021 2 次提交
    • B
      ice: don't remove netdev->dev_addr from uc sync list · 3ba7f53f
      Brett Creeley 提交于
      In some circumstances, such as with bridging, it's possible that the
      stack will add the device's own MAC address to its unicast address list.
      
      If, later, the stack deletes this address, the driver will receive a
      request to remove this address.
      
      The driver stores its current MAC address as part of the VSI MAC filter
      list instead of separately. So, this causes a problem when the device's
      MAC address is deleted unexpectedly, which results in traffic failure in
      some cases.
      
      The following configuration steps will reproduce the previously
      mentioned problem:
      
      > ip link set eth0 up
      > ip link add dev br0 type bridge
      > ip link set br0 up
      > ip addr flush dev eth0
      > ip link set eth0 master br0
      > echo 1 > /sys/class/net/br0/bridge/vlan_filtering
      > modprobe -r veth
      > modprobe -r bridge
      > ip addr add 192.168.1.100/24 dev eth0
      
      The following ping command fails due to the netdev->dev_addr being
      deleted when removing the bridge module.
      > ping <link partner>
      
      Fix this by making sure to not delete the netdev->dev_addr during MAC
      address sync. After fixing this issue it was noticed that the
      netdev_warn() in .set_mac was overly verbose, so make it at
      netdev_dbg().
      
      Also, there is a possibility of a race condition between .set_mac and
      .set_rx_mode. Fix this by calling netif_addr_lock_bh() and
      netif_addr_unlock_bh() on the device's netdev when the netdev->dev_addr
      is going to be updated in .set_mac.
      
      Fixes: e94d4478 ("ice: Implement filter sync, NDO operations and bump version")
      Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
      Tested-by: NLiang Li <liali@redhat.com>
      Tested-by: NGurucharan G <gurucharanx.g@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      3ba7f53f
    • A
      ice: Prevent probing virtual functions · 50ac7479
      Anirudh Venkataramanan 提交于
      The userspace utility "driverctl" can be used to change/override the
      system's default driver choices. This is useful in some situations
      (buggy driver, old driver missing a device ID, trying a workaround,
      etc.) where the user needs to load a different driver.
      
      However, this is also prone to user error, where a driver is mapped
      to a device it's not designed to drive. For example, if the ice driver
      is mapped to driver iavf devices, the ice driver crashes.
      
      Add a check to return an error if the ice driver is being used to
      probe a virtual function.
      
      Fixes: 837f08fd ("ice: Add basic driver framework for Intel(R) E800 Series")
      Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: NGurucharan G <gurucharanx.g@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      50ac7479
  10. 28 7月, 2021 1 次提交
    • A
      dev_ioctl: split out ndo_eth_ioctl · a7605370
      Arnd Bergmann 提交于
      Most users of ndo_do_ioctl are ethernet drivers that implement
      the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware
      timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP.
      
      Separate these from the few drivers that use ndo_do_ioctl to
      implement SIOCBOND, SIOCBR and SIOCWANDEV commands.
      
      This is a purely cosmetic change intended to help readers find
      their way through the implementation.
      
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Vladimir Oltean <olteanv@gmail.com>
      Cc: Leon Romanovsky <leon@kernel.org>
      Cc: linux-rdma@vger.kernel.org
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7605370
  11. 26 6月, 2021 1 次提交
  12. 25 6月, 2021 1 次提交
  13. 18 6月, 2021 2 次提交
  14. 11 6月, 2021 4 次提交
    • J
      ice: enable transmit timestamps for E810 devices · ea9b847c
      Jacob Keller 提交于
      Add support for enabling Tx timestamp requests for outgoing packets on
      E810 devices.
      
      The ice hardware can support multiple outstanding Tx timestamp requests.
      When sending a descriptor to hardware, a Tx timestamp request is made by
      setting a request bit, and assigning an index that represents which Tx
      timestamp index to store the timestamp in.
      
      Hardware makes no effort to synchronize the index use, so it is up to
      software to ensure that Tx timestamp indexes are not re-used before the
      timestamp is reported back.
      
      To do this, introduce a Tx timestamp tracker which will keep track of
      currently in-use indexes.
      
      In the hot path, if a packet has a timestamp request, an index will be
      requested from the tracker. Unfortunately, this does require a lock as
      the indexes are shared across all queues on a PHY. There are not enough
      indexes to reliably assign only 1 to each queue.
      
      For the E810 devices, the timestamp indexes are not shared across PHYs,
      so each port can have its own tracking.
      
      Once hardware captures a timestamp, an interrupt is fired. In this
      interrupt, trigger a new work item that will figure out which timestamp
      was completed, and report the timestamp back to the stack.
      
      This function loops through the Tx timestamp indexes and checks whether
      there is now a valid timestamp. If so, it clears the PHY timestamp
      indication in the PHY memory, locks and removes the SKB and bit in the
      tracker, then reports the timestamp to the stack.
      
      It is possible in some cases that a timestamp request will be initiated
      but never completed. This might occur if the packet is dropped by
      software or hardware before it reaches the PHY.
      
      Add a task to the periodic work function that will check whether
      a timestamp request is more than a few seconds old. If so, the timestamp
      index is cleared in the PHY, and the SKB is released.
      
      Just as with Rx timestamps, the Tx timestamps are only 40 bits wide, and
      use the same overall logic for extending to 64 bits of nanoseconds.
      
      With this change, E810 devices should be able to perform basic PTP
      functionality.
      
      Future changes will extend the support to cover the E822-based devices.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      ea9b847c
    • J
      ice: enable receive hardware timestamping · 77a78115
      Jacob Keller 提交于
      Add SIOCGHWTSTAMP and SIOCSHWTSTAMP ioctl handlers to respond to
      requests to enable timestamping support. If the request is for enabling
      Rx timestamps, set a bit in the Rx descriptors to indicate that receive
      timestamps should be reported.
      
      Hardware captures receive timestamps in the PHY which only captures part
      of the timer, and reports only 40 bits into the Rx descriptor. The upper
      32 bits represent the contents of GLTSYN_TIME_L at the point of packet
      reception, while the lower 8 bits represent the upper 8 bits of
      GLTSYN_TIME_0.
      
      The networking and PTP stack expect 64 bit timestamps in nanoseconds. To
      support this, implement some logic to extend the timestamps by using the
      full PHC time.
      
      If the Rx timestamp was captured prior to the PHC time, then the real
      timestamp is
      
        PHC - (lower_32_bits(PHC) - timestamp)
      
      If the Rx timestamp was captured after the PHC time, then the real
      timestamp is
      
        PHC + (timestamp - lower_32_bits(PHC))
      
      These calculations are correct as long as neither the PHC timestamp nor
      the Rx timestamps are more than 2^32-1 nanseconds old. Further, we can
      detect when the Rx timestamp is before or after the PHC as long as the
      PHC timestamp is no more than 2^31-1 nanoseconds old.
      
      In that case, we calculate the delta between the lower 32 bits of the
      PHC and the Rx timestamp. If it's larger than 2^31-1 then the Rx
      timestamp must have been captured in the past. If it's smaller, then the
      Rx timestamp must have been captured after PHC time.
      
      Add an ice_ptp_extend_32b_ts function that relies on a cached copy of
      the PHC time and implements this algorithm to calculate the proper upper
      32bits of the Rx timestamps.
      
      Cache the PHC time periodically in all of the Rx rings. This enables
      each Rx ring to simply call the extension function with a recent copy of
      the PHC time. By ensuring that the PHC time is kept up to date
      periodically, we ensure this algorithm doesn't use stale data and
      produce incorrect results.
      
      To cache the time, introduce a kworker and a kwork item to periodically
      store the Rx time. It might seem like we should use the .do_aux_work
      interface of the PTP clock. This doesn't work because all PFs must cache
      this time, but only one PF owns the PTP clock device.
      
      Thus, the ice driver will manage its own kthread instead of relying on
      the PTP do_aux_work handler.
      
      With this change, the driver can now report Rx timestamps on all
      incoming packets.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      77a78115
    • J
      ice: register 1588 PTP clock device object for E810 devices · 06c16d89
      Jacob Keller 提交于
      Add a new ice_ptp.c file for holding the basic PTP clock interface
      functions. If the device supports PTP, call the new ice_ptp_init and
      ice_ptp_release functions where appropriate.
      
      If the function owns the hardware resource associated with the PTP
      hardware clock, register with the PTP_1588_CLOCK infrastructure to
      allocate a new clock object that represents the device hardware clock.
      
      Implement basic functionality for reading and setting the clock time,
      performing clock adjustments, and adjusting the clock frequency.
      
      Future changes will introduce functionality for handling related
      features including Tx and Rx timestamps.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      06c16d89
    • J
      ice: add support for sideband messages · 8f5ee3c4
      Jacob Keller 提交于
      In order to support certain device features, including enabling the PTP
      hardware clock, the ice driver needs to control some registers on the
      device PHY.
      
      These registers are accessed by sending sideband messages. For some
      hardware, these messages must be sent over the device admin queue, while
      other hardware has a dedicated control queue for the sideband messages.
      
      Add the neighbor device message structure for sending a message to the
      neighboring device. Where supported, initialize the sideband control
      queue and handle cleanup.
      
      Add a wrapper function for sending sideband control queue messages that
      read or write a neighboring device register.
      
      Because some devices send sideband messages over the AdminQ, also
      increase the length of the admin queue to allow more messages to be
      queued up. This is important because the sideband messages add
      additional pressure on the AQ usage.
      
      This support will be used in following patches to enable support for
      CONFIG_1588_PTP_CLOCK.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      8f5ee3c4
  15. 10 6月, 2021 1 次提交
    • M
      ice: add ndo_bpf callback for safe mode netdev ops · ebc5399e
      Maciej Fijalkowski 提交于
      ice driver requires a programmable pipeline firmware package in order to
      have a support for advanced features. Otherwise, driver falls back to so
      called 'safe mode'. For that mode, ndo_bpf callback is not exposed and
      when user tries to load XDP program, the following happens:
      
      $ sudo ./xdp1 enp179s0f1
      libbpf: Kernel error message: Underlying driver does not support XDP in native mode
      link set xdp fd failed
      
      which is sort of confusing, as there is a native XDP support, but not in
      the current mode. Improve the user experience by providing the specific
      ndo_bpf callback dedicated for safe mode which will make use of extack
      to explicitly let the user know that the DDP package is missing and
      that's the reason that the XDP can't be loaded onto interface currently.
      
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Fixes: efc2214b ("ice: Add support for XDP")
      Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: NKiran Bhandare <kiranx.bhandare@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      ebc5399e
  16. 07 6月, 2021 3 次提交
  17. 29 5月, 2021 3 次提交
  18. 23 4月, 2021 1 次提交
  19. 15 4月, 2021 5 次提交
  20. 09 4月, 2021 1 次提交
    • Y
      ice: fix memory leak of aRFS after resuming from suspend · 1831da7e
      Yongxin Liu 提交于
      In ice_suspend(), ice_clear_interrupt_scheme() is called, and then
      irq_free_descs() will be eventually called to free irq and its descriptor.
      
      In ice_resume(), ice_init_interrupt_scheme() is called to allocate new
      irqs. However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap
      maybe cannot be freed, if the irqs that released in ice_suspend() were
      reassigned to other devices, which makes irq descriptor's affinity_notify
      lost.
      
      So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(), which
      can make sure all irq_glue and cpu_rmap can be correctly released before
      corresponding irq and descriptor are released.
      
      Fix the following memory leak.
      
      unreferenced object 0xffff95bd951afc00 (size 512):
        comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
        hex dump (first 32 bytes):
          18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff  ........p.......
          00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff  ................
        backtrace:
          [<0000000072e4b914>] __kmalloc+0x336/0x540
          [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0
          [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice]
          [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
          [<00000000d692edba>] local_pci_probe+0x47/0xa0
          [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
          [<00000000555a9e4a>] process_one_work+0x1dd/0x410
          [<000000002c4b414a>] worker_thread+0x221/0x3f0
          [<00000000bb2b556b>] kthread+0x14c/0x170
          [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
      unreferenced object 0xffff95bd81b0a2a0 (size 96):
        comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
        hex dump (first 32 bytes):
          38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00  8...............
          b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff  ................
        backtrace:
          [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0
          [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0
          [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice]
          [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
          [<00000000d692edba>] local_pci_probe+0x47/0xa0
          [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
          [<00000000555a9e4a>] process_one_work+0x1dd/0x410
          [<000000002c4b414a>] worker_thread+0x221/0x3f0
          [<00000000bb2b556b>] kthread+0x14c/0x170
          [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
      
      Fixes: 769c500d ("ice: Add advanced power mgmt for WoL")
      Signed-off-by: NYongxin Liu <yongxin.liu@windriver.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      1831da7e
  21. 08 4月, 2021 5 次提交