1. 18 8月, 2022 7 次提交
    • V
      net: mscc: ocelot: report ndo_get_stats64 from the wraparound-resistant ocelot->stats · e780e319
      Vladimir Oltean 提交于
      Rather than reading the stats64 counters directly from the 32-bit
      hardware, it's better to rely on the output produced by the periodic
      ocelot_port_update_stats().
      
      It would be even better to call ocelot_port_update_stats() right from
      ocelot_get_stats64() to make sure we report the current values rather
      than the ones from 2 seconds ago. But we need to export
      ocelot_port_update_stats() from the switch lib towards the switchdev
      driver for that, and future work will largely undo that.
      
      There are more ocelot-based drivers waiting to be introduced, an example
      of which is the SPI-controlled VSC7512. In that driver's case, it will
      be impossible to call ocelot_port_update_stats() from ndo_get_stats64
      context, since the latter is atomic, and reading the stats over SPI is
      sleepable. So the compromise taken here, which will also hold going
      forward, is to report 64-bit counters to stats64, which are not 100% up
      to date.
      
      Fixes: a556c76a ("net: mscc: Add initial Ocelot switch support")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      e780e319
    • V
      net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset · d4c36765
      Vladimir Oltean 提交于
      With so many counter addresses recently discovered as being wrong, it is
      desirable to at least have a central database of information, rather
      than two: one through the SYS_COUNT_* registers (used for
      ndo_get_stats64), and the other through the offset field of struct
      ocelot_stat_layout elements (used for ethtool -S).
      
      The strategy will be to keep the SYS_COUNT_* definitions as the single
      source of truth, but for that we need to expand our current definitions
      to cover all registers. Then we need to convert the ocelot region
      creation logic, and stats worker, to the read semantics imposed by going
      through SYS_COUNT_* absolute register addresses, rather than offsets
      of 32-bit words relative to SYS_COUNT_RX_OCTETS (which should have been
      SYS_CNT, by the way).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      d4c36765
    • V
      net: mscc: ocelot: make struct ocelot_stat_layout array indexable · 91904600
      Vladimir Oltean 提交于
      The ocelot counters are 32-bit and require periodic reading, every 2
      seconds, by ocelot_port_update_stats(), so that wraparounds are
      detected.
      
      Currently, the counters reported by ocelot_get_stats64() come from the
      32-bit hardware counters directly, rather than from the 64-bit
      accumulated ocelot->stats, and this is a problem for their integrity.
      
      The strategy is to make ocelot_get_stats64() able to cherry-pick
      individual stats from ocelot->stats the way in which it currently reads
      them out from SYS_COUNT_* registers. But currently it can't, because
      ocelot->stats is an opaque u64 array that's used only to feed data into
      ethtool -S.
      
      To solve that problem, we need to make ocelot->stats indexable, and
      associate each element with an element of struct ocelot_stat_layout used
      by ethtool -S.
      
      This makes ocelot_stat_layout a fat (and possibly sparse) array, so we
      need to change the way in which we access it. We no longer need
      OCELOT_STAT_END as a sentinel, because we know the array's size
      (OCELOT_NUM_STATS). We just need to skip the array elements that were
      left unpopulated for the switch revision (ocelot, felix, seville).
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      91904600
    • V
      net: mscc: ocelot: fix race between ndo_get_stats64 and ocelot_check_stats_work · 18d8e67d
      Vladimir Oltean 提交于
      The 2 methods can run concurrently, and one will change the window of
      counters (SYS_STAT_CFG_STAT_VIEW) that the other sees. The fix is
      similar to what commit 7fbf6795 ("net: mscc: ocelot: fix mutex lock
      error during ethtool stats read") has done for ethtool -S.
      
      Fixes: a556c76a ("net: mscc: Add initial Ocelot switch support")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      18d8e67d
    • V
      net: mscc: ocelot: turn stats_lock into a spinlock · 22d842e3
      Vladimir Oltean 提交于
      ocelot_get_stats64() currently runs unlocked and therefore may collide
      with ocelot_port_update_stats() which indirectly accesses the same
      counters. However, ocelot_get_stats64() runs in atomic context, and we
      cannot simply take the sleepable ocelot->stats_lock mutex. We need to
      convert it to an atomic spinlock first. Do that as a preparatory change.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      22d842e3
    • V
      net: mscc: ocelot: fix address of SYS_COUNT_TX_AGING counter · 173ca866
      Vladimir Oltean 提交于
      This register, used as part of stats->tx_dropped in
      ocelot_get_stats64(), has a wrong address. At the address currently
      given, there is actually the c_tx_green_prio_6 counter.
      
      Fixes: a556c76a ("net: mscc: Add initial Ocelot switch support")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      173ca866
    • V
      net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters · 5152de7b
      Vladimir Oltean 提交于
      Reading stats using the SYS_COUNT_* register definitions is only used by
      ocelot_get_stats64() from the ocelot switchdev driver, however,
      currently the bucket definitions are incorrect.
      
      Separately, on both RX and TX, we have the following problems:
      - a 256-1023 bucket which actually tracks the 256-511 packets
      - the 1024-1526 bucket actually tracks the 512-1023 packets
      - the 1527-max bucket actually tracks the 1024-1526 packets
      
      => nobody tracks the packets from the real 1527-max bucket
      
      Additionally, the RX_PAUSE, RX_CONTROL, RX_LONGS and RX_CLASSIFIED_DROPS
      all track the wrong thing. However this doesn't seem to have any
      consequence, since ocelot_get_stats64() doesn't use these.
      
      Even though this problem only manifests itself for the switchdev driver,
      we cannot split the fix for ocelot and for DSA, since it requires fixing
      the bucket definitions from enum ocelot_reg, which makes us necessarily
      adapt the structures from felix and seville as well.
      
      Fixes: 84705fc1 ("net: dsa: felix: introduce support for Seville VSC9953 switch")
      Fixes: 56051948 ("net: dsa: ocelot: add driver for Felix switch family")
      Fixes: a556c76a ("net: mscc: Add initial Ocelot switch support")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      5152de7b
  2. 16 8月, 2022 3 次提交
  3. 15 8月, 2022 4 次提交
    • A
      mlxsw: spectrum_ptp: Forbid PTP enablement only in RX or in TX · e01885c3
      Amit Cohen 提交于
      Currently mlxsw driver configures one global PTP configuration for all
      ports. The reason is that the switch behaves like a transparent clock
      between CPU port and front-panel ports. When time stamp is enabled in
      any port, the hardware is configured to update the correction field. The
      fact that the configuration of CPU port affects all the ports, makes the
      correction field update to be global for all ports. Otherwise, user will
      see odd values in the correction field, as the switch will update the
      correction field in the CPU port, but not in all the front-panel ports.
      
      The CPU port is relevant in both RX and TX, so to avoid problematic
      configuration, forbid PTP enablement only in one direction, i.e., only in
      RX or TX.
      
      Without the change:
      $ hwstamp_ctl -i swp1 -r 12 -t 0
      current settings:
      tx_type 0
      rx_filter 0
      new settings:
      tx_type 0
      rx_filter 2
      $ echo $?
      0
      
      With the change:
      $ hwstamp_ctl -i swp1 -r 12 -t 0
      current settings:
      tx_type 1
      rx_filter 2
      SIOCSHWTSTAMP failed: Invalid argument
      
      Fixes: 08ef8bc8 ("mlxsw: spectrum_ptp: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls")
      Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: NPetr Machata <petrm@nvidia.com>
      Signed-off-by: NPetr Machata <petrm@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e01885c3
    • A
      mlxsw: spectrum_ptp: Protect PTP configuration with a mutex · d72fdef2
      Amit Cohen 提交于
      Currently the functions mlxsw_sp2_ptp_{configure, deconfigure}_port()
      assume that they are called when RTNL is locked and they warn otherwise.
      
      The deconfigure function can be called when port is removed, for example
      as part of device reload, then there is no locked RTNL and the function
      warns [1].
      
      To avoid such case, do not assume that RTNL protects this code, add a
      dedicated mutex instead. The mutex protects 'ptp_state->config' which
      stores the existing global configuration in hardware. Use this mutex also
      to protect the code which configures the hardware. Then, there will be
      only one configuration in any time, which will be updated in 'ptp_state'
      and a race will be avoided.
      
      [1]:
      RTNL: assertion failed at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c (1600)
      WARNING: CPU: 1 PID: 1583493 at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:1600 mlxsw_sp2_ptp_hwtstamp_set+0x2d3/0x300 [mlxsw_spectrum]
      [...]
      CPU: 1 PID: 1583493 Comm: devlink Not tainted5.19.0-rc8-custom-127022-gb371dffda095 #789
      Hardware name: Mellanox Technologies Ltd.MSN3420/VMOD0005, BIOS 5.11 01/06/2019
      RIP: 0010:mlxsw_sp2_ptp_hwtstamp_set+0x2d3/0x300[mlxsw_spectrum]
      [...]
      Call Trace:
       <TASK>
       mlxsw_sp_port_remove+0x7e/0x190 [mlxsw_spectrum]
       mlxsw_sp_fini+0xd1/0x270 [mlxsw_spectrum]
       mlxsw_core_bus_device_unregister+0x55/0x280 [mlxsw_core]
       mlxsw_devlink_core_bus_device_reload_down+0x1c/0x30[mlxsw_core]
       devlink_reload+0x1ee/0x230
       devlink_nl_cmd_reload+0x4de/0x580
       genl_family_rcv_msg_doit+0xdc/0x140
       genl_rcv_msg+0xd7/0x1d0
       netlink_rcv_skb+0x49/0xf0
       genl_rcv+0x1f/0x30
       netlink_unicast+0x22f/0x350
       netlink_sendmsg+0x208/0x440
       __sys_sendto+0xf0/0x140
       __x64_sys_sendto+0x1b/0x20
       do_syscall_64+0x35/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: 08ef8bc8 ("mlxsw: spectrum_ptp: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls")
      Reported-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NPetr Machata <petrm@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d72fdef2
    • A
      mlxsw: spectrum: Clear PTP configuration after unregistering the netdevice · a159e986
      Amit Cohen 提交于
      Currently as part of removing port, PTP API is called to clear the
      existing configuration and set the 'rx_filter' and 'tx_type' to zero.
      The clearing is done before unregistering the netdevice, which means that
      there is a window of time in which the user can reconfigure PTP in the
      port, and this configuration will not be cleared.
      
      Reorder the operations, clear PTP configuration after unregistering the
      netdevice.
      
      Fixes: 87486427 ("mlxsw: spectrum: PTP: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls")
      Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NPetr Machata <petrm@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a159e986
    • A
      mlxsw: spectrum_ptp: Fix compilation warnings · 12e09138
      Amit Cohen 提交于
      In case that 'CONFIG_PTP_1588_CLOCK' is not enabled in the config file,
      there are implementations for the functions
      mlxsw_{sp,sp2}_ptp_txhdr_construct() as part of 'spectrum_ptp.h'. In this
      case, they should be defined as 'static' as they are not supposed to be
      used out of this file. Make the functions 'static', otherwise the following
      warnings are returned:
      
      "warning: no previous prototype for 'mlxsw_sp_ptp_txhdr_construct'"
      "warning: no previous prototype for 'mlxsw_sp2_ptp_txhdr_construct'"
      
      In addition, make the functions 'inline' for case that 'spectrum_ptp.h'
      will be included anywhere else and the functions would probably not be
      used, so compilation warnings about unused static will be returned.
      
      Fixes: 24157bc6 ("mlxsw: Send PTP packets as data packets to overcome a limitation")
      Reported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: NPetr Machata <petrm@nvidia.com>
      Signed-off-by: NPetr Machata <petrm@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12e09138
  4. 13 8月, 2022 1 次提交
  5. 12 8月, 2022 9 次提交
    • I
      iavf: Fix deadlock in initialization · cbe9e511
      Ivan Vecera 提交于
      Fix deadlock that occurs when iavf interface is a part of failover
      configuration.
      
      1. Mutex crit_lock is taken at the beginning of iavf_watchdog_task()
      2. Function iavf_init_config_adapter() is called when adapter
         state is __IAVF_INIT_CONFIG_ADAPTER
      3. iavf_init_config_adapter() calls register_netdevice() that emits
         NETDEV_REGISTER event
      4. Notifier function failover_event() then calls
         net_failover_slave_register() that calls dev_open()
      5. dev_open() calls iavf_open() that tries to take crit_lock in
         end-less loop
      
      Stack trace:
      ...
      [  790.251876]  usleep_range_state+0x5b/0x80
      [  790.252547]  iavf_open+0x37/0x1d0 [iavf]
      [  790.253139]  __dev_open+0xcd/0x160
      [  790.253699]  dev_open+0x47/0x90
      [  790.254323]  net_failover_slave_register+0x122/0x220 [net_failover]
      [  790.255213]  failover_slave_register.part.7+0xd2/0x180 [failover]
      [  790.256050]  failover_event+0x122/0x1ab [failover]
      [  790.256821]  notifier_call_chain+0x47/0x70
      [  790.257510]  register_netdevice+0x20f/0x550
      [  790.258263]  iavf_watchdog_task+0x7c8/0xea0 [iavf]
      [  790.259009]  process_one_work+0x1a7/0x360
      [  790.259705]  worker_thread+0x30/0x390
      
      To fix the situation we should check the current adapter state after
      first unsuccessful mutex_trylock() and return with -EBUSY if it is
      __IAVF_INIT_CONFIG_ADAPTER.
      
      Fixes: 226d5285 ("iavf: fix locking of critical sections")
      Signed-off-by: NIvan Vecera <ivecera@redhat.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      cbe9e511
    • P
      iavf: Fix reset error handling · 31071173
      Przemyslaw Patynowski 提交于
      Do not call iavf_close in iavf_reset_task error handling. Doing so can
      lead to double call of napi_disable, which can lead to deadlock there.
      Removing VF would lead to iavf_remove task being stuck, because it
      requires crit_lock, which is held by iavf_close.
      Call iavf_disable_vf if reset fail, so that driver will clean up
      remaining invalid resources.
      During rapid VF resets, HW can fail to setup VF mailbox. Wrong
      error handling can lead to iavf_remove being stuck with:
      [ 5218.999087] iavf 0000:82:01.0: Failed to init adminq: -53
      ...
      [ 5267.189211] INFO: task repro.sh:11219 blocked for more than 30 seconds.
      [ 5267.189520]       Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
      [ 5267.189764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 5267.190062] task:repro.sh        state:D stack:    0 pid:11219 ppid:  8162 flags:0x00000000
      [ 5267.190347] Call Trace:
      [ 5267.190647]  <TASK>
      [ 5267.190927]  __schedule+0x460/0x9f0
      [ 5267.191264]  schedule+0x44/0xb0
      [ 5267.191563]  schedule_preempt_disabled+0x14/0x20
      [ 5267.191890]  __mutex_lock.isra.12+0x6e3/0xac0
      [ 5267.192237]  ? iavf_remove+0xf9/0x6c0 [iavf]
      [ 5267.192565]  iavf_remove+0x12a/0x6c0 [iavf]
      [ 5267.192911]  ? _raw_spin_unlock_irqrestore+0x1e/0x40
      [ 5267.193285]  pci_device_remove+0x36/0xb0
      [ 5267.193619]  device_release_driver_internal+0xc1/0x150
      [ 5267.193974]  pci_stop_bus_device+0x69/0x90
      [ 5267.194361]  pci_stop_and_remove_bus_device+0xe/0x20
      [ 5267.194735]  pci_iov_remove_virtfn+0xba/0x120
      [ 5267.195130]  sriov_disable+0x2f/0xe0
      [ 5267.195506]  ice_free_vfs+0x7d/0x2f0 [ice]
      [ 5267.196056]  ? pci_get_device+0x4f/0x70
      [ 5267.196496]  ice_sriov_configure+0x78/0x1a0 [ice]
      [ 5267.196995]  sriov_numvfs_store+0xfe/0x140
      [ 5267.197466]  kernfs_fop_write_iter+0x12e/0x1c0
      [ 5267.197918]  new_sync_write+0x10c/0x190
      [ 5267.198404]  vfs_write+0x24e/0x2d0
      [ 5267.198886]  ksys_write+0x5c/0xd0
      [ 5267.199367]  do_syscall_64+0x3a/0x80
      [ 5267.199827]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [ 5267.200317] RIP: 0033:0x7f5b381205c8
      [ 5267.200814] RSP: 002b:00007fff8c7e8c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 5267.201981] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f5b381205c8
      [ 5267.202620] RDX: 0000000000000002 RSI: 00005569420ee900 RDI: 0000000000000001
      [ 5267.203426] RBP: 00005569420ee900 R08: 000000000000000a R09: 00007f5b38180820
      [ 5267.204327] R10: 000000000000000a R11: 0000000000000246 R12: 00007f5b383c06e0
      [ 5267.205193] R13: 0000000000000002 R14: 00007f5b383bb880 R15: 0000000000000002
      [ 5267.206041]  </TASK>
      [ 5267.206970] Kernel panic - not syncing: hung_task: blocked tasks
      [ 5267.207809] CPU: 48 PID: 551 Comm: khungtaskd Kdump: loaded Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
      [ 5267.208726] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019
      [ 5267.209623] Call Trace:
      [ 5267.210569]  <TASK>
      [ 5267.211480]  dump_stack_lvl+0x33/0x42
      [ 5267.212472]  panic+0x107/0x294
      [ 5267.213467]  watchdog.cold.8+0xc/0xbb
      [ 5267.214413]  ? proc_dohung_task_timeout_secs+0x30/0x30
      [ 5267.215511]  kthread+0xf4/0x120
      [ 5267.216459]  ? kthread_complete_and_exit+0x20/0x20
      [ 5267.217505]  ret_from_fork+0x22/0x30
      [ 5267.218459]  </TASK>
      
      Fixes: f0db7892 ("i40evf: use netdev variable in reset task")
      Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Signed-off-by: NJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: NMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      31071173
    • P
      iavf: Fix NULL pointer dereference in iavf_get_link_ksettings · 541a1af4
      Przemyslaw Patynowski 提交于
      Fix possible NULL pointer dereference, due to freeing of adapter->vf_res
      in iavf_init_get_resources. Previous commit introduced a regression,
      where receiving IAVF_ERR_ADMIN_QUEUE_NO_WORK from iavf_get_vf_config
      would free adapter->vf_res. However, netdev is still registered, so
      ethtool_ops can be called. Calling iavf_get_link_ksettings with no vf_res,
      will result with:
      [ 9385.242676] BUG: kernel NULL pointer dereference, address: 0000000000000008
      [ 9385.242683] #PF: supervisor read access in kernel mode
      [ 9385.242686] #PF: error_code(0x0000) - not-present page
      [ 9385.242690] PGD 0 P4D 0
      [ 9385.242696] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
      [ 9385.242701] CPU: 6 PID: 3217 Comm: pmdalinux Kdump: loaded Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
      [ 9385.242708] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019
      [ 9385.242710] RIP: 0010:iavf_get_link_ksettings+0x29/0xd0 [iavf]
      [ 9385.242745] Code: 00 0f 1f 44 00 00 b8 01 ef ff ff 48 c7 46 30 00 00 00 00 48 c7 46 38 00 00 00 00 c6 46 0b 00 66 89 46 08 48 8b 87 68 0e 00 00 <f6> 40 08 80 75 50 8b 87 5c 0e 00 00 83 f8 08 74 7a 76 1d 83 f8 20
      [ 9385.242749] RSP: 0018:ffffc0560ec7fbd0 EFLAGS: 00010246
      [ 9385.242755] RAX: 0000000000000000 RBX: ffffc0560ec7fc08 RCX: 0000000000000000
      [ 9385.242759] RDX: ffffffffc0ad4550 RSI: ffffc0560ec7fc08 RDI: ffffa0fc66674000
      [ 9385.242762] RBP: 00007ffd1fb2bf50 R08: b6a2d54b892363ee R09: ffffa101dc14fb00
      [ 9385.242765] R10: 0000000000000000 R11: 0000000000000004 R12: ffffa0fc66674000
      [ 9385.242768] R13: 0000000000000000 R14: ffffa0fc66674000 R15: 00000000ffffffa1
      [ 9385.242771] FS:  00007f93711a2980(0000) GS:ffffa0fad72c0000(0000) knlGS:0000000000000000
      [ 9385.242775] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 9385.242778] CR2: 0000000000000008 CR3: 0000000a8e61c003 CR4: 00000000003706e0
      [ 9385.242781] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 9385.242784] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 9385.242787] Call Trace:
      [ 9385.242791]  <TASK>
      [ 9385.242793]  ethtool_get_settings+0x71/0x1a0
      [ 9385.242814]  __dev_ethtool+0x426/0x2f40
      [ 9385.242823]  ? slab_post_alloc_hook+0x4f/0x280
      [ 9385.242836]  ? kmem_cache_alloc_trace+0x15d/0x2f0
      [ 9385.242841]  ? dev_ethtool+0x59/0x170
      [ 9385.242848]  dev_ethtool+0xa7/0x170
      [ 9385.242856]  dev_ioctl+0xc3/0x520
      [ 9385.242866]  sock_do_ioctl+0xa0/0xe0
      [ 9385.242877]  sock_ioctl+0x22f/0x320
      [ 9385.242885]  __x64_sys_ioctl+0x84/0xc0
      [ 9385.242896]  do_syscall_64+0x3a/0x80
      [ 9385.242904]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [ 9385.242918] RIP: 0033:0x7f93702396db
      [ 9385.242923] Code: 73 01 c3 48 8b 0d ad 57 38 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 7d 57 38 00 f7 d8 64 89 01 48
      [ 9385.242927] RSP: 002b:00007ffd1fb2bf18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [ 9385.242932] RAX: ffffffffffffffda RBX: 000055671b1d2fe0 RCX: 00007f93702396db
      [ 9385.242935] RDX: 00007ffd1fb2bf20 RSI: 0000000000008946 RDI: 0000000000000007
      [ 9385.242937] RBP: 00007ffd1fb2bf20 R08: 0000000000000003 R09: 0030763066307330
      [ 9385.242940] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd1fb2bf80
      [ 9385.242942] R13: 0000000000000007 R14: 0000556719f6de90 R15: 00007ffd1fb2c1b0
      [ 9385.242948]  </TASK>
      [ 9385.242949] Modules linked in: iavf(E) xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp bridge stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables rfkill nfnetlink vfat fat irdma ib_uverbs ib_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support ice irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl i40e pcspkr intel_cstate joydev mei_me intel_uncore mxm_wmi mei ipmi_ssif lpc_ich ipmi_si acpi_power_meter xfs libcrc32c mgag200 i2c_algo_bit drm_shmem_helper drm_kms_helper sd_mod t10_pi crc64_rocksoft crc64 syscopyarea sg sysfillrect sysimgblt fb_sys_fops drm ixgbe ahci libahci libata crc32c_intel mdio dca wmi dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse
      [ 9385.243065]  [last unloaded: iavf]
      
      Dereference happens in if (ADV_LINK_SUPPORT(adapter)) statement
      
      Fixes: 209f2f9c ("iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation")
      Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Signed-off-by: NJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: NMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      541a1af4
    • P
      iavf: Fix adminq error handling · 41983161
      Przemyslaw Patynowski 提交于
      iavf_alloc_asq_bufs/iavf_alloc_arq_bufs allocates with dma_alloc_coherent
      memory for VF mailbox.
      Free DMA regions for both ASQ and ARQ in case error happens during
      configuration of ASQ/ARQ registers.
      Without this change it is possible to see when unloading interface:
      74626.583369: dma_debug_device_change: device driver has pending DMA allocations while released from device [count=32]
      One of leaked entries details: [device address=0x0000000b27ff9000] [size=4096 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as coherent]
      
      Fixes: d358aa9a ("i40evf: init code and hardware support")
      Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Signed-off-by: NJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: NMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      41983161
    • L
      net: lan966x: fix checking for return value of platform_get_irq_byname() · 40b4ac88
      Li Qiong 提交于
      The platform_get_irq_byname() returns non-zero IRQ number
      or negative error number. "if (irq)" always true, chang it
      to "if (irq > 0)"
      Signed-off-by: NLi Qiong <liqiong@nfschina.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40b4ac88
    • J
      net: cxgb3: Fix comment typo · 75d8620d
      Jason Wang 提交于
      The double `the' is duplicated in the comment, remove one.
      Signed-off-by: NJason Wang <wangborong@cdjrlc.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      75d8620d
    • J
      bnx2x: Fix comment typo · 0619d0fa
      Jason Wang 提交于
      The double `the' is duplicated in the comment, remove one.
      Signed-off-by: NJason Wang <wangborong@cdjrlc.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0619d0fa
    • C
      dpaa2-eth: trace the allocated address instead of page struct · e34f4934
      Chen Lin 提交于
      We should trace the allocated address instead of page struct.
      
      Fixes: 27c87486 ("dpaa2-eth: Use a single page per Rx buffer")
      Signed-off-by: NChen Lin <chen.lin5@zte.com.cn>
      Reviewed-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Link: https://lore.kernel.org/r/20220811151651.3327-1-chen45464546@163.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      e34f4934
    • J
      nfp: fix use-after-free in area_cache_get() · 02e1a114
      Jialiang Wang 提交于
      area_cache_get() is used to distribute cache->area and set cache->id,
       and if cache->id is not 0 and cache->area->kref refcount is 0, it will
       release the cache->area by nfp_cpp_area_release(). area_cache_get()
       set cache->id before cpp->op->area_init() and nfp_cpp_area_acquire().
      
      But if area_init() or nfp_cpp_area_acquire() fails, the cache->id is
       is already set but the refcount is not increased as expected. At this
       time, calling the nfp_cpp_area_release() will cause use-after-free.
      
      To avoid the use-after-free, set cache->id after area_init() and
       nfp_cpp_area_acquire() complete successfully.
      
      Note: This vulnerability is triggerable by providing emulated device
       equipped with specified configuration.
      
       BUG: KASAN: use-after-free in nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
        Write of size 4 at addr ffff888005b7f4a0 by task swapper/0/1
      
       Call Trace:
        <TASK>
       nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
       area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:884)
      
       Allocated by task 1:
       nfp_cpp_area_alloc_with_name (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:303)
       nfp_cpp_area_cache_add (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:802)
       nfp6000_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:1230)
       nfp_cpp_from_operations (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:1215)
       nfp_pci_probe (drivers/net/ethernet/netronome/nfp/nfp_main.c:744)
      
       Freed by task 1:
       kfree (mm/slub.c:4562)
       area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:873)
       nfp_cpp_read (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:924 drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:973)
       nfp_cpp_readl (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cpplib.c:48)
      Signed-off-by: NJialiang Wang <wangjialiang0806@163.com>
      Reviewed-by: NYinjun Zhang <yinjun.zhang@corigine.com>
      Acked-by: NSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20220810073057.4032-1-wangjialiang0806@163.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      02e1a114
  6. 11 8月, 2022 4 次提交
    • V
      mlxsw: minimal: Fix deadlock in ports creation · 4f98cb04
      Vadim Pasternak 提交于
      Drop devl_lock() / devl_unlock() from ports creation and removal flows
      since the devlink instance lock is now taken by mlxsw_core.
      
      Fixes: 72a4c8c9 ("mlxsw: convert driver to use unlocked devlink API during init/fini")
      Signed-off-by: NVadim Pasternak <vadimp@nvidia.com>
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: NPetr Machata <petrm@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Link: https://lore.kernel.org/r/f4afce5ab0318617f3866b85274be52542d59b32.1660211614.git.petrm@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      4f98cb04
    • M
      ice: Fix call trace with null VSI during VF reset · cf90b743
      Michal Jaron 提交于
      During stress test with attaching and detaching VF from KVM and
      simultaneously changing VFs spoofcheck and trust there was a
      call trace in ice_reset_vf that VF's VSI is null.
      
      [145237.352797] WARNING: CPU: 46 PID: 840629 at drivers/net/ethernet/intel/ice/ice_vf_lib.c:508 ice_reset_vf+0x3d6/0x410 [ice]
      [145237.352851] Modules linked in: ice(E) vfio_pci vfio_pci_core vfio_virqfd vfio_iommu_type1 vfio iavf dm_mod xt_CHECKSUM xt_MASQUERADE
      xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun
       bridge stp llc sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTC
      O_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl ipmi_si intel_cstate ipmi_devintf joydev intel_uncore m
      ei_me ipmi_msghandler i2c_i801 pcspkr mei lpc_ich ioatdma i2c_smbus acpi_pad acpi_power_meter ip_tables xfs libcrc32c i2c_algo_bit drm_sh
      mem_helper drm_kms_helper sd_mod t10_pi crc64_rocksoft syscopyarea crc64 sysfillrect sg sysimgblt fb_sys_fops drm i40e ixgbe ahci libahci
       libata crc32c_intel mdio dca wmi fuse [last unloaded: ice]
      [145237.352917] CPU: 46 PID: 840629 Comm: kworker/46:2 Tainted: G S      W I E     5.19.0-rc6+ #24
      [145237.352921] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
      [145237.352923] Workqueue: ice ice_service_task [ice]
      [145237.352948] RIP: 0010:ice_reset_vf+0x3d6/0x410 [ice]
      [145237.352984] Code: 30 ec f3 cc e9 28 fd ff ff 0f b7 4b 50 48 c7 c2 48 19 9c c0 4c 89 ee 48 c7 c7 30 fe 9e c0 e8 d1 21 9d cc 31 c0 e9 a
      9 fe ff ff <0f> 0b b8 ea ff ff ff e9 c1 fc ff ff 0f 0b b8 fb ff ff ff e9 91 fe
      [145237.352987] RSP: 0018:ffffb453e257fdb8 EFLAGS: 00010246
      [145237.352990] RAX: ffff8bd0040181c0 RBX: ffff8be68db8f800 RCX: 0000000000000000
      [145237.352991] RDX: 000000000000ffff RSI: 0000000000000000 RDI: ffff8be68db8f800
      [145237.352993] RBP: ffff8bd0040181c0 R08: 0000000000001000 R09: ffff8bcfd520e000
      [145237.352995] R10: 0000000000000000 R11: 00008417b5ab0bc0 R12: 0000000000000005
      [145237.352996] R13: ffff8bcee061c0d0 R14: ffff8bd004019640 R15: 0000000000000000
      [145237.352998] FS:  0000000000000000(0000) GS:ffff8be5dfb00000(0000) knlGS:0000000000000000
      [145237.353000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [145237.353002] CR2: 00007fd81f651d68 CR3: 0000001a0fe10001 CR4: 00000000001726e0
      [145237.353003] Call Trace:
      [145237.353008]  <TASK>
      [145237.353011]  ice_process_vflr_event+0x8d/0xb0 [ice]
      [145237.353049]  ice_service_task+0x79f/0xef0 [ice]
      [145237.353074]  process_one_work+0x1c8/0x390
      [145237.353081]  ? process_one_work+0x390/0x390
      [145237.353084]  worker_thread+0x30/0x360
      [145237.353087]  ? process_one_work+0x390/0x390
      [145237.353090]  kthread+0xe8/0x110
      [145237.353094]  ? kthread_complete_and_exit+0x20/0x20
      [145237.353097]  ret_from_fork+0x22/0x30
      [145237.353103]  </TASK>
      
      Remove WARN_ON() from check if VSI is null in ice_reset_vf.
      Add "VF is already removed\n" in dev_dbg().
      
      This WARN_ON() is unnecessary and causes call trace, despite that
      call trace, driver still works. There is no need for this warn
      because this piece of code is responsible for disabling VF's Tx/Rx
      queues when VF is disabled, but when VF is already removed there
      is no need to do reset or disable queues.
      
      Fixes: efe41860 ("ice: Fix memory corruption in VF driver")
      Signed-off-by: NMichal Jaron <michalx.jaron@intel.com>
      Signed-off-by: NJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: NMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      cf90b743
    • B
      ice: Fix VSI rebuild WARN_ON check for VF · 7fe05e12
      Benjamin Mikailenko 提交于
      In commit b03d519d ("ice: store VF pointer instead of VF ID")
      WARN_ON checks were added to validate the vsi->vf pointer and
      catch programming errors. However, one check to vsi->vf was missed.
      This caused a call trace when resetting VFs.
      
      Fix ice_vsi_rebuild by encompassing VF pointer in WARN_ON check.
      
      Fixes: b03d519d ("ice: store VF pointer instead of VF ID")
      Signed-off-by: NBenjamin Mikailenko <benjamin.mikailenko@intel.com>
      Tested-by: NMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      7fe05e12
    • M
      net/tls: Use RCU API to access tls_ctx->netdev · 94ce3b64
      Maxim Mikityanskiy 提交于
      Currently, tls_device_down synchronizes with tls_device_resync_rx using
      RCU, however, the pointer to netdev is stored using WRITE_ONCE and
      loaded using READ_ONCE.
      
      Although such approach is technically correct (rcu_dereference is
      essentially a READ_ONCE, and rcu_assign_pointer uses WRITE_ONCE to store
      NULL), using special RCU helpers for pointers is more valid, as it
      includes additional checks and might change the implementation
      transparently to the callers.
      
      Mark the netdev pointer as __rcu and use the correct RCU helpers to
      access it. For non-concurrent access pass the right conditions that
      guarantee safe access (locks taken, refcount value). Also use the
      correct helper in mlx5e, where even READ_ONCE was missing.
      
      The transition to RCU exposes existing issues, fixed by this commit:
      
      1. bond_tls_device_xmit could read netdev twice, and it could become
      NULL the second time, after the NULL check passed.
      
      2. Drivers shouldn't stop processing the last packet if tls_device_down
      just set netdev to NULL, before tls_dev_del was called. This prevents a
      possible packet drop when transitioning to the fallback software mode.
      
      Fixes: 89df6a81 ("net/bonding: Implement TLS TX device offload")
      Fixes: c55dcdd4 ("net/tls: Fix use-after-free after the TLS device goes down and up")
      Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
      Link: https://lore.kernel.org/r/20220810081602.1435800-1-maximmi@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      94ce3b64
  7. 10 8月, 2022 3 次提交
    • M
      mlx5: do not use RT_TOS for IPv6 flowlabel · bcb0da7f
      Matthias May 提交于
      According to Guillaume Nault RT_TOS should never be used for IPv6.
      
      Quote:
      RT_TOS() is an old macro used to interprete IPv4 TOS as described in
      the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
      code, although, given the current state of the code, most of the
      existing calls have no consequence.
      
      But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
      field to be interpreted the RFC 1349 way. There's no historical
      compatibility to worry about.
      
      Fixes: ce99f6b9 ("net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels")
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NMatthias May <matthias.may@westermo.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bcb0da7f
    • C
      net: atlantic: fix aq_vec index out of range error · 2ba5e47f
      Chia-Lin Kao (AceLan) 提交于
      The final update statement of the for loop exceeds the array range, the
      dereference of self->aq_vec[i] is not checked and then leads to the
      index out of range error.
      Also fixed this kind of coding style in other for loop.
      
      [   97.937604] UBSAN: array-index-out-of-bounds in drivers/net/ethernet/aquantia/atlantic/aq_nic.c:1404:48
      [   97.937607] index 8 is out of range for type 'aq_vec_s *[8]'
      [   97.937608] CPU: 38 PID: 3767 Comm: kworker/u256:18 Not tainted 5.19.0+ #2
      [   97.937610] Hardware name: Dell Inc. Precision 7865 Tower/, BIOS 1.0.0 06/12/2022
      [   97.937611] Workqueue: events_unbound async_run_entry_fn
      [   97.937616] Call Trace:
      [   97.937617]  <TASK>
      [   97.937619]  dump_stack_lvl+0x49/0x63
      [   97.937624]  dump_stack+0x10/0x16
      [   97.937626]  ubsan_epilogue+0x9/0x3f
      [   97.937627]  __ubsan_handle_out_of_bounds.cold+0x44/0x49
      [   97.937629]  ? __scm_send+0x348/0x440
      [   97.937632]  ? aq_vec_stop+0x72/0x80 [atlantic]
      [   97.937639]  aq_nic_stop+0x1b6/0x1c0 [atlantic]
      [   97.937644]  aq_suspend_common+0x88/0x90 [atlantic]
      [   97.937648]  aq_pm_suspend_poweroff+0xe/0x20 [atlantic]
      [   97.937653]  pci_pm_suspend+0x7e/0x1a0
      [   97.937655]  ? pci_pm_suspend_noirq+0x2b0/0x2b0
      [   97.937657]  dpm_run_callback+0x54/0x190
      [   97.937660]  __device_suspend+0x14c/0x4d0
      [   97.937661]  async_suspend+0x23/0x70
      [   97.937663]  async_run_entry_fn+0x33/0x120
      [   97.937664]  process_one_work+0x21f/0x3f0
      [   97.937666]  worker_thread+0x4a/0x3c0
      [   97.937668]  ? process_one_work+0x3f0/0x3f0
      [   97.937669]  kthread+0xf0/0x120
      [   97.937671]  ? kthread_complete_and_exit+0x20/0x20
      [   97.937672]  ret_from_fork+0x22/0x30
      [   97.937676]  </TASK>
      
      v2. fixed "warning: variable 'aq_vec' set but not used"
      
      v3. simplified a for loop
      
      Fixes: 97bde5c4 ("net: ethernet: aquantia: Support for NIC-specific code")
      Signed-off-by: NChia-Lin Kao (AceLan) <acelan.kao@canonical.com>
      Acked-by: NSudarsana Reddy Kalluru <skalluru@marvell.com>
      Link: https://lore.kernel.org/r/20220808081845.42005-1-acelan.kao@canonical.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      2ba5e47f
    • S
      net: bgmac: Fix a BUG triggered by wrong bytes_compl · 1b7680c6
      Sandor Bodo-Merle 提交于
      On one of our machines we got:
      
      kernel BUG at lib/dynamic_queue_limits.c:27!
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
      CPU: 0 PID: 1166 Comm: irq/41-bgmac Tainted: G        W  O    4.14.275-rt132 #1
      Hardware name: BRCM XGS iProc
      task: ee3415c0 task.stack: ee32a000
      PC is at dql_completed+0x168/0x178
      LR is at bgmac_poll+0x18c/0x6d8
      pc : [<c03b9430>]    lr : [<c04b5a18>]    psr: 800a0313
      sp : ee32be14  ip : 000005ea  fp : 00000bd4
      r10: ee558500  r9 : c0116298  r8 : 00000002
      r7 : 00000000  r6 : ef128810  r5 : 01993267  r4 : 01993851
      r3 : ee558000  r2 : 000070e1  r1 : 00000bd4  r0 : ee52c180
      Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      Control: 12c5387d  Table: 8e88c04a  DAC: 00000051
      Process irq/41-bgmac (pid: 1166, stack limit = 0xee32a210)
      Stack: (0xee32be14 to 0xee32c000)
      be00:                                              ee558520 ee52c100 ef128810
      be20: 00000000 00000002 c0116298 c04b5a18 00000000 c0a0c8c4 c0951780 00000040
      be40: c0701780 ee558500 ee55d520 ef05b340 ef6f9780 ee558520 00000001 00000040
      be60: ffffe000 c0a56878 ef6fa040 c0952040 0000012c c0528744 ef6f97b0 fffcfb6a
      be80: c0a04104 2eda8000 c0a0c4ec c0a0d368 ee32bf44 c0153534 ee32be98 ee32be98
      bea0: ee32bea0 ee32bea0 ee32bea8 ee32bea8 00000000 c01462e4 ffffe000 ef6f22a8
      bec0: ffffe000 00000008 ee32bee4 c0147430 ffffe000 c094a2a8 00000003 ffffe000
      bee0: c0a54528 00208040 0000000c c0a0c8c4 c0a65980 c0124d3c 00000008 ee558520
      bf00: c094a23c c0a02080 00000000 c07a9910 ef136970 ef136970 ee30a440 ef136900
      bf20: ee30a440 00000001 ef136900 ee30a440 c016d990 00000000 c0108db0 c012500c
      bf40: ef136900 c016da14 ee30a464 ffffe000 00000001 c016dd14 00000000 c016db28
      bf60: ffffe000 ee21a080 ee30a400 00000000 ee32a000 ee30a440 c016dbfc ee25fd70
      bf80: ee21a09c c013edcc ee32a000 ee30a400 c013ec7c 00000000 00000000 00000000
      bfa0: 00000000 00000000 00000000 c0108470 00000000 00000000 00000000 00000000
      bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
      [<c03b9430>] (dql_completed) from [<c04b5a18>] (bgmac_poll+0x18c/0x6d8)
      [<c04b5a18>] (bgmac_poll) from [<c0528744>] (net_rx_action+0x1c4/0x494)
      [<c0528744>] (net_rx_action) from [<c0124d3c>] (do_current_softirqs+0x1ec/0x43c)
      [<c0124d3c>] (do_current_softirqs) from [<c012500c>] (__local_bh_enable+0x80/0x98)
      [<c012500c>] (__local_bh_enable) from [<c016da14>] (irq_forced_thread_fn+0x84/0x98)
      [<c016da14>] (irq_forced_thread_fn) from [<c016dd14>] (irq_thread+0x118/0x1c0)
      [<c016dd14>] (irq_thread) from [<c013edcc>] (kthread+0x150/0x158)
      [<c013edcc>] (kthread) from [<c0108470>] (ret_from_fork+0x14/0x24)
      Code: a83f15e0 0200001a 0630a0e1 c3ffffea (f201f0e7)
      
      The issue seems similar to commit 90b3b339 ("net: hisilicon: Fix a BUG
      trigered by wrong bytes_compl") and potentially introduced by commit
      b38c83dd ("bgmac: simplify tx ring index handling").
      
      If there is an RX interrupt between setting ring->end
      and netdev_sent_queue() we can hit the BUG_ON as bgmac_dma_tx_free()
      can miscalculate the queue size while called from bgmac_poll().
      
      The machine which triggered the BUG runs a v4.14 RT kernel - but the issue
      seems present in mainline too.
      
      Fixes: b38c83dd ("bgmac: simplify tx ring index handling")
      Signed-off-by: NSandor Bodo-Merle <sbodomerle@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20220808173939.193804-1-sbodomerle@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      1b7680c6
  8. 09 8月, 2022 2 次提交
  9. 06 8月, 2022 7 次提交