1. 23 3月, 2021 5 次提交
  2. 27 2月, 2021 2 次提交
    • E
      bnxt_en: reliably allocate IRQ table on reset to avoid crash · 20d7d1c5
      Edwin Peer 提交于
      The following trace excerpt corresponds with a NULL pointer dereference
      of 'bp->irq_tbl' in bnxt_setup_inta() on an Aarch64 system after many
      device resets:
      
          Unable to handle kernel NULL pointer dereference at ... 000000d
          ...
          pc : string+0x3c/0x80
          lr : vsnprintf+0x294/0x7e0
          sp : ffff00000f61ba70 pstate : 20000145
          x29: ffff00000f61ba70 x28: 000000000000000d
          x27: ffff0000009c8b5a x26: ffff00000f61bb80
          x25: ffff0000009c8b5a x24: 0000000000000012
          x23: 00000000ffffffe0 x22: ffff000008990428
          x21: ffff00000f61bb80 x20: 000000000000000d
          x19: 000000000000001f x18: 0000000000000000
          x17: 0000000000000000 x16: ffff800b6d0fb400
          x15: 0000000000000000 x14: ffff800b7fe31ae8
          x13: 00001ed16472c920 x12: ffff000008c6b1c9
          x11: ffff000008cf0580 x10: ffff00000f61bb80
          x9 : 00000000ffffffd8 x8 : 000000000000000c
          x7 : ffff800b684b8000 x6 : 0000000000000000
          x5 : 0000000000000065 x4 : 0000000000000001
          x3 : ffff0a00ffffff04 x2 : 000000000000001f
          x1 : 0000000000000000 x0 : 000000000000000d
          Call trace:
          string+0x3c/0x80
          vsnprintf+0x294/0x7e0
          snprintf+0x44/0x50
          __bnxt_open_nic+0x34c/0x928 [bnxt_en]
          bnxt_open+0xe8/0x238 [bnxt_en]
          __dev_open+0xbc/0x130
          __dev_change_flags+0x12c/0x168
          dev_change_flags+0x20/0x60
          ...
      
      Ordinarily, a call to bnxt_setup_inta() (not in trace due to inlining)
      would not be expected on a system supporting MSIX at all. However, if
      bnxt_init_int_mode() does not end up being called after the call to
      bnxt_clear_int_mode() in bnxt_fw_reset_close(), then the driver will
      think that only INTA is supported and bp->irq_tbl will be NULL,
      causing the above crash.
      
      In the error recovery scenario, we call bnxt_clear_int_mode() in
      bnxt_fw_reset_close() early in the sequence. Ordinarily, we will
      call bnxt_init_int_mode() in bnxt_hwrm_if_change() after we
      reestablish communication with the firmware after reset.  However,
      if the sequence has to abort before we call bnxt_init_int_mode() and
      if the user later attempts to re-open the device, then it will cause
      the crash above.
      
      We fix it in 2 ways:
      
      1. Check for bp->irq_tbl in bnxt_setup_int_mode(). If it is NULL, call
      bnxt_init_init_mode().
      
      2. If we need to abort in bnxt_hwrm_if_change() and cannot complete
      the error recovery sequence, set the BNXT_STATE_ABORT_ERR flag.  This
      will cause more drastic recovery at the next attempt to re-open the
      device, including a call to bnxt_init_int_mode().
      
      Fixes: 3bc7d4a3 ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.")
      Reviewed-by: NScott Branden <scott.branden@broadcom.com>
      Signed-off-by: NEdwin Peer <edwin.peer@broadcom.com>
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      20d7d1c5
    • V
      bnxt_en: Fix race between firmware reset and driver remove. · d20cd745
      Vasundhara Volam 提交于
      The driver's error recovery reset sequence can take many seconds to
      complete and only the critical sections are protected by rtnl_lock.
      A recent change has introduced a regression in this sequence.
      
      bnxt_remove_one() may be called while the recovery is in progress.
      Normally, unregister_netdev() would cause bnxt_close_nic() to be
      called and this would cause the error recovery to safely abort
      with the BNXT_STATE_ABORT_ERR flag set in bnxt_close_nic().
      
      Recently, we added bnxt_reinit_after_abort() to allow the user to
      reopen the device after an aborted recovery.  This causes the
      regression in the scenario described above because we would
      attempt to re-open even after the netdev has been unregistered.
      
      Fix it by checking the netdev reg_state in
      bnxt_reinit_after_abort() and abort if it is unregistered.
      
      Fixes: 6882c36c ("bnxt_en: attempt to reinitialize after aborted reset")
      Signed-off-by: NVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      d20cd745
  3. 15 2月, 2021 6 次提交
  4. 12 2月, 2021 1 次提交
  5. 27 1月, 2021 1 次提交
  6. 26 1月, 2021 14 次提交
  7. 08 1月, 2021 1 次提交
  8. 06 1月, 2021 1 次提交
  9. 29 12月, 2020 2 次提交
  10. 01 12月, 2020 1 次提交
  11. 21 11月, 2020 1 次提交
  12. 20 11月, 2020 2 次提交
  13. 17 11月, 2020 2 次提交
  14. 27 10月, 2020 1 次提交