1. 25 10月, 2022 1 次提交
  2. 19 10月, 2022 5 次提交
  3. 12 10月, 2022 5 次提交
    • S
      nvme-multipath: fix possible hang in live ns resize with ANA access · 72e3b888
      Sagi Grimberg 提交于
      When we revalidate paths as part of ns size change (as of commit
      e7d65803), it is possible that during the path revalidation, the
      only paths that is IO capable (i.e. optimized/non-optimized) are the
      ones that ns resize was not yet informed to the host, which will cause
      inflight requests to be requeued (as we have available paths but none
      are IO capable). These requests on the requeue list are waiting for
      someone to resubmit them at some point.
      
      The IO capable paths will eventually notify the ns resize change to the
      host, but there is nothing that will kick the requeue list to resubmit
      the queued requests.
      
      Fix this by always kicking the requeue list, and if no IO capable path
      exists, these requests will be queued again.
      
      A typical log that indicates that IOs are requeued:
      --
      nvme nvme1: creating 4 I/O queues.
      nvme nvme1: new ctrl: "testnqn1"
      nvme nvme2: creating 4 I/O queues.
      nvme nvme2: mapped 4/0/0 default/read/poll queues.
      nvme nvme2: new ctrl: NQN "testnqn1", addr 127.0.0.1:8009
      nvme nvme1: rescanning namespaces.
      nvme1n1: detected capacity change from 2097152 to 4194304
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      nvme nvme2: rescanning namespaces.
      --
      Reported-by: NYogev Cohen <yogev@lightbitslabs.com>
      Fixes: e7d65803 ("nvme-multipath: revalidate paths during rescan")
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Cc: <stable@vger.kernel.org> # v5.15+
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      72e3b888
    • X
      nvme-pci: avoid the deepest sleep state on ZHITAI TiPro5000 SSDs · d5d3c100
      Xi Ruoyao 提交于
      ZHITAI TiPro5000 SSDs has the same APST sleep problem as its cousin,
      TiPro7000.  The quirk for TiPro7000 has been added in
      commit 6b961bce ("nvme-pci: avoid the deepest sleep state on
      ZHITAI TiPro7000 SSDs"), use the same quirk for TiPro5000.
      
      The ASPT data from "nvme id-ctrl /dev/nvme1":
      
      vid       : 0x1e49
      ssvid     : 0x1e49
      sn        : ZTA21T0KA2227304LM
      mn        : ZHITAI TiPlus5000 1TB
      fr        : ZTA09139
      [...]
      ps    0 : mp:6.50W operational enlat:0 exlat:0 rrt:0 rrl:0
               rwt:0 rwl:0 idle_power:- active_power:-
      ps    1 : mp:5.80W operational enlat:0 exlat:0 rrt:1 rrl:1
               rwt:1 rwl:1 idle_power:- active_power:-
      ps    2 : mp:3.60W operational enlat:0 exlat:0 rrt:2 rrl:2
               rwt:2 rwl:2 idle_power:- active_power:-
      ps    3 : mp:0.0500W non-operational enlat:5000 exlat:10000 rrt:3 rrl:3
               rwt:3 rwl:3 idle_power:- active_power:-
      ps    4 : mp:0.0025W non-operational enlat:8000 exlat:45000 rrt:4 rrl:4
               rwt:4 rwl:4 idle_power:- active_power:-
      Reported-and-tested-by: NChang Feng <flukehn@gmail.com>
      Signed-off-by: NXi Ruoyao <xry111@xry111.site>
      Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      d5d3c100
    • A
      nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM760 · 80b26240
      Abhijit 提交于
      Add a quirk to fix Lexar NM760 SSD drives reporting duplicate nsids.
      Signed-off-by: NAbhijit <abhijit@abhijittomar.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      80b26240
    • S
      nvme-tcp: fix possible hang caused during ctrl deletion · c4abd875
      Sagi Grimberg 提交于
      When we delete a controller, we execute the following:
      1. nvme_stop_ctrl() - stop some work elements that may be
      	inflight or scheduled (specifically also .stop_ctrl
      	which cancels ctrl error recovery work)
      2. nvme_remove_namespaces() - which first flushes scan_work
      	to avoid competing ns addition/removal
      3. continue to teardown the controller
      
      However, if err_work was scheduled to run in (1), it is designed to
      cancel any inflight I/O, particularly I/O that is originating from ns
      scan_work in (2), but because it is cancelled in .stop_ctrl(), we can
      prevent forward progress of (2) as ns scanning is blocking on I/O
      (that will never be cancelled).
      
      The race is:
      1. transport layer error observed -> err_work is scheduled
      2. scan_work executes, discovers ns, generate I/O to it
      3. nvme_ctop_ctrl() -> .stop_ctrl() -> cancel_work_sync(err_work)
         - err_work never executed
      4. nvme_remove_namespaces() -> flush_work(scan_work)
      --> deadlock, because scan_work is blocked on I/O that was supposed
      to be cancelled by err_work, but was cancelled before executing (see
      stack trace [1]).
      
      Fix this by flushing err_work instead of cancelling it, to force it
      to execute and cancel all inflight I/O.
      
      [1]:
      --
      Call Trace:
       <TASK>
       __schedule+0x390/0x910
       ? scan_shadow_nodes+0x40/0x40
       schedule+0x55/0xe0
       io_schedule+0x16/0x40
       do_read_cache_page+0x55d/0x850
       ? __page_cache_alloc+0x90/0x90
       read_cache_page+0x12/0x20
       read_part_sector+0x3f/0x110
       amiga_partition+0x3d/0x3e0
       ? osf_partition+0x33/0x220
       ? put_partition+0x90/0x90
       bdev_disk_changed+0x1fe/0x4d0
       blkdev_get_whole+0x7b/0x90
       blkdev_get_by_dev+0xda/0x2d0
       device_add_disk+0x356/0x3b0
       nvme_mpath_set_live+0x13c/0x1a0 [nvme_core]
       ? nvme_parse_ana_log+0xae/0x1a0 [nvme_core]
       nvme_update_ns_ana_state+0x3a/0x40 [nvme_core]
       nvme_mpath_add_disk+0x120/0x160 [nvme_core]
       nvme_alloc_ns+0x594/0xa00 [nvme_core]
       nvme_validate_or_alloc_ns+0xb9/0x1a0 [nvme_core]
       ? __nvme_submit_sync_cmd+0x1d2/0x210 [nvme_core]
       nvme_scan_work+0x281/0x410 [nvme_core]
       process_one_work+0x1be/0x380
       worker_thread+0x37/0x3b0
       ? process_one_work+0x380/0x380
       kthread+0x12d/0x150
       ? set_kthread_struct+0x50/0x50
       ret_from_fork+0x1f/0x30
       </TASK>
      INFO: task nvme:6725 blocked for more than 491 seconds.
            Not tainted 5.15.65-f0.el7.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      task:nvme            state:D
       stack:    0 pid: 6725 ppid:  1761 flags:0x00004000
      Call Trace:
       <TASK>
       __schedule+0x390/0x910
       ? sched_clock+0x9/0x10
       schedule+0x55/0xe0
       schedule_timeout+0x24b/0x2e0
       ? try_to_wake_up+0x358/0x510
       ? finish_task_switch+0x88/0x2c0
       wait_for_completion+0xa5/0x110
       __flush_work+0x144/0x210
       ? worker_attach_to_pool+0xc0/0xc0
       flush_work+0x10/0x20
       nvme_remove_namespaces+0x41/0xf0 [nvme_core]
       nvme_do_delete_ctrl+0x47/0x66 [nvme_core]
       nvme_sysfs_delete.cold.96+0x8/0xd [nvme_core]
       dev_attr_store+0x14/0x30
       sysfs_kf_write+0x38/0x50
       kernfs_fop_write_iter+0x146/0x1d0
       new_sync_write+0x114/0x1b0
       ? intel_pmu_handle_irq+0xe0/0x420
       vfs_write+0x18d/0x270
       ksys_write+0x61/0xe0
       __x64_sys_write+0x1a/0x20
       do_syscall_64+0x37/0x90
       entry_SYSCALL_64_after_hwframe+0x61/0xcb
      --
      
      Fixes: 3f2304f8 ("nvme-tcp: add NVMe over TCP host driver")
      Reported-by: NJonathan Nicklin <jnicklin@blockbridge.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Tested-by: NJonathan Nicklin <jnicklin@blockbridge.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      c4abd875
    • S
      nvme-rdma: fix possible hang caused during ctrl deletion · a1ae8d4d
      Sagi Grimberg 提交于
      When we delete a controller, we execute the following:
      1. nvme_stop_ctrl() - stop some work elements that may be
              inflight or scheduled (specifically also .stop_ctrl
              which cancels ctrl error recovery work)
      2. nvme_remove_namespaces() - which first flushes scan_work
              to avoid competing ns addition/removal
      3. continue to teardown the controller
      
      However, if err_work was scheduled to run in (1), it is designed to
      cancel any inflight I/O, particularly I/O that is originating from ns
      scan_work in (2), but because it is cancelled in .stop_ctrl(), we can
      prevent forward progress of (2) as ns scanning is blocking on I/O
      (that will never be cancelled).
      
      The race is:
      1. transport layer error observed -> err_work is scheduled
      2. scan_work executes, discovers ns, generate I/O to it
      3. nvme_ctop_ctrl() -> .stop_ctrl() -> cancel_work_sync(err_work)
         - err_work never executed
      4. nvme_remove_namespaces() -> flush_work(scan_work)
      --> deadlock, because scan_work is blocked on I/O that was supposed
      to be cancelled by err_work, but was cancelled before executing.
      
      Fix this by flushing err_work instead of cancelling it, to force it
      to execute and cancel all inflight I/O.
      
      Fixes: b435ecea ("nvme: Add .stop_ctrl to nvme ctrl ops")
      Fixes: f6c8e432 ("nvme: flush namespace scanning work just before removing namespaces")
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      a1ae8d4d
  4. 30 9月, 2022 8 次提交
  5. 27 9月, 2022 20 次提交
  6. 22 9月, 2022 1 次提交