1. 22 8月, 2020 1 次提交
    • J
      nvme-pci: Use u32 for nvme_dev.q_depth and nvme_queue.q_depth · 7442ddce
      John Garry 提交于
      Recently nvme_dev.q_depth was changed from an int to u16 type.
      
      This falls over for the queue depth calculation in nvme_pci_enable(),
      where NVME_CAP_MQES(dev->ctrl.cap) + 1 may overflow as a u16, as
      NVME_CAP_MQES() is a 16b number also. That happens for me, and this is the
      result:
      
      root@ubuntu:/home/john# [148.272996] Unable to handle kernel NULL pointer
      dereference at virtual address 0000000000000010
      Mem abort info:
      ESR = 0x96000004
      EC = 0x25: DABT (current EL), IL = 32 bits
      SET = 0, FnV = 0
      EA = 0, S1PTW = 0
      Data abort info:
      ISV = 0, ISS = 0x00000004
      CM = 0, WnR = 0
      user pgtable: 4k pages, 48-bit VAs, pgdp=00000a27bf3c9000
      [0000000000000010] pgd=0000000000000000, p4d=0000000000000000
      Internal error: Oops: 96000004 [#1] PREEMPT SMP
      Modules linked in: nvme nvme_core
      CPU: 56 PID: 256 Comm: kworker/u195:0 Not tainted
      5.8.0-next-20200812 #27
      Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 -
      V1.16.01 03/15/2019
      Workqueue: nvme-reset-wq nvme_reset_work [nvme]
      pstate: 80c00009 (Nzcv daif +PAN +UAO BTYPE=--)
      pc : __sg_alloc_table_from_pages+0xec/0x238
      lr : __sg_alloc_table_from_pages+0xc8/0x238
      sp : ffff800013ccbad0
      x29: ffff800013ccbad0 x28: ffff0a27b3d380a8
      x27: 0000000000000000 x26: 0000000000002dc2
      x25: 0000000000000dc0 x24: 0000000000000000
      x23: 0000000000000000 x22: ffff800013ccbbe8
      x21: 0000000000000010 x20: 0000000000000000
      x19: 00000000fffff000 x18: ffffffffffffffff
      x17: 00000000000000c0 x16: fffffe289eaf6380
      x15: ffff800011b59948 x14: ffff002bc8fe98f8
      x13: ff00000000000000 x12: ffff8000114ca000
      x11: 0000000000000000 x10: ffffffffffffffff
      x9 : ffffffffffffffc0 x8 : ffff0a27b5f9b6a0
      x7 : 0000000000000000 x6 : 0000000000000001
      x5 : ffff0a27b5f9b680 x4 : 0000000000000000
      x3 : ffff0a27b5f9b680 x2 : 0000000000000000
       x1 : 0000000000000001 x0 : 0000000000000000
       Call trace:
      __sg_alloc_table_from_pages+0xec/0x238
      sg_alloc_table_from_pages+0x18/0x28
      iommu_dma_alloc+0x474/0x678
      dma_alloc_attrs+0xd8/0xf0
      nvme_alloc_queue+0x114/0x160 [nvme]
      nvme_reset_work+0xb34/0x14b4 [nvme]
      process_one_work+0x1e8/0x360
      worker_thread+0x44/0x478
      kthread+0x150/0x158
      ret_from_fork+0x10/0x34
       Code: f94002c3 6b01017f 540007c2 11000486 (f8645aa5)
      ---[ end trace 89bb2b72d59bf925 ]---
      
      Fix by making onto a u32.
      
      Also use u32 for nvme_dev.q_depth, as we assign this value from
      nvme_dev.q_depth, and nvme_dev.q_depth will possibly hold 65536 - this
      avoids the same crash as above.
      
      Fixes: 61f3b896 ("nvme-pci: use unsigned for io queue depth")
      Signed-off-by: NJohn Garry <john.garry@huawei.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      7442ddce
  2. 29 7月, 2020 4 次提交
  3. 26 7月, 2020 1 次提交
  4. 08 7月, 2020 9 次提交
  5. 25 6月, 2020 2 次提交
  6. 24 6月, 2020 1 次提交
  7. 11 6月, 2020 1 次提交
  8. 28 5月, 2020 1 次提交
    • D
      nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll() · 9210c075
      Dongli Zhang 提交于
      There may be a race between nvme_reap_pending_cqes() and nvme_poll(), e.g.,
      when doing live reset while polling the nvme device.
      
            CPU X                        CPU Y
                                     nvme_poll()
      nvme_dev_disable()
      -> nvme_stop_queues()
      -> nvme_suspend_io_queues()
      -> nvme_suspend_queue()
                                     -> spin_lock(&nvmeq->cq_poll_lock);
      -> nvme_reap_pending_cqes()
         -> nvme_process_cq()        -> nvme_process_cq()
      
      In the above scenario, the nvme_process_cq() for the same queue may be
      running on both CPU X and CPU Y concurrently.
      
      It is much more easier to reproduce the issue when CONFIG_PREEMPT is
      enabled in kernel. When CONFIG_PREEMPT is disabled, it would take longer
      time for nvme_stop_queues()-->blk_mq_quiesce_queue() to wait for grace
      period.
      
      This patch protects nvme_process_cq() with nvmeq->cq_poll_lock in
      nvme_reap_pending_cqes().
      
      Fixes: fa46c6fb ("nvme/pci: move cqe check after device shutdown")
      Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com>
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      9210c075
  9. 27 5月, 2020 2 次提交
    • M
      nvme: introduce max_integrity_segments ctrl attribute · 95093350
      Max Gurtovoy 提交于
      This patch doesn't change any logic, and is needed as a preparation
      for adding PI support for fabrics drivers that will use an extended
      LBA format for metadata and will support more than 1 integrity segment.
      Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      95093350
    • W
      nvme-pci: make sure write/poll_queues less or equal then cpu count · 9c9e76d5
      Weiping Zhang 提交于
      Check module parameter write/poll_queues before using it to catch
      too large values.
      
      Reproducer:
      
      modprobe -r nvme
      modprobe nvme write_queues=`nproc`
      echo $((`nproc`+1)) > /sys/module/nvme/parameters/write_queues
      echo 1 > /sys/block/nvme0n1/device/reset_controller
      
      [  657.069000] ------------[ cut here ]------------
      [  657.069022] WARNING: CPU: 10 PID: 1163 at kernel/irq/affinity.c:390 irq_create_affinity_masks+0x47c/0x4a0
      [  657.069056]  dm_region_hash dm_log dm_mod
      [  657.069059] CPU: 10 PID: 1163 Comm: kworker/u193:9 Kdump: loaded Tainted: G        W         5.6.0+ #8
      [  657.069060] Hardware name: Inspur SA5212M5/YZMB-00882-104, BIOS 4.0.9 08/27/2019
      [  657.069064] Workqueue: nvme-reset-wq nvme_reset_work [nvme]
      [  657.069066] RIP: 0010:irq_create_affinity_masks+0x47c/0x4a0
      [  657.069067] Code: fe ff ff 48 c7 c0 b0 89 14 95 48 89 46 20 e9 e9 fb ff ff 31 c0 e9 90 fc ff ff 0f 0b 48 c7 44 24 08 00 00 00 00 e9 e9 fc ff ff <0f> 0b e9 87 fe ff ff 48 8b 7c 24 28 e8 33 a0 80 00 e9 b6 fc ff ff
      [  657.069068] RSP: 0018:ffffb505ce1ffc78 EFLAGS: 00010202
      [  657.069069] RAX: 0000000000000060 RBX: ffff9b97921fe5c0 RCX: 0000000000000000
      [  657.069069] RDX: ffff9b67bad80000 RSI: 00000000ffffffa0 RDI: 0000000000000000
      [  657.069070] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9b97921fe718
      [  657.069070] R10: ffff9b97921fe710 R11: 0000000000000001 R12: 0000000000000064
      [  657.069070] R13: 0000000000000060 R14: 0000000000000000 R15: 0000000000000001
      [  657.069071] FS:  0000000000000000(0000) GS:ffff9b67c0880000(0000) knlGS:0000000000000000
      [  657.069072] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  657.069072] CR2: 0000559eac6fc238 CR3: 000000057860a002 CR4: 00000000007606e0
      [  657.069073] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  657.069073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  657.069073] PKRU: 55555554
      [  657.069074] Call Trace:
      [  657.069080]  __pci_enable_msix_range+0x233/0x5a0
      [  657.069085]  ? kernfs_put+0xec/0x190
      [  657.069086]  pci_alloc_irq_vectors_affinity+0xbb/0x130
      [  657.069089]  nvme_reset_work+0x6e6/0xeab [nvme]
      [  657.069093]  ? __switch_to_asm+0x34/0x70
      [  657.069094]  ? __switch_to_asm+0x40/0x70
      [  657.069095]  ? nvme_irq_check+0x30/0x30 [nvme]
      [  657.069098]  process_one_work+0x1a7/0x370
      [  657.069101]  worker_thread+0x1c9/0x380
      [  657.069102]  ? max_active_store+0x80/0x80
      [  657.069103]  kthread+0x112/0x130
      [  657.069104]  ? __kthread_parkme+0x70/0x70
      [  657.069105]  ret_from_fork+0x35/0x40
      [  657.069106] ---[ end trace f4f06b7d24513d06 ]---
      [  657.077110] nvme nvme0: 95/1/0 default/read/poll queues
      Signed-off-by: NWeiping Zhang <zhangweiping@didiglobal.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      9c9e76d5
  10. 13 5月, 2020 1 次提交
  11. 10 5月, 2020 4 次提交
  12. 26 3月, 2020 8 次提交
  13. 28 2月, 2020 1 次提交
  14. 19 2月, 2020 2 次提交
  15. 15 2月, 2020 1 次提交
  16. 04 2月, 2020 1 次提交