1. 22 8月, 2020 2 次提交
    • C
      nvme-pci: fix PRP pool size · c61b82c7
      Christoph Hellwig 提交于
      All operations are based on the controller, not the host page size.
      Switch the dma pool to use the controller page size as well to avoid
      massive overallocations on large page size systems.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c61b82c7
    • J
      nvme-pci: Use u32 for nvme_dev.q_depth and nvme_queue.q_depth · 7442ddce
      John Garry 提交于
      Recently nvme_dev.q_depth was changed from an int to u16 type.
      
      This falls over for the queue depth calculation in nvme_pci_enable(),
      where NVME_CAP_MQES(dev->ctrl.cap) + 1 may overflow as a u16, as
      NVME_CAP_MQES() is a 16b number also. That happens for me, and this is the
      result:
      
      root@ubuntu:/home/john# [148.272996] Unable to handle kernel NULL pointer
      dereference at virtual address 0000000000000010
      Mem abort info:
      ESR = 0x96000004
      EC = 0x25: DABT (current EL), IL = 32 bits
      SET = 0, FnV = 0
      EA = 0, S1PTW = 0
      Data abort info:
      ISV = 0, ISS = 0x00000004
      CM = 0, WnR = 0
      user pgtable: 4k pages, 48-bit VAs, pgdp=00000a27bf3c9000
      [0000000000000010] pgd=0000000000000000, p4d=0000000000000000
      Internal error: Oops: 96000004 [#1] PREEMPT SMP
      Modules linked in: nvme nvme_core
      CPU: 56 PID: 256 Comm: kworker/u195:0 Not tainted
      5.8.0-next-20200812 #27
      Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 -
      V1.16.01 03/15/2019
      Workqueue: nvme-reset-wq nvme_reset_work [nvme]
      pstate: 80c00009 (Nzcv daif +PAN +UAO BTYPE=--)
      pc : __sg_alloc_table_from_pages+0xec/0x238
      lr : __sg_alloc_table_from_pages+0xc8/0x238
      sp : ffff800013ccbad0
      x29: ffff800013ccbad0 x28: ffff0a27b3d380a8
      x27: 0000000000000000 x26: 0000000000002dc2
      x25: 0000000000000dc0 x24: 0000000000000000
      x23: 0000000000000000 x22: ffff800013ccbbe8
      x21: 0000000000000010 x20: 0000000000000000
      x19: 00000000fffff000 x18: ffffffffffffffff
      x17: 00000000000000c0 x16: fffffe289eaf6380
      x15: ffff800011b59948 x14: ffff002bc8fe98f8
      x13: ff00000000000000 x12: ffff8000114ca000
      x11: 0000000000000000 x10: ffffffffffffffff
      x9 : ffffffffffffffc0 x8 : ffff0a27b5f9b6a0
      x7 : 0000000000000000 x6 : 0000000000000001
      x5 : ffff0a27b5f9b680 x4 : 0000000000000000
      x3 : ffff0a27b5f9b680 x2 : 0000000000000000
       x1 : 0000000000000001 x0 : 0000000000000000
       Call trace:
      __sg_alloc_table_from_pages+0xec/0x238
      sg_alloc_table_from_pages+0x18/0x28
      iommu_dma_alloc+0x474/0x678
      dma_alloc_attrs+0xd8/0xf0
      nvme_alloc_queue+0x114/0x160 [nvme]
      nvme_reset_work+0xb34/0x14b4 [nvme]
      process_one_work+0x1e8/0x360
      worker_thread+0x44/0x478
      kthread+0x150/0x158
      ret_from_fork+0x10/0x34
       Code: f94002c3 6b01017f 540007c2 11000486 (f8645aa5)
      ---[ end trace 89bb2b72d59bf925 ]---
      
      Fix by making onto a u32.
      
      Also use u32 for nvme_dev.q_depth, as we assign this value from
      nvme_dev.q_depth, and nvme_dev.q_depth will possibly hold 65536 - this
      avoids the same crash as above.
      
      Fixes: 61f3b896 ("nvme-pci: use unsigned for io queue depth")
      Signed-off-by: NJohn Garry <john.garry@huawei.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      7442ddce
  2. 29 7月, 2020 4 次提交
  3. 26 7月, 2020 1 次提交
  4. 08 7月, 2020 9 次提交
  5. 25 6月, 2020 2 次提交
  6. 24 6月, 2020 1 次提交
  7. 11 6月, 2020 1 次提交
  8. 28 5月, 2020 1 次提交
    • D
      nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll() · 9210c075
      Dongli Zhang 提交于
      There may be a race between nvme_reap_pending_cqes() and nvme_poll(), e.g.,
      when doing live reset while polling the nvme device.
      
            CPU X                        CPU Y
                                     nvme_poll()
      nvme_dev_disable()
      -> nvme_stop_queues()
      -> nvme_suspend_io_queues()
      -> nvme_suspend_queue()
                                     -> spin_lock(&nvmeq->cq_poll_lock);
      -> nvme_reap_pending_cqes()
         -> nvme_process_cq()        -> nvme_process_cq()
      
      In the above scenario, the nvme_process_cq() for the same queue may be
      running on both CPU X and CPU Y concurrently.
      
      It is much more easier to reproduce the issue when CONFIG_PREEMPT is
      enabled in kernel. When CONFIG_PREEMPT is disabled, it would take longer
      time for nvme_stop_queues()-->blk_mq_quiesce_queue() to wait for grace
      period.
      
      This patch protects nvme_process_cq() with nvmeq->cq_poll_lock in
      nvme_reap_pending_cqes().
      
      Fixes: fa46c6fb ("nvme/pci: move cqe check after device shutdown")
      Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com>
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      9210c075
  9. 27 5月, 2020 2 次提交
    • M
      nvme: introduce max_integrity_segments ctrl attribute · 95093350
      Max Gurtovoy 提交于
      This patch doesn't change any logic, and is needed as a preparation
      for adding PI support for fabrics drivers that will use an extended
      LBA format for metadata and will support more than 1 integrity segment.
      Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      95093350
    • W
      nvme-pci: make sure write/poll_queues less or equal then cpu count · 9c9e76d5
      Weiping Zhang 提交于
      Check module parameter write/poll_queues before using it to catch
      too large values.
      
      Reproducer:
      
      modprobe -r nvme
      modprobe nvme write_queues=`nproc`
      echo $((`nproc`+1)) > /sys/module/nvme/parameters/write_queues
      echo 1 > /sys/block/nvme0n1/device/reset_controller
      
      [  657.069000] ------------[ cut here ]------------
      [  657.069022] WARNING: CPU: 10 PID: 1163 at kernel/irq/affinity.c:390 irq_create_affinity_masks+0x47c/0x4a0
      [  657.069056]  dm_region_hash dm_log dm_mod
      [  657.069059] CPU: 10 PID: 1163 Comm: kworker/u193:9 Kdump: loaded Tainted: G        W         5.6.0+ #8
      [  657.069060] Hardware name: Inspur SA5212M5/YZMB-00882-104, BIOS 4.0.9 08/27/2019
      [  657.069064] Workqueue: nvme-reset-wq nvme_reset_work [nvme]
      [  657.069066] RIP: 0010:irq_create_affinity_masks+0x47c/0x4a0
      [  657.069067] Code: fe ff ff 48 c7 c0 b0 89 14 95 48 89 46 20 e9 e9 fb ff ff 31 c0 e9 90 fc ff ff 0f 0b 48 c7 44 24 08 00 00 00 00 e9 e9 fc ff ff <0f> 0b e9 87 fe ff ff 48 8b 7c 24 28 e8 33 a0 80 00 e9 b6 fc ff ff
      [  657.069068] RSP: 0018:ffffb505ce1ffc78 EFLAGS: 00010202
      [  657.069069] RAX: 0000000000000060 RBX: ffff9b97921fe5c0 RCX: 0000000000000000
      [  657.069069] RDX: ffff9b67bad80000 RSI: 00000000ffffffa0 RDI: 0000000000000000
      [  657.069070] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9b97921fe718
      [  657.069070] R10: ffff9b97921fe710 R11: 0000000000000001 R12: 0000000000000064
      [  657.069070] R13: 0000000000000060 R14: 0000000000000000 R15: 0000000000000001
      [  657.069071] FS:  0000000000000000(0000) GS:ffff9b67c0880000(0000) knlGS:0000000000000000
      [  657.069072] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  657.069072] CR2: 0000559eac6fc238 CR3: 000000057860a002 CR4: 00000000007606e0
      [  657.069073] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  657.069073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  657.069073] PKRU: 55555554
      [  657.069074] Call Trace:
      [  657.069080]  __pci_enable_msix_range+0x233/0x5a0
      [  657.069085]  ? kernfs_put+0xec/0x190
      [  657.069086]  pci_alloc_irq_vectors_affinity+0xbb/0x130
      [  657.069089]  nvme_reset_work+0x6e6/0xeab [nvme]
      [  657.069093]  ? __switch_to_asm+0x34/0x70
      [  657.069094]  ? __switch_to_asm+0x40/0x70
      [  657.069095]  ? nvme_irq_check+0x30/0x30 [nvme]
      [  657.069098]  process_one_work+0x1a7/0x370
      [  657.069101]  worker_thread+0x1c9/0x380
      [  657.069102]  ? max_active_store+0x80/0x80
      [  657.069103]  kthread+0x112/0x130
      [  657.069104]  ? __kthread_parkme+0x70/0x70
      [  657.069105]  ret_from_fork+0x35/0x40
      [  657.069106] ---[ end trace f4f06b7d24513d06 ]---
      [  657.077110] nvme nvme0: 95/1/0 default/read/poll queues
      Signed-off-by: NWeiping Zhang <zhangweiping@didiglobal.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      9c9e76d5
  10. 13 5月, 2020 1 次提交
  11. 10 5月, 2020 4 次提交
  12. 26 3月, 2020 8 次提交
  13. 28 2月, 2020 1 次提交
  14. 19 2月, 2020 2 次提交
  15. 15 2月, 2020 1 次提交