1. 29 7月, 2020 1 次提交
  2. 16 7月, 2020 1 次提交
  3. 02 7月, 2020 1 次提交
  4. 25 6月, 2020 2 次提交
    • S
      nvme: fix possible deadlock when I/O is blocked · 3b4b1972
      Sagi Grimberg 提交于
      Revert fab7772b ("nvme-multipath: revalidate nvme_ns_head gendisk
      in nvme_validate_ns")
      
      When adding a new namespace to the head disk (via nvme_mpath_set_live)
      we will see partition scan which triggers I/O on the mpath device node.
      This process will usually be triggered from the scan_work which holds
      the scan_lock. If I/O blocks (if we got ana change currently have only
      available paths but none are accessible) this can deadlock on the head
      disk bd_mutex as both partition scan I/O takes it, and head disk revalidation
      takes it to check for resize (also triggered from scan_work on a different
      path). See trace [1].
      
      The mpath disk revalidation was originally added to detect online disk
      size change, but this is no longer needed since commit cb224c3a
      ("nvme: Convert to use set_capacity_revalidate_and_notify") which already
      updates resize info without unnecessarily revalidating the disk (the
      mpath disk doesn't even implement .revalidate_disk fop).
      
      [1]:
      --
      kernel: INFO: task kworker/u65:9:494 blocked for more than 241 seconds.
      kernel:       Tainted: G           OE     5.3.5-050305-generic #201910071830
      kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      kernel: kworker/u65:9   D    0   494      2 0x80004000
      kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core]
      kernel: Call Trace:
      kernel:  __schedule+0x2b9/0x6c0
      kernel:  schedule+0x42/0xb0
      kernel:  schedule_preempt_disabled+0xe/0x10
      kernel:  __mutex_lock.isra.0+0x182/0x4f0
      kernel:  __mutex_lock_slowpath+0x13/0x20
      kernel:  mutex_lock+0x2e/0x40
      kernel:  revalidate_disk+0x63/0xa0
      kernel:  __nvme_revalidate_disk+0xfe/0x110 [nvme_core]
      kernel:  nvme_revalidate_disk+0xa4/0x160 [nvme_core]
      kernel:  ? evict+0x14c/0x1b0
      kernel:  revalidate_disk+0x2b/0xa0
      kernel:  nvme_validate_ns+0x49/0x940 [nvme_core]
      kernel:  ? blk_mq_free_request+0xd2/0x100
      kernel:  ? __nvme_submit_sync_cmd+0xbe/0x1e0 [nvme_core]
      kernel:  nvme_scan_work+0x24f/0x380 [nvme_core]
      kernel:  process_one_work+0x1db/0x380
      kernel:  worker_thread+0x249/0x400
      kernel:  kthread+0x104/0x140
      kernel:  ? process_one_work+0x380/0x380
      kernel:  ? kthread_park+0x80/0x80
      kernel:  ret_from_fork+0x1f/0x40
      ...
      kernel: INFO: task kworker/u65:1:2630 blocked for more than 241 seconds.
      kernel:       Tainted: G           OE     5.3.5-050305-generic #201910071830
      kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      kernel: kworker/u65:1   D    0  2630      2 0x80004000
      kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core]
      kernel: Call Trace:
      kernel:  __schedule+0x2b9/0x6c0
      kernel:  schedule+0x42/0xb0
      kernel:  io_schedule+0x16/0x40
      kernel:  do_read_cache_page+0x438/0x830
      kernel:  ? __switch_to_asm+0x34/0x70
      kernel:  ? file_fdatawait_range+0x30/0x30
      kernel:  read_cache_page+0x12/0x20
      kernel:  read_dev_sector+0x27/0xc0
      kernel:  read_lba+0xc1/0x220
      kernel:  ? kmem_cache_alloc_trace+0x19c/0x230
      kernel:  efi_partition+0x1e6/0x708
      kernel:  ? vsnprintf+0x39e/0x4e0
      kernel:  ? snprintf+0x49/0x60
      kernel:  check_partition+0x154/0x244
      kernel:  rescan_partitions+0xae/0x280
      kernel:  __blkdev_get+0x40f/0x560
      kernel:  blkdev_get+0x3d/0x140
      kernel:  __device_add_disk+0x388/0x480
      kernel:  device_add_disk+0x13/0x20
      kernel:  nvme_mpath_set_live+0x119/0x140 [nvme_core]
      kernel:  nvme_update_ns_ana_state+0x5c/0x60 [nvme_core]
      kernel:  nvme_set_ns_ana_state+0x1e/0x30 [nvme_core]
      kernel:  nvme_parse_ana_log+0xa1/0x180 [nvme_core]
      kernel:  ? nvme_update_ns_ana_state+0x60/0x60 [nvme_core]
      kernel:  nvme_mpath_add_disk+0x47/0x90 [nvme_core]
      kernel:  nvme_validate_ns+0x396/0x940 [nvme_core]
      kernel:  ? blk_mq_free_request+0xd2/0x100
      kernel:  nvme_scan_work+0x24f/0x380 [nvme_core]
      kernel:  process_one_work+0x1db/0x380
      kernel:  worker_thread+0x249/0x400
      kernel:  kthread+0x104/0x140
      kernel:  ? process_one_work+0x380/0x380
      kernel:  ? kthread_park+0x80/0x80
      kernel:  ret_from_fork+0x1f/0x40
      --
      
      Fixes: fab7772b ("nvme-multipath: revalidate nvme_ns_head gendisk
      in nvme_validate_ns")
      Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      3b4b1972
    • M
      nvme: set initial value for controller's numa node · 4fea243e
      Max Gurtovoy 提交于
      Initialize the node to NUMA_NO_NODE value. Transports that are aware of
      numa node affinity can override it (e.g. RDMA transport set the affinity
      according to the RDMA HCA).
      Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      4fea243e
  5. 11 6月, 2020 1 次提交
  6. 30 5月, 2020 1 次提交
  7. 27 5月, 2020 8 次提交
  8. 10 5月, 2020 20 次提交
  9. 27 4月, 2020 1 次提交
    • N
      nvme: prevent double free in nvme_alloc_ns() error handling · 132be623
      Niklas Cassel 提交于
      When jumping to the out_put_disk label, we will call put_disk(), which will
      trigger a call to disk_release(), which calls blk_put_queue().
      
      Later in the cleanup code, we do blk_cleanup_queue(), which will also call
      blk_put_queue().
      
      Putting the queue twice is incorrect, and will generate a KASAN splat.
      
      Set the disk->queue pointer to NULL, before calling put_disk(), so that the
      first call to blk_put_queue() will not free the queue.
      
      The second call to blk_put_queue() uses another pointer to the same queue,
      so this call will still free the queue.
      
      Fixes: 85136c01 ("lightnvm: simplify geometry enumeration")
      Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      132be623
  10. 02 4月, 2020 1 次提交
  11. 31 3月, 2020 1 次提交
    • N
      nvme: fix compat address handling in several ioctls · c95b708d
      Nick Bowler 提交于
      On a 32-bit kernel, the upper bits of userspace addresses passed via
      various ioctls are silently ignored by the nvme driver.
      
      However on a 64-bit kernel running a compat task, these upper bits are
      not ignored and are in fact required to be zero for the ioctls to work.
      
      Unfortunately, this difference matters.  32-bit smartctl submits the
      NVME_IOCTL_ADMIN_CMD ioctl with garbage in these upper bits because it
      seems the pointer value it puts into the nvme_passthru_cmd structure is
      sign extended.  This works fine on 32-bit kernels but fails on a 64-bit
      one because (at least on my setup) the addresses smartctl uses are
      consistently above 2G.  For example:
      
        # smartctl -x /dev/nvme0n1
        smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.5.11] (local build)
        Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
      
        Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Bad address
      
      Since changing 32-bit kernels to actually check all of the submitted
      address bits now would break existing userspace, this patch fixes the
      compat problem by explicitly zeroing the upper bits in the compat case.
      This enables 32-bit smartctl to work on a 64-bit kernel.
      Signed-off-by: NNick Bowler <nbowler@draconx.ca>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      c95b708d
  12. 26 3月, 2020 2 次提交