1. 01 8月, 2019 4 次提交
    • S
      nvme: fix controller removal race with scan work · 0157ec8d
      Sagi Grimberg 提交于
      With multipath enabled, nvme_scan_work() can read from the device
      (through nvme_mpath_add_disk()) and hang [1]. However, with fabrics,
      once ctrl->state is set to NVME_CTRL_DELETING, the reads will hang
      (see nvmf_check_ready()) and the mpath stack device make_request
      will block if head->list is not empty. However, when the head->list
      consistst of only DELETING/DEAD controllers, we should actually not
      block, but rather fail immediately.
      
      In addition, before we go ahead and remove the namespaces, make sure
      to clear the current path and kick the requeue list so that the
      request will fast fail upon requeuing.
      
      [1]:
      --
        INFO: task kworker/u4:3:166 blocked for more than 120 seconds.
              Not tainted 5.2.0-rc6-vmlocalyes-00005-g808c8c2dc0cf #316
        "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        kworker/u4:3    D    0   166      2 0x80004000
        Workqueue: nvme-wq nvme_scan_work
        Call Trace:
         __schedule+0x851/0x1400
         schedule+0x99/0x210
         io_schedule+0x21/0x70
         do_read_cache_page+0xa57/0x1330
         read_cache_page+0x4a/0x70
         read_dev_sector+0xbf/0x380
         amiga_partition+0xc4/0x1230
         check_partition+0x30f/0x630
         rescan_partitions+0x19a/0x980
         __blkdev_get+0x85a/0x12f0
         blkdev_get+0x2a5/0x790
         __device_add_disk+0xe25/0x1250
         device_add_disk+0x13/0x20
         nvme_mpath_set_live+0x172/0x2b0
         nvme_update_ns_ana_state+0x130/0x180
         nvme_set_ns_ana_state+0x9a/0xb0
         nvme_parse_ana_log+0x1c3/0x4a0
         nvme_mpath_add_disk+0x157/0x290
         nvme_validate_ns+0x1017/0x1bd0
         nvme_scan_work+0x44d/0x6a0
         process_one_work+0x7d7/0x1240
         worker_thread+0x8e/0xff0
         kthread+0x2c3/0x3b0
         ret_from_fork+0x35/0x40
      
         INFO: task kworker/u4:1:1034 blocked for more than 120 seconds.
              Not tainted 5.2.0-rc6-vmlocalyes-00005-g808c8c2dc0cf #316
        "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        kworker/u4:1    D    0  1034      2 0x80004000
        Workqueue: nvme-delete-wq nvme_delete_ctrl_work
        Call Trace:
         __schedule+0x851/0x1400
         schedule+0x99/0x210
         schedule_timeout+0x390/0x830
         wait_for_completion+0x1a7/0x310
         __flush_work+0x241/0x5d0
         flush_work+0x10/0x20
         nvme_remove_namespaces+0x85/0x3d0
         nvme_do_delete_ctrl+0xb4/0x1e0
         nvme_delete_ctrl_work+0x15/0x20
         process_one_work+0x7d7/0x1240
         worker_thread+0x8e/0xff0
         kthread+0x2c3/0x3b0
         ret_from_fork+0x35/0x40
      --
      Reported-by: NLogan Gunthorpe <logang@deltatee.com>
      Tested-by: NLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      0157ec8d
    • S
      nvme-rdma: fix possible use-after-free in connect error flow · d94211b8
      Sagi Grimberg 提交于
      When start_queue fails, we need to make sure to drain the
      queue cq before freeing the rdma resources because we might
      still race with the completion path. Have start_queue() error
      path safely stop the queue.
      
      --
      [30371.808111] nvme nvme1: Failed reconnect attempt 11
      [30371.808113] nvme nvme1: Reconnecting in 10 seconds...
      [...]
      [30382.069315] nvme nvme1: creating 4 I/O queues.
      [30382.257058] nvme nvme1: Connect Invalid SQE Parameter, qid 4
      [30382.257061] nvme nvme1: failed to connect queue: 4 ret=386
      [30382.305001] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      [30382.305022] IP: qedr_poll_cq+0x8a3/0x1170 [qedr]
      [30382.305028] PGD 0 P4D 0
      [30382.305037] Oops: 0000 [#1] SMP PTI
      [...]
      [30382.305153] Call Trace:
      [30382.305166]  ? __switch_to_asm+0x34/0x70
      [30382.305187]  __ib_process_cq+0x56/0xd0 [ib_core]
      [30382.305201]  ib_poll_handler+0x26/0x70 [ib_core]
      [30382.305213]  irq_poll_softirq+0x88/0x110
      [30382.305223]  ? sort_range+0x20/0x20
      [30382.305232]  __do_softirq+0xde/0x2c6
      [30382.305241]  ? sort_range+0x20/0x20
      [30382.305249]  run_ksoftirqd+0x1c/0x60
      [30382.305258]  smpboot_thread_fn+0xef/0x160
      [30382.305265]  kthread+0x113/0x130
      [30382.305273]  ? kthread_create_worker_on_cpu+0x50/0x50
      [30382.305281]  ret_from_fork+0x35/0x40
      --
      Reported-by: NNicolas Morey-Chaisemartin <NMoreyChaisemartin@suse.com>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      d94211b8
    • S
      nvme: fix a possible deadlock when passthru commands sent to a multipath device · b9156dae
      Sagi Grimberg 提交于
      When the user issues a command with side effects, we will end up freezing
      the namespace request queue when updating disk info (and the same for
      the corresponding mpath disk node).
      
      However, we are not freezing the mpath node request queue,
      which means that mpath I/O can still come in and block on blk_queue_enter
      (called from nvme_ns_head_make_request -> direct_make_request).
      
      This is a deadlock, because blk_queue_enter will block until the inner
      namespace request queue is unfroze, but that process is blocked because
      the namespace revalidation is trying to update the mpath disk info
      and freeze its request queue (which will never complete because
      of the I/O that is blocked on blk_queue_enter).
      
      Fix this by freezing all the subsystem nsheads request queues before
      executing the passthru command. Given that these commands are infrequent
      we should not worry about this temporary I/O freeze to keep things sane.
      
      Here is the matching hang traces:
      --
      [ 374.465002] INFO: task systemd-udevd:17994 blocked for more than 122 seconds.
      [ 374.472975] Not tainted 5.2.0-rc3-mpdebug+ #42
      [ 374.478522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 374.487274] systemd-udevd D 0 17994 1 0x00000000
      [ 374.493407] Call Trace:
      [ 374.496145] __schedule+0x2ef/0x620
      [ 374.500047] schedule+0x38/0xa0
      [ 374.503569] blk_queue_enter+0x139/0x220
      [ 374.507959] ? remove_wait_queue+0x60/0x60
      [ 374.512540] direct_make_request+0x60/0x130
      [ 374.517219] nvme_ns_head_make_request+0x11d/0x420 [nvme_core]
      [ 374.523740] ? generic_make_request_checks+0x307/0x6f0
      [ 374.529484] generic_make_request+0x10d/0x2e0
      [ 374.534356] submit_bio+0x75/0x140
      [ 374.538163] ? guard_bio_eod+0x32/0xe0
      [ 374.542361] submit_bh_wbc+0x171/0x1b0
      [ 374.546553] block_read_full_page+0x1ed/0x330
      [ 374.551426] ? check_disk_change+0x70/0x70
      [ 374.556008] ? scan_shadow_nodes+0x30/0x30
      [ 374.560588] blkdev_readpage+0x18/0x20
      [ 374.564783] do_read_cache_page+0x301/0x860
      [ 374.569463] ? blkdev_writepages+0x10/0x10
      [ 374.574037] ? prep_new_page+0x88/0x130
      [ 374.578329] ? get_page_from_freelist+0xa2f/0x1280
      [ 374.583688] ? __alloc_pages_nodemask+0x179/0x320
      [ 374.588947] read_cache_page+0x12/0x20
      [ 374.593142] read_dev_sector+0x2d/0xd0
      [ 374.597337] read_lba+0x104/0x1f0
      [ 374.601046] find_valid_gpt+0xfa/0x720
      [ 374.605243] ? string_nocheck+0x58/0x70
      [ 374.609534] ? find_valid_gpt+0x720/0x720
      [ 374.614016] efi_partition+0x89/0x430
      [ 374.618113] ? string+0x48/0x60
      [ 374.621632] ? snprintf+0x49/0x70
      [ 374.625339] ? find_valid_gpt+0x720/0x720
      [ 374.629828] check_partition+0x116/0x210
      [ 374.634214] rescan_partitions+0xb6/0x360
      [ 374.638699] __blkdev_reread_part+0x64/0x70
      [ 374.643377] blkdev_reread_part+0x23/0x40
      [ 374.647860] blkdev_ioctl+0x48c/0x990
      [ 374.651956] block_ioctl+0x41/0x50
      [ 374.655766] do_vfs_ioctl+0xa7/0x600
      [ 374.659766] ? locks_lock_inode_wait+0xb1/0x150
      [ 374.664832] ksys_ioctl+0x67/0x90
      [ 374.668539] __x64_sys_ioctl+0x1a/0x20
      [ 374.672732] do_syscall_64+0x5a/0x1c0
      [ 374.676828] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      [ 374.738474] INFO: task nvmeadm:49141 blocked for more than 123 seconds.
      [ 374.745871] Not tainted 5.2.0-rc3-mpdebug+ #42
      [ 374.751419] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 374.760170] nvmeadm D 0 49141 36333 0x00004080
      [ 374.766301] Call Trace:
      [ 374.769038] __schedule+0x2ef/0x620
      [ 374.772939] schedule+0x38/0xa0
      [ 374.776452] blk_mq_freeze_queue_wait+0x59/0x100
      [ 374.781614] ? remove_wait_queue+0x60/0x60
      [ 374.786192] blk_mq_freeze_queue+0x1a/0x20
      [ 374.790773] nvme_update_disk_info.isra.57+0x5f/0x350 [nvme_core]
      [ 374.797582] ? nvme_identify_ns.isra.50+0x71/0xc0 [nvme_core]
      [ 374.804006] __nvme_revalidate_disk+0xe5/0x110 [nvme_core]
      [ 374.810139] nvme_revalidate_disk+0xa6/0x120 [nvme_core]
      [ 374.816078] ? nvme_submit_user_cmd+0x11e/0x320 [nvme_core]
      [ 374.822299] nvme_user_cmd+0x264/0x370 [nvme_core]
      [ 374.827661] nvme_dev_ioctl+0x112/0x1d0 [nvme_core]
      [ 374.833114] do_vfs_ioctl+0xa7/0x600
      [ 374.837117] ? __audit_syscall_entry+0xdd/0x130
      [ 374.842184] ksys_ioctl+0x67/0x90
      [ 374.845891] __x64_sys_ioctl+0x1a/0x20
      [ 374.850082] do_syscall_64+0x5a/0x1c0
      [ 374.854178] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      --
      Reported-by: NJames Puthukattukaran <james.puthukattukaran@oracle.com>
      Tested-by: NJames Puthukattukaran <james.puthukattukaran@oracle.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      b9156dae
    • L
      nvme-core: Fix extra device_put() call on error path · 8c36e66f
      Logan Gunthorpe 提交于
      In the error path for nvme_init_subsystem(), nvme_put_subsystem()
      will call device_put(), but it will get called again after the
      mutex_unlock().
      
      The device_put() only needs to be called if device_add() fails.
      
      This bug caused a KASAN use-after-free error when adding and
      removing subsytems in a loop:
      
        BUG: KASAN: use-after-free in device_del+0x8d9/0x9a0
        Read of size 8 at addr ffff8883cdaf7120 by task multipathd/329
      
        CPU: 0 PID: 329 Comm: multipathd Not tainted 5.2.0-rc6-vmlocalyes-00019-g70a2b39005fd-dirty #314
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
        Call Trace:
         dump_stack+0x7b/0xb5
         print_address_description+0x6f/0x280
         ? device_del+0x8d9/0x9a0
         __kasan_report+0x148/0x199
         ? device_del+0x8d9/0x9a0
         ? class_release+0x100/0x130
         ? device_del+0x8d9/0x9a0
         kasan_report+0x12/0x20
         __asan_report_load8_noabort+0x14/0x20
         device_del+0x8d9/0x9a0
         ? device_platform_notify+0x70/0x70
         nvme_destroy_subsystem+0xf9/0x150
         nvme_free_ctrl+0x280/0x3a0
         device_release+0x72/0x1d0
         kobject_put+0x144/0x410
         put_device+0x13/0x20
         nvme_free_ns+0xc4/0x100
         nvme_release+0xb3/0xe0
         __blkdev_put+0x549/0x6e0
         ? kasan_check_write+0x14/0x20
         ? bd_set_size+0xb0/0xb0
         ? kasan_check_write+0x14/0x20
         ? mutex_lock+0x8f/0xe0
         ? __mutex_lock_slowpath+0x20/0x20
         ? locks_remove_file+0x239/0x370
         blkdev_put+0x72/0x2c0
         blkdev_close+0x8d/0xd0
         __fput+0x256/0x770
         ? _raw_read_lock_irq+0x40/0x40
         ____fput+0xe/0x10
         task_work_run+0x10c/0x180
         ? filp_close+0xf7/0x140
         exit_to_usermode_loop+0x151/0x170
         do_syscall_64+0x240/0x2e0
         ? prepare_exit_to_usermode+0xd5/0x190
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x7f5a79af05d7
        Code: 00 00 0f 05 48 3d 00 f0 ff ff 77 3f c3 66 0f 1f 44 00 00 53 89 fb 48 83 ec 10 e8 c4 fb ff ff 89 df 89 c2 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 89 d7 89 44 24 0c e8 06 fc ff ff 8b 44 24
        RSP: 002b:00007f5a7799c810 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
        RAX: 0000000000000000 RBX: 0000000000000008 RCX: 00007f5a79af05d7
        RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000008
        RBP: 00007f5a58000f98 R08: 0000000000000002 R09: 00007f5a7935ee80
        R10: 0000000000000000 R11: 0000000000000293 R12: 000055e432447240
        R13: 0000000000000000 R14: 0000000000000001 R15: 000055e4324a9cf0
      
        Allocated by task 1236:
         save_stack+0x21/0x80
         __kasan_kmalloc.constprop.6+0xab/0xe0
         kasan_kmalloc+0x9/0x10
         kmem_cache_alloc_trace+0x102/0x210
         nvme_init_identify+0x13c3/0x3820
         nvme_loop_configure_admin_queue+0x4fa/0x5e0
         nvme_loop_create_ctrl+0x469/0xf40
         nvmf_dev_write+0x19a3/0x21ab
         __vfs_write+0x66/0x120
         vfs_write+0x154/0x490
         ksys_write+0x104/0x240
         __x64_sys_write+0x73/0xb0
         do_syscall_64+0xa5/0x2e0
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
        Freed by task 329:
         save_stack+0x21/0x80
         __kasan_slab_free+0x129/0x190
         kasan_slab_free+0xe/0x10
         kfree+0xa7/0x200
         nvme_release_subsystem+0x49/0x60
         device_release+0x72/0x1d0
         kobject_put+0x144/0x410
         put_device+0x13/0x20
         klist_class_dev_put+0x31/0x40
         klist_put+0x8f/0xf0
         klist_del+0xe/0x10
         device_del+0x3a7/0x9a0
         nvme_destroy_subsystem+0xf9/0x150
         nvme_free_ctrl+0x280/0x3a0
         device_release+0x72/0x1d0
         kobject_put+0x144/0x410
         put_device+0x13/0x20
         nvme_free_ns+0xc4/0x100
         nvme_release+0xb3/0xe0
         __blkdev_put+0x549/0x6e0
         blkdev_put+0x72/0x2c0
         blkdev_close+0x8d/0xd0
         __fput+0x256/0x770
         ____fput+0xe/0x10
         task_work_run+0x10c/0x180
         exit_to_usermode_loop+0x151/0x170
         do_syscall_64+0x240/0x2e0
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 32fd90c4 ("nvme: change locking for the per-subsystem controller list")
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      8c36e66f
  2. 30 7月, 2019 1 次提交
  3. 23 7月, 2019 4 次提交
    • Y
      Revert "nvme-pci: don't create a read hctx mapping without read queues" · 8fe34be1
      yangerkun 提交于
      This reverts commit 0298d543.
      
      With this patch, set 'poll_queues > hard queues' will lead to 'nr_read_queues = 0'
      in nvme_calc_irq_sets. Then poll_queues setting can fail since dev->tagset.nr_maps
      equals to 2 and nvme_pci_map_queues will not do map for poll queues.
      Signed-off-by: Nyangerkun <yangerkun@huawei.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      8fe34be1
    • M
      nvme: fix multipath crash when ANA is deactivated · 66b20ac0
      Marta Rybczynska 提交于
      Fix a crash with multipath activated. It happends when ANA log
      page is larger than MDTS and because of that ANA is disabled.
      The driver then tries to access unallocated buffer when connecting
      to a nvme target. The signature is as follows:
      
      [  300.433586] nvme nvme0: ANA log page size (8208) larger than MDTS (8192).
      [  300.435387] nvme nvme0: disabling ANA support.
      [  300.437835] nvme nvme0: creating 4 I/O queues.
      [  300.459132] nvme nvme0: new ctrl: NQN "nqn.0.0.0", addr 10.91.0.1:8009
      [  300.464609] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      [  300.466342] #PF error: [normal kernel read fault]
      [  300.467385] PGD 0 P4D 0
      [  300.467987] Oops: 0000 [#1] SMP PTI
      [  300.468787] CPU: 3 PID: 50 Comm: kworker/u8:1 Not tainted 5.0.20kalray+ #4
      [  300.470264] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  300.471532] Workqueue: nvme-wq nvme_scan_work [nvme_core]
      [  300.472724] RIP: 0010:nvme_parse_ana_log+0x21/0x140 [nvme_core]
      [  300.474038] Code: 45 01 d2 d8 48 98 c3 66 90 0f 1f 44 00 00 41 57 41 56 41 55 41 54 55 53 48 89 fb 48 83 ec 08 48 8b af 20 0a 00 00 48 89 34 24 <66> 83 7d 08 00 0f 84 c6 00 00 00 44 8b 7d 14 49 89 d5 8b 55 10 48
      [  300.477374] RSP: 0018:ffffa50e80fd7cb8 EFLAGS: 00010296
      [  300.478334] RAX: 0000000000000001 RBX: ffff9130f1872258 RCX: 0000000000000000
      [  300.479784] RDX: ffffffffc06c4c30 RSI: ffff9130edad4280 RDI: ffff9130f1872258
      [  300.481488] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
      [  300.483203] R10: 0000000000000220 R11: 0000000000000040 R12: ffff9130f18722c0
      [  300.484928] R13: ffff9130f18722d0 R14: ffff9130edad4280 R15: ffff9130f18722c0
      [  300.486626] FS:  0000000000000000(0000) GS:ffff9130f7b80000(0000) knlGS:0000000000000000
      [  300.488538] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  300.489907] CR2: 0000000000000008 CR3: 00000002365e6000 CR4: 00000000000006e0
      [  300.491612] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  300.493303] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  300.494991] Call Trace:
      [  300.495645]  nvme_mpath_add_disk+0x5c/0xb0 [nvme_core]
      [  300.496880]  nvme_validate_ns+0x2ef/0x550 [nvme_core]
      [  300.498105]  ? nvme_identify_ctrl.isra.45+0x6a/0xb0 [nvme_core]
      [  300.499539]  nvme_scan_work+0x2b4/0x370 [nvme_core]
      [  300.500717]  ? __switch_to_asm+0x35/0x70
      [  300.501663]  process_one_work+0x171/0x380
      [  300.502340]  worker_thread+0x49/0x3f0
      [  300.503079]  kthread+0xf8/0x130
      [  300.503795]  ? max_active_store+0x80/0x80
      [  300.504690]  ? kthread_bind+0x10/0x10
      [  300.505502]  ret_from_fork+0x35/0x40
      [  300.506280] Modules linked in: nvme_tcp nvme_rdma rdma_cm iw_cm ib_cm ib_core nvme_fabrics nvme_core xt_physdev ip6table_raw ip6table_mangle ip6table_filter ip6_tables xt_comment iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_CHECKSUM iptable_mangle iptable_filter veth ebtable_filter ebtable_nat ebtables iptable_raw vxlan ip6_udp_tunnel udp_tunnel sunrpc joydev pcspkr virtio_balloon br_netfilter bridge stp llc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net virtio_console net_failover virtio_blk failover ata_piix serio_raw libata virtio_pci virtio_ring virtio
      [  300.514984] CR2: 0000000000000008
      [  300.515569] ---[ end trace faa2eefad7e7f218 ]---
      [  300.516354] RIP: 0010:nvme_parse_ana_log+0x21/0x140 [nvme_core]
      [  300.517330] Code: 45 01 d2 d8 48 98 c3 66 90 0f 1f 44 00 00 41 57 41 56 41 55 41 54 55 53 48 89 fb 48 83 ec 08 48 8b af 20 0a 00 00 48 89 34 24 <66> 83 7d 08 00 0f 84 c6 00 00 00 44 8b 7d 14 49 89 d5 8b 55 10 48
      [  300.520353] RSP: 0018:ffffa50e80fd7cb8 EFLAGS: 00010296
      [  300.521229] RAX: 0000000000000001 RBX: ffff9130f1872258 RCX: 0000000000000000
      [  300.522399] RDX: ffffffffc06c4c30 RSI: ffff9130edad4280 RDI: ffff9130f1872258
      [  300.523560] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
      [  300.524734] R10: 0000000000000220 R11: 0000000000000040 R12: ffff9130f18722c0
      [  300.525915] R13: ffff9130f18722d0 R14: ffff9130edad4280 R15: ffff9130f18722c0
      [  300.527084] FS:  0000000000000000(0000) GS:ffff9130f7b80000(0000) knlGS:0000000000000000
      [  300.528396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  300.529440] CR2: 0000000000000008 CR3: 00000002365e6000 CR4: 00000000000006e0
      [  300.530739] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  300.531989] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  300.533264] Kernel panic - not syncing: Fatal exception
      [  300.534338] Kernel Offset: 0x17c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      [  300.536227] ---[ end Kernel panic - not syncing: Fatal exception ]---
      
      Condition check refactoring from Christoph Hellwig.
      Signed-off-by: NMarta Rybczynska <marta.rybczynska@kalray.eu>
      Tested-by: NJean-Baptiste Riaux <jbriaux@kalray.eu>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      66b20ac0
    • L
      nvme: fix memory leak caused by incorrect subsystem free · e654dfd3
      Logan Gunthorpe 提交于
      When freeing the subsystem after finding another match with
      __nvme_find_get_subsystem(), use put_device() instead of
      __nvme_release_subsystem() which calls kfree() directly.
      
      Per the documentation, put_device() should always be used
      after device_initialization() is called. Otherwise, leaks
      like the one below which was detected by kmemleak may occur.
      
      Once the call of __nvme_release_subsystem() is removed it no
      longer makes sense to keep the helper, so fold it back
      into nvme_release_subsystem().
      
      unreferenced object 0xffff8883d12bfbc0 (size 16):
        comm "nvme", pid 2635, jiffies 4294933602 (age 739.952s)
        hex dump (first 16 bytes):
          6e 76 6d 65 2d 73 75 62 73 79 73 32 00 88 ff ff  nvme-subsys2....
        backtrace:
          [<000000007d8fc208>] __kmalloc_track_caller+0x16d/0x2a0
          [<0000000081169e5f>] kvasprintf+0xad/0x130
          [<0000000025626f25>] kvasprintf_const+0x47/0x120
          [<00000000fa66ad36>] kobject_set_name_vargs+0x44/0x120
          [<000000004881f8b3>] dev_set_name+0x98/0xc0
          [<000000007124dae3>] nvme_init_identify+0x1995/0x38e0
          [<000000009315020a>] nvme_loop_configure_admin_queue+0x4fa/0x5e0
          [<000000001a63e766>] nvme_loop_create_ctrl+0x489/0xf80
          [<00000000a46ecc23>] nvmf_dev_write+0x1a12/0x2220
          [<000000002259b3d5>] __vfs_write+0x66/0x120
          [<000000002f6df81e>] vfs_write+0x154/0x490
          [<000000007e8cfc19>] ksys_write+0x10a/0x240
          [<00000000ff5c7b85>] __x64_sys_write+0x73/0xb0
          [<00000000fee6d692>] do_syscall_64+0xaa/0x470
          [<00000000997e1ede>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: ab9e00cc ("nvme: track subsystems")
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      e654dfd3
    • M
      nvme: ignore subnqn for ADATA SX6000LNP · 08b903b5
      Misha Nasledov 提交于
      The ADATA SX6000LNP NVMe SSDs have the same subnqn and, due to this, a
      system with more than one of these SSDs will only have one usable.
      
      [ 0.942706] nvme nvme1: ignoring ctrl due to duplicate subnqn (nqn.2018-05.com.example:nvme:nvm-subsystem-OUI00E04C).
      [ 0.943017] nvme nvme1: Removing after probe failure status: -22
      
      02:00.0 Non-Volatile memory controller [0108]: Realtek Semiconductor Co., Ltd. Device [10ec:5762] (rev 01)
      71:00.0 Non-Volatile memory controller [0108]: Realtek Semiconductor Co., Ltd. Device [10ec:5762] (rev 01)
      
      There are no firmware updates available from the vendor, unfortunately.
      Applying the NVME_QUIRK_IGNORE_DEV_SUBNQN quirk for these SSDs resolves
      the issue, and they all work after this patch:
      
      /dev/nvme0n1     2J1120050420         ADATA SX6000LNP [...]
      /dev/nvme1n1     2J1120050540         ADATA SX6000LNP [...]
      Signed-off-by: NMisha Nasledov <misha@nasledov.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      08b903b5
  4. 12 7月, 2019 1 次提交
    • M
      nvme: fix NULL deref for fabrics options · 7d30c81b
      Minwoo Im 提交于
      git://git.infradead.org/nvme.git nvme-5.3 branch now causes the
      following NULL deref oops.  Check the ctrl->opts first before the deref.
      
      [   16.337581] BUG: kernel NULL pointer dereference, address: 0000000000000056
      [   16.338551] #PF: supervisor read access in kernel mode
      [   16.338551] #PF: error_code(0x0000) - not-present page
      [   16.338551] PGD 0 P4D 0
      [   16.338551] Oops: 0000 [#1] SMP PTI
      [   16.338551] CPU: 2 PID: 1035 Comm: kworker/u16:5 Not tainted 5.2.0-rc6+ #1
      [   16.338551] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
      [   16.338551] Workqueue: nvme-wq nvme_scan_work [nvme_core]
      [   16.338551] RIP: 0010:nvme_validate_ns+0xc9/0x7e0 [nvme_core]
      [   16.338551] Code: c0 49 89 c5 0f 84 00 07 00 00 48 8b 7b 58 e8 be 48 39 c1 48 3d 00 f0 ff ff 49 89 45 18 0f 87 a4 06 00 00 48 8b 93 70 0a 00 00 <80> 7a 56 00 74 0c 48 8b 40 68 83 48 3c 08 49 8b 45 18 48 89 c6 bf
      [   16.338551] RSP: 0018:ffffc900024c7d10 EFLAGS: 00010283
      [   16.338551] RAX: ffff888135a30720 RBX: ffff88813a4fd1f8 RCX: 0000000000000007
      [   16.338551] RDX: 0000000000000000 RSI: ffffffff8256dd38 RDI: ffff888135a30720
      [   16.338551] RBP: 0000000000000001 R08: 0000000000000007 R09: ffff88813aa6a840
      [   16.338551] R10: 0000000000000001 R11: 000000000002d060 R12: ffff88813a4fd1f8
      [   16.338551] R13: ffff88813a77f800 R14: ffff88813aa35180 R15: 0000000000000001
      [   16.338551] FS:  0000000000000000(0000) GS:ffff88813ba80000(0000) knlGS:0000000000000000
      [   16.338551] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   16.338551] CR2: 0000000000000056 CR3: 000000000240a002 CR4: 0000000000360ee0
      [   16.338551] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   16.338551] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   16.338551] Call Trace:
      [   16.338551]  nvme_scan_work+0x2c0/0x340 [nvme_core]
      [   16.338551]  ? __switch_to_asm+0x40/0x70
      [   16.338551]  ? _raw_spin_unlock_irqrestore+0x18/0x30
      [   16.338551]  ? try_to_wake_up+0x408/0x450
      [   16.338551]  process_one_work+0x20b/0x3e0
      [   16.338551]  worker_thread+0x1f9/0x3d0
      [   16.338551]  ? cancel_delayed_work+0xa0/0xa0
      [   16.338551]  kthread+0x117/0x120
      [   16.338551]  ? kthread_stop+0xf0/0xf0
      [   16.338551]  ret_from_fork+0x3a/0x50
      [   16.338551] Modules linked in: nvme nvme_core
      [   16.338551] CR2: 0000000000000056
      [   16.338551] ---[ end trace b9bf761a93e62d84 ]---
      [   16.338551] RIP: 0010:nvme_validate_ns+0xc9/0x7e0 [nvme_core]
      [   16.338551] Code: c0 49 89 c5 0f 84 00 07 00 00 48 8b 7b 58 e8 be 48 39 c1 48 3d 00 f0 ff ff 49 89 45 18 0f 87 a4 06 00 00 48 8b 93 70 0a 00 00 <80> 7a 56 00 74 0c 48 8b 40 68 83 48 3c 08 49 8b 45 18 48 89 c6 bf
      [   16.338551] RSP: 0018:ffffc900024c7d10 EFLAGS: 00010283
      [   16.338551] RAX: ffff888135a30720 RBX: ffff88813a4fd1f8 RCX: 0000000000000007
      [   16.338551] RDX: 0000000000000000 RSI: ffffffff8256dd38 RDI: ffff888135a30720
      [   16.338551] RBP: 0000000000000001 R08: 0000000000000007 R09: ffff88813aa6a840
      [   16.338551] R10: 0000000000000001 R11: 000000000002d060 R12: ffff88813a4fd1f8
      [   16.338551] R13: ffff88813a77f800 R14: ffff88813aa35180 R15: 0000000000000001
      [   16.338551] FS:  0000000000000000(0000) GS:ffff88813ba80000(0000) knlGS:0000000000000000
      [   16.338551] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   16.338551] CR2: 0000000000000056 CR3: 000000000240a002 CR4: 0000000000360ee0
      [   16.338551] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   16.338551] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: 958f2a0f ("nvme-tcp: set the STABLE_WRITES flag when data digests are enabled")
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Keith Busch <kbusch@kernel.org>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      7d30c81b
  5. 11 7月, 2019 1 次提交
  6. 10 7月, 2019 14 次提交
  7. 24 6月, 2019 1 次提交
  8. 21 6月, 2019 14 次提交