1. 02 9月, 2020 1 次提交
  2. 29 6月, 2020 1 次提交
    • J
      nvme: implement mq_ops->commit_rqs() hook · a2724bab
      Jens Axboe 提交于
      fix #28871358
      
      commit 04f3eafda6e05adc56afed4d3ae6e24aaa429058 upstream
      
      Split the command submission and the SQ doorbell ring, and add the
      doorbell ring as our ->commit_rqs() hook. This allows a list of
      requests to be issued, with nvme only writing the SQ update when
      it's necessary. This is more efficient if we have lists of requests
      to issue, particularly on virtualized hardware, where writing the
      SQ doorbell is more expensive than on real hardware. For those cases,
      performance increases of 2-3x have been observed.
      
      The use case for this is plugged IO, where blk-mq flushes a batch of
      requests at the time.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com>
      Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      a2724bab
  3. 15 6月, 2020 2 次提交
  4. 18 3月, 2020 1 次提交
  5. 17 1月, 2020 4 次提交
  6. 15 1月, 2020 1 次提交
  7. 13 12月, 2019 1 次提交
  8. 05 12月, 2019 2 次提交
  9. 01 12月, 2019 5 次提交
  10. 24 11月, 2019 1 次提交
  11. 13 11月, 2019 1 次提交
    • A
      nvme-multipath: fix possible io hang after ctrl reconnect · 6376736d
      Anton Eidelman 提交于
      [ Upstream commit af8fd0424713a2adb812d10d55e86718152cf656 ]
      
      The following scenario results in an IO hang:
      1) ctrl completes a request with NVME_SC_ANA_TRANSITION.
         NVME_NS_ANA_PENDING bit in ns->flags is set and ana_work is triggered.
      2) ana_work: nvme_read_ana_log() tries to get the ANA log page from the ctrl.
         This fails because ctrl disconnects.
         Therefore nvme_update_ns_ana_state() is not called
         and NVME_NS_ANA_PENDING bit in ns->flags is not cleared.
      3) ctrl reconnects: nvme_mpath_init(ctrl,...) calls
         nvme_read_ana_log(ctrl, groups_only=true).
         However, nvme_update_ana_state() does not update namespaces
         because nr_nsids = 0 (due to groups_only mode).
      4) scan_work calls nvme_validate_ns() finds the ns and re-validates OK.
      
      Result:
      The ctrl is now live but NVME_NS_ANA_PENDING bit in ns->flags is still set.
      Consequently ctrl will never be considered a viable path by __nvme_find_path().
      IO will hang if ctrl is the only or the last path to the namespace.
      
      More generally, while ctrl is reconnecting, its ANA state may change.
      And because nvme_mpath_init() requests ANA log in groups_only mode,
      these changes are not propagated to the existing ctrl namespaces.
      This may result in a mal-function or an IO hang.
      
      Solution:
      nvme_mpath_init() will nvme_read_ana_log() with groups_only set to false.
      This will not harm the new ctrl case (no namespaces present),
      and will make sure the ANA state of namespaces gets updated after reconnect.
      
      Note: Another option would be for nvme_mpath_init() to invoke
      nvme_parse_ana_log(..., nvme_set_ns_ana_state) for each existing namespace.
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      6376736d
  12. 29 10月, 2019 1 次提交
  13. 05 10月, 2019 2 次提交
    • A
      nvme-multipath: fix ana log nsid lookup when nsid is not found · ad58ce6c
      Anton Eidelman 提交于
      [ Upstream commit e01f91dff91c7b16a6e3faf2565017d497a73f83 ]
      
      ANA log parsing invokes nvme_update_ana_state() per ANA group desc.
      This updates the state of namespaces with nsids in desc->nsids[].
      
      Both ctrl->namespaces list and desc->nsids[] array are sorted by nsid.
      Hence nvme_update_ana_state() performs a single walk over ctrl->namespaces:
      - if current namespace matches the current desc->nsids[n],
        this namespace is updated, and n is incremented.
      - the process stops when it encounters the end of either
        ctrl->namespaces end or desc->nsids[]
      
      In case desc->nsids[n] does not match any of ctrl->namespaces,
      the remaining nsids following desc->nsids[n] will not be updated.
      Such situation was considered abnormal and generated WARN_ON_ONCE.
      
      However ANA log MAY contain nsids not (yet) found in ctrl->namespaces.
      For example, lets consider the following scenario:
      - nvme0 exposes namespaces with nsids = [2, 3] to the host
      - a new namespace nsid = 1 is added dynamically
      - also, a ANA topology change is triggered
      - NS_CHANGED aen is generated and triggers scan_work
      - before scan_work discovers nsid=1 and creates a namespace, a NOTICE_ANA
        aen was issues and ana_work receives ANA log with nsids=[1, 2, 3]
      
      Result: ana_work fails to update ANA state on existing namespaces [2, 3]
      
      Solution:
      Change the way nvme_update_ana_state() namespace list walk
      checks the current namespace against desc->nsids[n] as follows:
      a) ns->head->ns_id < desc->nsids[n]: keep walking ctrl->namespaces.
      b) ns->head->ns_id == desc->nsids[n]: match, update the namespace
      c) ns->head->ns_id >= desc->nsids[n]: skip to desc->nsids[n+1]
      
      This enables correct operation in the scenario described above.
      This also allows ANA log to contain nsids currently invisible
      to the host, i.e. inactive nsids.
      Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
      Reviewed-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ad58ce6c
    • T
      nvmet: fix data units read and written counters in SMART log · 9edc229b
      Tom Wu 提交于
      [ Upstream commit 3bec2e3754becebd4c452999adb49bc62c575ea4 ]
      
      In nvme spec 1.3 there is a definition for data write/read counters
      from SMART log, (See section 5.14.1.2):
      	This value is reported in thousands (i.e., a value of 1
      	corresponds to 1000 units of 512 bytes read) and is rounded up.
      
      However, in nvme target where value is reported with actual units,
      but not thousands of units as the spec requires.
      Signed-off-by: NTom Wu <tomwu@mellanox.com>
      Reviewed-by: NIsrael Rukshin <israelr@mellanox.com>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      9edc229b
  14. 16 9月, 2019 1 次提交
  15. 10 9月, 2019 1 次提交
    • A
      nvme-multipath: fix possible I/O hang when paths are updated · 5e416b11
      Anton Eidelman 提交于
      [ Upstream commit 504db087aaccdb32af61539916409f7dca31ceb5 ]
      
      nvme_state_set_live() making a path available triggers requeue_work
      in order to resubmit requests that ended up on requeue_list when no
      paths were available.
      
      This requeue_work may race with concurrent nvme_ns_head_make_request()
      that do not observe the live path yet.
      Such concurrent requests may by made by either:
      - New IO submission.
      - Requeue_work triggered by nvme_failover_req() or another ana_work.
      
      A race may cause requeue_work capture the state of requeue_list before
      more requests get onto the list. These requests will stay on the list
      forever unless requeue_work is triggered again.
      
      In order to prevent such race, nvme_state_set_live() should
      synchronize_srcu(&head->srcu) before triggering the requeue_work and
      prevent nvme_ns_head_make_request referencing an old snapshot of the
      path list.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      5e416b11
  16. 06 9月, 2019 4 次提交
    • K
      nvme-pci: Fix async probe remove race · 4a982919
      Keith Busch 提交于
      [ Upstream commit bd46a90634302bfe791e93ad5496f98f165f7ae0 ]
      
      Ensure the controller is not in the NEW state when nvme_probe() exits.
      This will always allow a subsequent nvme_remove() to set the state to
      DELETING, fixing a potential race between the initial asynchronous probe
      and device removal.
      Reported-by: NLi Zhong <lizhongfs@gmail.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      4a982919
    • S
      nvme: fix a possible deadlock when passthru commands sent to a multipath device · 431f579a
      Sagi Grimberg 提交于
      [ Upstream commit b9156daeb1601d69007b7e50efcf89d69d72ec1d ]
      
      When the user issues a command with side effects, we will end up freezing
      the namespace request queue when updating disk info (and the same for
      the corresponding mpath disk node).
      
      However, we are not freezing the mpath node request queue,
      which means that mpath I/O can still come in and block on blk_queue_enter
      (called from nvme_ns_head_make_request -> direct_make_request).
      
      This is a deadlock, because blk_queue_enter will block until the inner
      namespace request queue is unfroze, but that process is blocked because
      the namespace revalidation is trying to update the mpath disk info
      and freeze its request queue (which will never complete because
      of the I/O that is blocked on blk_queue_enter).
      
      Fix this by freezing all the subsystem nsheads request queues before
      executing the passthru command. Given that these commands are infrequent
      we should not worry about this temporary I/O freeze to keep things sane.
      
      Here is the matching hang traces:
      --
      [ 374.465002] INFO: task systemd-udevd:17994 blocked for more than 122 seconds.
      [ 374.472975] Not tainted 5.2.0-rc3-mpdebug+ #42
      [ 374.478522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 374.487274] systemd-udevd D 0 17994 1 0x00000000
      [ 374.493407] Call Trace:
      [ 374.496145] __schedule+0x2ef/0x620
      [ 374.500047] schedule+0x38/0xa0
      [ 374.503569] blk_queue_enter+0x139/0x220
      [ 374.507959] ? remove_wait_queue+0x60/0x60
      [ 374.512540] direct_make_request+0x60/0x130
      [ 374.517219] nvme_ns_head_make_request+0x11d/0x420 [nvme_core]
      [ 374.523740] ? generic_make_request_checks+0x307/0x6f0
      [ 374.529484] generic_make_request+0x10d/0x2e0
      [ 374.534356] submit_bio+0x75/0x140
      [ 374.538163] ? guard_bio_eod+0x32/0xe0
      [ 374.542361] submit_bh_wbc+0x171/0x1b0
      [ 374.546553] block_read_full_page+0x1ed/0x330
      [ 374.551426] ? check_disk_change+0x70/0x70
      [ 374.556008] ? scan_shadow_nodes+0x30/0x30
      [ 374.560588] blkdev_readpage+0x18/0x20
      [ 374.564783] do_read_cache_page+0x301/0x860
      [ 374.569463] ? blkdev_writepages+0x10/0x10
      [ 374.574037] ? prep_new_page+0x88/0x130
      [ 374.578329] ? get_page_from_freelist+0xa2f/0x1280
      [ 374.583688] ? __alloc_pages_nodemask+0x179/0x320
      [ 374.588947] read_cache_page+0x12/0x20
      [ 374.593142] read_dev_sector+0x2d/0xd0
      [ 374.597337] read_lba+0x104/0x1f0
      [ 374.601046] find_valid_gpt+0xfa/0x720
      [ 374.605243] ? string_nocheck+0x58/0x70
      [ 374.609534] ? find_valid_gpt+0x720/0x720
      [ 374.614016] efi_partition+0x89/0x430
      [ 374.618113] ? string+0x48/0x60
      [ 374.621632] ? snprintf+0x49/0x70
      [ 374.625339] ? find_valid_gpt+0x720/0x720
      [ 374.629828] check_partition+0x116/0x210
      [ 374.634214] rescan_partitions+0xb6/0x360
      [ 374.638699] __blkdev_reread_part+0x64/0x70
      [ 374.643377] blkdev_reread_part+0x23/0x40
      [ 374.647860] blkdev_ioctl+0x48c/0x990
      [ 374.651956] block_ioctl+0x41/0x50
      [ 374.655766] do_vfs_ioctl+0xa7/0x600
      [ 374.659766] ? locks_lock_inode_wait+0xb1/0x150
      [ 374.664832] ksys_ioctl+0x67/0x90
      [ 374.668539] __x64_sys_ioctl+0x1a/0x20
      [ 374.672732] do_syscall_64+0x5a/0x1c0
      [ 374.676828] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      [ 374.738474] INFO: task nvmeadm:49141 blocked for more than 123 seconds.
      [ 374.745871] Not tainted 5.2.0-rc3-mpdebug+ #42
      [ 374.751419] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 374.760170] nvmeadm D 0 49141 36333 0x00004080
      [ 374.766301] Call Trace:
      [ 374.769038] __schedule+0x2ef/0x620
      [ 374.772939] schedule+0x38/0xa0
      [ 374.776452] blk_mq_freeze_queue_wait+0x59/0x100
      [ 374.781614] ? remove_wait_queue+0x60/0x60
      [ 374.786192] blk_mq_freeze_queue+0x1a/0x20
      [ 374.790773] nvme_update_disk_info.isra.57+0x5f/0x350 [nvme_core]
      [ 374.797582] ? nvme_identify_ns.isra.50+0x71/0xc0 [nvme_core]
      [ 374.804006] __nvme_revalidate_disk+0xe5/0x110 [nvme_core]
      [ 374.810139] nvme_revalidate_disk+0xa6/0x120 [nvme_core]
      [ 374.816078] ? nvme_submit_user_cmd+0x11e/0x320 [nvme_core]
      [ 374.822299] nvme_user_cmd+0x264/0x370 [nvme_core]
      [ 374.827661] nvme_dev_ioctl+0x112/0x1d0 [nvme_core]
      [ 374.833114] do_vfs_ioctl+0xa7/0x600
      [ 374.837117] ? __audit_syscall_entry+0xdd/0x130
      [ 374.842184] ksys_ioctl+0x67/0x90
      [ 374.845891] __x64_sys_ioctl+0x1a/0x20
      [ 374.850082] do_syscall_64+0x5a/0x1c0
      [ 374.854178] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      --
      Reported-by: NJames Puthukattukaran <james.puthukattukaran@oracle.com>
      Tested-by: NJames Puthukattukaran <james.puthukattukaran@oracle.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      431f579a
    • L
      nvmet-loop: Flush nvme_delete_wq when removing the port · 32c0b8f1
      Logan Gunthorpe 提交于
      [ Upstream commit 86b9a63e595ff03f9d0a7b92b6acc231fecefc29 ]
      
      After calling nvme_loop_delete_ctrl(), the controllers will not
      yet be deleted because nvme_delete_ctrl() only schedules work
      to do the delete.
      
      This means a race can occur if a port is removed but there
      are still active controllers trying to access that memory.
      
      To fix this, flush the nvme_delete_wq before returning from
      nvme_loop_remove_port() so that any controllers that might
      be in the process of being deleted won't access a freed port.
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      32c0b8f1
    • A
      nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns · 7436dc2a
      Anthony Iliopoulos 提交于
      [ Upstream commit fab7772bfbcfe8fb8e3e352a6a8fcaf044cded17 ]
      
      When CONFIG_NVME_MULTIPATH is set, only the hidden gendisk associated
      with the per-controller ns is run through revalidate_disk when a
      rescan is triggered, while the visible blockdev never gets its size
      (bdev->bd_inode->i_size) updated to reflect any capacity changes that
      may have occurred.
      
      This prevents online resizing of nvme block devices and in extension of
      any filesystems atop that will are unable to expand while mounted, as
      userspace relies on the blockdev size for obtaining the disk capacity
      (via BLKGETSIZE/64 ioctls).
      
      Fix this by explicitly revalidating the actual namespace gendisk in
      addition to the per-controller gendisk, when multipath is enabled.
      Signed-off-by: NAnthony Iliopoulos <ailiopoulos@suse.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      7436dc2a
  17. 16 8月, 2019 1 次提交
    • M
      nvme: fix multipath crash when ANA is deactivated · bdce5621
      Marta Rybczynska 提交于
      [ Upstream commit 66b20ac0a1a10769d059d6903202f53494e3d902 ]
      
      Fix a crash with multipath activated. It happends when ANA log
      page is larger than MDTS and because of that ANA is disabled.
      The driver then tries to access unallocated buffer when connecting
      to a nvme target. The signature is as follows:
      
      [  300.433586] nvme nvme0: ANA log page size (8208) larger than MDTS (8192).
      [  300.435387] nvme nvme0: disabling ANA support.
      [  300.437835] nvme nvme0: creating 4 I/O queues.
      [  300.459132] nvme nvme0: new ctrl: NQN "nqn.0.0.0", addr 10.91.0.1:8009
      [  300.464609] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      [  300.466342] #PF error: [normal kernel read fault]
      [  300.467385] PGD 0 P4D 0
      [  300.467987] Oops: 0000 [#1] SMP PTI
      [  300.468787] CPU: 3 PID: 50 Comm: kworker/u8:1 Not tainted 5.0.20kalray+ #4
      [  300.470264] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  300.471532] Workqueue: nvme-wq nvme_scan_work [nvme_core]
      [  300.472724] RIP: 0010:nvme_parse_ana_log+0x21/0x140 [nvme_core]
      [  300.474038] Code: 45 01 d2 d8 48 98 c3 66 90 0f 1f 44 00 00 41 57 41 56 41 55 41 54 55 53 48 89 fb 48 83 ec 08 48 8b af 20 0a 00 00 48 89 34 24 <66> 83 7d 08 00 0f 84 c6 00 00 00 44 8b 7d 14 49 89 d5 8b 55 10 48
      [  300.477374] RSP: 0018:ffffa50e80fd7cb8 EFLAGS: 00010296
      [  300.478334] RAX: 0000000000000001 RBX: ffff9130f1872258 RCX: 0000000000000000
      [  300.479784] RDX: ffffffffc06c4c30 RSI: ffff9130edad4280 RDI: ffff9130f1872258
      [  300.481488] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
      [  300.483203] R10: 0000000000000220 R11: 0000000000000040 R12: ffff9130f18722c0
      [  300.484928] R13: ffff9130f18722d0 R14: ffff9130edad4280 R15: ffff9130f18722c0
      [  300.486626] FS:  0000000000000000(0000) GS:ffff9130f7b80000(0000) knlGS:0000000000000000
      [  300.488538] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  300.489907] CR2: 0000000000000008 CR3: 00000002365e6000 CR4: 00000000000006e0
      [  300.491612] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  300.493303] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  300.494991] Call Trace:
      [  300.495645]  nvme_mpath_add_disk+0x5c/0xb0 [nvme_core]
      [  300.496880]  nvme_validate_ns+0x2ef/0x550 [nvme_core]
      [  300.498105]  ? nvme_identify_ctrl.isra.45+0x6a/0xb0 [nvme_core]
      [  300.499539]  nvme_scan_work+0x2b4/0x370 [nvme_core]
      [  300.500717]  ? __switch_to_asm+0x35/0x70
      [  300.501663]  process_one_work+0x171/0x380
      [  300.502340]  worker_thread+0x49/0x3f0
      [  300.503079]  kthread+0xf8/0x130
      [  300.503795]  ? max_active_store+0x80/0x80
      [  300.504690]  ? kthread_bind+0x10/0x10
      [  300.505502]  ret_from_fork+0x35/0x40
      [  300.506280] Modules linked in: nvme_tcp nvme_rdma rdma_cm iw_cm ib_cm ib_core nvme_fabrics nvme_core xt_physdev ip6table_raw ip6table_mangle ip6table_filter ip6_tables xt_comment iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_CHECKSUM iptable_mangle iptable_filter veth ebtable_filter ebtable_nat ebtables iptable_raw vxlan ip6_udp_tunnel udp_tunnel sunrpc joydev pcspkr virtio_balloon br_netfilter bridge stp llc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net virtio_console net_failover virtio_blk failover ata_piix serio_raw libata virtio_pci virtio_ring virtio
      [  300.514984] CR2: 0000000000000008
      [  300.515569] ---[ end trace faa2eefad7e7f218 ]---
      [  300.516354] RIP: 0010:nvme_parse_ana_log+0x21/0x140 [nvme_core]
      [  300.517330] Code: 45 01 d2 d8 48 98 c3 66 90 0f 1f 44 00 00 41 57 41 56 41 55 41 54 55 53 48 89 fb 48 83 ec 08 48 8b af 20 0a 00 00 48 89 34 24 <66> 83 7d 08 00 0f 84 c6 00 00 00 44 8b 7d 14 49 89 d5 8b 55 10 48
      [  300.520353] RSP: 0018:ffffa50e80fd7cb8 EFLAGS: 00010296
      [  300.521229] RAX: 0000000000000001 RBX: ffff9130f1872258 RCX: 0000000000000000
      [  300.522399] RDX: ffffffffc06c4c30 RSI: ffff9130edad4280 RDI: ffff9130f1872258
      [  300.523560] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
      [  300.524734] R10: 0000000000000220 R11: 0000000000000040 R12: ffff9130f18722c0
      [  300.525915] R13: ffff9130f18722d0 R14: ffff9130edad4280 R15: ffff9130f18722c0
      [  300.527084] FS:  0000000000000000(0000) GS:ffff9130f7b80000(0000) knlGS:0000000000000000
      [  300.528396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  300.529440] CR2: 0000000000000008 CR3: 00000002365e6000 CR4: 00000000000006e0
      [  300.530739] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  300.531989] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  300.533264] Kernel panic - not syncing: Fatal exception
      [  300.534338] Kernel Offset: 0x17c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      [  300.536227] ---[ end Kernel panic - not syncing: Fatal exception ]---
      
      Condition check refactoring from Christoph Hellwig.
      Signed-off-by: NMarta Rybczynska <marta.rybczynska@kalray.eu>
      Tested-by: NJean-Baptiste Riaux <jbriaux@kalray.eu>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      bdce5621
  18. 26 7月, 2019 3 次提交
    • C
      nvme-pci: set the errno on ctrl state change error · 762bba1b
      Chaitanya Kulkarni 提交于
      [ Upstream commit e71afda49335620e3d9adf56015676db33a3bd86 ]
      
      This patch removes the confusing assignment of the variable result at
      the time of declaration and sets the value in error cases next to the
      places where the actual error is happening.
      
      Here we also set the result value to -ENODEV when we fail at the final
      ctrl state transition in nvme_reset_work(). Without this assignment
      result will hold 0 from nvme_setup_io_queue() and on failure 0 will be
      passed to he nvme_remove_dead_ctrl() from final state transition.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      762bba1b
    • M
      nvme-pci: properly report state change failure in nvme_reset_work · c876a665
      Minwoo Im 提交于
      [ Upstream commit cee6c269b016ba89c62e34d6bccb103ee2c7de4f ]
      
      If the state change to NVME_CTRL_CONNECTING fails, the dmesg is going to
      be like:
      
        [  293.689160] nvme nvme0: failed to mark controller CONNECTING
        [  293.689160] nvme nvme0: Removing after probe failure status: 0
      
      Even it prints the first line to indicate the situation, the second line
      is not proper because the status is 0 which means normally success of
      the previous operation.
      
      This patch makes it indicate the proper error value when it fails.
        [   25.932367] nvme nvme0: failed to mark controller CONNECTING
        [   25.932369] nvme nvme0: Removing after probe failure status: -16
      
      This situation is able to be easily reproduced by:
        root@target:~# rmmod nvme && modprobe nvme && rmmod nvme
      Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c876a665
    • A
      nvme: fix possible io failures when removing multipathed ns · f0c83dd1
      Anton Eidelman 提交于
      [ Upstream commit 2181e455612a8db2761eabbf126640552a451e96 ]
      
      When a shared namespace is removed, we call blk_cleanup_queue()
      when the device can still be accessed as the current path and this can
      result in submission to a dying queue. Hence, direct_make_request()
      called by our mpath device may fail (propagating the failure to userspace).
      Instead, we want to failover this I/O to a different path if one exists.
      Thus, before we cleanup the request queue, we make sure that the device is
      cleared from the current path nor it can be selected again as such.
      
      Fix this by:
      - clear the ns from the head->list and synchronize rcu to make sure there is
        no concurrent path search that restores it as the current path
      - clear the mpath current path in order to trigger a subsequent path search
        and sync srcu to wait for any ongoing request submissions
      - safely continue to namespace removal and blk_cleanup_queue
      Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f0c83dd1
  19. 25 6月, 2019 2 次提交
  20. 19 6月, 2019 5 次提交