1. 10 5月, 2020 13 次提交
  2. 27 4月, 2020 1 次提交
    • N
      nvme: prevent double free in nvme_alloc_ns() error handling · 132be623
      Niklas Cassel 提交于
      When jumping to the out_put_disk label, we will call put_disk(), which will
      trigger a call to disk_release(), which calls blk_put_queue().
      
      Later in the cleanup code, we do blk_cleanup_queue(), which will also call
      blk_put_queue().
      
      Putting the queue twice is incorrect, and will generate a KASAN splat.
      
      Set the disk->queue pointer to NULL, before calling put_disk(), so that the
      first call to blk_put_queue() will not free the queue.
      
      The second call to blk_put_queue() uses another pointer to the same queue,
      so this call will still free the queue.
      
      Fixes: 85136c01 ("lightnvm: simplify geometry enumeration")
      Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      132be623
  3. 04 4月, 2020 2 次提交
    • J
      nvme-fc: Revert "add module to ops template to allow module references" · 8c5c6605
      James Smart 提交于
      The original patch was to resolve the lldd being able to be unloaded
      while being used to talk to the boot device of the system. However, the
      end result of the original patch is that any driver unload while a nvme
      controller is live via the lldd is now being prohibited. Given the module
      reference, the module teardown routine can't be called, thus there's no
      way, other than manual actions to terminate the controllers.
      
      Fixes: 863fbae9 ("nvme_fc: add module to ops template to allow module references")
      Cc: <stable@vger.kernel.org> # v5.4+
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      8c5c6605
    • S
      nvme: fix deadlock caused by ANA update wrong locking · 657f1975
      Sagi Grimberg 提交于
      The deadlock combines 4 flows in parallel:
      - ns scanning (triggered from reconnect)
      - request timeout
      - ANA update (triggered from reconnect)
      - I/O coming into the mpath device
      
      (1) ns scanning triggers disk revalidation -> update disk info ->
          freeze queue -> but blocked, due to (2)
      
      (2) timeout handler reference the g_usage_counter - > but blocks in
          the transport .timeout() handler, due to (3)
      
      (3) the transport timeout handler (indirectly) calls nvme_stop_queue() ->
          which takes the (down_read) namespaces_rwsem - > but blocks, due to (4)
      
      (4) ANA update takes the (down_write) namespaces_rwsem -> calls
          nvme_mpath_set_live() -> which synchronize the ns_head srcu
          (see commit 504db087) -> but blocks, due to (5)
      
      (5) I/O came into nvme_mpath_make_request -> took srcu_read_lock ->
          direct_make_request > blk_queue_enter -> but blocked, due to (1)
      
      ==> the request queue is under freeze -> deadlock.
      
      The fix is making ANA update take a read lock as the namespaces list
      is not manipulated, it is just the ns and ns->head that are being
      updated (which is protected with the ns->head lock).
      
      Fixes: 0d0b660f ("nvme: add ANA support")
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      657f1975
  4. 02 4月, 2020 1 次提交
  5. 01 4月, 2020 1 次提交
    • S
      nvme-tcp: fix possible crash in recv error flow · 39d06079
      Sagi Grimberg 提交于
      If the target misbehaves and sends us unexpected payload we
      need to make sure to fail the controller and stop processing
      the input stream. We clear the rd_enabled flag and stop
      the io_work, but we may still requeue it if we still have pending
      sends and then in the next invocation we will process the input
      stream as the check is only in the .data_ready upcall.
      
      To fix this we need to make sure not to self-requeue io_work
      upon a recv flow error.
      
      This fixes the crash:
       nvme nvme2: receive failed:  -22
       BUG: unable to handle page fault for address: ffffbeb5816c3b48
       nvme_ns_head_make_request: 29 callbacks suppressed
       block nvme0n5: no usable path - requeuing I/O
       block nvme0n5: no usable path - requeuing I/O
       block nvme0n7: no usable path - requeuing I/O
       block nvme0n7: no usable path - requeuing I/O
       block nvme0n3: no usable path - requeuing I/O
       block nvme0n3: no usable path - requeuing I/O
       block nvme0n3: no usable path - requeuing I/O
       block nvme0n7: no usable path - requeuing I/O
       block nvme0n3: no usable path - requeuing I/O
       block nvme0n3: no usable path - requeuing I/O
       #PF: supervisor read access inkernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 1039157067 P4D 1039157067 PUD 103915a067 PMD 102719f067 PTE 0
       Oops: 0000 [#1] SMP PTI
       CPU: 8 PID: 411 Comm: kworker/8:1H Not tainted 5.3.0-40-generic #32~18.04.1-Ubuntu
       Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
       Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
       RIP: 0010:nvme_tcp_recv_skb+0x2ae/0xb50 [nvme_tcp]
       RSP: 0018:ffffbeb5806cfd10 EFLAGS: 00010246
       RAX: ffffbeb5816c3b48 RBX: 00000000000003d0 RCX: 0000000000000008
       RDX: 00000000000003d0 RSI: 0000000000000001 RDI: ffff9a3040684b40
       RBP: ffffbeb5806cfd90 R08: 0000000000000000 R09: ffffffff946e6900
       R10: ffffbeb5806cfce0 R11: 0000000000000001 R12: 0000000000000000
       R13: ffff9a2ff86501c0 R14: 00000000000003d0 R15: ffff9a30b85f2798
       FS:  0000000000000000(0000) GS:ffff9a30bf800000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffffbeb5816c3b48 CR3: 000000088400a006 CR4: 00000000003626e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
        tcp_read_sock+0x8c/0x290
        ? __release_sock+0x9d/0xe0
        ? nvme_tcp_write_space+0xb0/0xb0 [nvme_tcp]
        nvme_tcp_io_work+0x4b4/0x830 [nvme_tcp]
        ? finish_task_switch+0x163/0x270
        process_one_work+0x1fd/0x3f0
        worker_thread+0x34/0x410
        kthread+0x121/0x140
        ? process_one_work+0x3f0/0x3f0
        ? kthread_park+0xb0/0xb0
        ret_from_fork+0x35/0x40
      Reported-by: NRoy Shterman <roys@lightbitslabs.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      39d06079
  6. 31 3月, 2020 4 次提交
  7. 28 3月, 2020 1 次提交
    • C
      block: simplify queue allocation · 3d745ea5
      Christoph Hellwig 提交于
      Current make_request based drivers use either blk_alloc_queue_node or
      blk_alloc_queue to allocate a queue, and then set up the make_request_fn
      function pointer and a few parameters using the blk_queue_make_request
      helper.  Simplify this by passing the make_request pointer to
      blk_alloc_queue, and while at it merge the _node variant into the main
      helper by always passing a node_id, and remove the superfluous gfp_mask
      parameter.  A lower-level __blk_alloc_queue is kept for the blk-mq case.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3d745ea5
  8. 26 3月, 2020 17 次提交