1. 25 6月, 2020 1 次提交
  2. 11 6月, 2020 1 次提交
  3. 29 5月, 2020 5 次提交
  4. 27 5月, 2020 1 次提交
  5. 10 5月, 2020 3 次提交
  6. 01 4月, 2020 1 次提交
    • S
      nvme-tcp: fix possible crash in recv error flow · 39d06079
      Sagi Grimberg 提交于
      If the target misbehaves and sends us unexpected payload we
      need to make sure to fail the controller and stop processing
      the input stream. We clear the rd_enabled flag and stop
      the io_work, but we may still requeue it if we still have pending
      sends and then in the next invocation we will process the input
      stream as the check is only in the .data_ready upcall.
      
      To fix this we need to make sure not to self-requeue io_work
      upon a recv flow error.
      
      This fixes the crash:
       nvme nvme2: receive failed:  -22
       BUG: unable to handle page fault for address: ffffbeb5816c3b48
       nvme_ns_head_make_request: 29 callbacks suppressed
       block nvme0n5: no usable path - requeuing I/O
       block nvme0n5: no usable path - requeuing I/O
       block nvme0n7: no usable path - requeuing I/O
       block nvme0n7: no usable path - requeuing I/O
       block nvme0n3: no usable path - requeuing I/O
       block nvme0n3: no usable path - requeuing I/O
       block nvme0n3: no usable path - requeuing I/O
       block nvme0n7: no usable path - requeuing I/O
       block nvme0n3: no usable path - requeuing I/O
       block nvme0n3: no usable path - requeuing I/O
       #PF: supervisor read access inkernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 1039157067 P4D 1039157067 PUD 103915a067 PMD 102719f067 PTE 0
       Oops: 0000 [#1] SMP PTI
       CPU: 8 PID: 411 Comm: kworker/8:1H Not tainted 5.3.0-40-generic #32~18.04.1-Ubuntu
       Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
       Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
       RIP: 0010:nvme_tcp_recv_skb+0x2ae/0xb50 [nvme_tcp]
       RSP: 0018:ffffbeb5806cfd10 EFLAGS: 00010246
       RAX: ffffbeb5816c3b48 RBX: 00000000000003d0 RCX: 0000000000000008
       RDX: 00000000000003d0 RSI: 0000000000000001 RDI: ffff9a3040684b40
       RBP: ffffbeb5806cfd90 R08: 0000000000000000 R09: ffffffff946e6900
       R10: ffffbeb5806cfce0 R11: 0000000000000001 R12: 0000000000000000
       R13: ffff9a2ff86501c0 R14: 00000000000003d0 R15: ffff9a30b85f2798
       FS:  0000000000000000(0000) GS:ffff9a30bf800000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffffbeb5816c3b48 CR3: 000000088400a006 CR4: 00000000003626e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
        tcp_read_sock+0x8c/0x290
        ? __release_sock+0x9d/0xe0
        ? nvme_tcp_write_space+0xb0/0xb0 [nvme_tcp]
        nvme_tcp_io_work+0x4b4/0x830 [nvme_tcp]
        ? finish_task_switch+0x163/0x270
        process_one_work+0x1fd/0x3f0
        worker_thread+0x34/0x410
        kthread+0x121/0x140
        ? process_one_work+0x3f0/0x3f0
        ? kthread_park+0xb0/0xb0
        ret_from_fork+0x35/0x40
      Reported-by: NRoy Shterman <roys@lightbitslabs.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      39d06079
  7. 31 3月, 2020 2 次提交
  8. 26 3月, 2020 6 次提交
  9. 05 3月, 2020 1 次提交
  10. 15 2月, 2020 2 次提交
    • N
      nvme: prevent warning triggered by nvme_stop_keep_alive · 97b2512a
      Nigel Kirkland 提交于
      Delayed keep alive work is queued on system workqueue and may be cancelled
      via nvme_stop_keep_alive from nvme_reset_wq, nvme_fc_wq or nvme_wq.
      
      Check_flush_dependency detects mismatched attributes between the work-queue
      context used to cancel the keep alive work and system-wq. Specifically
      system-wq does not have the WQ_MEM_RECLAIM flag, whereas the contexts used
      to cancel keep alive work have WQ_MEM_RECLAIM flag.
      
      Example warning:
      
        workqueue: WQ_MEM_RECLAIM nvme-reset-wq:nvme_fc_reset_ctrl_work [nvme_fc]
      	is flushing !WQ_MEM_RECLAIM events:nvme_keep_alive_work [nvme_core]
      
      To avoid the flags mismatch, delayed keep alive work is queued on nvme_wq.
      
      However this creates a secondary concern where work and a request to cancel
      that work may be in the same work queue - namely err_work in the rdma and
      tcp transports, which will want to flush/cancel the keep alive work which
      will now be on nvme_wq.
      
      After reviewing the transports, it looks like err_work can be moved to
      nvme_reset_wq. In fact that aligns them better with transition into
      RESETTING and performing related reset work in nvme_reset_wq.
      
      Change nvme-rdma and nvme-tcp to perform err_work in nvme_reset_wq.
      Signed-off-by: NNigel Kirkland <nigel.kirkland@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      97b2512a
    • A
      nvme/tcp: fix bug on double requeue when send fails · 2d570a7c
      Anton Eidelman 提交于
      When nvme_tcp_io_work() fails to send to socket due to
      connection close/reset, error_recovery work is triggered
      from nvme_tcp_state_change() socket callback.
      This cancels all the active requests in the tagset,
      which requeues them.
      
      The failed request, however, was ended and thus requeued
      individually as well unless send returned -EPIPE.
      Another return code to be treated the same way is -ECONNRESET.
      
      Double requeue caused BUG_ON(blk_queued_rq(rq))
      in blk_mq_requeue_request() from either the individual requeue
      of the failed request or the bulk requeue from
      blk_mq_tagset_busy_iter(, nvme_cancel_request, );
      Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      2d570a7c
  11. 05 11月, 2019 1 次提交
  12. 29 10月, 2019 1 次提交
  13. 15 10月, 2019 1 次提交
  14. 14 10月, 2019 2 次提交
  15. 26 9月, 2019 1 次提交
  16. 12 9月, 2019 2 次提交
  17. 30 8月, 2019 8 次提交
  18. 05 8月, 2019 1 次提交