1. 27 10月, 2020 1 次提交
  2. 23 10月, 2020 4 次提交
  3. 22 10月, 2020 8 次提交
    • C
      nvmet: don't use BLK_MQ_REQ_NOWAIT for passthru · 150dfb6c
      Chaitanya Kulkarni 提交于
      By default, we set the passthru request allocation flag such that it
      returns the error in the following code path and we fail the I/O when
      BLK_MQ_REQ_NOWAIT is used for request allocation :-
      
      nvme_alloc_request()
       blk_mq_alloc_request()
        blk_mq_queue_enter()
         if (flag & BLK_MQ_REQ_NOWAIT)
              return -EBUSY; <-- return if busy.
      
      On some controllers using BLK_MQ_REQ_NOWAIT ends up in I/O error where
      the controller is perfectly healthy and not in a degraded state.
      
      Block layer request allocation does allow us to wait instead of
      immediately returning the error when we BLK_MQ_REQ_NOWAIT flag is not
      used. This has shown to fix the I/O error problem reported under
      heavy random write workload.
      
      Remove the BLK_MQ_REQ_NOWAIT parameter for passthru request allocation
      which resolves this issue.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      150dfb6c
    • L
      nvmet: cleanup nvmet_passthru_map_sg() · 5e063101
      Logan Gunthorpe 提交于
      Clean up some confusing elements of nvmet_passthru_map_sg() by returning
      early if the request is greater than the maximum bio size. This allows
      us to drop the sg_cnt variable.
      
      This should not result in any functional change but makes the code
      clearer and more understandable. The original code allocated a truncated
      bio then would return EINVAL when bio_add_pc_page() filled that bio. The
      new code just returns EINVAL early if this would happen.
      
      Fixes: c1fef73f ("nvmet: add passthru code to process commands")
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Suggested-by: NDouglas Gilbert <dgilbert@interlog.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      5e063101
    • L
      nvmet: limit passthru MTDS by BIO_MAX_PAGES · df06047d
      Logan Gunthorpe 提交于
      nvmet_passthru_map_sg() only supports mapping a single BIO, not a chain
      so the effective maximum transfer should also be limitted by
      BIO_MAX_PAGES (presently this works out to 1MB).
      
      For PCI passthru devices the max_sectors would typically be more
      limitting than BIO_MAX_PAGES, but this may not be true for all passthru
      devices.
      
      Fixes: c1fef73f ("nvmet: add passthru code to process commands")
      Suggested-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      df06047d
    • Z
      nvmet: fix uninitialized work for zero kato · 85bd23f3
      zhenwei pi 提交于
      When connecting a controller with a zero kato value using the following
      command line
      
         nvme connect -t tcp -n NQN -a ADDR -s PORT --keep-alive-tmo=0
      
      the warning below can be reproduced:
      
      WARNING: CPU: 1 PID: 241 at kernel/workqueue.c:1627 __queue_delayed_work+0x6d/0x90
      with trace:
        mod_delayed_work_on+0x59/0x90
        nvmet_update_cc+0xee/0x100 [nvmet]
        nvmet_execute_prop_set+0x72/0x80 [nvmet]
        nvmet_tcp_try_recv_pdu+0x2f7/0x770 [nvmet_tcp]
        nvmet_tcp_io_work+0x63f/0xb2d [nvmet_tcp]
        ...
      
      This is caused by queuing up an uninitialized work.  Althrough the
      keep-alive timer is disabled during allocating the controller (fixed in
      0d3b6a8d), ka_work still has a chance to run (called by
      nvmet_start_ctrl).
      
      Fixes: 0d3b6a8d ("nvmet: Disable keep-alive timer when kato is cleared to 0h")
      Signed-off-by: Nzhenwei pi <pizhenwei@bytedance.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      85bd23f3
    • K
      nvme-pci: disable Write Zeroes on Sandisk Skyhawk · 02ca079c
      Kai-Heng Feng 提交于
      Like commit 5611ec2b ("nvme-pci: prevent SK hynix PC400 from using
      Write Zeroes command"), Sandisk Skyhawk has the same issue:
      [ 6305.633887] blk_update_request: operation not supported error, dev nvme0n1, sector 340812032 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
      
      So also disable Write Zeroes command on Sandisk Skyhawk.
      
      BugLink: https://bugs.launchpad.net/bugs/1899503Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      02ca079c
    • K
      nvme: use queuedata for nvme_req_qid · 643c476d
      Keith Busch 提交于
      The request's rq_disk isn't set for passthrough IO commands, so tracing
      uses qid 0 for these which incorrectly decodes as an admin command. Use
      the request_queue's queuedata instead since that value is always set for
      the IO queues, and never set for the admin queue.
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      643c476d
    • C
      nvme-rdma: fix crash due to incorrect cqe · a87da50f
      Chao Leng 提交于
      A crash happened due to injecting error test.
      When a CQE has incorrect command id due do an error injection, the host
      may find a request which is already freed.  Dereferencing req->mr->rkey
      causes a crash in nvme_rdma_process_nvme_rsp because the mr is already
      freed.
      
      Add a check for the mr to fix it.
      Signed-off-by: NChao Leng <lengchao@huawei.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      a87da50f
    • C
      nvme-rdma: fix crash when connect rejected · 43efdb8e
      Chao Leng 提交于
      A crash can happened when a connect is rejected.   The host establishes
      the connection after received ConnectReply, and then continues to send
      the fabrics Connect command.  If the controller does not receive the
      ReadyToUse capsule, host may receive a ConnectReject reply.
      
      Call nvme_rdma_destroy_queue_ib after the host received the
      RDMA_CM_EVENT_REJECTED event.  Then when the fabrics Connect command
      times out, nvme_rdma_timeout calls nvme_rdma_complete_rq to fail the
      request.  A crash happenes due to use after free in
      nvme_rdma_complete_rq.
      
      nvme_rdma_destroy_queue_ib is redundant when handling the
      RDMA_CM_EVENT_REJECTED event as nvme_rdma_destroy_queue_ib is already
      called in connection failure handler.
      Signed-off-by: NChao Leng <lengchao@huawei.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      43efdb8e
  4. 14 10月, 2020 1 次提交
  5. 07 10月, 2020 23 次提交
  6. 03 10月, 2020 1 次提交
    • C
      nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage() · 7d4194ab
      Coly Li 提交于
      Currently nvme_tcp_try_send_data() doesn't use kernel_sendpage() to
      send slab pages. But for pages allocated by __get_free_pages() without
      __GFP_COMP, which also have refcount as 0, they are still sent by
      kernel_sendpage() to remote end, this is problematic.
      
      The new introduced helper sendpage_ok() checks both PageSlab tag and
      page_count counter, and returns true if the checking page is OK to be
      sent by kernel_sendpage().
      
      This patch fixes the page checking issue of nvme_tcp_try_send_data()
      with sendpage_ok(). If sendpage_ok() returns true, send this page by
      kernel_sendpage(), otherwise use sock_no_sendpage to handle this page.
      Signed-off-by: NColy Li <colyli@suse.de>
      Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jan Kara <jack@suse.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
      Cc: Philipp Reisner <philipp.reisner@linbit.com>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Vlastimil Babka <vbabka@suse.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d4194ab
  7. 27 9月, 2020 2 次提交
    • J
      nvme-pci: allocate separate interrupt for the reserved non-polled I/O queue · 21cc2f3f
      Jeffle Xu 提交于
      One queue will be reserved for non-polled IO when nvme.poll_queues is
      greater or equal than the number of IO queues that the nvme controller
      can provide. Currently the reserved queue for non-polled IO will reuse
      the interrupt used by admin queue in this case, e.g, vector 0.
      
      This can work and the performance may not be an issue since the admin
      queue is used unfrequently. However this behaviour may be inconsistent
      with that when nvme.poll_queues is smaller than the number of IO
      queues available.
      
      Thus allocate separate interrupt for this reserved queue, and thus make
      the behaviour consistent.
      Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
      [hch: minor cleanups, mostly to the pre-existing surrounding code]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      21cc2f3f
    • C
      nvme: fix error handling in nvme_ns_report_zones · 936fab50
      Christoph Hellwig 提交于
      nvme_submit_sync_cmd can return positive NVMe error codes in addition to
      the negative Linux error code, which are currently ignored.  Fix this
      by removing __nvme_ns_report_zones and handling the errors from
      nvme_submit_sync_cmd in the caller instead of multiplexing the return
      value and the number of zones reported into a single return value.
      
      Fixes: 240e6ee2 ("nvme: support for zoned namespaces")
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
      936fab50