1. 02 12月, 2020 19 次提交
    • J
      nvme: remove unnecessary return values · e1aaf5ca
      Javier González 提交于
      Cleanup unnecessary ret values that are not checked or used in
      nvme_alloc_ns().
      Signed-off-by: NJavier González <javier.gonz@samsung.com>
      Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      e1aaf5ca
    • M
      nvme: print a warning for when listing active namespaces fails · f781f3dd
      Minwoo Im 提交于
      During the scan_work, an Identify command is issued to figure out which
      namespaces are active.  If this command fails, the nvme driver falls back
      to scanning namespaces sequentially.  In this situation, we don't see
      any warnings and don't even know whether list-ns command has been failed
      or not easiliy.
      
      Printa warning when the Identify command executin fail:
      
      [    1.108399] nvme nvme0: Identify NS List failed (status=0x400b)
      [    1.109583] nvme0n1: detected capacity change from 0 to 1048576
      [    1.112186] nvme nvme0: Identify Descriptors failed (nsid=2, status=0x4002)
      [    1.113929] nvme nvme0: Identify Descriptors failed (nsid=3, status=0x4002)
      [    1.116537] nvme nvme0: Identify Descriptors failed (nsid=4, status=0x4002)
      ...
      Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      f781f3dd
    • M
      nvme: improve an error message on Identify failure · aa9d7295
      Minwoo Im 提交于
      Add the namespace ID to the error message when the Identify command used
      to retrieve the Namespace Identification Descriptor list fails.
      
      This avoids rather useless and duplicative messages like the following:
      [    1.321031] nvme nvme0: Identify Descriptors failed (16386)
      [    1.321948] nvme nvme0: Identify Descriptors failed (16386)
      [    1.322872] nvme nvme0: Identify Descriptors failed (16386)
      [    1.323775] nvme nvme0: Identify Descriptors failed (16386)
      [    1.324687] nvme nvme0: Identify Descriptors failed (16386)
      ...
      
      Also, print the nvme status code in hexadecimal rather than decimal
      format rather for better readability.
      Signed-off-by: NMinwoo Im <minwoo.im.dev@gmail.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      aa9d7295
    • V
      nvme-fabrics: reject I/O to offline device · 8c4dfea9
      Victor Gladkov 提交于
      Commands get stuck while Host NVMe-oF controller is in reconnect state.
      The controller enters into reconnect state when it loses connection with
      the target.  It tries to reconnect every 10 seconds (default) until
      a successful reconnect or until the reconnect time-out is reached.
      The default reconnect time out is 10 minutes.
      
      Applications are expecting commands to complete with success or error
      within a certain timeout (30 seconds by default).  The NVMe host is
      enforcing that timeout while it is connected, but during reconnect the
      timeout is not enforced and commands may get stuck for a long period or
      even forever.
      
      To fix this long delay due to the default timeout, introduce new
      "fast_io_fail_tmo" session parameter.  The timeout is measured in seconds
      from the controller reconnect and any command beyond that timeout is
      rejected.  The new parameter value may be passed during 'connect'.
      The default value of -1 means no timeout (similar to current behavior).
      Signed-off-by: NVictor Gladkov <victor.gladkov@kioxia.com>
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NChao Leng <lengchao@huawei.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      8c4dfea9
    • C
      nvmet: fix a spelling mistake "incuding" -> "including" in Kconfig · 9f20599c
      Colin Ian King 提交于
      There is a spelling mistake in the Kconfig help text. Fix it.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      9f20599c
    • M
      nvmet: make sure discovery change log event is protected · 0068a7b0
      Max Gurtovoy 提交于
      Generation counter is protected by nvmet_config_sem. Make sure the
      callers that call functions that might change it, are calling it
      properly.
      Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
      Reviewed-by: NIsrael Rukshin <israelr@nvidia.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      0068a7b0
    • A
      nvmet: remove unused ctrl->cqs · 6d65aeab
      Amit 提交于
      remove unused cqs from nvmet_ctrl struct
      this will reduce the allocated memory.
      Signed-off-by: NAmit <amit.engel@dell.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      6d65aeab
    • N
      nvme-pci: don't allocate unused I/O queues · e3aef095
      Niklas Schnelle 提交于
      currently the NVME_QUIRK_SHARED_TAGS quirk for Apple devices is handled
      during the assignment of nr_io_queues in nvme_setup_io_queues().
      This however means that for these devices nvme_max_io_queues() will
      actually not return the supported maximum which is confusing and
      unexpected and also means that in nvme_probe() we are allocating
      for I/O queues that will never be used.
      Fix this by moving the quirk handling into nvme_max_io_queues().
      Signed-off-by: NNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      e3aef095
    • N
      nvme-pci: drop min() from nr_io_queues assignment · ff4e5fba
      Niklas Schnelle 提交于
      in nvme_setup_io_queues() the number of I/O queues is set to either 1 in
      case of a quirky Apple device or to the min of nvme_max_io_queues() or
      dev->nr_allocated_queues - 1.
      This is unnecessarily complicated as dev->nr_allocated_queues is only
      assigned once and is nvme_max_io_queues() + 1.
      Signed-off-by: NNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      ff4e5fba
    • C
      nvmet: use inline bio for passthru fast path · dab3902b
      Chaitanya Kulkarni 提交于
      In nvmet_passthru_execute_cmd() which is a high frequency function
      it uses bio_alloc() which leads to memory allocation from the fs pool
      for each I/O.
      
      For NVMeoF nvmet_req we already have inline_bvec allocated as a part of
      request allocation that can be used with preallocated bio when we
      already know the size of request before bio allocation with bio_alloc(),
      which we already do.
      
      Introduce a bio member for the nvmet_req passthru anon union. In the
      fast path, check if we can get away with inline bvec and bio from
      nvmet_req with bio_init() call before actually allocating from the
      bio_alloc().
      
      This will be useful to avoid any new memory allocation under high
      memory pressure situation and get rid of any extra work of
      allocation (bio_alloc()) vs initialization (bio_init()) when
      transfer len is < NVMET_MAX_INLINE_DATA_LEN that user can configure at
      compile time.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      dab3902b
    • C
      nvmet: use blk_rq_bio_prep instead of blk_rq_append_bio · a4fe2d3a
      Chaitanya Kulkarni 提交于
      The function blk_rq_append_bio() is a genereric API written for all
      types driver (having bounce buffers) and different context (where
      request is already having a bio i.e. rq->bio != NULL).
      
      It does mainly three things: calculating the segments, bounce queue and
      if req->bio == NULL call blk_rq_bio_prep() or handle low level merge()
      case.
      
      The NVMe PCIe and fabrics transports currently does not use queue
      bounce mechanism. In order to find this for each request processing
      in the passthru blk_rq_append_bio() does extra work in the fast path
      for each request.
      
      When I ran I/Os with different block sizes on the passthru controller
      I found that we can reuse the req->sg_cnt instead of iterating over the
      bvecs to find out nr_segs in blk_rq_append_bio(). This calculation in
      blk_rq_append_bio() is a duplication of work given that we have the
      value in req->sg_cnt. (correct me here if I'm wrong).
      
      With NVMe passthru request based driver we allocate fresh request each
      time, so every call to blk_rq_append_bio() rq->bio will be NULL i.e.
      we don't really need the second condition in the blk_rq_append_bio()
      and the resulting error condition in the caller of blk_rq_append_bio().
      
      So for NVMeOF passthru driver recalculating the segments, bounce check
      and ll_back_merge code is not needed such that we can get away with the
      minimal version of the blk_rq_append_bio() which removes the error check
      in the fast path along with extra variable in nvmet_passthru_map_sg().
      
      This patch updates the nvmet_passthru_map_sg() such that it does only
      appending the bio to the request in the context of the NVMeOF Passthru
      driver. Following are perf numbers :-
      
      With current implementation (blk_rq_append_bio()) :-
      ----------------------------------------------------
      +    5.80%     0.02%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
      +    5.44%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
      +    4.88%     0.00%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
      +    5.44%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
      +    4.86%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
      +    5.17%     0.00%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd
      
      With this patch using blk_rq_bio_prep() :-
      ----------------------------------------------------
      +    3.14%     0.02%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd
      +    3.26%     0.01%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd
      +    5.37%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
      +    5.18%     0.02%  kworker/0:2-eve  [nvmet]  [k] nvmet_passthru_execute_cmd
      +    4.84%     0.02%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
      +    4.87%     0.01%  kworker/0:2-mm_  [nvmet]  [k] nvmet_passthru_execute_cmd
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      a4fe2d3a
    • C
      nvmet: remove op_flags for passthru commands · 06b3bec8
      Chaitanya Kulkarni 提交于
      For passthru commands setting op_flags has no meaning. Remove the code
      that sets the op flags in nvmet_passthru_map_sg().
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      06b3bec8
    • C
      nvme: split nvme_alloc_request() · 39dfe844
      Chaitanya Kulkarni 提交于
      Right now nvme_alloc_request() allocates a request from block layer
      based on the value of the qid. When qid set to NVME_QID_ANY it used
      blk_mq_alloc_request() else blk_mq_alloc_request_hctx().
      
      The function nvme_alloc_request() is called from different context, The
      only place where it uses non NVME_QID_ANY value is for fabrics connect
      commands :-
      
      nvme_submit_sync_cmd()		NVME_QID_ANY
      nvme_features()			NVME_QID_ANY
      nvme_sec_submit()		NVME_QID_ANY
      nvmf_reg_read32()		NVME_QID_ANY
      nvmf_reg_read64()		NVME_QID_ANY
      nvmf_reg_write32()		NVME_QID_ANY
      nvmf_connect_admin_queue()	NVME_QID_ANY
      nvme_submit_user_cmd()		NVME_QID_ANY
      	nvme_alloc_request()
      nvme_keep_alive()		NVME_QID_ANY
      	nvme_alloc_request()
      nvme_timeout()			NVME_QID_ANY
      	nvme_alloc_request()
      nvme_delete_queue()		NVME_QID_ANY
      	nvme_alloc_request()
      nvmet_passthru_execute_cmd()	NVME_QID_ANY
      	nvme_alloc_request()
      nvmf_connect_io_queue() 	QID
      	__nvme_submit_sync_cmd()
      		nvme_alloc_request()
      
      With passthru nvme_alloc_request() now falls into the I/O fast path such
      that blk_mq_alloc_request_hctx() is never gets called and that adds
      additional branch check in fast path.
      
      Split the nvme_alloc_request() into nvme_alloc_request() and
      nvme_alloc_request_qid().
      
      Replace each call of the nvme_alloc_request() with NVME_QID_ANY param
      with a call to newly added nvme_alloc_request() without NVME_QID_ANY.
      
      Replace a call to nvme_alloc_request() with QID param with a call to
      newly added nvme_alloc_request() and nvme_alloc_request_qid()
      based on the qid value set in the __nvme_submit_sync_cmd().
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      39dfe844
    • C
      nvmet: add passthru io timeout value attr · 47e9730c
      Chaitanya Kulkarni 提交于
      NVMeOF controller in the passsthru mode is capable of handling wide set
      of I/O commands including vender specific passhtru io comands.
      
      The vendor specific I/O commands are used to read the large drive
      logs and can take longer than default NVMe commands, i.e. for
      passthru requests the timeout value may differ from the passthru
      controller's default timeout values (nvme-core:io_timeout).
      
      Add a configfs attribute so that user can set the io timeout values.
      In case if this configfs value is not set nvme_alloc_request() will set
      the NVME_IO_TIMEOUT value when request queuedata is NULL.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      47e9730c
    • C
      nvmet: add passthru admin timeout value attr · a2f6a2b8
      Chaitanya Kulkarni 提交于
      NVMeOF controller in the passsthru mode is capable of handling wide set
      of admin commands including vender specific passhtru admin comands.
      
      The vendor specific admin commands are used to read the large drive
      logs and can take longer than default NVMe commands, i.e. for
      passthru requests the timeout value may differ from the passthru
      controller's default timeout values (nvme-core:admin_timeout).
      
      Add a configfs attribute so that user can set the admin timeout values.
      In case if this configfs value is not set nvme_alloc_request() will set
      the ADMIN_TIMEOUT value when request queuedata is NULL.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      a2f6a2b8
    • C
      nvme: use consistent macro name for timeout · dc96f938
      Chaitanya Kulkarni 提交于
      This is purely a clenaup patch, add prefix NVME to the ADMIN_TIMEOUT to
      make consistent with NVME_IO_TIMEOUT.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      dc96f938
    • C
      nvme: centralize setting the timeout in nvme_alloc_request · 0d2e7c84
      Chaitanya Kulkarni 提交于
      The function nvme_alloc_request() is called from different context
      (I/O and Admin queue) where callers do not consider the I/O timeout when
      called from I/O queue context.
      
      Update nvme_alloc_request() to set the default I/O and Admin timeout
      value based on whether the queuedata is set or not.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      0d2e7c84
    • B
      nvme: simplify nvme_req_qid() · 84115d6d
      Baolin Wang 提交于
      Use the request's '->mq_hctx->queue_num' directly to simplify the
      nvme_req_qid() function.
      Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      84115d6d
    • J
      nvme-fcloop: add sysfs attribute to inject command drop · 03d99e5d
      James Smart 提交于
      Add sysfs attribute to specify parameters for dropping a command.  The
      attribute takes a string of:
      
        <opcode>:<starting a what instance>:<number of times>
      
      Opcode is formatted as lower 8 bits are opcode.  If a fabrics opcode, a
      bit above bits 7:0 will be set.
      
      Once set, each sqe is looked at. If the opcode matches the running
      instance count is updated. If the instance count is in the range of where
      to drop (based on starting and # of times), then drop the command by not
      passing it to the target layer.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      03d99e5d
  2. 10 11月, 2020 1 次提交
  3. 03 11月, 2020 6 次提交
  4. 28 10月, 2020 1 次提交
  5. 27 10月, 2020 7 次提交
    • C
      nvmet: fix a NULL pointer dereference when tracing the flush command · 3c3751f2
      Chaitanya Kulkarni 提交于
      When target side trace in turned on and flush command is issued from the
      host it results in the following Oops.
      
      [  856.789724] BUG: kernel NULL pointer dereference, address: 0000000000000068
      [  856.790686] #PF: supervisor read access in kernel mode
      [  856.791262] #PF: error_code(0x0000) - not-present page
      [  856.791863] PGD 6d7110067 P4D 6d7110067 PUD 66f0ad067 PMD 0
      [  856.792527] Oops: 0000 [#1] SMP NOPTI
      [  856.792950] CPU: 15 PID: 7034 Comm: nvme Tainted: G           OE     5.9.0nvme-5.9+ #71
      [  856.793790] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e3214
      [  856.794956] RIP: 0010:trace_event_raw_event_nvmet_req_init+0x13e/0x170 [nvmet]
      [  856.795734] Code: 41 5c 41 5d c3 31 d2 31 f6 e8 4e 9b b8 e0 e9 0e ff ff ff 49 8b 55 00 48 8b 38 8b 0
      [  856.797740] RSP: 0018:ffffc90001be3a60 EFLAGS: 00010246
      [  856.798375] RAX: 0000000000000000 RBX: ffff8887e7d2c01c RCX: 0000000000000000
      [  856.799234] RDX: 0000000000000020 RSI: 0000000057e70ea2 RDI: ffff8887e7d2c034
      [  856.800088] RBP: ffff88869f710578 R08: ffff888807500d40 R09: 00000000fffffffe
      [  856.800951] R10: 0000000064c66670 R11: 00000000ef955201 R12: ffff8887e7d2c034
      [  856.801807] R13: ffff88869f7105c8 R14: 0000000000000040 R15: ffff88869f710440
      [  856.802667] FS:  00007f6a22bd8780(0000) GS:ffff888813a00000(0000) knlGS:0000000000000000
      [  856.803635] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  856.804367] CR2: 0000000000000068 CR3: 00000006d73e0000 CR4: 00000000003506e0
      [  856.805283] Call Trace:
      [  856.805613]  nvmet_req_init+0x27c/0x480 [nvmet]
      [  856.806200]  nvme_loop_queue_rq+0xcb/0x1d0 [nvme_loop]
      [  856.806862]  blk_mq_dispatch_rq_list+0x123/0x7b0
      [  856.807459]  ? kvm_sched_clock_read+0x14/0x30
      [  856.808025]  __blk_mq_sched_dispatch_requests+0xc7/0x170
      [  856.808708]  blk_mq_sched_dispatch_requests+0x30/0x60
      [  856.809372]  __blk_mq_run_hw_queue+0x70/0x100
      [  856.809935]  __blk_mq_delay_run_hw_queue+0x156/0x170
      [  856.810574]  blk_mq_run_hw_queue+0x86/0xe0
      [  856.811104]  blk_mq_sched_insert_request+0xef/0x160
      [  856.811733]  blk_execute_rq+0x69/0xc0
      [  856.812212]  ? blk_mq_rq_ctx_init+0xd0/0x230
      [  856.812784]  nvme_execute_passthru_rq+0x57/0x130 [nvme_core]
      [  856.813461]  nvme_submit_user_cmd+0xeb/0x300 [nvme_core]
      [  856.814099]  nvme_user_cmd.isra.82+0x11e/0x1a0 [nvme_core]
      [  856.814752]  blkdev_ioctl+0x1dc/0x2c0
      [  856.815197]  block_ioctl+0x3f/0x50
      [  856.815606]  __x64_sys_ioctl+0x84/0xc0
      [  856.816074]  do_syscall_64+0x33/0x40
      [  856.816533]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  856.817168] RIP: 0033:0x7f6a222ed107
      [  856.817617] Code: 44 00 00 48 8b 05 81 cd 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 8
      [  856.819901] RSP: 002b:00007ffca848f058 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
      [  856.820846] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f6a222ed107
      [  856.821726] RDX: 00007ffca848f060 RSI: 00000000c0484e43 RDI: 0000000000000003
      [  856.822603] RBP: 0000000000000003 R08: 000000000000003f R09: 0000000000000005
      [  856.823478] R10: 00007ffca848ece0 R11: 0000000000000202 R12: 00007ffca84912d3
      [  856.824359] R13: 00007ffca848f4d0 R14: 0000000000000002 R15: 000000000067e900
      [  856.825236] Modules linked in: nvme_loop(OE) nvmet(OE) nvme_fabrics(OE) null_blk nvme(OE) nvme_corel
      
      Move the nvmet_req_init() tracepoint after we parse the command in
      nvmet_req_init() so that we can get rid of the duplicate
      nvmet_find_namespace() call.
      Rename __assign_disk_name() ->  __assign_req_name(). Now that we call
      tracepoint after parsing the command simplify the newly added
      __assign_req_name() which fixes this bug.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      3c3751f2
    • J
      nvme-fc: remove nvme_fc_terminate_io() · ac9b820e
      James Smart 提交于
      __nvme_fc_terminate_io() is now called by only 1 place, in reset_work.
      Consoldate and move the functionality of terminate_io into reset_work.
      
      In reset_work, rather than calling the create_association directly,
      schedule the connect work element to do its thing. After scheduling,
      flush the connect work element to continue with semantic of not
      returning until connect has been attempted at least once.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      ac9b820e
    • J
      nvme-fc: eliminate terminate_io use by nvme_fc_error_recovery · 95ced8a2
      James Smart 提交于
      nvme_fc_error_recovery() special cases handling when in CONNECTING state
      and calls __nvme_fc_terminate_io(). __nvme_fc_terminate_io() itself
      special cases CONNECTING state and calls the routine to abort outstanding
      ios.
      
      Simplify the sequence by putting the call to abort outstanding I/Os
      directly in nvme_fc_error_recovery.
      
      Move the location of __nvme_fc_abort_outstanding_ios(), and
      nvme_fc_terminate_exchange() which is called by it, to avoid adding
      function prototypes for nvme_fc_error_recovery().
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      95ced8a2
    • J
      nvme-fc: remove err_work work item · 9c2bb257
      James Smart 提交于
      err_work was created to handle errors (mainly I/O timeouts) while in
      CONNECTING state. The flag for err_work_active is also unneeded.
      
      Remove err_work_active and err_work.  The actions to abort I/Os are moved
      inline to nvme_error_recovery().
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      9c2bb257
    • J
      nvme-fc: track error_recovery while connecting · caf1cbe3
      James Smart 提交于
      Whenever there are errors during CONNECTING, the driver recovers by
      aborting all outstanding ios and counts on the io completion to fail them
      and thus the connection/association they are on.  However, the connection
      failure depends on a failure state from the core routines.  Not all
      commands that are issued by the core routine are guaranteed to cause a
      failure of the core routine. They may be treated as a failure status and
      the status is then ignored.
      
      As such, whenever the transport enters error_recovery while CONNECTING,
      it will set a new flag indicating an association failed. The
      create_association routine which creates and initializes the controller,
      will monitor the state of the flag as well as the core routine error
      status and ensure the association fails if there was an error.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      caf1cbe3
    • Z
      nvme-rdma: handle unexpected nvme completion data length · 25c1ca6e
      zhenwei pi 提交于
      Receiving a zero length message leads to the following warnings because
      the CQE is processed twice:
      
      refcount_t: underflow; use-after-free.
      WARNING: CPU: 0 PID: 0 at lib/refcount.c:28
      
      RIP: 0010:refcount_warn_saturate+0xd9/0xe0
      Call Trace:
       <IRQ>
       nvme_rdma_recv_done+0xf3/0x280 [nvme_rdma]
       __ib_process_cq+0x76/0x150 [ib_core]
       ...
      
      Sanity check the received data length, to avoids this.
      
      Thanks to Chao Leng & Sagi for suggestions.
      Signed-off-by: Nzhenwei pi <pizhenwei@bytedance.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      25c1ca6e
    • K
      nvme: ignore zone validate errors on subsequent scans · 8685699c
      Keith Busch 提交于
      Revalidating nvme zoned namespaces requires IO commands, and there are
      controller states that prevent IO. For example, a sanitize in progress
      is required to fail all IO, but we don't want to remove a namespace
      we've previously added just because the controller is in such a state.
      Suppress the error in this case.
      Reported-by: NMichael Nguyen <michael.nguyen@wdc.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      8685699c
  6. 23 10月, 2020 4 次提交
  7. 22 10月, 2020 2 次提交
    • C
      nvmet: don't use BLK_MQ_REQ_NOWAIT for passthru · 150dfb6c
      Chaitanya Kulkarni 提交于
      By default, we set the passthru request allocation flag such that it
      returns the error in the following code path and we fail the I/O when
      BLK_MQ_REQ_NOWAIT is used for request allocation :-
      
      nvme_alloc_request()
       blk_mq_alloc_request()
        blk_mq_queue_enter()
         if (flag & BLK_MQ_REQ_NOWAIT)
              return -EBUSY; <-- return if busy.
      
      On some controllers using BLK_MQ_REQ_NOWAIT ends up in I/O error where
      the controller is perfectly healthy and not in a degraded state.
      
      Block layer request allocation does allow us to wait instead of
      immediately returning the error when we BLK_MQ_REQ_NOWAIT flag is not
      used. This has shown to fix the I/O error problem reported under
      heavy random write workload.
      
      Remove the BLK_MQ_REQ_NOWAIT parameter for passthru request allocation
      which resolves this issue.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      150dfb6c
    • L
      nvmet: cleanup nvmet_passthru_map_sg() · 5e063101
      Logan Gunthorpe 提交于
      Clean up some confusing elements of nvmet_passthru_map_sg() by returning
      early if the request is greater than the maximum bio size. This allows
      us to drop the sg_cnt variable.
      
      This should not result in any functional change but makes the code
      clearer and more understandable. The original code allocated a truncated
      bio then would return EINVAL when bio_add_pc_page() filled that bio. The
      new code just returns EINVAL early if this would happen.
      
      Fixes: c1fef73f ("nvmet: add passthru code to process commands")
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Suggested-by: NDouglas Gilbert <dgilbert@interlog.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      5e063101