1. 17 6月, 2021 3 次提交
  2. 12 5月, 2021 1 次提交
    • C
      nvmet: fix inline bio check for bdev-ns · 608a9690
      Chaitanya Kulkarni 提交于
      When handling rw commands, for inline bio case we only consider
      transfer size. This works well when req->sg_cnt fits into the
      req->inline_bvec, but it will result in the warning in
      __bio_add_page() when req->sg_cnt > NVMET_MAX_INLINE_BVEC.
      
      Consider an I/O size 32768 and first page is not aligned to the page
      boundary, then I/O is split in following manner :-
      
      [ 2206.256140] nvmet: sg->length 3440 sg->offset 656
      [ 2206.256144] nvmet: sg->length 4096 sg->offset 0
      [ 2206.256148] nvmet: sg->length 4096 sg->offset 0
      [ 2206.256152] nvmet: sg->length 4096 sg->offset 0
      [ 2206.256155] nvmet: sg->length 4096 sg->offset 0
      [ 2206.256159] nvmet: sg->length 4096 sg->offset 0
      [ 2206.256163] nvmet: sg->length 4096 sg->offset 0
      [ 2206.256166] nvmet: sg->length 4096 sg->offset 0
      [ 2206.256170] nvmet: sg->length 656 sg->offset 0
      
      Now the req->transfer_size == NVMET_MAX_INLINE_DATA_LEN i.e. 32768, but
      the req->sg_cnt is (9) > NVMET_MAX_INLINE_BIOVEC which is (8).
      This will result in the following warning message :-
      
      nvmet_bdev_execute_rw()
      	bio_add_page()
      		__bio_add_page()
      			WARN_ON_ONCE(bio_full(bio, len));
      
      This scenario is very hard to reproduce on the nvme-loop transport only
      with rw commands issued with the passthru IOCTL interface from the host
      application and the data buffer is allocated with the malloc() and not
      the posix_memalign().
      
      Fixes: 73383adf ("nvmet: don't split large I/Os unconditionally")
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      608a9690
  3. 03 4月, 2021 3 次提交
  4. 05 3月, 2021 1 次提交
    • M
      nvmet: model_number must be immutable once set · d9f273b7
      Max Gurtovoy 提交于
      In case we have already established connection to nvmf target, it
      shouldn't be allowed to change the model_number. E.g. if someone will
      identify ctrl and get model_number of "my_model" later on will change
      the model_numbel via configfs to "my_new_model" this will break the NVMe
      specification for "Get Log Page – Persistent Event Log" that refers to
      Model Number as: "This field contains the same value as reported in the
      Model Number field of the Identify Controller data structure, bytes
      63:24."
      
      Although it doesn't mentioned explicitly that this field can't be
      changed, we can assume it.
      
      So allow setting this field only once: using configfs or in the first
      identify ctrl operation.
      Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      d9f273b7
  5. 10 2月, 2021 3 次提交
    • C
      nvmet: add nvmet_req_subsys() helper · 20c2c3bb
      Chaitanya Kulkarni 提交于
      Just like what we have to get the passthru ctrl from the req, add an
      helper to get the subsystem associated with the nvmet_req() instead
      of open coding the chain of structures.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      20c2c3bb
    • C
      nvmet: add helper to report invalid opcode · d81d57cf
      Chaitanya Kulkarni 提交于
      In the NVMeOF block device backend, file backend, and passthru backend
      we reject and report the commands if opcode is not handled.
      
      Add an helper and use it in block device backend to keep the code
      and error message uniform.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      d81d57cf
    • C
      nvmet: make nvmet_find_namespace() req based · 3a1f7c79
      Chaitanya Kulkarni 提交于
      The six callers of nvmet_find_namespace() duplicate the error log page
      update and status setting code for each call on failure.
      
      All callers are nvmet requests based functions, so we can pass req
      to the nvmet_find_namesapce() & derive ctrl from req, that'll allow us
      to update the error log page in nvmet_find_namespace(). Now that we
      pass the request we can also get rid of the local variable in
      nvmet_find_namespace() and use the req->ns and return the error code.
      
      Replace the ctrl parameter with nvmet_req for nvmet_find_namespace(),
      centralize the error log page update for non allocated namesapces, and
      return uniform error for non-allocated namespace.
      
      The nvmet_find_namespace() takes nsid parameter which is from NVMe
      commands structures such as get_log_page, identify, rw and common. All
      these commands have same offset for the nsid field.
      
      Derive nsid from req->cmd->common.nsid) & remove the extra parameter
      from the nvmet_find_namespace().
      
      Lastly now we associate the ns to the req parameter that we pass to the
      nvmet_find_namespace(), rename nvmet_find_namespace() to
      nvmet_req_find_ns().
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      3a1f7c79
  6. 02 2月, 2021 1 次提交
  7. 02 12月, 2020 4 次提交
    • A
      nvmet: remove unused ctrl->cqs · 6d65aeab
      Amit 提交于
      remove unused cqs from nvmet_ctrl struct
      this will reduce the allocated memory.
      Signed-off-by: NAmit <amit.engel@dell.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      6d65aeab
    • C
      nvmet: use inline bio for passthru fast path · dab3902b
      Chaitanya Kulkarni 提交于
      In nvmet_passthru_execute_cmd() which is a high frequency function
      it uses bio_alloc() which leads to memory allocation from the fs pool
      for each I/O.
      
      For NVMeoF nvmet_req we already have inline_bvec allocated as a part of
      request allocation that can be used with preallocated bio when we
      already know the size of request before bio allocation with bio_alloc(),
      which we already do.
      
      Introduce a bio member for the nvmet_req passthru anon union. In the
      fast path, check if we can get away with inline bvec and bio from
      nvmet_req with bio_init() call before actually allocating from the
      bio_alloc().
      
      This will be useful to avoid any new memory allocation under high
      memory pressure situation and get rid of any extra work of
      allocation (bio_alloc()) vs initialization (bio_init()) when
      transfer len is < NVMET_MAX_INLINE_DATA_LEN that user can configure at
      compile time.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      dab3902b
    • C
      nvmet: add passthru io timeout value attr · 47e9730c
      Chaitanya Kulkarni 提交于
      NVMeOF controller in the passsthru mode is capable of handling wide set
      of I/O commands including vender specific passhtru io comands.
      
      The vendor specific I/O commands are used to read the large drive
      logs and can take longer than default NVMe commands, i.e. for
      passthru requests the timeout value may differ from the passthru
      controller's default timeout values (nvme-core:io_timeout).
      
      Add a configfs attribute so that user can set the io timeout values.
      In case if this configfs value is not set nvme_alloc_request() will set
      the NVME_IO_TIMEOUT value when request queuedata is NULL.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      47e9730c
    • C
      nvmet: add passthru admin timeout value attr · a2f6a2b8
      Chaitanya Kulkarni 提交于
      NVMeOF controller in the passsthru mode is capable of handling wide set
      of admin commands including vender specific passhtru admin comands.
      
      The vendor specific admin commands are used to read the large drive
      logs and can take longer than default NVMe commands, i.e. for
      passthru requests the timeout value may differ from the passthru
      controller's default timeout values (nvme-core:admin_timeout).
      
      Add a configfs attribute so that user can set the admin timeout values.
      In case if this configfs value is not set nvme_alloc_request() will set
      the ADMIN_TIMEOUT value when request queuedata is NULL.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      a2f6a2b8
  8. 27 9月, 2020 1 次提交
  9. 29 7月, 2020 4 次提交
    • L
      nvmet: introduce the passthru configfs interface · cae5b01a
      Logan Gunthorpe 提交于
      When CONFIG_NVME_TARGET_PASSTHRU as 'passthru' directory will
      be added to each subsystem. The directory is similar to a namespace
      and has two attributes: device_path and enable. The user must set the
      path to the nvme controller's char device and write '1' to enable the
      subsystem to use passthru.
      
      Any given subsystem is prevented from enabling both a regular namespace
      and the passthru device. If one is enabled, enabling the other will
      produce an error.
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      cae5b01a
    • L
      nvmet: Add passthru enable/disable helpers · ba76af67
      Logan Gunthorpe 提交于
      This patch adds helper functions which are used in the NVMeOF configfs
      when the user is configuring the passthru subsystem. Here we ensure
      that only one subsys is assigned to each nvme_ctrl by using an xarray
      on the cntlid.
      
      The subsystem's version number is overridden by the passed through
      controller's version. However, if that version is less than 1.2.1,
      then we bump the advertised version to that and print a warning
      in dmesg.
      Based-on-a-patch-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      ba76af67
    • L
      nvmet: add passthru code to process commands · c1fef73f
      Logan Gunthorpe 提交于
      Add passthru command handling capability for the NVMeOF target and
      export passthru APIs which are used to integrate passthru
      code with nvmet-core.
      
      The new file passthru.c handles passthru cmd parsing and execution.
      In the passthru mode, we create a block layer request from the nvmet
      request and map the data on to the block layer request.
      
      Admin commands and features are on an allow list as there are a number
      of each that don't make too much sense with passthrough. We use an
      allow list such that new commands can be considered before being blindly
      passed through. In both cases, vendor specific commands are always
      allowed.
      
      We also reject reservation IO commands as the underlying device cannot
      differentiate between multiple hosts behind a fabric.
      Based-on-a-patch-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      c1fef73f
    • C
      nvmet: use xarray for ctrl ns storing · 7774e77e
      Chaitanya Kulkarni 提交于
      This patch replaces the ctrl->namespaces tracking from linked list to
      xarray and improves the performance when accessing one namespce :-
      
      XArray vs Default:-
      
      IOPS and BW (more the better) increase BW (~1.8%):-
      ---------------------------------------------------
      
       XArray :-
        read:  IOPS=160k,  BW=626MiB/s  (656MB/s)(18.3GiB/30001msec)
        read:  IOPS=160k,  BW=626MiB/s  (656MB/s)(18.3GiB/30001msec)
        read:  IOPS=162k,  BW=631MiB/s  (662MB/s)(18.5GiB/30001msec)
      
       Default:-
        read:  IOPS=156k,  BW=609MiB/s  (639MB/s)(17.8GiB/30001msec)
        read:  IOPS=157k,  BW=613MiB/s  (643MB/s)(17.0GiB/30001msec)
        read:  IOPS=160k,  BW=626MiB/s  (656MB/s)(18.3GiB/30001msec)
      
      Submission latency (less the better) decrease (~8.3%):-
      -------------------------------------------------------
      
       XArray:-
        slat  (usec):  min=7,  max=8386,  avg=11.19,  stdev=5.96
        slat  (usec):  min=7,  max=441,   avg=11.09,  stdev=4.48
        slat  (usec):  min=7,  max=1088,  avg=11.21,  stdev=4.54
      
       Default :-
        slat  (usec):  min=8,   max=2826.5k,  avg=23.96,  stdev=3911.50
        slat  (usec):  min=8,   max=503,      avg=12.52,  stdev=5.07
        slat  (usec):  min=8,   max=2384,     avg=12.50,  stdev=5.28
      
      CPU Usage (less the better) decrease (~5.2%):-
      ----------------------------------------------
      
       XArray:-
        cpu  :  usr=1.84%,  sys=18.61%,  ctx=949471,  majf=0,  minf=250
        cpu  :  usr=1.83%,  sys=18.41%,  ctx=950262,  majf=0,  minf=237
        cpu  :  usr=1.82%,  sys=18.82%,  ctx=957224,  majf=0,  minf=234
      
       Default:-
        cpu  :  usr=1.70%,  sys=19.21%,  ctx=858196,  majf=0,  minf=251
        cpu  :  usr=1.82%,  sys=19.98%,  ctx=929720,  majf=0,  minf=227
        cpu  :  usr=1.83%,  sys=20.33%,  ctx=947208,  majf=0,  minf=235.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      7774e77e
  10. 08 7月, 2020 1 次提交
  11. 27 5月, 2020 6 次提交
  12. 10 5月, 2020 1 次提交
  13. 26 3月, 2020 1 次提交
  14. 05 3月, 2020 2 次提交
    • M
      nvmet: make ctrl model configurable · 013b7ebe
      Mark Ruijter 提交于
      This patch adds a new target subsys attribute which allows user to
      optionally specify model name which then used in the
      nvmet_execute_identify_ctrl() to fill up the nvme_id_ctrl structure.
      
      The default value for the model is set to "Linux" for backward
      compatibility.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NMark Ruijter <MRuijter@onestopsystems.com>
      [chaitanya.kulkarni@wdc.com
       *Use macro for default model, coding style fixes.
       *Use RCU for accessing model in for configfs and in
        nvmet_execute_identify_ctrl().
      ]
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      013b7ebe
    • C
      nvmet: make ctrl-id configurable · 94a39d61
      Chaitanya Kulkarni 提交于
      This patch adds a new target subsys attribute which allows user to
      optionally specify target controller IDs which then used in the
      nvmet_execute_identify_ctrl() to fill up the nvme_id_ctrl structure.
      
      For example, when using a cluster setup with two nodes, with a dual
      ported NVMe drive and exporting the drive from both the nodes,
      The connection to the host fails due to the same controller ID and
      results in the following error message:-
      
      "nvme nvmeX: Duplicate cntlid XXX with nvmeX, rejecting"
      
      With this patch now user can partition the controller IDs for each
      subsystem by setting up the cntlid_min and cntlid_max. These values
      will be used at the time of the controller ID creation. By partitioning
      the ctrl-ids for each subsystem results in the unique ctrl-id space
      which avoids the collision.
      
      When new attribute is not specified target will fall back to original
      cntlid calculation method.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      94a39d61
  15. 04 2月, 2020 1 次提交
  16. 05 11月, 2019 3 次提交
  17. 01 8月, 2019 1 次提交
  18. 10 7月, 2019 1 次提交
  19. 21 6月, 2019 1 次提交
  20. 25 4月, 2019 1 次提交