1. 03 6月, 2021 1 次提交
  2. 19 5月, 2021 1 次提交
  3. 03 4月, 2021 4 次提交
  4. 18 3月, 2021 1 次提交
  5. 05 3月, 2021 1 次提交
    • M
      nvmet: model_number must be immutable once set · d9f273b7
      Max Gurtovoy 提交于
      In case we have already established connection to nvmf target, it
      shouldn't be allowed to change the model_number. E.g. if someone will
      identify ctrl and get model_number of "my_model" later on will change
      the model_numbel via configfs to "my_new_model" this will break the NVMe
      specification for "Get Log Page – Persistent Event Log" that refers to
      Model Number as: "This field contains the same value as reported in the
      Model Number field of the Identify Controller data structure, bytes
      63:24."
      
      Although it doesn't mentioned explicitly that this field can't be
      changed, we can assume it.
      
      So allow setting this field only once: using configfs or in the first
      identify ctrl operation.
      Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      d9f273b7
  6. 10 2月, 2021 4 次提交
  7. 02 12月, 2020 1 次提交
  8. 27 10月, 2020 1 次提交
    • C
      nvmet: fix a NULL pointer dereference when tracing the flush command · 3c3751f2
      Chaitanya Kulkarni 提交于
      When target side trace in turned on and flush command is issued from the
      host it results in the following Oops.
      
      [  856.789724] BUG: kernel NULL pointer dereference, address: 0000000000000068
      [  856.790686] #PF: supervisor read access in kernel mode
      [  856.791262] #PF: error_code(0x0000) - not-present page
      [  856.791863] PGD 6d7110067 P4D 6d7110067 PUD 66f0ad067 PMD 0
      [  856.792527] Oops: 0000 [#1] SMP NOPTI
      [  856.792950] CPU: 15 PID: 7034 Comm: nvme Tainted: G           OE     5.9.0nvme-5.9+ #71
      [  856.793790] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e3214
      [  856.794956] RIP: 0010:trace_event_raw_event_nvmet_req_init+0x13e/0x170 [nvmet]
      [  856.795734] Code: 41 5c 41 5d c3 31 d2 31 f6 e8 4e 9b b8 e0 e9 0e ff ff ff 49 8b 55 00 48 8b 38 8b 0
      [  856.797740] RSP: 0018:ffffc90001be3a60 EFLAGS: 00010246
      [  856.798375] RAX: 0000000000000000 RBX: ffff8887e7d2c01c RCX: 0000000000000000
      [  856.799234] RDX: 0000000000000020 RSI: 0000000057e70ea2 RDI: ffff8887e7d2c034
      [  856.800088] RBP: ffff88869f710578 R08: ffff888807500d40 R09: 00000000fffffffe
      [  856.800951] R10: 0000000064c66670 R11: 00000000ef955201 R12: ffff8887e7d2c034
      [  856.801807] R13: ffff88869f7105c8 R14: 0000000000000040 R15: ffff88869f710440
      [  856.802667] FS:  00007f6a22bd8780(0000) GS:ffff888813a00000(0000) knlGS:0000000000000000
      [  856.803635] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  856.804367] CR2: 0000000000000068 CR3: 00000006d73e0000 CR4: 00000000003506e0
      [  856.805283] Call Trace:
      [  856.805613]  nvmet_req_init+0x27c/0x480 [nvmet]
      [  856.806200]  nvme_loop_queue_rq+0xcb/0x1d0 [nvme_loop]
      [  856.806862]  blk_mq_dispatch_rq_list+0x123/0x7b0
      [  856.807459]  ? kvm_sched_clock_read+0x14/0x30
      [  856.808025]  __blk_mq_sched_dispatch_requests+0xc7/0x170
      [  856.808708]  blk_mq_sched_dispatch_requests+0x30/0x60
      [  856.809372]  __blk_mq_run_hw_queue+0x70/0x100
      [  856.809935]  __blk_mq_delay_run_hw_queue+0x156/0x170
      [  856.810574]  blk_mq_run_hw_queue+0x86/0xe0
      [  856.811104]  blk_mq_sched_insert_request+0xef/0x160
      [  856.811733]  blk_execute_rq+0x69/0xc0
      [  856.812212]  ? blk_mq_rq_ctx_init+0xd0/0x230
      [  856.812784]  nvme_execute_passthru_rq+0x57/0x130 [nvme_core]
      [  856.813461]  nvme_submit_user_cmd+0xeb/0x300 [nvme_core]
      [  856.814099]  nvme_user_cmd.isra.82+0x11e/0x1a0 [nvme_core]
      [  856.814752]  blkdev_ioctl+0x1dc/0x2c0
      [  856.815197]  block_ioctl+0x3f/0x50
      [  856.815606]  __x64_sys_ioctl+0x84/0xc0
      [  856.816074]  do_syscall_64+0x33/0x40
      [  856.816533]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  856.817168] RIP: 0033:0x7f6a222ed107
      [  856.817617] Code: 44 00 00 48 8b 05 81 cd 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 8
      [  856.819901] RSP: 002b:00007ffca848f058 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
      [  856.820846] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f6a222ed107
      [  856.821726] RDX: 00007ffca848f060 RSI: 00000000c0484e43 RDI: 0000000000000003
      [  856.822603] RBP: 0000000000000003 R08: 000000000000003f R09: 0000000000000005
      [  856.823478] R10: 00007ffca848ece0 R11: 0000000000000202 R12: 00007ffca84912d3
      [  856.824359] R13: 00007ffca848f4d0 R14: 0000000000000002 R15: 000000000067e900
      [  856.825236] Modules linked in: nvme_loop(OE) nvmet(OE) nvme_fabrics(OE) null_blk nvme(OE) nvme_corel
      
      Move the nvmet_req_init() tracepoint after we parse the command in
      nvmet_req_init() so that we can get rid of the duplicate
      nvmet_find_namespace() call.
      Rename __assign_disk_name() ->  __assign_req_name(). Now that we call
      tracepoint after parsing the command simplify the newly added
      __assign_req_name() which fixes this bug.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      3c3751f2
  9. 22 10月, 2020 1 次提交
    • Z
      nvmet: fix uninitialized work for zero kato · 85bd23f3
      zhenwei pi 提交于
      When connecting a controller with a zero kato value using the following
      command line
      
         nvme connect -t tcp -n NQN -a ADDR -s PORT --keep-alive-tmo=0
      
      the warning below can be reproduced:
      
      WARNING: CPU: 1 PID: 241 at kernel/workqueue.c:1627 __queue_delayed_work+0x6d/0x90
      with trace:
        mod_delayed_work_on+0x59/0x90
        nvmet_update_cc+0xee/0x100 [nvmet]
        nvmet_execute_prop_set+0x72/0x80 [nvmet]
        nvmet_tcp_try_recv_pdu+0x2f7/0x770 [nvmet_tcp]
        nvmet_tcp_io_work+0x63f/0xb2d [nvmet_tcp]
        ...
      
      This is caused by queuing up an uninitialized work.  Althrough the
      keep-alive timer is disabled during allocating the controller (fixed in
      0d3b6a8d), ka_work still has a chance to run (called by
      nvmet_start_ctrl).
      
      Fixes: 0d3b6a8d ("nvmet: Disable keep-alive timer when kato is cleared to 0h")
      Signed-off-by: Nzhenwei pi <pizhenwei@bytedance.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      85bd23f3
  10. 27 9月, 2020 1 次提交
  11. 24 8月, 2020 1 次提交
  12. 22 8月, 2020 1 次提交
  13. 29 7月, 2020 3 次提交
    • L
      nvmet: Add passthru enable/disable helpers · ba76af67
      Logan Gunthorpe 提交于
      This patch adds helper functions which are used in the NVMeOF configfs
      when the user is configuring the passthru subsystem. Here we ensure
      that only one subsys is assigned to each nvme_ctrl by using an xarray
      on the cntlid.
      
      The subsystem's version number is overridden by the passed through
      controller's version. However, if that version is less than 1.2.1,
      then we bump the advertised version to that and print a warning
      in dmesg.
      Based-on-a-patch-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      ba76af67
    • L
      nvmet: add passthru code to process commands · c1fef73f
      Logan Gunthorpe 提交于
      Add passthru command handling capability for the NVMeOF target and
      export passthru APIs which are used to integrate passthru
      code with nvmet-core.
      
      The new file passthru.c handles passthru cmd parsing and execution.
      In the passthru mode, we create a block layer request from the nvmet
      request and map the data on to the block layer request.
      
      Admin commands and features are on an allow list as there are a number
      of each that don't make too much sense with passthrough. We use an
      allow list such that new commands can be considered before being blindly
      passed through. In both cases, vendor specific commands are always
      allowed.
      
      We also reject reservation IO commands as the underlying device cannot
      differentiate between multiple hosts behind a fabric.
      Based-on-a-patch-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      c1fef73f
    • C
      nvmet: use xarray for ctrl ns storing · 7774e77e
      Chaitanya Kulkarni 提交于
      This patch replaces the ctrl->namespaces tracking from linked list to
      xarray and improves the performance when accessing one namespce :-
      
      XArray vs Default:-
      
      IOPS and BW (more the better) increase BW (~1.8%):-
      ---------------------------------------------------
      
       XArray :-
        read:  IOPS=160k,  BW=626MiB/s  (656MB/s)(18.3GiB/30001msec)
        read:  IOPS=160k,  BW=626MiB/s  (656MB/s)(18.3GiB/30001msec)
        read:  IOPS=162k,  BW=631MiB/s  (662MB/s)(18.5GiB/30001msec)
      
       Default:-
        read:  IOPS=156k,  BW=609MiB/s  (639MB/s)(17.8GiB/30001msec)
        read:  IOPS=157k,  BW=613MiB/s  (643MB/s)(17.0GiB/30001msec)
        read:  IOPS=160k,  BW=626MiB/s  (656MB/s)(18.3GiB/30001msec)
      
      Submission latency (less the better) decrease (~8.3%):-
      -------------------------------------------------------
      
       XArray:-
        slat  (usec):  min=7,  max=8386,  avg=11.19,  stdev=5.96
        slat  (usec):  min=7,  max=441,   avg=11.09,  stdev=4.48
        slat  (usec):  min=7,  max=1088,  avg=11.21,  stdev=4.54
      
       Default :-
        slat  (usec):  min=8,   max=2826.5k,  avg=23.96,  stdev=3911.50
        slat  (usec):  min=8,   max=503,      avg=12.52,  stdev=5.07
        slat  (usec):  min=8,   max=2384,     avg=12.50,  stdev=5.28
      
      CPU Usage (less the better) decrease (~5.2%):-
      ----------------------------------------------
      
       XArray:-
        cpu  :  usr=1.84%,  sys=18.61%,  ctx=949471,  majf=0,  minf=250
        cpu  :  usr=1.83%,  sys=18.41%,  ctx=950262,  majf=0,  minf=237
        cpu  :  usr=1.82%,  sys=18.82%,  ctx=957224,  majf=0,  minf=234
      
       Default:-
        cpu  :  usr=1.70%,  sys=19.21%,  ctx=858196,  majf=0,  minf=251
        cpu  :  usr=1.82%,  sys=19.98%,  ctx=929720,  majf=0,  minf=227
        cpu  :  usr=1.83%,  sys=20.33%,  ctx=947208,  majf=0,  minf=235.
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      7774e77e
  14. 08 7月, 2020 1 次提交
  15. 01 7月, 2020 1 次提交
  16. 11 6月, 2020 1 次提交
    • C
      nvmet: fail outstanding host posted AEN req · 819f7b88
      Chaitanya Kulkarni 提交于
      In function nvmet_async_event_process() we only process AENs iff
      there is an open slot on the ctrl->async_event_cmds[] && aen
      event list posted by the target is not empty. This keeps host
      posted AEN outstanding if target generated AEN list is empty.
      We do cleanup the target generated entries from the aen list in
      nvmet_ctrl_free()-> nvmet_async_events_free() but we don't
      process AEN posted by the host. This leads to following problem :-
      
      When processing admin sq at the time of nvmet_sq_destroy() holds
      an extra percpu reference(atomic value = 1), so in the following code
      path after switching to atomic rcu, release function (nvmet_sq_free())
      is not getting called which blocks the sq->free_done in
      nvmet_sq_destroy() :-
      
      nvmet_sq_destroy()
       percpu_ref_kill_and_confirm()
       - __percpu_ref_switch_mode()
       --  __percpu_ref_switch_to_atomic()
       ---   call_rcu() -> percpu_ref_switch_to_atomic_rcu()
       ----     /* calls switch callback */
       - percpu_ref_put()
       -- percpu_ref_put_many(ref, 1)
       --- else if (unlikely(atomic_long_sub_and_test(nr, &ref->count)))
       ----   ref->release(ref); <---- Not called.
      
      This results in indefinite hang:-
      
        void nvmet_sq_destroy(struct nvmet_sq *sq)
      ...
                if (ctrl && ctrl->sqs && ctrl->sqs[0] == sq) {
                        nvmet_async_events_process(ctrl, status);
                        percpu_ref_put(&sq->ref);
                }
                percpu_ref_kill_and_confirm(&sq->ref, nvmet_confirm_sq);
                wait_for_completion(&sq->confirm_done);
                wait_for_completion(&sq->free_done); <-- Hang here
      
      Which breaks the further disconnect sequence. This problem seems to be
      introduced after commit 64f5e9cd ("nvmet: fix memory leak when
      removing namespaces and controllers concurrently").
      
      This patch processes ctrl->async_event_cmds[] in the admin sq destroy()
      context irrespetive of aen_list. Also we get rid of the controller's
      aen_list processing in the nvmet_sq_destroy() context and just ignore
      ctrl->aen_list.
      
      This results in nvmet_async_events_process() being called from workqueue
      context so we adjust the code accordingly.
      
      Fixes: 64f5e9cd ("nvmet: fix memory leak when removing namespaces and controllers concurrently ")
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      819f7b88
  17. 27 5月, 2020 8 次提交
  18. 05 3月, 2020 2 次提交
    • M
      nvmet: make ctrl model configurable · 013b7ebe
      Mark Ruijter 提交于
      This patch adds a new target subsys attribute which allows user to
      optionally specify model name which then used in the
      nvmet_execute_identify_ctrl() to fill up the nvme_id_ctrl structure.
      
      The default value for the model is set to "Linux" for backward
      compatibility.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NMark Ruijter <MRuijter@onestopsystems.com>
      [chaitanya.kulkarni@wdc.com
       *Use macro for default model, coding style fixes.
       *Use RCU for accessing model in for configfs and in
        nvmet_execute_identify_ctrl().
      ]
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      013b7ebe
    • C
      nvmet: make ctrl-id configurable · 94a39d61
      Chaitanya Kulkarni 提交于
      This patch adds a new target subsys attribute which allows user to
      optionally specify target controller IDs which then used in the
      nvmet_execute_identify_ctrl() to fill up the nvme_id_ctrl structure.
      
      For example, when using a cluster setup with two nodes, with a dual
      ported NVMe drive and exporting the drive from both the nodes,
      The connection to the host fails due to the same controller ID and
      results in the following error message:-
      
      "nvme nvmeX: Duplicate cntlid XXX with nvmeX, rejecting"
      
      With this patch now user can partition the controller IDs for each
      subsystem by setting up the cntlid_min and cntlid_max. These values
      will be used at the time of the controller ID creation. By partitioning
      the ctrl-ids for each subsystem results in the unique ctrl-id space
      which avoids the collision.
      
      When new attribute is not specified target will fall back to original
      cntlid calculation method.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      94a39d61
  19. 05 2月, 2020 1 次提交
    • D
      nvmet: update AEN list and array at one place · 0f5be6a4
      Daniel Wagner 提交于
      All async events are enqueued via nvmet_add_async_event() which
      updates the ctrl->async_event_cmds[] array and additionally an struct
      nvmet_async_event is added to the ctrl->async_events list.
      
      Under normal operations the nvmet_async_event_work() updates again
      the ctrl->async_event_cmds and removes the corresponding struct
      nvmet_async_event from the list again. Though nvmet_sq_destroy() could
      be called which calls nvmet_async_events_free() which only updates the
      ctrl->async_event_cmds[] array.
      
      Add new functions nvmet_async_events_process() and
      nvmet_async_events_free() to process async events, update an array and
      the list.
      
      When we destroy submission queue after clearing the aen present on
      the ctrl->async list we also loop over ctrl->async_event_cmds[] for
      any requests posted by the host for which we don't have the AEN in
      the ctrl->async_events list by calling nvmet_async_event_process()
      and nvmet_async_events_free().
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDaniel Wagner <dwagner@suse.de>
      [chaitanya.kulkarni@wdc.com
       * Loop over and clear out outstanding requests
       * Update changelog
      ]
      Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: NKeith Busch <kbusch@kernel.org>
      0f5be6a4
  20. 04 2月, 2020 2 次提交
  21. 05 11月, 2019 3 次提交