1. 26 3月, 2018 4 次提交
    • M
      nvme: make nvme_get_log_ext non-static · d558fb51
      Matias Bjørling 提交于
      Enable the lightnvm integration to use the nvme_get_log_ext()
      function.
      Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d558fb51
    • N
      nvme: Add .stop_ctrl to nvme ctrl ops · b435ecea
      Nitzan Carmi 提交于
      For consistancy reasons, any fabric-specific works
      (e.g error recovery/reconnect) should be canceled in
      nvme_stop_ctrl, as for all other NVMe pending works
      (e.g. scan, keep alive).
      
      The patch aims to simplify the logic of the code, as
      we now only rely on a vague demand from any fabric
      to flush its private workqueues at the beginning of
      .delete_ctrl op.
      Signed-off-by: NNitzan Carmi <nitzanc@mellanox.com>
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b435ecea
    • J
      nvme: change namespaces_mutext to namespaces_rwsem · 765cc031
      Jianchao Wang 提交于
      namespaces_mutext is used to synchronize the operations on ctrl
      namespaces list. Most of the time, it is a read operation.
      
      On the other hand, there are many interfaces in nvme core that
      need this lock, such as nvme_wait_freeze, and even more interfaces
      will be added. If we use mutex here, circular dependency could be
      introduced easily. For example:
      context A                  context B
      nvme_xxx                   nvme_xxx
      hold namespaces_mutext     require namespaces_mutext
      sync context B
      
      So it is better to change it from mutex to rwsem.
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJianchao Wang <jianchao.w.wang@oracle.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      765cc031
    • T
      nvme: Add fault injection feature · b9e03857
      Thomas Tai 提交于
      Linux's fault injection framework provides a systematic way to support
      error injection via debugfs in the /sys/kernel/debug directory. This
      patch uses the framework to add error injection to NVMe driver. The
      fault injection source code is stored in a separate file and only linked
      if CONFIG_FAULT_INJECTION_DEBUG_FS kernel config is selected.
      
      Once the error injection is enabled, NVME_SC_INVALID_OPCODE with no
      retry will be injected into the nvme_end_request. Users can change
      the default status code and no retry flag via debufs. Following example
      shows how to enable and inject an error. For more examples, refer to
      Documentation/fault-injection/nvme-fault-injection.txt
      
      How to enable nvme fault injection:
      
      First, enable CONFIG_FAULT_INJECTION_DEBUG_FS kernel config,
      recompile the kernel. After booting up the kernel, do the
      following.
      
      How to inject an error:
      
      mount /dev/nvme0n1 /mnt
      echo 1 > /sys/kernel/debug/nvme0n1/fault_inject/times
      echo 100 > /sys/kernel/debug/nvme0n1/fault_inject/probability
      cp a.file /mnt
      
      Expected Result:
      
      cp: cannot stat ‘/mnt/a.file’: Input/output error
      
      Message from dmesg:
      
      FAULT_INJECTION: forcing a failure.
      name fault_inject, interval 1, probability 100, space 0, times 1
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc8+ #2
      Hardware name: innotek GmbH VirtualBox/VirtualBox,
      BIOS VirtualBox 12/01/2006
      Call Trace:
        <IRQ>
        dump_stack+0x5c/0x7d
        should_fail+0x148/0x170
        nvme_should_fail+0x2f/0x50 [nvme_core]
        nvme_process_cq+0xe7/0x1d0 [nvme]
        nvme_irq+0x1e/0x40 [nvme]
        __handle_irq_event_percpu+0x3a/0x190
        handle_irq_event_percpu+0x30/0x70
        handle_irq_event+0x36/0x60
        handle_fasteoi_irq+0x78/0x120
        handle_irq+0xa7/0x130
        ? tick_irq_enter+0xa8/0xc0
        do_IRQ+0x43/0xc0
        common_interrupt+0xa2/0xa2
        </IRQ>
      RIP: 0010:native_safe_halt+0x2/0x10
      RSP: 0018:ffffffff82003e90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd
      RAX: ffffffff817a10c0 RBX: ffffffff82012480 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: 0000000000000000 R08: 000000008e38ce64 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82012480
      R13: ffffffff82012480 R14: 0000000000000000 R15: 0000000000000000
        ? __sched_text_end+0x4/0x4
        default_idle+0x18/0xf0
        do_idle+0x150/0x1d0
        cpu_startup_entry+0x6f/0x80
        start_kernel+0x4c4/0x4e4
        ? set_init_arg+0x55/0x55
        secondary_startup_64+0xa5/0xb0
        print_req_error: I/O error, dev nvme0n1, sector 9240
      EXT4-fs error (device nvme0n1): ext4_find_entry:1436:
      inode #2: comm cp: reading directory lblock 0
      Signed-off-by: NThomas Tai <thomas.tai@oracle.com>
      Reviewed-by: NEric Saint-Etienne <eric.saint.etienne@oracle.com>
      Signed-off-by: NKarl Volz <karl.volz@oracle.com>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b9e03857
  2. 13 2月, 2018 1 次提交
    • R
      nvme: Don't use a stack buffer for keep-alive command · 0a34e466
      Roland Dreier 提交于
      In nvme_keep_alive() we pass a request with a pointer to an NVMe command on
      the stack into blk_execute_rq_nowait().  However, the block layer doesn't
      guarantee that the request is fully queued before blk_execute_rq_nowait()
      returns.  If not, and the request is queued after nvme_keep_alive() returns,
      then we'll end up using stack memory that might have been overwritten to
      form the NVMe command we pass to hardware.
      
      Fix this by keeping a special command struct in the nvme_ctrl struct right
      next to the delayed work struct used for keep-alives.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      0a34e466
  3. 09 2月, 2018 1 次提交
  4. 16 1月, 2018 1 次提交
    • R
      nvme: host delete_work and reset_work on separate workqueues · b227c59b
      Roy Shterman 提交于
      We need to ensure that delete_work will be hosted on a different
      workqueue than all the works we flush or cancel from it.
      Otherwise we may hit a circular dependency warning [1].
      
      Also, given that delete_work flushes reset_work, host reset_work
      on nvme_reset_wq and delete_work on nvme_delete_wq. In addition,
      fix the flushing in the individual drivers to flush nvme_delete_wq
      when draining queued deletes.
      
      [1]:
      [  178.491942] =============================================
      [  178.492718] [ INFO: possible recursive locking detected ]
      [  178.493495] 4.9.0-rc4-c844263313a8-lb #3 Tainted: G           OE
      [  178.494382] ---------------------------------------------
      [  178.495160] kworker/5:1/135 is trying to acquire lock:
      [  178.495894]  (
      [  178.496120] "nvme-wq"
      [  178.496471] ){++++.+}
      [  178.496599] , at:
      [  178.496921] [<ffffffffa70ac206>] flush_work+0x1a6/0x2d0
      [  178.497670]
                     but task is already holding lock:
      [  178.498499]  (
      [  178.498724] "nvme-wq"
      [  178.499074] ){++++.+}
      [  178.499202] , at:
      [  178.499520] [<ffffffffa70ad6c2>] process_one_work+0x162/0x6a0
      [  178.500343]
                     other info that might help us debug this:
      [  178.501269]  Possible unsafe locking scenario:
      
      [  178.502113]        CPU0
      [  178.502472]        ----
      [  178.502829]   lock(
      [  178.503115] "nvme-wq"
      [  178.503467] );
      [  178.503716]   lock(
      [  178.504001] "nvme-wq"
      [  178.504353] );
      [  178.504601]
                      *** DEADLOCK ***
      
      [  178.505441]  May be due to missing lock nesting notation
      
      [  178.506453] 2 locks held by kworker/5:1/135:
      [  178.507068]  #0:
      [  178.507330]  (
      [  178.507598] "nvme-wq"
      [  178.507726] ){++++.+}
      [  178.508079] , at:
      [  178.508173] [<ffffffffa70ad6c2>] process_one_work+0x162/0x6a0
      [  178.509004]  #1:
      [  178.509265]  (
      [  178.509532] (&ctrl->delete_work)
      [  178.509795] ){+.+.+.}
      [  178.510145] , at:
      [  178.510239] [<ffffffffa70ad6c2>] process_one_work+0x162/0x6a0
      [  178.511070]
                     stack backtrace:
      :
      [  178.511693] CPU: 5 PID: 135 Comm: kworker/5:1 Tainted: G           OE   4.9.0-rc4-c844263313a8-lb #3
      [  178.512974] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014
      [  178.514247] Workqueue: nvme-wq nvme_del_ctrl_work [nvme_tcp]
      [  178.515071]  ffffc2668175bae0 ffffffffa7450823 ffffffffa88abd80 ffffffffa88abd80
      [  178.516195]  ffffc2668175bb98 ffffffffa70eb012 ffffffffa8d8d90d ffff9c472e9ea700
      [  178.517318]  ffff9c472e9ea700 ffff9c4700000000 ffff9c4700007200 ab83be61bec0d50e
      [  178.518443] Call Trace:
      [  178.518807]  [<ffffffffa7450823>] dump_stack+0x85/0xc2
      [  178.519542]  [<ffffffffa70eb012>] __lock_acquire+0x17d2/0x18f0
      [  178.520377]  [<ffffffffa75839a7>] ? serial8250_console_putchar+0x27/0x30
      [  178.521330]  [<ffffffffa7583980>] ? wait_for_xmitr+0xa0/0xa0
      [  178.522174]  [<ffffffffa70ac1eb>] ? flush_work+0x18b/0x2d0
      [  178.522975]  [<ffffffffa70eb7cb>] lock_acquire+0x11b/0x220
      [  178.523753]  [<ffffffffa70ac206>] ? flush_work+0x1a6/0x2d0
      [  178.524535]  [<ffffffffa70ac229>] flush_work+0x1c9/0x2d0
      [  178.525291]  [<ffffffffa70ac206>] ? flush_work+0x1a6/0x2d0
      [  178.526077]  [<ffffffffa70a9cf0>] ? flush_workqueue_prep_pwqs+0x220/0x220
      [  178.527040]  [<ffffffffa70ae7cf>] __cancel_work_timer+0x10f/0x1d0
      [  178.527907]  [<ffffffffa70fecb9>] ? vprintk_default+0x29/0x40
      [  178.528726]  [<ffffffffa71cb507>] ? printk+0x48/0x50
      [  178.529434]  [<ffffffffa70ae8c3>] cancel_delayed_work_sync+0x13/0x20
      [  178.530381]  [<ffffffffc042100b>] nvme_stop_ctrl+0x5b/0x70 [nvme_core]
      [  178.531314]  [<ffffffffc0403dcc>] nvme_del_ctrl_work+0x2c/0x50 [nvme_tcp]
      [  178.532271]  [<ffffffffa70ad741>] process_one_work+0x1e1/0x6a0
      [  178.533101]  [<ffffffffa70ad6c2>] ? process_one_work+0x162/0x6a0
      [  178.533954]  [<ffffffffa70adc4e>] worker_thread+0x4e/0x490
      [  178.534735]  [<ffffffffa70adc00>] ? process_one_work+0x6a0/0x6a0
      [  178.535588]  [<ffffffffa70adc00>] ? process_one_work+0x6a0/0x6a0
      [  178.536441]  [<ffffffffa70b48cf>] kthread+0xff/0x120
      [  178.537149]  [<ffffffffa70b47d0>] ? kthread_park+0x60/0x60
      [  178.538094]  [<ffffffffa70b47d0>] ? kthread_park+0x60/0x60
      [  178.538900]  [<ffffffffa78e332a>] ret_from_fork+0x2a/0x40
      Signed-off-by: NRoy Shterman <roys@lightbitslabs.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      b227c59b
  5. 15 1月, 2018 1 次提交
  6. 11 1月, 2018 1 次提交
  7. 08 1月, 2018 1 次提交
  8. 29 12月, 2017 1 次提交
  9. 23 11月, 2017 1 次提交
  10. 11 11月, 2017 12 次提交
  11. 01 11月, 2017 1 次提交
  12. 27 10月, 2017 2 次提交
    • C
      nvme: get rid of nvme_ctrl_list · a6a5149b
      Christoph Hellwig 提交于
      Use the core chrdev code to set up the link between the character device
      and the nvme controller.  This allows us to get rid of the global list
      of all controllers, and also ensures that we have both a reference to
      the controller and the transport module before the open method of the
      character device is called.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sgi@grimberg.me>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      a6a5149b
    • C
      nvme: switch controller refcounting to use struct device · d22524a4
      Christoph Hellwig 提交于
      Instead of allocating a separate struct device for the character device
      handle embedd it into struct nvme_ctrl and use it for the main controller
      refcounting.  This removes double refcounting and gets us an automatic
      reference for the character device operations.  We keep ctrl->device as a
      pointer for now to avoid chaning printks all over, but in the future we
      could look into message printing helpers that take a controller structure
      similar to what other subsystems do.
      
      Note the delete_ctrl operation always already has a reference (either
      through sysfs due this change, or because every open file on the
      /dev/nvme-fabrics node has a refernece) when it is entered now, so we
      don't need to do the unless_zero variant there.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      d22524a4
  13. 19 10月, 2017 1 次提交
  14. 04 10月, 2017 1 次提交
  15. 12 9月, 2017 2 次提交
  16. 30 8月, 2017 1 次提交
  17. 29 8月, 2017 4 次提交
  18. 06 7月, 2017 1 次提交
    • S
      nvme: split nvme_uninit_ctrl into stop and uninit · d09f2b45
      Sagi Grimberg 提交于
      Usually before we teardown the controller we want to:
      1. complete/cancel any ctrl inflight works
      2. remove ctrl namespaces (only for removal though, resets
         shouldn't remove any namespaces).
      
      but we do not want to destroy the controller device as
      we might use it for logging during the teardown stage.
      
      This patch adds nvme_start_ctrl() which queues inflight
      controller works (aen, ns scan, queue start and keep-alive
      if kato is set) and nvme_stop_ctrl() which cancels the works
      namespace removal is left to the callers to handle.
      
      Move nvme_uninit_ctrl after we are done with the
      controller device.
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      d09f2b45
  19. 02 7月, 2017 2 次提交
  20. 28 6月, 2017 1 次提交