1. 21 5月, 2017 1 次提交
  2. 02 5月, 2017 1 次提交
  3. 21 4月, 2017 5 次提交
    • K
      nvme/pci: Poll CQ on timeout · 7776db1c
      Keith Busch 提交于
      If an IO timeout occurs, it's helpful to know if the controller did not
      post a completion or the driver missed an interrupt. While we never expect
      the latter, this patch will make it possible to tell the difference so
      we don't have to guess.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Tested-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      7776db1c
    • H
      nvme: improve performance for virtual NVMe devices · f9f38e33
      Helen Koike 提交于
      This change provides a mechanism to reduce the number of MMIO doorbell
      writes for the NVMe driver. When running in a virtualized environment
      like QEMU, the cost of an MMIO is quite hefy here. The main idea for
      the patch is provide the device two memory location locations:
       1) to store the doorbell values so they can be lookup without the doorbell
          MMIO write
       2) to store an event index.
      I believe the doorbell value is obvious, the event index not so much.
      Similar to the virtio specification, the virtual device can tell the
      driver (guest OS) not to write MMIO unless you are writing past this
      value.
      
      FYI: doorbell values are written by the nvme driver (guest OS) and the
      event index is written by the virtual device (host OS).
      
      The patch implements a new admin command that will communicate where
      these two memory locations reside. If the command fails, the nvme
      driver will work as before without any optimizations.
      
      Contributions:
        Eric Northup <digitaleric@google.com>
        Frank Swiderski <fes@google.com>
        Ted Tso <tytso@mit.edu>
        Keith Busch <keith.busch@intel.com>
      
      Just to give an idea on the performance boost with the vendor
      extension: Running fio [1], a stock NVMe driver I get about 200K read
      IOPs with my vendor patch I get about 1000K read IOPs. This was
      running with a null device i.e. the backing device simply returned
      success on every read IO request.
      
      [1] Running on a 4 core machine:
        fio --time_based --name=benchmark --runtime=30
        --filename=/dev/nvme0n1 --nrfiles=1 --ioengine=libaio --iodepth=32
        --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=4
        --rw=randread --blocksize=4k --randrepeat=false
      Signed-off-by: NRob Nelson <rlnelson@google.com>
      [mlin: port for upstream]
      Signed-off-by: NMing Lin <mlin@kernel.org>
      [koike: updated for upstream]
      Signed-off-by: NHelen Koike <helen.koike@collabora.co.uk>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      f9f38e33
    • K
      nvme/pci: Don't set reserved SQ create flags · 81c1cd98
      Keith Busch 提交于
      The QPRIO field is only valid if weighted round robin arbitration is used,
      and this driver doesn't enable that controller configuration option.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      81c1cd98
    • A
      nvme: Adjust the Samsung APST quirk · ff5350a8
      Andy Lutomirski 提交于
      I got a couple more reports: the Samsung APST issues appears to
      affect multiple 950-series devices in Dell XPS 15 9550 and Precision
      5510 laptops.  Change the quirk: rather than blacklisting the
      firmware on the first problematic SSD that was reported, disable
      APST on all 144d:a802 devices if they're installed in the two
      affected Dell models.  While we're at it, disable only the deepest
      sleep state instead of all of them -- the reporters say that this is
      sufficient to fix the problem.
      
      (I have a device that appears to be entirely identical to one of the
      affected devices, but I have a different Dell laptop, so it's not
      the case that all Samsung devices with firmware BXW75D0Q are broken
      under all circumstances.)
      
      Samsung engineers have an affected system, and hopefully they'll
      give us a better workaround some time soon.  In the mean time, this
      should minimize regressions.
      
      See https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678184
      
      Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      ff5350a8
    • C
      nvme: split nvme status from block req->errors · 27fa9bc5
      Christoph Hellwig 提交于
      We want our own clearly defined error field for NVMe passthrough commands,
      and the request errors field is going away in its current form.
      
      Just store the status and result field in the nvme_request field from
      hardirq completion context (using a new helper) and then generate a
      Linux errno for the block layer only when we actually need it.
      
      Because we can't overload the status value with a negative error code
      for cancelled command we now have a flags filed in struct nvme_request
      that contains a bit for this condition.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      27fa9bc5
  4. 09 4月, 2017 1 次提交
  5. 06 4月, 2017 1 次提交
  6. 04 4月, 2017 1 次提交
  7. 31 3月, 2017 1 次提交
  8. 02 3月, 2017 2 次提交
    • K
      nvme: Complete all stuck requests · 302ad8cc
      Keith Busch 提交于
      If the nvme driver is shutting down its controller, the drievr will not
      start the queues up again, preventing blk-mq's hot CPU notifier from
      making forward progress.
      
      To fix that, this patch starts a request_queue freeze when the driver
      resets a controller so no new requests may enter. The driver will wait
      for frozen after IO queues are restarted to ensure the queue reference
      can be reinitialized when nvme requests to unfreeze the queues.
      
      If the driver is doing a safe shutdown, the driver will wait for the
      controller to successfully complete all inflight requests so that we
      don't unnecessarily fail them. Once the controller has been disabled,
      the queues will be restarted to force remaining entered requests to end
      in failure so that blk-mq's hot cpu notifier may progress.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      302ad8cc
    • S
      nvme: allocate nvme_queue in correct node · d3af3ecd
      Shaohua Li 提交于
      nvme_queue is per-cpu queue (mostly). Allocating it in node where blk-mq
      will use it.
      Signed-off-by: NShaohua Li <shli@fb.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      d3af3ecd
  9. 24 2月, 2017 1 次提交
  10. 23 2月, 2017 3 次提交
  11. 18 2月, 2017 2 次提交
  12. 07 2月, 2017 1 次提交
  13. 01 2月, 2017 1 次提交
  14. 31 1月, 2017 1 次提交
  15. 18 1月, 2017 1 次提交
  16. 14 1月, 2017 1 次提交
  17. 21 12月, 2016 2 次提交
  18. 19 12月, 2016 1 次提交
  19. 14 12月, 2016 1 次提交
  20. 09 12月, 2016 1 次提交
    • C
      block: improve handling of the magic discard payload · f9d03f96
      Christoph Hellwig 提交于
      Instead of allocating a single unused biovec for discard requests, send
      them down without any payload.  Instead we allow the driver to add a
      "special" payload using a biovec embedded into struct request (unioned
      over other fields never used while in the driver), and overloading
      the number of segments for this case.
      
      This has a couple of advantages:
      
       - we don't have to allocate the bio_vec
       - the amount of special casing for discard requests in the block
         layer is significantly reduced
       - using this same scheme for other request types is trivial,
         which will be important for implementing the new WRITE_ZEROES
         op on devices where it actually requires a payload (e.g. SCSI)
       - we can get rid of playing games with the request length, as
         we'll never touch it and completions will work just fine
       - it will allow us to support ranged discard operations in the
         future by merging non-contiguous discard bios into a single
         request
       - last but not least it removes a lot of code
      
      This patch is the common base for my WIP series for ranges discards and to
      remove discard_zeroes_data in favor of always using REQ_OP_WRITE_ZEROES,
      so it would be good to get it in quickly.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      f9d03f96
  21. 06 12月, 2016 2 次提交
  22. 17 11月, 2016 1 次提交
  23. 16 11月, 2016 1 次提交
  24. 11 11月, 2016 2 次提交
    • C
      nvme: don't pass the full CQE to nvme_complete_async_event · 7bf58533
      Christoph Hellwig 提交于
      We only need the status and result fields, and passing them explicitly
      makes life a lot easier for the Fibre Channel transport which doesn't
      have a full CQE for the fast path case.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      7bf58533
    • C
      nvme: introduce struct nvme_request · d49187e9
      Christoph Hellwig 提交于
      This adds a shared per-request structure for all NVMe I/O.  This structure
      is embedded as the first member in all NVMe transport drivers request
      private data and allows to implement common functionality between the
      drivers.
      
      The first use is to replace the current abuse of the SCSI command
      passthrough fields in struct request for the NVMe command passthrough,
      but it will grow a field more fields to allow implementing things
      like common abort handlers in the future.
      
      The passthrough commands are handled by having a pointer to the SQE
      (struct nvme_command) in struct nvme_request, and the union of the
      possible result fields, which had to be turned from an anonymous
      into a named union for that purpose.  This avoids having to pass
      a reference to a full CQE around and thus makes checking the result
      a lot more lightweight.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      d49187e9
  25. 28 10月, 2016 1 次提交
  26. 20 10月, 2016 1 次提交
  27. 13 10月, 2016 1 次提交
  28. 12 10月, 2016 2 次提交
    • K
      nvme: don't schedule multiple resets · c5f6ce97
      Keith Busch 提交于
      The queue_work only fails if the work is pending, but not yet running. If
      the work is running, the work item would get requeued, triggering a
      double reset. If the first reset fails for any reason, the second
      reset triggers:
      
      	WARN_ON(dev->ctrl.state == NVME_CTRL_RESETTING)
      
      Hitting that schedules controller deletion for a second time, which
      potentially takes a reference on the device that is being deleted.
      If the reset occurs at the same time as a hot removal event, this causes
      a double-free.
      
      This patch has the reset helper function check if the work is busy
      prior to queueing, and changes all places that schedule resets to use
      this function. Since most users don't want to sync with that work, the
      "flush_work" is moved to the only caller that wants to sync.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: Sagi Grimberg<sagi@grimberg.me>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      c5f6ce97
    • K
      nvme: Delete created IO queues on reset · 70659060
      Keith Busch 提交于
      The driver was decrementing the online_queues prior to attempting to
      delete those IO queues, so the driver ended up not requesting the
      controller delete any. This patch saves the online_queues prior to
      suspending them, and adds that parameter for deleting io queues.
      
      Fixes: c21377f8 ("nvme: Suspend all queues before deletion")
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      70659060