1. 23 2月, 2017 2 次提交
    • A
      nvme: Enable autonomous power state transitions · c5552fde
      Andy Lutomirski 提交于
      NVMe devices can advertise multiple power states.  These states can
      be either "operational" (the device is fully functional but possibly
      slow) or "non-operational" (the device is asleep until woken up).
      Some devices can automatically enter a non-operational state when
      idle for a specified amount of time and then automatically wake back
      up when needed.
      
      The hardware configuration is a table.  For each state, an entry in
      the table indicates the next deeper non-operational state, if any,
      to autonomously transition to and the idle time required before
      transitioning.
      
      This patch teaches the driver to program APST so that each successive
      non-operational state will be entered after an idle time equal to 100%
      of the total latency (entry plus exit) associated with that state.
      The maximum acceptable latency is controlled using dev_pm_qos
      (e.g. power/pm_qos_latency_tolerance_us in sysfs); non-operational
      states with total latency greater than this value will not be used.
      As a special case, setting the latency tolerance to 0 will disable
      APST entirely.  On hardware without APST support, the sysfs file will
      not be exposed.
      
      The latency tolerance for newly-probed devices is set by the module
      parameter nvme_core.default_ps_max_latency_us.
      
      In theory, the device can expose "default" APST table, but this
      doesn't seem to function correctly on my device (Samsung 950), nor
      does it seem particularly useful.  There is also an optional
      mechanism by which a configuration can be "saved" so it will be
      automatically loaded on reset.  This can be configured from
      userspace, but it doesn't seem useful to support in the driver.
      
      On my laptop, enabling APST seems to save nearly 1W.
      
      The hardware tables can be decoded in userspace with nvme-cli.
      'nvme id-ctrl /dev/nvmeN' will show the power state table and
      'nvme get-feature -f 0x0c -H /dev/nvme0' will show the current APST
      configuration.
      
      This feature is quirked off on a known-buggy Samsung device.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      c5552fde
    • P
      nvme: Use CNS as 8-bit field and avoid endianness conversion · 986994a2
      Parav Pandit 提交于
      This patch defines CNS field as 8-bit field and avoids cpu_to/from_le
      conversions.
      Also initialize nvme_command cns value explicitly to NVME_ID_CNS_NS
      for readability (don't rely on the fact that NVME_ID_CNS_NS = 0).
      Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      986994a2
  2. 18 2月, 2017 1 次提交
  3. 09 2月, 2017 1 次提交
  4. 06 12月, 2016 1 次提交
  5. 01 12月, 2016 1 次提交
  6. 11 11月, 2016 1 次提交
    • C
      nvme: introduce struct nvme_request · d49187e9
      Christoph Hellwig 提交于
      This adds a shared per-request structure for all NVMe I/O.  This structure
      is embedded as the first member in all NVMe transport drivers request
      private data and allows to implement common functionality between the
      drivers.
      
      The first use is to replace the current abuse of the SCSI command
      passthrough fields in struct request for the NVMe command passthrough,
      but it will grow a field more fields to allow implementing things
      like common abort handlers in the future.
      
      The passthrough commands are handled by having a pointer to the SQE
      (struct nvme_command) in struct nvme_request, and the union of the
      possible result fields, which had to be turned from an anonymous
      into a named union for that purpose.  This avoids having to pass
      a reference to a full CQE around and thus makes checking the result
      a lot more lightweight.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      d49187e9
  7. 20 10月, 2016 4 次提交
  8. 19 8月, 2016 1 次提交
  9. 06 7月, 2016 2 次提交
  10. 12 6月, 2016 6 次提交
  11. 02 5月, 2016 1 次提交
  12. 02 12月, 2015 1 次提交
  13. 10 10月, 2015 6 次提交
  14. 19 8月, 2015 2 次提交
  15. 21 7月, 2015 1 次提交
  16. 06 6月, 2015 1 次提交
    • K
      NVMe: Automatic namespace rescan · a5768aa8
      Keith Busch 提交于
      Namespaces may be dynamically allocated and deleted or attached and
      detached. This has the driver rescan the device for namespace changes
      after each device reset or namespace change asynchronous event.
      
      There could potentially be many detached namespaces that we don't want
      polluting /dev/ with unusable block handles, so this will delete disks
      if the namespace is not active as indicated by the response from identify
      namespace. This also skips adding the disk if no capacity is provisioned
      to the namespace in the first place.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      a5768aa8
  17. 22 5月, 2015 3 次提交
  18. 08 4月, 2015 1 次提交
  19. 20 2月, 2015 4 次提交
    • K
      NVMe: Fix potential corruption during shutdown · 07836e65
      Keith Busch 提交于
      The driver has to end unreturned commands at some point even if the
      controller has not provided a completion. The driver tried to be safe by
      deleting IO queues prior to ending all unreturned commands. That should
      cause the controller to internally abort inflight commands, but IO queue
      deletion request does not have to be successful, so all bets are off. We
      still have to make progress, so to be extra safe, this patch doesn't
      clear a queue to release the dma mapping for a command until after the
      pci device has been disabled.
      
      This patch removes the special handling during device initialization
      so controller recovery can be done all the time. This is possible since
      initialization is not inlined with pci probe anymore.
      Reported-by: NNilish Choudhury <nilesh.choudhury@oracle.com>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      07836e65
    • K
      NVMe: Asynchronous controller probe · 2e1d8448
      Keith Busch 提交于
      This performs the longest parts of nvme device probe in scheduled work.
      This speeds up probe significantly when multiple devices are in use.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      2e1d8448
    • K
      NVMe: Register management handle under nvme class · b3fffdef
      Keith Busch 提交于
      This creates a new class type for nvme devices to register their
      management character devices with. This is so we do not rely on miscdev
      to provide enough minors for as many nvme devices some people plan to
      use. The previous limit was approximately 60 NVMe controllers, depending
      on the platform and kernel. Now the limit is 1M, which ought to be enough
      for anybody.
      
      Since we have a new device class, it makes sense to attach the block
      devices under this as well, so part of this patch moves the management
      handle initialization prior to the namespaces discovery.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      b3fffdef
    • K
      NVMe: Update SCSI Inquiry VPD 83h translation · 4f1982b4
      Keith Busch 提交于
      The original translation created collisions on Inquiry VPD 83 for many
      existing devices. Newer specifications provide other ways to translate
      based on the device's version can be used to create unique identifiers.
      
      Version 1.1 provides an EUI64 field that uniquely identifies each
      namespace, and 1.2 added the longer NGUID field for the same reason.
      Both follow the IEEE EUI format and readily translate to the SCSI device
      identification EUI designator type 2h. For devices implementing either,
      the translation will use this type, defaulting to the EUI64 8-byte type if
      implemented then NGUID's 16 byte version if not. If neither are provided,
      the 1.0 translation is used, and is updated to use the SCSI String format
      to guarantee a unique identifier.
      
      Knowing when to use the new fields depends on the nvme controller's
      revision. The NVME_VS macro was not decoding this correctly, so that is
      fixed in this patch and moved to a more appropriate place.
      
      Since the Identify Namespace structure required an update for the NGUID
      field, this patch adds the remaining new 1.2 fields to the structure.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      4f1982b4