1. 09 6月, 2017 2 次提交
    • C
      blk-mq: switch ->queue_rq return value to blk_status_t · fc17b653
      Christoph Hellwig 提交于
      Use the same values for use for request completion errors as the return
      value from ->queue_rq.  BLK_STS_RESOURCE is special cased to cause
      a requeue, and all the others are completed as-is.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      fc17b653
    • C
      block: introduce new block status code type · 2a842aca
      Christoph Hellwig 提交于
      Currently we use nornal Linux errno values in the block layer, and while
      we accept any error a few have overloaded magic meanings.  This patch
      instead introduces a new  blk_status_t value that holds block layer specific
      status codes and explicitly explains their meaning.  Helpers to convert from
      and to the previous special meanings are provided for now, but I suspect
      we want to get rid of them in the long run - those drivers that have a
      errno input (e.g. networking) usually get errnos that don't know about
      the special block layer overloads, and similarly returning them to userspace
      will usually return somethings that strictly speaking isn't correct
      for file system operations, but that's left as an exercise for later.
      
      For now the set of errors is a very limited set that closely corresponds
      to the previous overloaded errno values, but there is some low hanging
      fruite to improve it.
      
      blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
      typechecking, so that we can easily catch places passing the wrong values.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2a842aca
  2. 26 5月, 2017 2 次提交
  3. 23 5月, 2017 2 次提交
    • M
      nvme: avoid to use blk_mq_abort_requeue_list() · 986f75c8
      Ming Lei 提交于
      NVMe may add request into requeue list simply and not kick off the
      requeue if hw queues are stopped. Then blk_mq_abort_requeue_list()
      is called in both nvme_kill_queues() and nvme_ns_remove() for
      dealing with this issue.
      
      Unfortunately blk_mq_abort_requeue_list() is absolutely a
      race maker, for example, one request may be requeued during
      the aborting. So this patch just calls blk_mq_kick_requeue_list() in
      nvme_kill_queues() to handle this issue like what nvme_start_queues()
      does. Now all requests in requeue list when queues are stopped will be
      handled by blk_mq_kick_requeue_list() when queues are restarted, either
      in nvme_start_queues() or in nvme_kill_queues().
      
      Cc: stable@vger.kernel.org
      Reported-by: NZhang Yi <yizhan@redhat.com>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      986f75c8
    • M
      nvme: use blk_mq_start_hw_queues() in nvme_kill_queues() · 806f026f
      Ming Lei 提交于
      Inside nvme_kill_queues(), we have to start hw queues for
      draining requests in sw queues, .dispatch list and requeue list,
      so use blk_mq_start_hw_queues() instead of blk_mq_start_stopped_hw_queues()
      which only run queues if queues are stopped, but the queues may have
      been started already, for example nvme_start_queues() is called in reset work
      function.
      
      blk_mq_start_hw_queues() run hw queues in current context, instead
      of running asynchronously like before. Given nvme_kill_queues() is
      run from either remove context or reset worker context, both are fine
      to run hw queue directly. And the mutex of namespaces_mutex isn't a
      problem too becasue nvme_start_freeze() runs hw queue in this way
      already.
      
      Cc: stable@vger.kernel.org
      Reported-by: NZhang Yi <yizhan@redhat.com>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      806f026f
  4. 25 4月, 2017 3 次提交
  5. 21 4月, 2017 6 次提交
  6. 09 4月, 2017 1 次提交
  7. 06 4月, 2017 4 次提交
  8. 04 4月, 2017 1 次提交
  9. 02 4月, 2017 1 次提交
  10. 29 3月, 2017 1 次提交
  11. 02 3月, 2017 1 次提交
    • K
      nvme: Complete all stuck requests · 302ad8cc
      Keith Busch 提交于
      If the nvme driver is shutting down its controller, the drievr will not
      start the queues up again, preventing blk-mq's hot CPU notifier from
      making forward progress.
      
      To fix that, this patch starts a request_queue freeze when the driver
      resets a controller so no new requests may enter. The driver will wait
      for frozen after IO queues are restarted to ensure the queue reference
      can be reinitialized when nvme requests to unfreeze the queues.
      
      If the driver is doing a safe shutdown, the driver will wait for the
      controller to successfully complete all inflight requests so that we
      don't unnecessarily fail them. Once the controller has been disabled,
      the queues will be restarted to force remaining entered requests to end
      in failure so that blk-mq's hot cpu notifier may progress.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      302ad8cc
  12. 23 2月, 2017 6 次提交
  13. 18 2月, 2017 2 次提交
  14. 15 2月, 2017 1 次提交
  15. 09 2月, 2017 1 次提交
  16. 07 2月, 2017 1 次提交
  17. 01 2月, 2017 1 次提交
    • C
      block: fold cmd_type into the REQ_OP_ space · aebf526b
      Christoph Hellwig 提交于
      Instead of keeping two levels of indirection for requests types, fold it
      all into the operations.  The little caveat here is that previously
      cmd_type only applied to struct request, while the request and bio op
      fields were set to plain REQ_OP_READ/WRITE even for passthrough
      operations.
      
      Instead this patch adds new REQ_OP_* for SCSI passthrough and driver
      private requests, althought it has to add two for each so that we
      can communicate the data in/out nature of the request.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      aebf526b
  18. 31 1月, 2017 1 次提交
  19. 12 1月, 2017 1 次提交
    • G
      nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too · b5a10c5f
      Guilherme G. Piccoli 提交于
      Commit 54adc010 ("nvme/quirk: Add a delay before checking for adapter
      readiness") introduced a quirk to adapters that cannot read the bit
      NVME_CSTS_RDY right after register NVME_REG_CC is set; these adapters
      need a delay or else the action of reading the bit NVME_CSTS_RDY could
      somehow corrupt adapter's registers state and it never recovers.
      
      When this quirk was added, we checked ctrl->tagset in order to avoid
      quirking in probe time, supposing we would never require such delay
      during probe. Well, it was too optimistic; we in fact need this quirk
      at probe time in some cases, like after a kexec.
      
      In some experiments, after abnormal shutdown of machine (aka power cord
      unplug), we booted into our bootloader in Power, which is a Linux kernel,
      and kexec'ed into another distro. If this kexec is too quick, we end up
      reaching the probe of NVMe adapter in that distro when adapter is in
      bad state (not fully initialized on our bootloader). What happens next
      is that nvme_wait_ready() is unable to complete, except if the quirk is
      enabled.
      
      So, this patch removes the original ctrl->tagset verification in order
      to enable the quirk even on probe time.
      
      Fixes: 54adc010 ("nvme/quirk: Add a delay before checking for adapter readiness")
      Reported-by: NAndrew Byrne <byrneadw@ie.ibm.com>
      Reported-by: NJaime A. H. Gomez <jahgomez@mx1.ibm.com>
      Reported-by: NZachary D. Myers <zdmyers@us.ibm.com>
      Signed-off-by: NGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Acked-by: NJeffrey Lien <Jeff.Lien@wdc.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      b5a10c5f
  20. 21 12月, 2016 1 次提交
    • K
      nvme: simplify stripe quirk · e6282aef
      Keith Busch 提交于
      Some OEMs believe they own the Identify Controller vendor specific
      region and will repurpose it with their own values. While not common,
      we can't rely on the PCI VID:DID to tell use how to decode the field
      we reserved for this as the stripe size so we need to do something else
      for the list of devices using this quirk.
      
      The field was supposed to allow flexibility on the device's back-end
      striping, but it turned out that never materialized; the chunk is always
      the same as MDTS in the products subscribing to this quirk, so this
      patch removes the stripe_size field and sets the chunk to the max hw
      transfer size for the devices using this quirk.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      e6282aef
  21. 14 12月, 2016 1 次提交
    • L
      Revert "nvme: add support for the Write Zeroes command" · cdb98c26
      Linus Torvalds 提交于
      This reverts commit 6d31e3ba.
      
      This causes bootup problems for me both on my laptop and my desktop.
      What they have in common is that they have NVMe disks with dm-crypt, but
      it's not the same controller, so it's not controller-specific.
      
      Jens does not see it on his machine (also NVMe), so it's presumably
      something that triggers just on bootup.  Possibly related to dm-crypt
      and the fact that I mark my luks volume with "allow-discards" in
      /etc/crypttab.
      
      It's 100% repeatable for me, which made it fairly straightforward to
      bisect the problem to this commit. Small mercies.
      
      So we don't know what the reason is yet, but the revert is needed to get
      things going again.
      Acked-by: NJens Axboe <axboe@fb.com>
      Cc: Chaitanya Kulkarni <chaitanya.kulkarni@hgst.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cdb98c26