1. 25 5月, 2018 1 次提交
  2. 09 3月, 2018 1 次提交
  3. 08 3月, 2018 1 次提交
  4. 31 1月, 2018 1 次提交
    • M
      blk-mq: introduce BLK_STS_DEV_RESOURCE · 86ff7c2a
      Ming Lei 提交于
      This status is returned from driver to block layer if device related
      resource is unavailable, but driver can guarantee that IO dispatch
      will be triggered in future when the resource is available.
      
      Convert some drivers to return BLK_STS_DEV_RESOURCE.  Also, if driver
      returns BLK_STS_RESOURCE and SCHED_RESTART is set, rerun queue after
      a delay (BLK_MQ_DELAY_QUEUE) to avoid IO stalls.  BLK_MQ_DELAY_QUEUE is
      3 ms because both scsi-mq and nvmefc are using that magic value.
      
      If a driver can make sure there is in-flight IO, it is safe to return
      BLK_STS_DEV_RESOURCE because:
      
      1) If all in-flight IOs complete before examining SCHED_RESTART in
      blk_mq_dispatch_rq_list(), SCHED_RESTART must be cleared, so queue
      is run immediately in this case by blk_mq_dispatch_rq_list();
      
      2) if there is any in-flight IO after/when examining SCHED_RESTART
      in blk_mq_dispatch_rq_list():
      - if SCHED_RESTART isn't set, queue is run immediately as handled in 1)
      - otherwise, this request will be dispatched after any in-flight IO is
        completed via blk_mq_sched_restart()
      
      3) if SCHED_RESTART is set concurently in context because of
      BLK_STS_RESOURCE, blk_mq_delay_run_hw_queue() will cover the above two
      cases and make sure IO hang can be avoided.
      
      One invariant is that queue will be rerun if SCHED_RESTART is set.
      Suggested-by: NJens Axboe <axboe@kernel.dk>
      Tested-by: NLaurence Oberman <loberman@redhat.com>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      86ff7c2a
  5. 18 8月, 2017 1 次提交
  6. 15 8月, 2017 1 次提交
    • M
      xen-blkfront: use a right index when checking requests · b15bd8cb
      Munehisa Kamata 提交于
      Since commit d05d7f40 ("Merge branch 'for-4.8/core' of
      git://git.kernel.dk/linux-block") and 3fc9d690 ("Merge branch
      'for-4.8/drivers' of git://git.kernel.dk/linux-block"), blkfront_resume()
      has been using an index for iterating ring_info to check request when
      iterating blk_shadow in an inner loop. This seems to have been
      accidentally introduced during the massive rewrite of the block layer
      macros in the commits.
      
      This may cause crash like this:
      
      [11798.057074] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
      [11798.058832] IP: [<ffffffff814411fa>] blkfront_resume+0x10a/0x610
      ....
      [11798.061063] Call Trace:
      [11798.061063]  [<ffffffff8139ce93>] xenbus_dev_resume+0x53/0x140
      [11798.061063]  [<ffffffff8139ce40>] ? xenbus_dev_probe+0x150/0x150
      [11798.061063]  [<ffffffff813f359e>] dpm_run_callback+0x3e/0x110
      [11798.061063]  [<ffffffff813f3a08>] device_resume+0x88/0x190
      [11798.061063]  [<ffffffff813f4cc0>] dpm_resume+0x100/0x2d0
      [11798.061063]  [<ffffffff813f5221>] dpm_resume_end+0x11/0x20
      [11798.061063]  [<ffffffff813950a8>] do_suspend+0xe8/0x1a0
      [11798.061063]  [<ffffffff813954bd>] shutdown_handler+0xfd/0x130
      [11798.061063]  [<ffffffff8139aba0>] ? split+0x110/0x110
      [11798.061063]  [<ffffffff8139ac26>] xenwatch_thread+0x86/0x120
      [11798.061063]  [<ffffffff810b4570>] ? prepare_to_wait_event+0x110/0x110
      [11798.061063]  [<ffffffff8108fe57>] kthread+0xd7/0xf0
      [11798.061063]  [<ffffffff811da811>] ? kfree+0x121/0x170
      [11798.061063]  [<ffffffff8108fd80>] ? kthread_park+0x60/0x60
      [11798.061063]  [<ffffffff810863b0>] ?  call_usermodehelper_exec_work+0xb0/0xb0
      [11798.061063]  [<ffffffff810864ea>] ?  call_usermodehelper_exec_async+0x13a/0x140
      [11798.061063]  [<ffffffff81534a45>] ret_from_fork+0x25/0x30
      
      Use the right index in the inner loop.
      
      Fixes: d05d7f40 ("Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-block")
      Fixes: 3fc9d690 ("Merge branch 'for-4.8/drivers' of git://git.kernel.dk/linux-block")
      Signed-off-by: NMunehisa Kamata <kamatam@amazon.com>
      Reviewed-by: NThomas Friebel <friebelt@amazon.de>
      Reviewed-by: NEduardo Valentin <eduval@amazon.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reviewed-by: NRoger Pau Monne <roger.pau@citrix.com>
      Cc: xen-devel@lists.xenproject.org
      Cc: stable@vger.kernel.org
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      b15bd8cb
  7. 25 7月, 2017 2 次提交
  8. 24 7月, 2017 1 次提交
    • B
      xen-blkfront: Fix handling of non-supported operations · 31c4ccc3
      Bart Van Assche 提交于
      This patch fixes the following sparse warnings:
      
      drivers/block/xen-blkfront.c:916:45: warning: incorrect type in argument 2 (different base types)
      drivers/block/xen-blkfront.c:916:45:    expected restricted blk_status_t [usertype] error
      drivers/block/xen-blkfront.c:916:45:    got int [signed] error
      drivers/block/xen-blkfront.c:1599:47: warning: incorrect type in assignment (different base types)
      drivers/block/xen-blkfront.c:1599:47:    expected int [signed] error
      drivers/block/xen-blkfront.c:1599:47:    got restricted blk_status_t [usertype] <noident>
      drivers/block/xen-blkfront.c:1607:55: warning: incorrect type in assignment (different base types)
      drivers/block/xen-blkfront.c:1607:55:    expected int [signed] error
      drivers/block/xen-blkfront.c:1607:55:    got restricted blk_status_t [usertype] <noident>
      drivers/block/xen-blkfront.c:1625:55: warning: incorrect type in assignment (different base types)
      drivers/block/xen-blkfront.c:1625:55:    expected int [signed] error
      drivers/block/xen-blkfront.c:1625:55:    got restricted blk_status_t [usertype] <noident>
      drivers/block/xen-blkfront.c:1628:62: warning: restricted blk_status_t degrades to integer
      
      Compile-tested only.
      
      Fixes: commit 2a842aca ("block: introduce new block status code type")
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Roger Pau Monné <roger.pau@citrix.com>
      Cc: <xen-devel@lists.xenproject.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      31c4ccc3
  9. 28 6月, 2017 1 次提交
  10. 19 6月, 2017 1 次提交
  11. 09 6月, 2017 3 次提交
    • C
      block: switch bios to blk_status_t · 4e4cbee9
      Christoph Hellwig 提交于
      Replace bi_error with a new bi_status to allow for a clear conversion.
      Note that device mapper overloaded bi_error with a private value, which
      we'll have to keep arround at least for now and thus propagate to a
      proper blk_status_t value.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4e4cbee9
    • C
      blk-mq: switch ->queue_rq return value to blk_status_t · fc17b653
      Christoph Hellwig 提交于
      Use the same values for use for request completion errors as the return
      value from ->queue_rq.  BLK_STS_RESOURCE is special cased to cause
      a requeue, and all the others are completed as-is.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      fc17b653
    • C
      block: introduce new block status code type · 2a842aca
      Christoph Hellwig 提交于
      Currently we use nornal Linux errno values in the block layer, and while
      we accept any error a few have overloaded magic meanings.  This patch
      instead introduces a new  blk_status_t value that holds block layer specific
      status codes and explicitly explains their meaning.  Helpers to convert from
      and to the previous special meanings are provided for now, but I suspect
      we want to get rid of them in the long run - those drivers that have a
      errno input (e.g. networking) usually get errnos that don't know about
      the special block layer overloads, and similarly returning them to userspace
      will usually return somethings that strictly speaking isn't correct
      for file system operations, but that's left as an exercise for later.
      
      For now the set of errors is a very limited set that closely corresponds
      to the previous overloaded errno values, but there is some low hanging
      fruite to improve it.
      
      blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
      typechecking, so that we can easily catch places passing the wrong values.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2a842aca
  12. 21 4月, 2017 2 次提交
  13. 18 4月, 2017 1 次提交
  14. 31 3月, 2017 1 次提交
  15. 01 2月, 2017 1 次提交
    • C
      block: fold cmd_type into the REQ_OP_ space · aebf526b
      Christoph Hellwig 提交于
      Instead of keeping two levels of indirection for requests types, fold it
      all into the operations.  The little caveat here is that previously
      cmd_type only applied to struct request, while the request and bio op
      fields were set to plain REQ_OP_READ/WRITE even for passthrough
      operations.
      
      Instead this patch adds new REQ_OP_* for SCSI passthrough and driver
      private requests, althought it has to add two for each so that we
      can communicate the data in/out nature of the request.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      aebf526b
  16. 24 1月, 2017 2 次提交
  17. 07 11月, 2016 1 次提交
  18. 03 11月, 2016 2 次提交
    • B
      blk-mq: Add a kick_requeue_list argument to blk_mq_requeue_request() · 2b053aca
      Bart Van Assche 提交于
      Most blk_mq_requeue_request() and blk_mq_add_to_requeue_list() calls
      are followed by kicking the requeue list. Hence add an argument to
      these two functions that allows to kick the requeue list. This was
      proposed by Christoph Hellwig.
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2b053aca
    • B
      blk-mq: Avoid that requeueing starts stopped queues · 52d7f1b5
      Bart Van Assche 提交于
      Since blk_mq_requeue_work() starts stopped queues and since
      execution of this function can be scheduled after a queue has
      been stopped it is not possible to stop queues without using
      an additional state variable to track whether or not the queue
      has been stopped. Hence modify blk_mq_requeue_work() such that it
      does not start stopped queues. My conclusion after a review of
      the blk_mq_stop_hw_queues() and blk_mq_{delay_,}kick_requeue_list()
      callers is as follows:
      * In the dm driver starting and stopping queues should only happen
        if __dm_suspend() or __dm_resume() is called and not if the
        requeue list is processed.
      * In the SCSI core queue stopping and starting should only be
        performed by the scsi_internal_device_block() and
        scsi_internal_device_unblock() functions but not by any other
        function. Although the blk_mq_stop_hw_queue() call in
        scsi_queue_rq() may help to reduce CPU load if a LLD queue is
        full, figuring out whether or not a queue should be restarted
        when requeueing a command would require to introduce additional
        locking in scsi_mq_requeue_cmd() to avoid a race with
        scsi_internal_device_block(). Avoid this complexity by removing
        the blk_mq_stop_hw_queue() call from scsi_queue_rq().
      * In the NVMe core only the functions that call
        blk_mq_start_stopped_hw_queues() explicitly should start stopped
        queues.
      * A blk_mq_start_stopped_hwqueues() call must be added in the
        xen-blkfront driver in its blkif_recover() function.
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Roger Pau Monné <roger.pau@citrix.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: James Bottomley <jejb@linux.vnet.ibm.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      52d7f1b5
  19. 15 9月, 2016 1 次提交
  20. 20 8月, 2016 3 次提交
  21. 22 7月, 2016 1 次提交
  22. 30 6月, 2016 1 次提交
    • B
      xen-blkfront: save uncompleted reqs in blkfront_resume() · 7b427a59
      Bob Liu 提交于
      Uncompleted reqs used to be 'saved and resubmitted' in blkfront_recover() during
      migration, but that's too late after multi-queue was introduced.
      
      After a migrate to another host (which may not have multiqueue support), the
      number of rings (block hardware queues) may be changed and the ring and shadow
      structure will also be reallocated.
      
      The blkfront_recover() then can't 'save and resubmit' the real
      uncompleted reqs because shadow structure have been reallocated.
      
      This patch fixes this issue by moving the 'save' logic out of
      blkfront_recover() to earlier place in blkfront_resume().
      
      The 'resubmit' is not changed and still in blkfront_recover().
      Signed-off-by: NBob Liu <bob.liu@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: stable@vger.kernel.org
      7b427a59
  23. 28 6月, 2016 1 次提交
    • D
      block: convert to device_add_disk() · 0d52c756
      Dan Williams 提交于
      For block drivers that specify a parent device, convert them to use
      device_add_disk().
      
      This conversion was done with the following semantic patch:
      
          @@
          struct gendisk *disk;
          expression E;
          @@
      
          - disk->driverfs_dev = E;
          ...
          - add_disk(disk);
          + device_add_disk(E, disk);
      
          @@
          struct gendisk *disk;
          expression E1, E2;
          @@
      
          - disk->driverfs_dev = E1;
          ...
          E2 = disk;
          ...
          - add_disk(E2);
          + device_add_disk(E1, E2);
      
      ...plus some manual fixups for a few missed conversions.
      
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0d52c756
  24. 09 6月, 2016 3 次提交
    • C
      block: add a separate operation type for secure erase · 288dab8a
      Christoph Hellwig 提交于
      Instead of overloading the discard support with the REQ_SECURE flag.
      Use the opportunity to rename the queue flag as well, and remove the
      dead checks for this flag in the RAID 1 and RAID 10 drivers that don't
      claim support for secure erase.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      288dab8a
    • B
      xen-blkfront: fix resume issues after a migration · 2a6f71ad
      Bob Liu 提交于
      After a migrate to another host (which may not have multiqueue
      support), the number of rings (block hardware queues)
      may be changed and the ring info structure will also be reallocated.
      
      This patch fixes two related bugs:
       * call blk_mq_update_nr_hw_queues() to make blk-core know the number
         of hardware queues have been changed.
       * Don't store rinfo pointer to hctx->driver_data, because rinfo may be
         reallocated so use hctx->queue_num to get the rinfo structure instead.
      Signed-off-by: NBob Liu <bob.liu@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      2a6f71ad
    • B
      xen-blkfront: don't call talk_to_blkback when already connected to blkback · efd15352
      Bob Liu 提交于
      Sometimes blkfront may twice receive blkback_changed() notification
      (XenbusStateConnected) after migration, which will cause
      talk_to_blkback() to be called twice too and confuse xen-blkback.
      
      The flow is as follow:
         blkfront                                        blkback
      blkfront_resume()
       > talk_to_blkback()
        > Set blkfront to XenbusStateInitialised
                                                      front changed()
                                                       > Connect()
                                                        > Set blkback to XenbusStateConnected
      
      blkback_changed()
       > Skip talk_to_blkback()
         because frontstate == XenbusStateInitialised
       > blkfront_connect()
        > Set blkfront to XenbusStateConnected
      
      -----
      And here we get another XenbusStateConnected notification leading
      to:
      -----
      blkback_changed()
       > because now frontstate != XenbusStateInitialised
         talk_to_blkback() is also called again
        > blkfront state changed from
        XenbusStateConnected to XenbusStateInitialised
          (Which is not correct!)
      
      						front_changed():
                                                       > Do nothing because blkback
                                                         already in XenbusStateConnected
      
      Now blkback is in XenbusStateConnected but blkfront is still
      in XenbusStateInitialised - leading to no disks.
      
      Poking of the XenbusStateConnected state is allowed (to deal with
      block disk change) and has to be dealt with. The most likely
      cause of this bug are custom udev scripts hooking up the disks
      and then validating the size.
      Signed-off-by: NBob Liu <bob.liu@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      efd15352
  25. 08 6月, 2016 4 次提交
  26. 13 4月, 2016 1 次提交
  27. 04 3月, 2016 1 次提交