1. 26 9月, 2014 10 次提交
  2. 23 9月, 2014 15 次提交
  3. 10 9月, 2014 2 次提交
    • J
      blk-mq: scale depth and rq map appropriate if low on memory · a5164405
      Jens Axboe 提交于
      If we are running in a kdump environment, resources are scarce.
      For some SCSI setups with a huge set of shared tags, we run out
      of memory allocating what the drivers is asking for. So implement
      a scale back logic to reduce the tag depth for those cases, allowing
      the driver to successfully load.
      
      We should extend this to detect low memory situations, and implement
      a sane fallback for those (1 queue, 64 tags, or something like that).
      Tested-by: NRobert Elliott <elliott@hp.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      a5164405
    • A
      Block: fix unbalanced bypass-disable in blk_register_queue · df35c7c9
      Alan Stern 提交于
      When a queue is registered, the block layer turns off the bypass
      setting (because bypass is enabled when the queue is created).  This
      doesn't work well for queues that are unregistered and then registered
      again; we get a WARNING because of the unbalanced calls to
      blk_queue_bypass_end().
      
      This patch fixes the problem by making blk_register_queue() call
      blk_queue_bypass_end() only the first time the queue is registered.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Acked-by: NTejun Heo <tj@kernel.org>
      CC: James Bottomley <James.Bottomley@HansenPartnership.com>
      CC: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      df35c7c9
  4. 09 9月, 2014 1 次提交
    • T
      block, bdi: an active gendisk always has a request_queue associated with it · ff9ea323
      Tejun Heo 提交于
      bdev_get_queue() returns the request_queue associated with the
      specified block_device.  blk_get_backing_dev_info() makes use of
      bdev_get_queue() to determine the associated bdi given a block_device.
      
      All the callers of bdev_get_queue() including
      blk_get_backing_dev_info() assume that bdev_get_queue() may return
      NULL and implement NULL handling; however, bdev_get_queue() requires
      the passed in block_device is opened and attached to its gendisk.
      Because an active gendisk always has a valid request_queue associated
      with it, bdev_get_queue() can never return NULL and neither can
      blk_get_backing_dev_info().
      
      Make it clear that neither of the two functions can return NULL and
      remove NULL handling from all the callers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      ff9ea323
  5. 08 9月, 2014 1 次提交
    • T
      blkcg: remove blkcg->id · f4da8072
      Tejun Heo 提交于
      blkcg->id is a unique id given to each blkcg; however, the
      cgroup_subsys_state which each blkcg embeds already has ->serial_nr
      which can be used for the same purpose.  Drop blkcg->id and replace
      its uses with blkcg->css.serial_nr.  Rename cfq_cgroup->blkcg_id to
      ->blkcg_serial_nr and @id in check_blkcg_changed() to @serial_nr for
      consistency.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      f4da8072
  6. 04 9月, 2014 2 次提交
    • K
      block: Fix dev_t minor allocation lifetime · 2da78092
      Keith Busch 提交于
      Releases the dev_t minor when all references are closed to prevent
      another device from acquiring the same major/minor.
      
      Since the partition's release may be invoked from call_rcu's soft-irq
      context, the ext_dev_idr's mutex had to be replaced with a spinlock so
      as not so sleep.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2da78092
    • R
      blk-mq: cleanup after blk_mq_init_rq_map failures · 5676e7b6
      Robert Elliott 提交于
      In blk-mq.c blk_mq_alloc_tag_set, if:
      	set->tags = kmalloc_node()
      succeeds, but one of the blk_mq_init_rq_map() calls fails,
      	goto out_unwind;
      needs to free set->tags so the caller is not obligated
      to do so.  None of the current callers (null_blk,
      virtio_blk, virtio_blk, or the forthcoming scsi-mq)
      do so.
      
      set->tags needs to be set to NULL after doing so,
      so other tag cleanup logic doesn't try to free
      a stale pointer later.  Also set it to NULL
      in blk_mq_free_tag_set.
      
      Tested with error injection on the forthcoming
      scsi-mq + hpsa combination.
      Signed-off-by: NRobert Elliott <elliott@hp.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      5676e7b6
  7. 03 9月, 2014 1 次提交
  8. 29 8月, 2014 2 次提交
  9. 28 8月, 2014 1 次提交
  10. 27 8月, 2014 2 次提交
    • J
      block,scsi: verify return pointer from blk_get_request · eb571eea
      Joe Lawrence 提交于
      The blk-core dead queue checks introduce an error scenario to
      blk_get_request that returns NULL if the request queue has been
      shutdown. This affects the behavior for __GFP_WAIT callers, who should
      verify the return value before dereferencing.
      Signed-off-by: NJoe Lawrence <joe.lawrence@stratus.com>
      Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd]
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      eb571eea
    • T
      cfq-iosched: Fix wrong children_weight calculation · e15693ef
      Toshiaki Makita 提交于
      cfq_group_service_tree_add() is applying new_weight at the beginning of
      the function via cfq_update_group_weight().
      This actually allows weight to change between adding it to and subtracting
      it from children_weight, and triggers WARN_ON_ONCE() in
      cfq_group_service_tree_del(), or even causes oops by divide error during
      vfr calculation in cfq_group_service_tree_add().
      
      The detailed scenario is as follows:
      1. Create blkio cgroups X and Y as a child of X.
         Set X's weight to 500 and perform some I/O to apply new_weight.
         This X's I/O completes before starting Y's I/O.
      2. Y starts I/O and cfq_group_service_tree_add() is called with Y.
      3. cfq_group_service_tree_add() walks up the tree during children_weight
         calculation and adds parent X's weight (500) to children_weight of root.
         children_weight becomes 500.
      4. Set X's weight to 1000.
      5. X starts I/O and cfq_group_service_tree_add() is called with X.
      6. cfq_group_service_tree_add() applies its new_weight (1000).
      7. I/O of Y completes and cfq_group_service_tree_del() is called with Y.
      8. I/O of X completes and cfq_group_service_tree_del() is called with X.
      9. cfq_group_service_tree_del() subtracts X's weight (1000) from
         children_weight of root. children_weight becomes -500.
         This triggers WARN_ON_ONCE().
      10. Set X's weight to 500.
      11. X starts I/O and cfq_group_service_tree_add() is called with X.
      12. cfq_group_service_tree_add() applies its new_weight (500) and adds it
          to children_weight of root. children_weight becomes 0. Calcularion of
          vfr triggers oops by divide error.
      
      weight should be updated right before adding it to children_weight.
      Reported-by: NRuki Sekiya <sekiya.ruki@lab.ntt.co.jp>
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e15693ef
  11. 26 8月, 2014 1 次提交
  12. 23 8月, 2014 2 次提交