1. 23 12月, 2015 1 次提交
  2. 09 12月, 2015 1 次提交
  3. 08 12月, 2015 1 次提交
  4. 02 12月, 2015 3 次提交
    • A
      null_blk: change type of completion_nsec to unsigned long · dbac1175
      Arianna Avanzini 提交于
      This commit at least doubles the maximum value for
      completion_nsec. This helps in special cases where one wants/needs to
      emulate an extremely slow I/O (for example to spot bugs).
      Signed-off-by: NPaolo Valente <paolo.valente@unimore.it>
      Signed-off-by: NArianna Avanzini <avanzini@google.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      dbac1175
    • A
      null_blk: guarantee device restart in all irq modes · cf8ecc5a
      Arianna Avanzini 提交于
      In single-queue (block layer) mode,the function null_rq_prep_fn stops
      the device if alloc_cmd fails. Then, once stopped, the device must be
      restarted on the next command completion, so that the request(s) for
      which alloc_cmd failed can be requeued. Otherwise the device hangs.
      
      Unfortunately, device restart is currently performed only for delayed
      completions, i.e., in irqmode==2. This fact causes hangs, for the
      above reasons, with the other irqmodes in combination with single-queue
      block layer.
      
      This commits addresses this issue by making sure that, if stopped, the
      device is properly restarted for all irqmodes on completions.
      Signed-off-by: NPaolo Valente <paolo.valente@unimore.it>
      Signed-off-by: NArianna AVanzini <avanzini@google.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      cf8ecc5a
    • P
      null_blk: set a separate timer for each command · 3c395a96
      Paolo Valente 提交于
      For the Timer IRQ mode (i.e., when command completions are delayed),
      there is one timer for each CPU. Each of these timers
      . has a completion queue associated with it, containing all the
        command completions to be executed when the timer fires;
      . is set, and a new completion-to-execute is inserted into its
        completion queue, every time the dispatch code for a new command
        happens to be executed on the CPU related to the timer.
      
      This implies that, if the dispatch of a new command happens to be
      executed on a CPU whose timer has already been set, but has not yet
      fired, then the timer is set again, to the completion time of the
      newly arrived command. When the timer eventually fires, all its queued
      completions are executed.
      
      This way of handling delayed command completions entails the following
      problem: if more than one command completion is inserted into the
      queue of a timer before the timer fires, then the expiration time for
      the timer is moved forward every time each of these completions is
      enqueued. As a consequence, only the last completion enqueued enjoys a
      correct execution time, while all previous completions are unjustly
      delayed until the last completion is executed (and at that time they
      are executed all together).
      
      Specifically, if all the above completions are enqueued almost at the
      same time, then the problem is negligible. On the opposite end, if
      every completion is enqueued a while after the previous completion was
      enqueued (in the extreme case, it is enqueued only right before the
      timer would have expired), then every enqueued completion, except for
      the last one, experiences an inflated delay, proportional to the number
      of completions enqueued after it. In the end, commands, and thus I/O
      requests, may be completed at an arbitrarily lower rate than the
      desired one.
      
      This commit addresses this issue by replacing per-CPU timers with
      per-command timers, i.e., by associating an individual timer with each
      command.
      Signed-off-by: NPaolo Valente <paolo.valente@unimore.it>
      Signed-off-by: NArianna Avanzini <avanzini@google.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      3c395a96
  5. 20 11月, 2015 3 次提交
  6. 17 11月, 2015 1 次提交
    • M
      null_blk: register as a LightNVM device · b2b7e001
      Matias Bjørling 提交于
      Add support for registering as a LightNVM device. This allows us to
      evaluate the performance of the LightNVM subsystem.
      
      In /drivers/Makefile, LightNVM is moved above block device drivers
      to make sure that the LightNVM media managers have been initialized
      before drivers under /drivers/block are initialized.
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Fix by Jens Axboe to remove unneeded slab cache and the following
      memory leak.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b2b7e001
  7. 08 11月, 2015 1 次提交
  8. 01 10月, 2015 1 次提交
  9. 03 9月, 2015 2 次提交
  10. 29 7月, 2015 1 次提交
    • C
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig 提交于
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4246a0b6
  11. 23 7月, 2015 1 次提交
  12. 02 6月, 2015 2 次提交
    • A
      null_blk: restart request processing on completion handler · 8b70f45e
      Akinobu Mita 提交于
      When irqmode=2 (IRQ completion handler is timer) and queue_mode=1
      (Block interface to use is rq), the completion handler should restart
      request handling for any pending requests on a queue because request
      processing stops when the number of commands are queued more than
      hw_queue_depth (null_rq_prep_fn returns BLKPREP_DEFER).
      
      Without this change, the following command cannot finish.
      
      	# modprobe null_blk irqmode=2 queue_mode=1 hw_queue_depth=1
      	# fio --name=t --rw=read --size=1g --direct=1 \
      	  --ioengine=libaio --iodepth=64 --filename=/dev/nullb0
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      8b70f45e
    • A
      null_blk: prevent timer handler running on a different CPU where started · 419c21a3
      Akinobu Mita 提交于
      When irqmode=2 (IRQ completion handler is timer), timer handler should
      be called on the same CPU where the timer has been started.
      
      Since completion_queues are per-cpu and the completion handler only
      touches completion_queue for local CPU, we need to prevent the handler
      from running on a different CPU where the timer has been started.
      Otherwise, the IO cannot be completed until another completion handler
      is executed on that CPU.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      419c21a3
  13. 28 5月, 2015 1 次提交
    • L
      kernel/params: constify struct kernel_param_ops uses · 9c27847d
      Luis R. Rodriguez 提交于
      Most code already uses consts for the struct kernel_param_ops,
      sweep the kernel for the last offending stragglers. Other than
      include/linux/moduleparam.h and kernel/params.c all other changes
      were generated with the following Coccinelle SmPL patch. Merge
      conflicts between trees can be handled with Coccinelle.
      
      In the future git could get Coccinelle merge support to deal with
      patch --> fail --> grammar --> Coccinelle --> new patch conflicts
      automatically for us on patches where the grammar is available and
      the patch is of high confidence. Consider this a feature request.
      
      Test compiled on x86_64 against:
      
      	* allnoconfig
      	* allmodconfig
      	* allyesconfig
      
      @ const_found @
      identifier ops;
      @@
      
      const struct kernel_param_ops ops = {
      };
      
      @ const_not_found depends on !const_found @
      identifier ops;
      @@
      
      -struct kernel_param_ops ops = {
      +const struct kernel_param_ops ops = {
      };
      
      Generated-by: Coccinelle SmPL
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Junio C Hamano <gitster@pobox.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: cocci@systeme.lip6.fr
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      9c27847d
  14. 17 1月, 2015 1 次提交
    • J
      null_blk: suppress invalid partition info · 227290b4
      Jens Axboe 提交于
      null_blk is partitionable, but it doesn't store any of the info. When
      it is loaded, you would normally see:
      
      [1226739.343608]  nullb0: unknown partition table
      [1226739.343746]  nullb1: unknown partition table
      
      which can confuse some people. Add the appropriate gendisk flag
      to suppress this info.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      227290b4
  15. 03 1月, 2015 1 次提交
  16. 27 11月, 2014 1 次提交
  17. 30 10月, 2014 1 次提交
    • J
      blk-mq: add a 'list' parameter to ->queue_rq() · 74c45052
      Jens Axboe 提交于
      Since we have the notion of a 'last' request in a chain, we can use
      this to have the hardware optimize the issuing of requests. Add
      a list_head parameter to queue_rq that the driver can use to
      temporarily store hw commands for issue when 'last' is true. If we
      are doing a chain of requests, pass in a NULL list for the first
      request to force issue of that immediately, then batch the remainder
      for deferred issue until the last request has been sent.
      
      Instead of adding yet another argument to the hot ->queue_rq path,
      encapsulate the passed arguments in a blk_mq_queue_data structure.
      This is passed as a constant, and has been tested as faster than
      passing 4 (or even 3) args through ->queue_rq. Update drivers for
      the new ->queue_rq() prototype. There are no functional changes
      in this patch for drivers - if they don't use the passed in list,
      then they will just queue requests individually like before.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      74c45052
  18. 22 10月, 2014 1 次提交
  19. 05 10月, 2014 1 次提交
    • M
      block: disable entropy contributions for nonrot devices · b277da0a
      Mike Snitzer 提交于
      Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set
      QUEUE_FLAG_NONROT.
      
      Historically, all block devices have automatically made entropy
      contributions.  But as previously stated in commit e2e1a148 ("block: add
      sysfs knob for turning off disk entropy contributions"):
          - On SSD disks, the completion times aren't as random as they
            are for rotational drives. So it's questionable whether they
            should contribute to the random pool in the first place.
          - Calling add_disk_randomness() has a lot of overhead.
      
      There are more reliable sources for randomness than non-rotational block
      devices.  From a security perspective it is better to err on the side of
      caution than to allow entropy contributions from unreliable "random"
      sources.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b277da0a
  20. 23 9月, 2014 3 次提交
  21. 03 9月, 2014 1 次提交
    • R
      blk-mq: pass along blk_mq_alloc_tag_set return values · dc501dc0
      Robert Elliott 提交于
      Two of the blk-mq based drivers do not pass back the return value
      from blk_mq_alloc_tag_set, instead just returning -ENOMEM.
      
      blk_mq_alloc_tag_set returns -EINVAL if the number of queues or
      queue depth is bad.  -ENOMEM implies that retrying after freeing some
      memory might be more successful, but that won't ever change
      in the -EINVAL cases.
      
      Change the null_blk and mtip32xx drivers to pass along
      the return value.
      Signed-off-by: NRobert Elliott <elliott@hp.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      dc501dc0
  22. 17 6月, 2014 1 次提交
  23. 12 6月, 2014 1 次提交
  24. 29 5月, 2014 1 次提交
  25. 28 5月, 2014 1 次提交
  26. 01 5月, 2014 1 次提交
  27. 16 4月, 2014 2 次提交
  28. 11 2月, 2014 1 次提交
  29. 08 2月, 2014 1 次提交
    • S
      block/null_blk: Fix completion processing from LIFO to FIFO · d7790b92
      Shlomo Pongratz 提交于
      The completion queue is implemented using lockless list.
      
      The llist_add is adds the events to the list head which is a push operation.
      The processing of the completion elements is done by disconnecting all the
      pushed elements and iterating over the disconnected list. The problem is
      that the processing is done in reverse order w.r.t order of the insertion
      i.e. LIFO processing. By reversing the disconnected list which is done in
      linear time the desired FIFO processing is achieved.
      Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      d7790b92
  30. 22 1月, 2014 1 次提交
  31. 12 1月, 2014 1 次提交