1. 09 6月, 2017 2 次提交
    • C
      blk-mq: switch ->queue_rq return value to blk_status_t · fc17b653
      Christoph Hellwig 提交于
      Use the same values for use for request completion errors as the return
      value from ->queue_rq.  BLK_STS_RESOURCE is special cased to cause
      a requeue, and all the others are completed as-is.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      fc17b653
    • C
      block: introduce new block status code type · 2a842aca
      Christoph Hellwig 提交于
      Currently we use nornal Linux errno values in the block layer, and while
      we accept any error a few have overloaded magic meanings.  This patch
      instead introduces a new  blk_status_t value that holds block layer specific
      status codes and explicitly explains their meaning.  Helpers to convert from
      and to the previous special meanings are provided for now, but I suspect
      we want to get rid of them in the long run - those drivers that have a
      errno input (e.g. networking) usually get errnos that don't know about
      the special block layer overloads, and similarly returning them to userspace
      will usually return somethings that strictly speaking isn't correct
      for file system operations, but that's left as an exercise for later.
      
      For now the set of errors is a very limited set that closely corresponds
      to the previous overloaded errno values, but there is some low hanging
      fruite to improve it.
      
      blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
      typechecking, so that we can easily catch places passing the wrong values.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2a842aca
  2. 21 4月, 2017 2 次提交
  3. 20 4月, 2017 1 次提交
  4. 31 3月, 2017 2 次提交
  5. 01 2月, 2017 1 次提交
    • C
      block: fold cmd_type into the REQ_OP_ space · aebf526b
      Christoph Hellwig 提交于
      Instead of keeping two levels of indirection for requests types, fold it
      all into the operations.  The little caveat here is that previously
      cmd_type only applied to struct request, while the request and bio op
      fields were set to plain REQ_OP_READ/WRITE even for passthrough
      operations.
      
      Instead this patch adds new REQ_OP_* for SCSI passthrough and driver
      private requests, althought it has to add two for each so that we
      can communicate the data in/out nature of the request.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      aebf526b
  6. 31 1月, 2017 2 次提交
  7. 26 12月, 2016 1 次提交
    • T
      ktime: Cleanup ktime_set() usage · 8b0e1953
      Thomas Gleixner 提交于
      ktime_set(S,N) was required for the timespec storage type and is still
      useful for situations where a Seconds and Nanoseconds part of a time value
      needs to be converted. For anything where the Seconds argument is 0, this
      is pointless and can be replaced with a simple assignment.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      8b0e1953
  8. 16 11月, 2016 1 次提交
  9. 21 9月, 2016 2 次提交
    • M
      lightnvm: control life of nvm_dev in driver · b0b4e09c
      Matias Bjørling 提交于
      LightNVM compatible device drivers does not have a method to expose
      LightNVM specific sysfs entries.
      
      To enable LightNVM sysfs entries to be exposed, lightnvm device
      drivers require a struct device to attach it to. To allow both the
      actual device driver and lightnvm sysfs entries to coexist, the device
      driver tracks the lifetime of the nvm_dev structure.
      
      This patch refactors NVMe and null_blk to handle the lifetime of struct
      nvm_dev, which eliminates the need for struct gendisk when a lightnvm
      compatible device is provided.
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b0b4e09c
    • M
      null_blk: refactor to support non-gendisk devices · 9ae2d0aa
      Matias Bjørling 提交于
      With LightNVM enabled devices, the gendisk structure is not exposed
      to the user. This hides the device driver specific sysfs entries, and
      prevents binding of LightNVM geometry information to the device.
      
      Refactor the device registration process, so that gendisk and
      non-gendisk devices are easily managed.
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      9ae2d0aa
  10. 15 9月, 2016 1 次提交
  11. 21 7月, 2016 1 次提交
  12. 19 3月, 2016 1 次提交
  13. 11 2月, 2016 1 次提交
    • M
      null_blk: oops when initializing without lightnvm · a514379b
      Matias Bjørling 提交于
      If the LightNVM subsystem is not compiled into the kernel, and the
      null_blk device driver requests lightnvm to be initialized. The call to
      nvm_register fails and the null_add_dev function cleans up the
      initialization. However, at this point the null block device has
      already been added to the nullb_list and thus a second cleanup will
      occur when the function has returned, that leads to a double call to
      blk_cleanup_queue.
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      a514379b
  14. 05 2月, 2016 1 次提交
    • M
      lightnvm: allow to force mm initialization · bf643185
      Matias Bjørling 提交于
      System block allows the device to initialize with its configured media
      manager. The system blocks is written to disk, and read again when media
      manager is determined. For this to work, the backend must store the
      data. Device drivers, such as null_blk, does not have any backend
      storage. This patch allows the media manager to be initialized without a
      storage backend.
      
      It also fix incorrect configuration of capabilities in null_blk, as it
      does not support get/set bad block interface.
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      bf643185
  15. 14 1月, 2016 1 次提交
    • A
      null_blk: use sector_div instead of do_div · e93d12ae
      Arnd Bergmann 提交于
      Dividing a sector_t number should be done using sector_div rather than do_div
      to optimize the 32-bit sector_t case, and with the latest do_div optimizations,
      we now get a compile-time warning for this:
      
      arch/arm/include/asm/div64.h:32:95: note: expected 'uint64_t * {aka long long unsigned int *}' but argument is of type 'sector_t * {aka long unsigned int *}'
      drivers/block/null_blk.c:521:81: warning: comparison of distinct pointer types lacks a cast
      
      This changes the newly added code to use sector_div. It is a simplified version
      of the original patch, as Linus Torvalds pointed out that we should not be using
      an expensive division function in the first place.
      
      This version was suggested by Matias Bjorling.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Matias Bjorling <m@bjorling.me>
      Fixes: b2b7e001 ("null_blk: register as a LightNVM device")
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e93d12ae
  16. 12 1月, 2016 1 次提交
    • M
      lightnvm: refactor end_io functions for sync · 91276162
      Matias Bjørling 提交于
      To implement sync I/O support within the LightNVM core, the end_io
      functions are refactored to take an end_io function pointer instead of
      testing for initialized media manager, followed by calling its end_io
      function.
      
      Sync I/O can then be implemented using a callback that signal I/O
      completion. This is similar to the logic found in blk_to_execute_io().
      By implementing it this way, the underlying device I/Os submission logic
      is abstracted away from core, targets, and media managers.
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      91276162
  17. 29 12月, 2015 1 次提交
  18. 23 12月, 2015 1 次提交
  19. 09 12月, 2015 1 次提交
  20. 08 12月, 2015 1 次提交
  21. 02 12月, 2015 4 次提交
    • C
      blk-mq: add a flags parameter to blk_mq_alloc_request · 6f3b0e8b
      Christoph Hellwig 提交于
      We already have the reserved flag, and a nowait flag awkwardly encoded as
      a gfp_t.  Add a real flags argument to make the scheme more extensible and
      allow for a nicer calling convention.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      6f3b0e8b
    • A
      null_blk: change type of completion_nsec to unsigned long · dbac1175
      Arianna Avanzini 提交于
      This commit at least doubles the maximum value for
      completion_nsec. This helps in special cases where one wants/needs to
      emulate an extremely slow I/O (for example to spot bugs).
      Signed-off-by: NPaolo Valente <paolo.valente@unimore.it>
      Signed-off-by: NArianna Avanzini <avanzini@google.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      dbac1175
    • A
      null_blk: guarantee device restart in all irq modes · cf8ecc5a
      Arianna Avanzini 提交于
      In single-queue (block layer) mode,the function null_rq_prep_fn stops
      the device if alloc_cmd fails. Then, once stopped, the device must be
      restarted on the next command completion, so that the request(s) for
      which alloc_cmd failed can be requeued. Otherwise the device hangs.
      
      Unfortunately, device restart is currently performed only for delayed
      completions, i.e., in irqmode==2. This fact causes hangs, for the
      above reasons, with the other irqmodes in combination with single-queue
      block layer.
      
      This commits addresses this issue by making sure that, if stopped, the
      device is properly restarted for all irqmodes on completions.
      Signed-off-by: NPaolo Valente <paolo.valente@unimore.it>
      Signed-off-by: NArianna AVanzini <avanzini@google.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      cf8ecc5a
    • P
      null_blk: set a separate timer for each command · 3c395a96
      Paolo Valente 提交于
      For the Timer IRQ mode (i.e., when command completions are delayed),
      there is one timer for each CPU. Each of these timers
      . has a completion queue associated with it, containing all the
        command completions to be executed when the timer fires;
      . is set, and a new completion-to-execute is inserted into its
        completion queue, every time the dispatch code for a new command
        happens to be executed on the CPU related to the timer.
      
      This implies that, if the dispatch of a new command happens to be
      executed on a CPU whose timer has already been set, but has not yet
      fired, then the timer is set again, to the completion time of the
      newly arrived command. When the timer eventually fires, all its queued
      completions are executed.
      
      This way of handling delayed command completions entails the following
      problem: if more than one command completion is inserted into the
      queue of a timer before the timer fires, then the expiration time for
      the timer is moved forward every time each of these completions is
      enqueued. As a consequence, only the last completion enqueued enjoys a
      correct execution time, while all previous completions are unjustly
      delayed until the last completion is executed (and at that time they
      are executed all together).
      
      Specifically, if all the above completions are enqueued almost at the
      same time, then the problem is negligible. On the opposite end, if
      every completion is enqueued a while after the previous completion was
      enqueued (in the extreme case, it is enqueued only right before the
      timer would have expired), then every enqueued completion, except for
      the last one, experiences an inflated delay, proportional to the number
      of completions enqueued after it. In the end, commands, and thus I/O
      requests, may be completed at an arbitrarily lower rate than the
      desired one.
      
      This commit addresses this issue by replacing per-CPU timers with
      per-command timers, i.e., by associating an individual timer with each
      command.
      Signed-off-by: NPaolo Valente <paolo.valente@unimore.it>
      Signed-off-by: NArianna Avanzini <avanzini@google.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      3c395a96
  22. 20 11月, 2015 3 次提交
  23. 17 11月, 2015 1 次提交
    • M
      null_blk: register as a LightNVM device · b2b7e001
      Matias Bjørling 提交于
      Add support for registering as a LightNVM device. This allows us to
      evaluate the performance of the LightNVM subsystem.
      
      In /drivers/Makefile, LightNVM is moved above block device drivers
      to make sure that the LightNVM media managers have been initialized
      before drivers under /drivers/block are initialized.
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Fix by Jens Axboe to remove unneeded slab cache and the following
      memory leak.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b2b7e001
  24. 08 11月, 2015 1 次提交
  25. 01 10月, 2015 1 次提交
  26. 03 9月, 2015 2 次提交
  27. 29 7月, 2015 1 次提交
    • C
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig 提交于
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4246a0b6
  28. 23 7月, 2015 1 次提交
  29. 02 6月, 2015 1 次提交
    • A
      null_blk: restart request processing on completion handler · 8b70f45e
      Akinobu Mita 提交于
      When irqmode=2 (IRQ completion handler is timer) and queue_mode=1
      (Block interface to use is rq), the completion handler should restart
      request handling for any pending requests on a queue because request
      processing stops when the number of commands are queued more than
      hw_queue_depth (null_rq_prep_fn returns BLKPREP_DEFER).
      
      Without this change, the following command cannot finish.
      
      	# modprobe null_blk irqmode=2 queue_mode=1 hw_queue_depth=1
      	# fio --name=t --rw=read --size=1g --direct=1 \
      	  --ioengine=libaio --iodepth=64 --filename=/dev/nullb0
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      8b70f45e