1. 28 4月, 2009 3 次提交
    • T
      block: merge blk_invoke_request_fn() into __blk_run_queue() · a538cd03
      Tejun Heo 提交于
      __blk_run_queue wraps blk_invoke_request_fn() such that it
      additionally removes plug and bails out early if the queue is empty.
      Both extra operations have their own pending mechanisms and don't
      cause any harm correctness-wise when they are done superflously.
      
      The only user of blk_invoke_request_fn() being blk_start_queue(),
      there isn't much reason to keep both functions around.  Merge
      blk_invoke_request_fn() into __blk_run_queue() and make
      blk_start_queue() use __blk_run_queue() instead.
      
      [ Impact: merge two subtly different internal functions ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a538cd03
    • B
      block: enable by default support for large devices and files on 32-bit archs · db29a6b4
      Bartlomiej Zolnierkiewicz 提交于
      Enable by default support for large devices and files (CONFIG_LBD):
      
      - With 1TB disks being a commodity hardware it is quite easy to hit 2TB
        limitation while building RAIDs etc. and many distros have been using
        CONFIG_LBD=y by default already (at least Fedora 10 and openSUSE 11.1).
      
      - This should also prevent a subtle ext4 filesystem compatibility issue:
        mke2fs.ext4 defaults to creating filesystems with huge_files feature
        enabled and such filesystems cannot be later mounted read-write on
        machines with CONFIG_LBD=n (it should be quite easy to hit this issue
        when trying to use filesystem created using distro kernel on system
        running the self-build kernel, think about USB disk enclosures & co.).
      
      While at it:
      
      - Clarify config option help text w.r.t. mounting ext4 filesystems
        (they can be mounted with CONFIG_LBD=n but in the read-only mode).
      
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      db29a6b4
    • T
      block: clear req->errors on bio completion only for fs requests · 924cec77
      Tejun Heo 提交于
      Impact: subtle behavior change
      
      For fs requests, rq is only carrier of bios and rq error status as a
      whole doesn't mean much.  This is the reason why rq->errors is being
      cleared on each partial completion of a request as on each partial
      completion the error status is transferred to the respective bios.
      
      For pc requests, rq->errors is used to carry error status to the
      issuer and thus __end_that_request_first() doesn't clear it on such
      cases.
      
      The condition was fine till now as only fs and pc requests have used
      bio and thus the bio completion path.  However, future changes will
      unify data accesses to bio and all non fs users care about rq error
      status.  Clear rq->errors on bio completion only for fs requests.
      
      In general, the implicit clearing is a bit too subtle especially as
      the meaning of rq->errors is completely dependent on low level
      drivers.  Unifying / cleaning up rq->errors usage and letting llds
      manage it would be better.  TODO comment added.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJens Axboe <axboe@kernel.dk>
      924cec77
  2. 24 4月, 2009 5 次提交
    • J
      cfq-iosched: cache prio_tree root in cfqq->p_root · f2d1f0ae
      Jens Axboe 提交于
      Currently we look it up from ->ioprio, but ->ioprio can change if
      either the process gets its IO priority changed explicitly, or if
      cfq decides to temporarily boost it. So if we are unlucky, we can
      end up attempting to remove a node from a different rbtree root than
      where it was added.
      
      Fix this by using ->org_ioprio as the prio_tree index, since that
      will only change for explicit IO priority settings (not for a boost).
      Additionally cache the rbtree root inside the cfqq, then we don't have
      to add code to reinsert the cfqq in the prio_tree if IO priority changes.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      f2d1f0ae
    • J
      cfq-iosched: fix bug with aliased request and cooperation detection · 3ac6c9f8
      Jens Axboe 提交于
      cfq_prio_tree_lookup() should return the direct match, yet it always
      returns zero. Fix that.
      
      cfq_prio_tree_add() assumes that we don't get a direct match, while
      it is very possible that we do. Using O_DIRECT, you can have different
      cfqq with matching requests, since you don't have the page cache
      to serialize things for you. Fix this bug by only adding the cfqq if
      there isn't an existing match.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      3ac6c9f8
    • J
      cfq-iosched: clear ->prio_trees[] on cfqd alloc · 26a2ac00
      Jens Axboe 提交于
      Not strictly needed, but we should make it clear that we init the
      rbtree roots here.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      26a2ac00
    • H
      block: fix intermittent dm timeout based oops · 17d5c8ca
      Hannes Reinecke 提交于
      Very rarely under stress testing of dm, oopses are occuring as
      something tampers with an old stack frame.  This has been traced back
      to blk_abort_queue() leaving a timeout_list pointing to the stack.
      The reason is that sometimes blk_abort_request() won't delete the
      timer (if the request is marked as complete but before the timer has
      been removed, a small race window).  Fix this by splicing back from
      the ususally empty list to the q->timeout_list.
      Signed-off-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      17d5c8ca
    • J
      block: simplify I/O stat accounting · 42dad764
      Jerome Marchand 提交于
      This simplifies I/O stat accounting switching code and separates it
      completely from I/O scheduler switch code.
      
      Requests are accounted according to the state of their request queue
      at the time of the request allocation. There is no need anymore to
      flush the request queue when switching I/O accounting state.
      Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      42dad764
  3. 22 4月, 2009 6 次提交
  4. 15 4月, 2009 11 次提交
  5. 07 4月, 2009 7 次提交
  6. 06 4月, 2009 3 次提交
  7. 03 4月, 2009 1 次提交
    • L
      blktrace: fix pdu_len when tracing packet command requests · e2494e1b
      Li Zefan 提交于
      Impact: output all of packet commands - not just the first 4 / 8 bytes
      
      Since commit d7e3c324 ("block: add
      large command support"), struct request->cmd has been changed from
      unsinged char cmd[BLK_MAX_CDB] to unsigned char *cmd.
      
      v1 -> v2: by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      
      - make sure rq->cmd_len is always intialized, and then we can use
        rq->cmd_len instead of BLK_MAX_CDB.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      LKML-Reference: <49D4507E.2060602@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e2494e1b
  8. 26 3月, 2009 2 次提交
  9. 24 3月, 2009 2 次提交
    • B
      bsg: add support for tail queuing · 05378940
      Boaz Harrosh 提交于
      Currently inherited from sg.c bsg will submit asynchronous request
       at the head-of-the-queue, (using "at_head" set in the call to
       blk_execute_rq_nowait()). This is bad in situation where the queues
       are full, requests will execute out of order, and can cause
       starvation of the first submitted requests.
      
      The sg_io_v4->flags member is used and a bit is allocated to denote the
      Q_AT_TAIL. Zero is to queue at_head as before, to be compatible with old
      code at the write/read path. SG_IO code path behavior was changed so to
      be the same as write/read behavior. SG_IO was very rarely used and breaking
      compatibility with it is OK at this stage.
      
      sg_io_hdr at sg.h also has a flags member and uses 3 bits from the first
      nibble and one bit from the last nibble. Even though none of these bits
      are supported by bsg, The second nibble is allocated for use by bsg. Just
      in case.
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      CC: Douglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      05378940
    • J
      block: get rid of unused blkdev_free_rq() define · 50e17493
      Jens Axboe 提交于
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      50e17493