1. 11 5月, 2009 2 次提交
    • T
      block: implement and enforce request peek/start/fetch · 9934c8c0
      Tejun Heo 提交于
      Till now block layer allowed two separate modes of request execution.
      A request is always acquired from the request queue via
      elv_next_request().  After that, drivers are free to either dequeue it
      or process it without dequeueing.  Dequeue allows elv_next_request()
      to return the next request so that multiple requests can be in flight.
      
      Executing requests without dequeueing has its merits mostly in
      allowing drivers for simpler devices which can't do sg to deal with
      segments only without considering request boundary.  However, the
      benefit this brings is dubious and declining while the cost of the API
      ambiguity is increasing.  Segment based drivers are usually for very
      old or limited devices and as converting to dequeueing model isn't
      difficult, it doesn't justify the API overhead it puts on block layer
      and its more modern users.
      
      Previous patches converted all block low level drivers to dequeueing
      model.  This patch completes the API transition by...
      
      * renaming elv_next_request() to blk_peek_request()
      
      * renaming blkdev_dequeue_request() to blk_start_request()
      
      * adding blk_fetch_request() which is combination of peek and start
      
      * disallowing completion of queued (not started) requests
      
      * applying new API to all LLDs
      
      Renamings are for consistency and to break out of tree code so that
      it's apparent that out of tree drivers need updating.
      
      [ Impact: block request issue API cleanup, no functional change ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: unsik Kim <donari75@gmail.com>
      Cc: Paul Clements <paul.clements@steeleye.com>
      Cc: Tim Waugh <tim@cyberelk.net>
      Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Laurent Vivier <Laurent@lvivier.info>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
      Cc: Alex Dubov <oakad@yahoo.com>
      Cc: Pierre Ossman <drzeus@drzeus.cx>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
      Cc: Stefan Weinhuber <wein@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Pete Zaitcev <zaitcev@redhat.com>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      9934c8c0
    • T
      block: drop request->hard_* and *nr_sectors · 2e46e8b2
      Tejun Heo 提交于
      struct request has had a few different ways to represent some
      properties of a request.  ->hard_* represent block layer's view of the
      request progress (completion cursor) and the ones without the prefix
      are supposed to represent the issue cursor and allowed to be updated
      as necessary by the low level drivers.  The thing is that as block
      layer supports partial completion, the two cursors really aren't
      necessary and only cause confusion.  In addition, manual management of
      request detail from low level drivers is cumbersome and error-prone at
      the very least.
      
      Another interesting duplicate fields are rq->[hard_]nr_sectors and
      rq->{hard_cur|current}_nr_sectors against rq->data_len and
      rq->bio->bi_size.  This is more convoluted than the hard_ case.
      
      rq->[hard_]nr_sectors are initialized for requests with bio but
      blk_rq_bytes() uses it only for !pc requests.  rq->data_len is
      initialized for all request but blk_rq_bytes() uses it only for pc
      requests.  This causes good amount of confusion throughout block layer
      and its drivers and determining the request length has been a bit of
      black magic which may or may not work depending on circumstances and
      what the specific LLD is actually doing.
      
      rq->{hard_cur|current}_nr_sectors represent the number of sectors in
      the contiguous data area at the front.  This is mainly used by drivers
      which transfers data by walking request segment-by-segment.  This
      value always equals rq->bio->bi_size >> 9.  However, data length for
      pc requests may not be multiple of 512 bytes and using this field
      becomes a bit confusing.
      
      In general, having multiple fields to represent the same property
      leads only to confusion and subtle bugs.  With recent block low level
      driver cleanups, no driver is accessing or manipulating these
      duplicate fields directly.  Drop all the duplicates.  Now rq->sector
      means the current sector, rq->data_len the current total length and
      rq->bio->bi_size the current segment length.  Everything else is
      defined in terms of these three and available only through accessors.
      
      * blk_recalc_rq_sectors() is collapsed into blk_update_request() and
        now handles pc and fs requests equally other than rq->sector update.
        This means that now pc requests can use partial completion too (no
        in-kernel user yet tho).
      
      * bio_cur_sectors() is replaced with bio_cur_bytes() as block layer
        now uses byte count as the primary data length.
      
      * blk_rq_pos() is now guranteed to be always correct.  In-block users
        converted.
      
      * blk_rq_bytes() is now guaranteed to be always valid as is
        blk_rq_sectors().  In-block users converted.
      
      * blk_rq_sectors() is now guaranteed to equal blk_rq_bytes() >> 9.
        More convenient one is used.
      
      * blk_rq_bytes() and blk_rq_cur_bytes() are now inlined and take const
        pointer to request.
      
      [ Impact: API cleanup, single way to represent one property of a request ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      2e46e8b2
  2. 28 4月, 2009 3 次提交
  3. 24 4月, 2009 1 次提交
  4. 15 4月, 2009 1 次提交
  5. 07 4月, 2009 2 次提交
  6. 13 3月, 2009 1 次提交
  7. 02 2月, 2009 1 次提交
  8. 26 12月, 2008 1 次提交
  9. 17 10月, 2008 1 次提交
  10. 09 10月, 2008 3 次提交
    • J
      block: add fault injection mechanism for faking request timeouts · 581d4e28
      Jens Axboe 提交于
      Only works for the generic request timer handling. Allows one to
      sporadically ignore request completions, thus exercising the timeout
      handling.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      581d4e28
    • J
      block: unify request timeout handling · 242f9dcb
      Jens Axboe 提交于
      Right now SCSI and others do their own command timeout handling.
      Move those bits to the block layer.
      
      Instead of having a timer per command, we try to be a bit more clever
      and simply have one per-queue. This avoids the overhead of having to
      tear down and setup a timer for each command, so it will result in a lot
      less timer fiddling.
      Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      242f9dcb
    • J
      block: add support for IO CPU affinity · c7c22e4d
      Jens Axboe 提交于
      This patch adds support for controlling the IO completion CPU of
      either all requests on a queue, or on a per-request basis. We export
      a sysfs variable (rq_affinity) which, if set, migrates completions
      of requests to the CPU that originally submitted it. A bio helper
      (bio_set_completion_cpu()) is also added, so that queuers can ask
      for completion on that specific CPU.
      
      In testing, this has been show to cut the system time by as much
      as 20-40% on synthetic workloads where CPU affinity is desired.
      
      This requires a little help from the architecture, so it'll only
      work as designed for archs that are using the new generic smp
      helper infrastructure.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      c7c22e4d
  11. 03 7月, 2008 1 次提交
  12. 29 4月, 2008 1 次提交
  13. 04 3月, 2008 1 次提交
  14. 30 1月, 2008 3 次提交