1. 28 4月, 2009 33 次提交
    • T
      block: merge blk_invoke_request_fn() into __blk_run_queue() · a538cd03
      Tejun Heo 提交于
      __blk_run_queue wraps blk_invoke_request_fn() such that it
      additionally removes plug and bails out early if the queue is empty.
      Both extra operations have their own pending mechanisms and don't
      cause any harm correctness-wise when they are done superflously.
      
      The only user of blk_invoke_request_fn() being blk_start_queue(),
      there isn't much reason to keep both functions around.  Merge
      blk_invoke_request_fn() into __blk_run_queue() and make
      blk_start_queue() use __blk_run_queue() instead.
      
      [ Impact: merge two subtly different internal functions ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a538cd03
    • J
      block: implement blkdev_readpages · db2dbb12
      Jeff Moyer 提交于
      Doing a proper block dev ->readpages() speeds up the crazy dump(8)
      approach of using interleaved process IO.
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      db2dbb12
    • B
      block: enable by default support for large devices and files on 32-bit archs · db29a6b4
      Bartlomiej Zolnierkiewicz 提交于
      Enable by default support for large devices and files (CONFIG_LBD):
      
      - With 1TB disks being a commodity hardware it is quite easy to hit 2TB
        limitation while building RAIDs etc. and many distros have been using
        CONFIG_LBD=y by default already (at least Fedora 10 and openSUSE 11.1).
      
      - This should also prevent a subtle ext4 filesystem compatibility issue:
        mke2fs.ext4 defaults to creating filesystems with huge_files feature
        enabled and such filesystems cannot be later mounted read-write on
        machines with CONFIG_LBD=n (it should be quite easy to hit this issue
        when trying to use filesystem created using distro kernel on system
        running the self-build kernel, think about USB disk enclosures & co.).
      
      While at it:
      
      - Clarify config option help text w.r.t. mounting ext4 filesystems
        (they can be mounted with CONFIG_LBD=n but in the read-only mode).
      
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      db29a6b4
    • T
      ide-dma: don't reset request fields on dma_timeout_retry() · 586cf268
      Tejun Heo 提交于
      Impact: drop unnecessary code
      
      Now that everything uses bio and block operations, there is no need to
      reset request fields manually when retrying a request.  Every field is
      guaranteed to be always valid.  Drop unnecessary request field
      resetting from ide_dma_timeout_retry().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      586cf268
    • T
      ide: drop rq->data handling from ide_map_sg() · 5ad960fe
      Tejun Heo 提交于
      Impact: remove code path which is no longer necessary
      
      All IDE data transfers now use rq->bio.  Simplify ide_map_sg()
      accordingly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      5ad960fe
    • T
      ide-atapi: kill unused fields and callbacks · 29d1a437
      Tejun Heo 提交于
      Impact: remove fields and code paths which are no longer necessary
      
      Now that ide-tape uses standard mechanisms to transfer data, special
      case handling for bh handling can be dropped from ide-atapi.  Drop the
      followings.
      
      * pc->cur_pos, b_count, bh and b_data
      * drive->pc_update_buffers() and pc_io_buffers().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      29d1a437
    • T
      ide-tape: simplify read/write functions · 4344d07f
      Tejun Heo 提交于
      Impact: cleanup
      
      idetape_chrdev_read/write() functions are unnecessarily complex when
      everything can be handled in a single loop.  Collapse
      idetape_add_chrdev_read/write_request() into the rw functions and
      simplify the implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      4344d07f
    • T
      ide-tape: use byte size instead of sectors on rw issue functions · 71294cf9
      Tejun Heo 提交于
      Impact: cleanup
      
      Byte size is what most issue functions deal with, make
      idetape_queue_rw_tail() and its wrappers take byte size instead of
      sector counts.  idetape_chrdev_read() and write() functions are
      converted to use tape->buffer_size instead of ctl from tape->cap.
      
      This cleans up code a little bit and will ease the next r/w
      reimplementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      71294cf9
    • T
      ide-tape: unify r/w init paths · 3596b664
      Tejun Heo 提交于
      Impact: cleanup
      
      Read and write init paths are almost identical.  Unify them into
      idetape_init_rw().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      3596b664
    • T
      ide-tape: kill idetape_bh · 6cf3d545
      Tejun Heo 提交于
      Impact: kill now unnecessary idetape_bh
      
      With everything using standard mechanisms, there is no need for
      idetape_bh anymore.  Kill it and use tape->buf, cur and valid to
      describe data buffer instead.
      
      Changes worth mentioning are...
      
      * idetape_queue_rq_tail() now always queue tape->buf and and adjusts
        buffer state properly before completion.
      
      * idetape_pad_zeros() clears the buffer only once.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      6cf3d545
    • T
      ide-tape: use standard data transfer mechanism · e998f30b
      Tejun Heo 提交于
      Impact: use standard way to transfer data
      
      ide-tape uses rq in an interesting way.  For r/w requests, rq->special
      is used to carry a private buffer management structure idetape_bh and
      rq->nr_sectors and current_nr_sectors are initialized to the number of
      idetape blocks which isn't necessary 512 bytes.  Also,
      rq->current_nr_sectors is used to report back the residual count in
      units of idetape blocks.
      
      This peculiarity taxes both block layer and ide.  ide-atapi has
      different paths and hooks to accomodate it and what a rq means becomes
      quite confusing and making changes at the block layer becomes quite
      difficult and error-prone.
      
      This patch makes ide-tape use bio instead.  With the previous patch,
      ide-tape currently is using single contiguos buffer so replacing it
      isn't difficult.  Data buffer is mapped into bio using
      blk_rq_map_kern() in idetape_queue_rw_tail().  idetape_io_buffers()
      and idetape_update_buffers() are dropped and pc->bh is set to null to
      tell ide-atapi to use standard data transfer mechanism and idetape_bh
      byte counts are updated by the issuer on completion using the residual
      count.
      
      This change also nicely removes the FIXME in ide_pc_intr() where
      ide-tape rqs need to be completed using ide_rq_bytes() instead of
      blk_rq_bytes() (although this didn't really matter as the request
      didn't have bio).
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      e998f30b
    • T
      ide-tape: use single continuous buffer · 7b13354e
      Tejun Heo 提交于
      Impact: simpler buffer allocation and handling, kills OOM, fix DMA transfers
      
      ide-tape has its own multiple buffer mechanism using struct
      idetape_bh.  It allocates buffer with decreasing order-of-two
      allocations so that it results in minimum number of segments.
      However, the implementation is quite complex and works in a way that
      no other block or ide driver works necessitating a lot of special case
      handling.
      
      The benefit this complex allocation scheme brings is questionable as
      PIO or DMA the number of segments (16 maximum) doesn't make any
      noticeable difference and it also doesn't negate the need for multiple
      order allocation which can fail under memory pressure or high
      fragmentation although it does lower the highest order necessary by
      one when the buffer size isn't power of two.
      
      As the first step to remove the custom buffer management, this patch
      makes ide-tape allocate single continous buffer.  The maximum order is
      four.  I doubt the change would cause any trouble but if it ever
      matters, it should be converted to regular sg mechanism like everyone
      else and even in that case dropping custom buffer handling and moving
      to standard mechanism first make sense as an intermediate step.
      
      This patch makes the first bh to contain the whole buffer and drops
      multi bh handling code.  Following patches will make further changes.
      
      This patch has the side effect of killing OOM triggered by allocation
      path and fixing DMA transfers.  Previously, bug in alloc path
      triggered OOM on command issue and commands were passed to DMA engine
      without DMA-mapping all the segments.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      7b13354e
    • T
      ide-atapi,tape,floppy: allow ->pc_callback() to change rq->data_len · eb6a61bb
      Tejun Heo 提交于
      Impact: allow residual count implementation in ->pc_callback()
      
      rq->data_len has two duties - carrying the number of input bytes on
      issue and carrying residual count back to the issuer on completion.
      ide-atapi completion callback ->pc_callback() is the right place to do
      this but currently ide-atapi depends on rq->data_len carrying the
      original request size after calling ->pc_callback() to complete the pc
      request.
      
      This patch makes ide_pc_intr(), ide_tape_issue_pc() and
      ide_floppy_issue_pc() cache length to complete before calling
      ->pc_callback() so that it can modify rq->data_len as necessary.
      
      Note: As using rq->data_len for two purposes can make cases like this
            incorrect in subtle ways, future changes will introduce separate
            field for residual count.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      eb6a61bb
    • T
      ide-tape,floppy: fix failed command completion after request sense · 08f370f0
      Tejun Heo 提交于
      Impact: fix infinite retry loop
      
      After a command failed, ide-tape and floppy inserts REQUEST_SENSE in
      front of the failed command and according to the result, sets
      pc->retries, flags and errors.  After REQUEST_SENSE is complete, the
      failed command is again at the front of the queue and if the verdict
      was to terminate the request, the issue functions tries to complete it
      directly by calling drive->pc_callback() and returning ide_stopped.
      
      However, drive->pc_callback() doesn't complete a request.  It only
      prepares for completion of the request.  As a result, this creates an
      infinite loop where the failed request is retried perpetually.
      
      Fix it by actually ending the request by calling ide_complete_rq().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      08f370f0
    • T
      ide-pm: don't abuse rq->data · 765139ef
      Tejun Heo 提交于
      Impact: cleanup rq->data usage
      
      ide-pm uses rq->data to carry pointer to struct request_pm_state
      through request queue and rq->special is used to carray pointer to
      local struct ide_cmd, which isn't necessary.  Use rq->special for
      request_pm_state instead and use local ide_cmd in
      ide_start_power_step().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      765139ef
    • T
      ide-cd,atapi: use bio for internal commands · 02e7cf8f
      Tejun Heo 提交于
      Impact: unify request data buffer handling
      
      rq->data is used mostly to pass kernel buffer through request queue
      without using bio.  There are only a couple of places which still do
      this in kernel and converting to bio isn't difficult.
      
      This patch converts ide-cd and atapi to use bio instead of rq->data
      for request sense and internal pc commands.  With previous change to
      unify sense request handling, this is relatively easily achieved by
      adding blk_rq_map_kern() during sense_rq prep and PC issue.
      
      If blk_rq_map_kern() fails for sense, the error is deferred till sense
      issue and aborts the failed command which triggered the sense.  Note
      that this is a slim possibility as sense prep is done on each command
      issue, so for the above condition to actually trigger, all preps since
      the last sense issue till the issue of the request which would require
      a sense should fail.
      
      * do_request functions might sleep now.  This should be okay as ide
        request_fn - do_ide_request() - is invoked only from make_request
        and plug work.  Make sure this is the case by adding might_sleep()
        to do_ide_request().
      
      * Functions which access the read sense data before the sense request
        is complete now should access bio_data(sense_rq->bio) as the sense
        buffer might have been copied during blk_rq_map_kern().
      
      * ide-tape updated to map sg.
      
      * cdrom_do_block_pc() now doesn't have to deal with REQ_TYPE_ATA_PC
        special case.  Simplified.
      
      * tp_ops->output/input_data path dropped from ide_pc_intr().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      02e7cf8f
    • B
      ide-atapi: convert ide-{floppy,tape} to using preallocated sense buffer · 06875320
      Borislav Petkov 提交于
      Since we're issuing REQ_TYPE_SENSE now we need to allow those types of
      rqs in the ->do_request callbacks. As a future improvement, sense_len
      assignment might be unified across all ATAPI devices. Borislav to
      check with specs and test.
      
      As a result, get rid of ide_queue_pc_head() and
      drive->request_sense_rq.
      
      tj: * Init request sense ide_atapi_pc from sense request.  In the
            longer timer, it would probably better to fold
            ide_create_request_sense_cmd() into its only current user -
            ide_floppy_get_format_progress().
      
          * ide_retry_pc() no longer takes @disk.
      
      CC: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NBorislav Petkov <petkovbb@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      06875320
    • B
      ide-cd: convert to using generic sense request · c457ce87
      Borislav Petkov 提交于
      Preallocate a sense request in the ->do_request method and reinitialize
      it only on demand, in case it's been consumed in the IRQ handler path.
      The reason for this is that we don't want to be mapping rq to bio in
      the IRQ path and introduce all kinds of unnecessary hacks to the block
      layer.
      
      tj: * Both user and kernel PC requests expect sense data to be stored
            in separate storage other than drive->sense_data.  Copy sense
            data to rq->sense on completion if rq->sense is not NULL.  This
            fixes bogus sense data on PC requests.
      
      As a result, remove cdrom_queue_request_sense.
      
      CC: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NBorislav Petkov <petkovbb@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      c457ce87
    • B
      ide: add helpers for preparing sense requests · e69d800f
      Borislav Petkov 提交于
      This is in preparation of removing the queueing of a sense request out
      of the IRQ handler path.
      
      Use struct request_sense as a general sense buffer for all ATAPI
      devices ide-{floppy,tape,cd}.
      
      tj: * blk_get_request(__GFP_WAIT) can't be called from do_request() as
            it can cause deadlock.  Converted to use inline struct request
            and blk_rq_init().
      
          * Added xfer / cdb len selection depending on device type.
      
          * All sense prep logics folded into ide_prep_sense() which never
            fails.
      
          * hwif->rq clearing and sense_rq used handling moved into
            ide_queue_sense_rq().
      
          * blk_rq_map_kern() conversion is moved to later patch.
      
      CC: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NBorislav Petkov <petkovbb@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e69d800f
    • T
      ide-cd: don't abuse rq->buffer · 1f181d2b
      Tejun Heo 提交于
      Impact: rq->buffer usage cleanup
      
      ide-cd uses rq->buffer to carry pointer to the original request when
      issuing REQUEST_SENSE.  Use rq->special instead.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      1f181d2b
    • T
      ide-atapi: don't abuse rq->buffer · ac0b0113
      Tejun Heo 提交于
      Impact: rq->buffer usage cleanup
      
      ide-atapi uses rq->buffer as private opaque value for internal special
      requests.  rq->special isn't used for these cases (the only case where
      rq->special is used is for ide-tape rw requests).  Use rq->special
      instead.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      ac0b0113
    • T
      ide-taskfile: don't abuse rq->buffer · d868ca24
      Tejun Heo 提交于
      Impact: rq->buffer usage cleanup
      
      ide_raw_taskfile() directly uses rq->buffer to carry pointer to the
      data buffer.  This complicates both block interface and ide backend
      request handling.  Use blk_rq_map_kern() instead and drop special
      handling for REQ_TYPE_ATA_TASKFILE from ide_map_sg().
      
      Note that REQ_RW setting is moved upwards as blk_rq_map_kern() uses it
      to initialize bio rw flag.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      d868ca24
    • T
      ide-floppy: block pc always uses bio · 8968932e
      Tejun Heo 提交于
      Impact: remove unnecessary code path
      
      Block pc requests always use bio and rq->data is always NULL.  No need
      to worry about !rq->bio cases in idefloppy_block_pc_cmd().  Note that
      ide-atapi uses ide_pio_bytes() for bio PIO transfer which handle sg
      fine.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      8968932e
    • T
      ide-cd: clear sense buffer before issuing request sense · 59a4f6f3
      Tejun Heo 提交于
      Impact: code simplification
      
      ide_cd_request_sense_fixup() clears the tail of the sense buffer if
      the device didn't completely fill it.  This patch makes
      cdrom_queue_request_sense() clear the sense buffer before issuing the
      command instead of clearing it afterwards.  This simplifies code and
      eases future changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      59a4f6f3
    • T
      ide kill unused ide_cmd->special · 214ae191
      Tejun Heo 提交于
      Impact: removal of unused field
      
      No one uses ide_cmd->special anymore.  Kill it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      214ae191
    • T
      ide: don't set REQ_SOFTBARRIER · b2963ac1
      Tejun Heo 提交于
      ide doesn't have to worry about REQ_SOFTBARRIER.  Don't set it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      b2963ac1
    • T
      ide: use blk_run_queue() instead of blk_start_queueing() · 220d06b5
      Tejun Heo 提交于
      blk_start_queueing() is being phased out in favor of
      [__]blk_run_queue().  Switch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      220d06b5
    • T
      ide-tape: remove back-to-back REQUEST_SENSE detection · 0de57fb9
      Tejun Heo 提交于
      Impact: fix an oops which always triggers
      
      ide_tape_issue_pc() assumed drive->pc isn't NULL on invocation when
      checking for back-to-back request sense issues but drive->pc can be
      NULL and even when it's not NULL, it's not safe to dereference it once
      the previous command is complete because pc could have been freed or
      was on stack.  Kill back-to-back REQUEST_SENSE detection.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      0de57fb9
    • T
      block: clear req->errors on bio completion only for fs requests · 924cec77
      Tejun Heo 提交于
      Impact: subtle behavior change
      
      For fs requests, rq is only carrier of bios and rq error status as a
      whole doesn't mean much.  This is the reason why rq->errors is being
      cleared on each partial completion of a request as on each partial
      completion the error status is transferred to the respective bios.
      
      For pc requests, rq->errors is used to carry error status to the
      issuer and thus __end_that_request_first() doesn't clear it on such
      cases.
      
      The condition was fine till now as only fs and pc requests have used
      bio and thus the bio completion path.  However, future changes will
      unify data accesses to bio and all non fs users care about rq error
      status.  Clear rq->errors on bio completion only for fs requests.
      
      In general, the implicit clearing is a bit too subtle especially as
      the meaning of rq->errors is completely dependent on low level
      drivers.  Unifying / cleaning up rq->errors usage and letting llds
      manage it would be better.  TODO comment added.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJens Axboe <axboe@kernel.dk>
      924cec77
    • A
      loop: use BIO list management functions · e686307f
      Akinobu Mita 提交于
      Now that the bio list management stuff is generic, convert loop to use
      bio lists instead of its own private bio list implementation.
      
      Cc:  Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      e686307f
    • T
      hd: fix locking · e93b9fb7
      Tejun Heo 提交于
      hd dance around local irq and HD_IRQ enable without achieving much.
      It ends up transferring data from irq handler with both local irq and
      HD_IRQ disabled.  The only place it actually does something is while
      transferring the first block of a request which it does with HD_IRQ
      disabled but local irq enabled.
      
      Unfortunately, the dancing is horribly broken from locking POV.  IRQ
      and timeout handlers access block queue without grabbing the queue
      lock and running the driver in SMP configuration crashes the whole
      machine pretty quickly.
      
      Remove meaningless irq enable/disable dancing and add proper locking
      in issue, irq and timeout paths.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e93b9fb7
    • B
      mg_disk: fix CONFIG_LBD=y warning · 7090a0a9
      Bartlomiej Zolnierkiewicz 提交于
      drivers/block/mg_disk.c: In function ‘mg_dump_status’:
      drivers/block/mg_disk.c:265: warning: format ‘%ld’ expects type ‘long int’, but
      argument 2 has type ‘sector_t’
      
      [ Impact: kill build warning ]
      
      Cc: unsik Kim <donari75@gmail.com>
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      7090a0a9
    • T
      mg_disk: fix locking · ac2ff946
      Tejun Heo 提交于
      IRQ and timeout handlers call functions which expect locked queue lock
      without locking it.  Fix it.
      
      While at it, convert 0s used as null pointer constant to NULLs.
      
      [ Impact: fix locking, cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: unsik Kim <donari75@gmail.com>
      ac2ff946
  2. 27 4月, 2009 7 次提交