1. 11 9月, 2009 11 次提交
    • J
      block: add blk-iopoll, a NAPI like approach for block devices · 5e605b64
      Jens Axboe 提交于
      This borrows some code from NAPI and implements a polled completion
      mode for block devices. The idea is the same as NAPI - instead of
      doing the command completion when the irq occurs, schedule a dedicated
      softirq in the hopes that we will complete more IO when the iopoll
      handler is invoked. Devices have a budget of commands assigned, and will
      stay in polled mode as long as they continue to consume their budget
      from the iopoll softirq handler. If they do not, the device is set back
      to interrupt completion mode.
      
      This patch holds the core bits for blk-iopoll, device driver support
      sold separately.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      5e605b64
    • J
      block: improve queue_should_plug() by looking at IO depths · fb1e7538
      Jens Axboe 提交于
      Instead of just checking whether this device uses block layer
      tagging, we can improve the detection by looking at the maximum
      queue depth it has reached. If that crosses 4, then deem it a
      queuing device.
      
      This is important on high IOPS devices, since plugging hurts
      the performance there (it can be as much as 10-15% of the sys
      time).
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      fb1e7538
    • J
      bio: first step in sanitizing the bio->bi_rw flag testing · 1f98a13f
      Jens Axboe 提交于
      Get rid of any functions that test for these bits and make callers
      use bio_rw_flagged() directly. Then it is at least directly apparent
      what variable and flag they check.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      1f98a13f
    • J
      block: make bio_rw_flagged() return a bool · e7e503ae
      Jens Axboe 提交于
      Makes for a saner interface, instead of returning the bit position.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      e7e503ae
    • H
      Send uevents for write_protect changes · e3264a4d
      Hannes Reinecke 提交于
      Whenever a block device changes it's read-only attribute
      notify the userspace about it.
      Signed-off-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      e3264a4d
    • V
      cfq-iosched: no need to keep track of busy_rt_queues · d58b85e1
      Vivek Goyal 提交于
      o Get rid of busy_rt_queues infrastructure. Looks like it is redundant.
      
      o Once an RT queue gets request it will preempt any of the BE or IDLE queues
        immediately. Otherwise this queue will be put on service tree and scheduler
        will anyway select this queue before any of the BE or IDLE queue. Hence
        looks like there is no need to keep track of how many busy RT queues are
        currently on service tree.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      d58b85e1
    • J
      cfq-iosched: drain device queue before switching to a sync queue · 5ad531db
      Jens Axboe 提交于
      To lessen the impact of async IO on sync IO, let the device drain of
      any async IO in progress when switching to a sync cfqq that has idling
      enabled.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      5ad531db
    • T
      scsi,block: update SCSI to handle mixed merge failures · da6c5c72
      Tejun Heo 提交于
      Update scsi_io_completion() such that it only fails requests till the
      next error boundary and retry the leftover.  This enables block layer
      to merge requests with different failfast settings and still behave
      correctly on errors.  Allow merge of requests of different failfast
      settings.
      
      As SCSI is currently the only subsystem which follows failfast status,
      there's no need to worry about other block drivers for now.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Niel Lambrechts <niel.lambrechts@gmail.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      da6c5c72
    • T
      block: implement mixed merge of different failfast requests · 80a761fd
      Tejun Heo 提交于
      Failfast has characteristics from other attributes.  When issuing,
      executing and successuflly completing requests, failfast doesn't make
      any difference.  It only affects how a request is handled on failure.
      Allowing requests with different failfast settings to be merged cause
      normal IOs to fail prematurely while not allowing has performance
      penalties as failfast is used for read aheads which are likely to be
      located near in-flight or to-be-issued normal IOs.
      
      This patch introduces the concept of 'mixed merge'.  A request is a
      mixed merge if it is merge of segments which require different
      handling on failure.  Currently the only mixable attributes are
      failfast ones (or lack thereof).
      
      When a bio with different failfast settings is added to an existing
      request or requests of different failfast settings are merged, the
      merged request is marked mixed.  Each bio carries failfast settings
      and the request always tracks failfast state of the first bio.  When
      the request fails, blk_rq_err_bytes() can be used to determine how
      many bytes can be safely failed without crossing into an area which
      requires further retrials.
      
      This allows request merging regardless of failfast settings while
      keeping the failure handling correct.
      
      This patch only implements mixed merge but doesn't enable it.  The
      next one will update SCSI to make use of mixed merge.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Niel Lambrechts <niel.lambrechts@gmail.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      80a761fd
    • T
      block: use the same failfast bits for bio and request · a82afdfc
      Tejun Heo 提交于
      bio and request use the same set of failfast bits.  This patch makes
      the following changes to simplify things.
      
      * enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_*
        bits coincide with __REQ_FAILFAST_* bits.
      
      * The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV
        but the matching is useless anyway.  init_request_from_bio() is
        responsible for setting FAILFAST bits on FS requests and non-FS
        requests never use BIO_RW_AHEAD.  Drop the code and comment from
        blk_rq_bio_prep().
      
      * Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and
        simplify FAILFAST flags handling in init_request_from_bio().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      a82afdfc
    • G
      md: Fix "strchr" [drivers/md/dm-log-userspace.ko] undefined! · 0d03d59d
      Geert Uytterhoeven 提交于
      Commit b8313b6d ("dm log: remove incorrect
      field from userspace table output") added a call to strstr() with a
      single-character "needle" string parameter.
      
      Unfortunately some versions of gcc replace such calls to strstr() by calls
      to strchr() behind our back.  This causes linking errors if strchr() is
      defined as an inline function in <asm/string.h> (e.g. on m68k):
      
      | WARNING: "strchr" [drivers/md/dm-log-userspace.ko] undefined!
      
      Avoid this by explicitly calling strchr() instead.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0d03d59d
  2. 10 9月, 2009 3 次提交
    • L
      Merge branch 'lookup-permissions-cleanup' · 526b6780
      Linus Torvalds 提交于
      * lookup-permissions-cleanup:
        jffs2/jfs/xfs: switch over to 'check_acl' rather than 'permission()'
        ext[234]: move over to 'check_acl' permission model
        shmfs: use 'check_acl' instead of 'permission'
        Make 'check_acl()' a first-class filesystem op
        Simplify exec_permission_lite(), part 3
        Simplify exec_permission_lite() further
        Simplify exec_permission_lite() logic
        Do not call 'ima_path_check()' for each path component
      526b6780
    • R
      binfmt_elf: fix PT_INTERP bss handling · 752015d1
      Roland McGrath 提交于
      In fs/binfmt_elf.c, load_elf_interp() calls padzero() for .bss even if
      the PT_LOAD has no PROT_WRITE and no .bss.  This generates EFAULT.
      
      Here is a small test case.  (Yes, there are other, useful PT_INTERP
      which have only .text and no .data/.bss.)
      
      	----- ptinterp.S
      	_start: .globl _start
      		 nop
      		 int3
      	-----
      	$ gcc -m32 -nostartfiles -nostdlib -o ptinterp ptinterp.S
      	$ gcc -m32 -Wl,--dynamic-linker=ptinterp -o hello hello.c
      	$ ./hello
      	Segmentation fault  # during execve() itself
      
      	After applying the patch:
      	$ ./hello
      	Trace trap  # user-mode execution after execve() finishes
      
      If the ELF headers are actually self-inconsistent, then dying is fine.
      But having no PROT_WRITE segment is perfectly normal and correct if
      there is no segment with p_memsz > p_filesz (i.e. bss).  John Reiser
      suggested checking for PROT_WRITE in the bss logic.  I think it makes
      most sense to simply apply the bss logic only when there is bss.
      
      This patch looks less trivial than it is due to some reindentation.
      It just moves the "if (last_bss > elf_bss) {" test up to include the
      partial-page bss logic as well as the more-pages bss logic.
      Reported-by: NJohn Reiser <jreiser@bitwagon.com>
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      752015d1
    • L
      Linux 2.6.31 · 74fca6a4
      Linus Torvalds 提交于
      74fca6a4
  3. 09 9月, 2009 11 次提交
  4. 08 9月, 2009 5 次提交
  5. 07 9月, 2009 2 次提交
  6. 06 9月, 2009 8 次提交