1. 08 5月, 2013 7 次提交
    • K
      aio: kill batch allocation · a1c8eae7
      Kent Overstreet 提交于
      Previously, allocating a kiocb required touching quite a few global
      (well, per kioctx) cachelines...  so batching up allocation to amortize
      those was worthwhile.  But we've gotten rid of some of those, and in
      another couple of patches kiocb allocation won't require writing to any
      shared cachelines, so that means we can just rip this code out.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a1c8eae7
    • K
      aio: use cancellation list lazily · 0460fef2
      Kent Overstreet 提交于
      Cancelling kiocbs requires adding them to a per kioctx linked list,
      which is one of the few things we need to take the kioctx lock for in
      the fast path.  But most kiocbs can't be cancelled - so if we just do
      this lazily, we can avoid quite a bit of locking overhead.
      
      While we're at it, instead of using a flag bit switch to using ki_cancel
      itself to indicate that a kiocb has been cancelled/completed.  This lets
      us get rid of ki_flags entirely.
      
      [akpm@linux-foundation.org: remove buggy BUG()]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0460fef2
    • K
      aio: make aio_put_req() lockless · 11599eba
      Kent Overstreet 提交于
      Freeing a kiocb needed to touch the kioctx for three things:
      
       * Pull it off the reqs_active list
       * Decrementing reqs_active
       * Issuing a wakeup, if the kioctx was in the process of being freed.
      
      This patch moves these to aio_complete(), for a couple reasons:
      
       * aio_complete() already has to issue the wakeup, so if we drop the
         kioctx refcount before aio_complete does its wakeup we don't have to
         do it twice.
       * aio_complete currently has to take the kioctx lock, so it makes sense
         for it to pull the kiocb off the reqs_active list too.
       * A later patch is going to change reqs_active to include unreaped
         completions - this will mean allocating a kiocb doesn't have to look
         at the ringbuffer. So taking the decrement of reqs_active out of
         kiocb_free() is useful prep work for that patch.
      
      This doesn't really affect cancellation, since existing (usb) code that
      implements a cancel function still calls aio_complete() - we just have
      to make sure that aio_complete does the necessary teardown for cancelled
      kiocbs.
      
      It does affect code paths where we free kiocbs that were never
      submitted; they need to decrement reqs_active and pull the kiocb off the
      reqs_active list.  This occurs in two places: kiocb_batch_free(), which
      is going away in a later patch, and the error path in io_submit_one.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      11599eba
    • K
      aio: move private stuff out of aio.h · 4e179bca
      Kent Overstreet 提交于
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4e179bca
    • K
      aio: kill return value of aio_complete() · 2d68449e
      Kent Overstreet 提交于
      Nothing used the return value, and it probably wasn't possible to use it
      safely for the locked versions (aio_complete(), aio_put_req()).  Just
      kill it.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Acked-by: NZach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2d68449e
    • Z
      aio: remove retry-based AIO · 41003a7b
      Zach Brown 提交于
      This removes the retry-based AIO infrastructure now that nothing in tree
      is using it.
      
      We want to remove retry-based AIO because it is fundemantally unsafe.
      It retries IO submission from a kernel thread that has only assumed the
      mm of the submitting task.  All other task_struct references in the IO
      submission path will see the kernel thread, not the submitting task.
      This design flaw means that nothing of any meaningful complexity can use
      retry-based AIO.
      
      This removes all the code and data associated with the retry machinery.
      The most significant benefit of this is the removal of the locking
      around the unused run list in the submission path.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Signed-off-by: NZach Brown <zab@redhat.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41003a7b
    • Z
      aio: remove dead code from aio.h · 4b49bb8a
      Zach Brown 提交于
      Signed-off-by: NZach Brown <zab@redhat.com>
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b49bb8a
  2. 31 7月, 2012 1 次提交
  3. 05 7月, 2012 1 次提交
  4. 03 11月, 2011 1 次提交
    • J
      aio: allocate kiocbs in batches · 080d676d
      Jeff Moyer 提交于
      In testing aio on a fast storage device, I found that the context lock
      takes up a fair amount of cpu time in the I/O submission path.  The reason
      is that we take it for every I/O submitted (see __aio_get_req).  Since we
      know how many I/Os are passed to io_submit, we can preallocate the kiocbs
      in batches, reducing the number of times we take and release the lock.
      
      In my testing, I was able to reduce the amount of time spent in
      _raw_spin_lock_irq by .56% (average of 3 runs).  The command I used to
      test this was:
      
         aio-stress -O -o 2 -o 3 -r 8 -d 128 -b 32 -i 32 -s 16384 <dev>
      
      I also tested the patch with various numbers of events passed to
      io_submit, and I ran the xfstests aio group of tests to ensure I didn't
      break anything.
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Daniel Ehrenberg <dehrenberg@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      080d676d
  5. 27 7月, 2011 1 次提交
  6. 28 5月, 2010 1 次提交
    • J
      aio: fix the compat vectored operations · 9d85cba7
      Jeff Moyer 提交于
      The aio compat code was not converting the struct iovecs from 32bit to
      64bit pointers, causing either EINVAL to be returned from io_getevents, or
      EFAULT as the result of the I/O.  This patch passes a compat flag to
      io_submit to signal that pointer conversion is necessary for a given iocb
      array.
      
      A variant of this was tested by Michael Tokarev.  I have also updated the
      libaio test harness to exercise this code path with good success.
      Further, I grabbed a copy of ltp and ran the
      testcases/kernel/syscall/readv and writev tests there (compiled with -m32
      on my 64bit system).  All seems happy, but extra eyes on this would be
      welcome.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix CONFIG_COMPAT=n build]
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Reported-by: NMichael Tokarev <mjt@tls.msk.ru>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: <stable@kernel.org>		[2.6.35.1]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d85cba7
  7. 16 12月, 2009 1 次提交
  8. 20 9月, 2009 1 次提交
  9. 01 7月, 2009 1 次提交
  10. 29 12月, 2008 1 次提交
    • J
      aio: make the lookup_ioctx() lockless · abf137dd
      Jens Axboe 提交于
      The mm->ioctx_list is currently protected by a reader-writer lock,
      so we always grab that lock on the read side for doing ioctx
      lookups. As the workload is extremely reader biased, turn this into
      an rcu hlist so we can make lookup_ioctx() lockless. Get rid of
      the rwlock and use a spinlock for providing update side exclusion.
      
      There's usually only 1 entry on this list, so it doesn't make sense
      to look into fancier data structures.
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      abf137dd
  11. 17 10月, 2008 1 次提交
  12. 27 7月, 2008 1 次提交
  13. 29 4月, 2008 1 次提交
  14. 19 2月, 2008 1 次提交
    • A
      fs/block_dev.c: remove #if 0'ed code · 86b6c7a7
      Adrian Bunk 提交于
      Commit b2e895db #if 0'ed this code stating:
      
      <--  snip  -->
      
          [PATCH] revert blockdev direct io back to 2.6.19 version
      
          Andrew Vasquez is reporting as-iosched oopses and a 65% throughput
          slowdown due to the recent special-casing of direct-io against
          blockdevs.  We don't know why either of these things are occurring.
      
          The patch minimally reverts us back to the 2.6.19 code for a 2.6.20
          release.
      
      <--  snip  -->
      
      It has since been dead code, and unless someone wants to revive it now
      it's time to remove it.
      
      This patch also makes bio_release_pages() static again and removes the
      ki_bio_count member from struct kiocb, reverting changes that had been
      done for this dead code.
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NJens Axboe <axboe@carl.home.kernel.dk>
      86b6c7a7
  15. 14 2月, 2008 1 次提交
  16. 19 10月, 2007 1 次提交
  17. 20 7月, 2007 1 次提交
  18. 11 5月, 2007 1 次提交
    • D
      signal/timer/event: KAIO eventfd support example · 9c3060be
      Davide Libenzi 提交于
      This is an example about how to add eventfd support to the current KAIO code,
      in order to enable KAIO to post readiness events to a pollable fd (hence
      compatible with POSIX select/poll).  The KAIO code simply signals the eventfd
      fd when events are ready, and this triggers a POLLIN in the fd.  This patch
      uses a reserved for future use member of the struct iocb to pass an eventfd
      file descriptor, that KAIO will use to post events every time a request
      completes.  At that point, an aio_getevents() will return the completed result
      to a struct io_event.  I made a quick test program to verify the patch, and it
      runs fine here:
      
      http://www.xmailserver.org/eventfd-aio-test.c
      
      The test program uses poll(2), but it'd, of course, work with select and epoll
      too.
      
      This can allow to schedule both block I/O and other poll-able devices
      requests, and wait for results using select/poll/epoll.  In a typical
      scenario, an application would submit KAIO request using aio_submit(), and
      will also use epoll_ctl() on the whole other class of devices (that with the
      addition of signals, timers and user events, now it's pretty much complete),
      and then would:
      
      	epoll_wait(...);
      	for_each_event {
      		if (curr_event_is_kaiofd) {
      			aio_getevents();
      			dispatch_aio_events();
      		} else {
      			dispatch_epoll_event();
      		}
      	}
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9c3060be
  19. 10 5月, 2007 1 次提交
  20. 14 12月, 2006 1 次提交
    • C
      [PATCH] optimize o_direct on block devices · e61c9018
      Chen, Kenneth W 提交于
      Implement block device specific .direct_IO method instead of going through
      generic direct_io_worker for block device.
      
      direct_io_worker() is fairly complex because it needs to handle O_DIRECT on
      file system, where it needs to perform block allocation, hole detection,
      extents file on write, and tons of other corner cases.  The end result is
      that it takes tons of CPU time to submit an I/O.
      
      For block device, the block allocation is much simpler and a tight triple
      loop can be written to iterate each iovec and each page within the iovec in
      order to construct/prepare bio structure and then subsequently submit it to
      the block layer.  This significantly speeds up O_D on block device.
      
      [akpm@osdl.org: small speedup]
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Zach Brown <zach.brown@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e61c9018
  21. 08 12月, 2006 1 次提交
  22. 22 11月, 2006 1 次提交
    • D
      WorkStruct: Separate delayable and non-delayable events. · 52bad64d
      David Howells 提交于
      Separate delayable work items from non-delayable work items be splitting them
      into a separate structure (delayed_work), which incorporates a work_struct and
      the timer_list removed from work_struct.
      
      The work_struct struct is huge, and this limits it's usefulness.  On a 64-bit
      architecture it's nearly 100 bytes in size.  This reduces that by half for the
      non-delayable type of event.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      52bad64d
  23. 01 10月, 2006 4 次提交
  24. 09 1月, 2006 1 次提交
  25. 14 11月, 2005 2 次提交
    • Z
      [PATCH] aio: don't ref kioctx after decref in put_ioctx · 5ef1c49f
      Zach Brown 提交于
      put_ioctx's refcount debugging was doing an atomic_read after dropping its
      reference when it wasn't the last ref, leaving a tiny race for another freeing
      thread to sneak into.  This shifts the debugging before the ops, uses BUG_ON,
      and reformats the defines a little.  Sadly, moving to inlines increased the
      code size but this change decreases the code size by a whole 9 bytes :)
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5ef1c49f
    • Z
      [PATCH] aio: remove kioctx from mm_struct · 20dcae32
      Zach Brown 提交于
      Sync iocbs have a life cycle that don't need a kioctx.  Their retrying, if
      any, is done in the context of their owner who has allocated them on the
      stack.
      
      The sole user of a sync iocb's ctx reference was aio_complete() checking for
      an elevated iocb ref count that could never happen.  No path which grabs an
      iocb ref has access to sync iocbs.
      
      If we were to implement sync iocb cancelation it would be done by the owner of
      the iocb using its on-stack reference.
      
      Removing this chunk from aio_complete allows us to remove the entire kioctx
      instance from mm_struct, reducing its size by a third.  On a i386 testing box
      the slab size went from 768 to 504 bytes and from 5 to 8 per page.
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      20dcae32
  26. 07 11月, 2005 1 次提交
    • Z
      [PATCH] aio: remove aio_max_nr accounting race · d55b5fda
      Zach Brown 提交于
      AIO was adding a new context's max requests to the global total before
      testing if that resulting total was over the global limit.  This let
      innocent tasks get their new limit tested along with a racing guilty task
      that was crossing the limit.  This serializes the _nr accounting with a
      spinlock It also switches to using unsigned long for the global totals.
      Individual contexts are still limited to an unsigned int's worth of
      requests by the syscall interface.
      
      The problem and fix were verified with a simple program that spun creating
      and destroying a context while holding on to another long lived context.
      Before the patch a task creating a tiny context could get a spurious EAGAIN
      if it raced with a task creating a very large context that overran the
      limit.
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d55b5fda
  27. 18 10月, 2005 1 次提交
  28. 01 10月, 2005 1 次提交
    • Z
      [PATCH] aio: remove unlocked task_list test and resulting race · 897f15fb
      Zach Brown 提交于
      Only one of the run or kick path is supposed to put an iocb on the run
      list.  If both of them do it than one of them can end up referencing a
      freed iocb.  The kick path could delete the task_list item from the wait
      queue before getting the ctx_lock and putting the iocb on the run list.
      The run path was testing the task_list item outside the lock so that it
      could catch ki_retry methods that return -EIOCBRETRY *without* putting the
      iocb on a wait queue and promising to call kick_iocb.  This unlocked check
      could then race with the kick path to cause both to try and put the iocb on
      the run list.
      
      The patch stops the run path from testing task_list by requring that any
      ki_retry that returns -EIOCBRETRY *must* guarantee that kick_iocb() will be
      called in the future.  aio_p{read,write}, the only in-tree -EIOCBRETRY
      users, are updated.
      Signed-off-by: NZach Brown <zach.brown@oracle.com>
      Signed-off-by: NBenjamin LaHaise <bcrl@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      897f15fb
  29. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4