提交 · 1e142b29e210b5dfb2deeb6ce2210b60af16d2a6 · openeuler / Kernel

24 2月, 2013 2 次提交

mm: make do_mmap_pgoff return populate as a size in bytes, not as a bool · 41badc15

由 Michel Lespinasse 提交于 2月 22, 2013

do_mmap_pgoff() rounds up the desired size to the next PAGE_SIZE
multiple, however there was no equivalent code in mm_populate(), which
caused issues.

This could be fixed by introduced the same rounding in mm_populate(),
however I think it's preferable to make do_mmap_pgoff() return populate
as a size rather than as a boolean, so we don't have to duplicate the
size rounding logic in mm_populate().
Signed-off-by: NMichel Lespinasse <walken@google.com>
Acked-by: NRik van Riel <riel@redhat.com>
Tested-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Greg Ungerer <gregungerer@westnet.com.au>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

41badc15

mm: introduce mm_populate() for populating new vmas · bebeb3d6

由 Michel Lespinasse 提交于 2月 22, 2013

When creating new mappings using the MAP_POPULATE / MAP_LOCKED flags (or
with MCL_FUTURE in effect), we want to populate the pages within the
newly created vmas.  This may take a while as we may have to read pages
from disk, so ideally we want to do this outside of the write-locked
mmap_sem region.

This change introduces mm_populate(), which is used to defer populating
such mappings until after the mmap_sem write lock has been released.
This is implemented as a generalization of the former do_mlock_pages(),
which accomplished the same task but was using during mlock() /
mlockall().
Signed-off-by: NMichel Lespinasse <walken@google.com>
Reported-by: NAndy Lutomirski <luto@amacapital.net>
Acked-by: NRik van Riel <riel@redhat.com>
Tested-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Greg Ungerer <gregungerer@westnet.com.au>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bebeb3d6

23 7月, 2012 1 次提交
- A
  aio: now fput() is OK from interrupt context; get rid of manual delayed __fput() · 3ffa3c0e
  由 Al Viro 提交于 6月 24, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  3ffa3c0e
01 6月, 2012 2 次提交

A
switch aio and shm to do_mmap_pgoff(), make do_mmap() static · e3fc629d
由 Al Viro 提交于 5月 30, 2012
```
after all, 0 bytes and 0 pages is the same thing...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
e3fc629d

aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() · ac34ebb3

由 Christopher Yeoh 提交于 5月 31, 2012

A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after
changes made to support CMA in an earlier patch.

Rather than having an additional check_access parameter to these
functions, the first paramater type is overloaded to allow the caller to
specify CHECK_IOVEC_ONLY which means check that the contents of the iovec
are valid, but do not check the memory that they point to. This is used
by process_vm_readv/writev where we need to validate that a iovec passed
to the syscall is valid but do not want to check the memory that it points
to at this point because it refers to an address space in another process.
Signed-off-by: NChris Yeoh <yeohc@au1.ibm.com>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ac34ebb3

22 5月, 2012 1 次提交

vfs: make AIO use the proper rw_verify_area() area helpers · a70b52ec

由 Linus Torvalds 提交于 5月 21, 2012

We had for some reason overlooked the AIO interface, and it didn't use
the proper rw_verify_area() helper function that checks (for example)
mandatory locking on the file, and that the size of the access doesn't
cause us to overflow the provided offset limits etc.

Instead, AIO did just the security_file_permission() thing (that
rw_verify_area() also does) directly.

This fixes it to do all the proper helper functions, which not only
means that now mandatory file locking works with AIO too, we can
actually remove lines of code.
Reported-by: NManish Honap <manish_honap_vit@yahoo.co.in>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a70b52ec

21 4月, 2012 3 次提交

A
kill mm argument of vm_munmap() · bfce281c
由 Al Viro 提交于 4月 20, 2012
```
it's always current->mm
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
bfce281c

aio: don't bother with unmapping when aio_free_ring() is coming from exit_aio() · 936af157

由 Al Viro 提交于 4月 20, 2012

... since exit_mmap() is coming and it will munmap() everything anyway.
In all other cases aio_free_ring() has ctx->mm == current->mm; moreover,
all other callers of vm_munmap() have mm == current->mm, so this will
allow us to get rid of mm argument of vm_munmap().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

936af157

VM: add "vm_munmap()" helper function · a46ef99d

由 Linus Torvalds 提交于 4月 20, 2012

Like the vm_brk() function, this is the same as "do_munmap()", except it
does the VM locking for the caller.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a46ef99d

01 4月, 2012 2 次提交
- A
  aio: take final put_ioctx() into callers of io_destroy() · a2e1859a
  由 Al Viro 提交于 3月 20, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  a2e1859a
- A
  aio: merge aio_cancel_all() with wait_for_all_aios() · 06af121e
  由 Al Viro 提交于 3月 20, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  06af121e
21 3月, 2012 6 次提交

aio: fix the comment in aio_kick_handler() · 9fcf03d0

由 Al Viro 提交于 3月 13, 2012

	It should've been changed when queue_work() became
queue_delayed_work(..., 0) in there.  It's always had been
about not needing a delay, not about not using specific
function...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9fcf03d0

A
aio: don't bother with cancel_delayed_work() in exit_aio() · cd1ea261
由 Al Viro 提交于 3月 11, 2012
```
__put_ioctx() will cover it anyway.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
cd1ea261
A
aio: use cancel_delayed_work_sync() · bf50722a
由 Al Viro 提交于 3月 11, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
bf50722a
A
aio: aio_nr_lock is taken only synchronously now · 9fa1cb39
由 Al Viro 提交于 3月 10, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
9fa1cb39

aio: aio_nr decrements don't need to be delayed · 2dd542b7

由 Al Viro 提交于 3月 10, 2012

we can do that right in __put_ioctx(); as the result, the loop
in ioctx_alloc() can be killed.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2dd542b7

A
aio: don't bother with async freeing on failure in ioctx_alloc() · e23754f8
由 Al Viro 提交于 3月 06, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
e23754f8

20 3月, 2012 1 次提交
- C
  fs: remove the second argument of k[un]map_atomic() · e8e3c3d6
  由 Cong Wang 提交于 11月 25, 2011
```
Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
Signed-off-by: NCong Wang <amwang@redhat.com>
```
  e8e3c3d6
10 3月, 2012 2 次提交

aio: fix the "too late munmap()" race · c7b28555

由 Al Viro 提交于 3月 08, 2012

Current code has put_ioctx() called asynchronously from aio_fput_routine();
that's done *after* we have killed the request that used to pin ioctx,
so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
from progressing.  As the result, we can end up with async call of
put_ioctx() being the last one and possibly happening during exit_mmap()
or elf_core_dump(), neither of which expects stray munmap() being done
to them...

We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
with that, but that's all we care about - neither io_destroy() nor
exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
does really_put_req(), so the ioctx teardown won't be done until then
and we don't care about the contents of ioctx past that point.

Since actual freeing of these suckers is RCU-delayed, we don't need to
bump ioctx refcount when request goes into list for async removal.
All we need is rcu_read_lock held just over the ->ctx_lock-protected
area in aio_fput_routine().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c7b28555

aio: fix io_setup/io_destroy race · 86b62a2c

由 Al Viro 提交于 3月 07, 2012

Have ioctx_alloc() return an extra reference, so that caller would drop it
on success and not bother with re-grabbing it on failure exit.  The current
code is obviously broken - io_destroy() from another thread that managed
to guess the address io_setup() would've returned would free ioctx right
under us; gets especially interesting if aio_context_t * we pass to
io_setup() points to PROT_READ mapping, so put_user() fails and we end
up doing io_destroy() on kioctx another thread has just got freed...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

86b62a2c

06 3月, 2012 1 次提交

aio: wake up waiters when freeing unused kiocbs · 880641bb

由 Jeff Moyer 提交于 3月 05, 2012

Bart Van Assche reported a hung fio process when either hot-removing
storage or when interrupting the fio process itself.  The (pruned) call
trace for the latter looks like so:

  fio             D 0000000000000001     0  6849   6848 0x00000004
   ffff880092541b88 0000000000000046 ffff880000000000 ffff88012fa11dc0
   ffff88012404be70 ffff880092541fd8 ffff880092541fd8 ffff880092541fd8
   ffff880128b894d0 ffff88012404be70 ffff880092541b88 000000018106f24d
  Call Trace:
    schedule+0x3f/0x60
    io_schedule+0x8f/0xd0
    wait_for_all_aios+0xc0/0x100
    exit_aio+0x55/0xc0
    mmput+0x2d/0x110
    exit_mm+0x10d/0x130
    do_exit+0x671/0x860
    do_group_exit+0x44/0xb0
    get_signal_to_deliver+0x218/0x5a0
    do_signal+0x65/0x700
    do_notify_resume+0x65/0x80
    int_signal+0x12/0x17

The problem lies with the allocation batching code.  It will
opportunistically allocate kiocbs, and then trim back the list of iocbs
when there is not enough room in the completion ring to hold all of the
events.

In the case above, what happens is that the pruning back of events ends
up freeing up the last active request and the context is marked as dead,
so it is thus responsible for waking up waiters.  Unfortunately, the
code does not check for this condition, so we end up with a hung task.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Reported-by: NBart Van Assche <bvanassche@acm.org>
Tested-by: NBart Van Assche <bvanassche@acm.org>
Cc: <stable@kernel.org>		[3.2.x only]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

880641bb

29 2月, 2012 1 次提交

fs: reduce the use of module.h wherever possible · 630d9c47

由 Paul Gortmaker 提交于 11月 16, 2011

For files only using THIS_MODULE and/or EXPORT_SYMBOL, map
them onto including export.h -- or if the file isn't even
using those, then just delete the include.  Fix up any implicit
include dependencies that were being masked by module.h along
the way.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

630d9c47

14 1月, 2012 1 次提交

Unused iocbs in a batch should not be accounted as active. · 69e4747e

由 Gleb Natapov 提交于 1月 08, 2012

Since commit 080d676d ("aio: allocate kiocbs in batches") iocbs are
allocated in a batch during processing of first iocbs.  All iocbs in a
batch are automatically added to ctx->active_reqs list and accounted in
ctx->reqs_active.

If one (not the last one) of iocbs submitted by an user fails, further
iocbs are not processed, but they are still present in ctx->active_reqs
and accounted in ctx->reqs_active.  This causes process to stuck in a D
state in wait_for_all_aios() on exit since ctx->reqs_active will never
go down to zero.  Furthermore since kiocb_batch_free() frees iocb
without removing it from active_reqs list the list become corrupted
which may cause oops.

Fix this by removing iocb from ctx->active_reqs and updating
ctx->reqs_active in kiocb_batch_free().
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Cc: stable@kernel.org   # 3.2
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

69e4747e

03 11月, 2011 1 次提交

aio: allocate kiocbs in batches · 080d676d

由 Jeff Moyer 提交于 11月 02, 2011

In testing aio on a fast storage device, I found that the context lock
takes up a fair amount of cpu time in the I/O submission path.  The reason
is that we take it for every I/O submitted (see __aio_get_req).  Since we
know how many I/Os are passed to io_submit, we can preallocate the kiocbs
in batches, reducing the number of times we take and release the lock.

In my testing, I was able to reduce the amount of time spent in
_raw_spin_lock_irq by .56% (average of 3 runs).  The command I used to
test this was:

   aio-stress -O -o 2 -o 3 -r 8 -d 128 -b 32 -i 32 -s 16384 <dev>

I also tested the patch with various numbers of events passed to
io_submit, and I ran the xfstests aio group of tests to ensure I didn't
break anything.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Daniel Ehrenberg <dehrenberg@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

080d676d

01 11月, 2011 1 次提交

Cross Memory Attach · fcf63409

由 Christopher Yeoh 提交于 10月 31, 2011

The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than a
double copy of the message via shared memory.

The following patch attempts to achieve this by allowing a destination
process, given an address and size from a source process, to copy memory
directly from the source process into its own address space via a system
call.  There is also a symmetrical ability to copy from the current
process's address space into a destination process's address space.

- Use of /proc/pid/mem has been considered, but there are issues with
  using it:
  - Does not allow for specifying iovecs for both src and dest, assuming
    preadv or pwritev was implemented either the area read from or
  written to would need to be contiguous.
  - Currently mem_read allows only processes who are currently
  ptrace'ing the target and are still able to ptrace the target to read
  from the target. This check could possibly be moved to the open call,
  but its not clear exactly what race this restriction is stopping
  (reason  appears to have been lost)
  - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
  domain socket is a bit ugly from a userspace point of view,
  especially when you may have hundreds if not (eventually) thousands
  of processes  that all need to do this with each other
  - Doesn't allow for some future use of the interface we would like to
  consider adding in the future (see below)
  - Interestingly reading from /proc/pid/mem currently actually
  involves two copies! (But this could be fixed pretty easily)

As mentioned previously use of vmsplice instead was considered, but has
problems.  Since you need the reader and writer working co-operatively if
the pipe is not drained then you block.  Which requires some wrapping to
do non blocking on the send side or polling on the receive.  In all to all
communication it requires ordering otherwise you can deadlock.  And in the
example of many MPI tasks writing to one MPI task vmsplice serialises the
copying.

There are some cases of MPI collectives where even a single copy interface
does not get us the performance gain we could.  For example in an
MPI_Reduce rather than copy the data from the source we would like to
instead use it directly in a mathops (say the reduce is doing a sum) as
this would save us doing a copy.  We don't need to keep a copy of the data
from the source.  I haven't implemented this, but I think this interface
could in the future do all this through the use of the flags - eg could
specify the math operation and type and the kernel rather than just
copying the data would apply the specified operation between the source
and destination and store it in the destination.

Although we don't have a "second user" of the interface (though I've had
some nibbles from people who may be interested in using it for intra
process messaging which is not MPI).  This interface is something which
hardware vendors are already doing for their custom drivers to implement
fast local communication.  And so in addition to this being useful for
OpenMPI it would mean the driver maintainers don't have to fix things up
when the mm changes.

There was some discussion about how much faster a true zero copy would
go. Here's a link back to the email with some testing I did on that:

http://marc.info/?l=linux-mm&m=130105930902915&w=2

There is a basic man page for the proposed interface here:

http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

This has been implemented for x86 and powerpc, other architecture should
mainly (I think) just need to add syscall numbers for the process_vm_readv
and process_vm_writev. There are 32 bit compatibility versions for
64-bit kernels.

For arch maintainers there are some simple tests to be able to quickly
verify that the syscalls are working correctly here:

http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgzSigned-off-by: NChris Yeoh <yeohc@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: <linux-man@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fcf63409

23 3月, 2011 1 次提交

aio: wake all waiters when destroying ctx · e91f90bb

由 Roland Dreier 提交于 3月 22, 2011

The test program below will hang because io_getevents() uses
add_wait_queue_exclusive(), which means the wake_up() in io_destroy() only
wakes up one of the threads.  Fix this by using wake_up_all() in the aio
code paths where we want to make sure no one gets stuck.

	// t.c -- compile with gcc -lpthread -laio t.c

	#include <libaio.h>
	#include <pthread.h>
	#include <stdio.h>
	#include <unistd.h>

	static const int nthr = 2;

	void *getev(void *ctx)
	{
		struct io_event ev;
		io_getevents(ctx, 1, 1, &ev, NULL);
		printf("io_getevents returned\n");
		return NULL;
	}

	int main(int argc, char *argv[])
	{
		io_context_t ctx = 0;
		pthread_t thread[nthr];
		int i;

		io_setup(1024, &ctx);

		for (i = 0; i < nthr; ++i)
			pthread_create(&thread[i], NULL, getev, ctx);

		sleep(1);

		io_destroy(ctx);

		for (i = 0; i < nthr; ++i)
			pthread_join(thread[i], NULL);

		return 0;
	}
Signed-off-by: NRoland Dreier <roland@purestorage.com>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e91f90bb

10 3月, 2011 3 次提交

aio: remove request submission batching · cf15900e

由 Jens Axboe 提交于 3月 02, 2011

This should be useless now that we have on-stack plugging. So lets just
kill it.
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

cf15900e

fs: make aio plug · 9f5b9425

由 Shaohua Li 提交于 7月 01, 2010

Signed-off-by: NShaohua Li <shaohua.li@intel.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

9f5b9425

block: remove per-queue plugging · 7eaceacc

由 Jens Axboe 提交于 3月 10, 2011

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7eaceacc

26 2月, 2011 2 次提交

aio: fix race between io_destroy() and io_submit() · 7137c6bd

由 Jan Kara 提交于 2月 25, 2011

A race can occur when io_submit() races with io_destroy():

 CPU1						CPU2
io_submit()
  do_io_submit()
    ...
    ctx = lookup_ioctx(ctx_id);
						io_destroy()
    Now do_io_submit() holds the last reference to ctx.
    ...
    queue new AIO
    put_ioctx(ctx) - frees ctx with active AIOs

We solve this issue by checking whether ctx is being destroyed in AIO
submission path after adding new AIO to ctx.  Then we are guaranteed that
either io_destroy() waits for new AIO or we see that ctx is being
destroyed and bail out.

Cc: Nick Piggin <npiggin@kernel.dk>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7137c6bd

aio: fix rcu ioctx lookup · 3bd9a5d7

由 Nick Piggin 提交于 2月 25, 2011

aio-dio-invalidate-failure GPFs in aio_put_req from io_submit.

lookup_ioctx doesn't implement the rcu lookup pattern properly.
rcu_read_lock does not prevent refcount going to zero, so we might take
a refcount on a zero count ioctx.

Fix the bug by atomically testing for zero refcount before incrementing.

[jack@suse.cz: added comment into the code]
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NNick Piggin <npiggin@kernel.dk>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3bd9a5d7

27 1月, 2011 1 次提交

fs/aio: aio_wq isn't used in memory reclaim path · d37adaa1

由 Tejun Heo 提交于 1月 26, 2011

aio_wq isn't used during memory reclaim.  Convert to alloc_workqueue()
without WQ_MEM_RECLAIM.  It's possible to use system_wq but given that
the number of work items is determined from userland and the work item
may block, enforcing strict concurrency limit would be a good idea.

Also, move fput_work to system_wq so that aio_wq is used soley to
throttle the max concurrency of aio work items and fput_work doesn't
interact with other work items.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: linux-aio@kvack.org

d37adaa1

17 1月, 2011 1 次提交

aio: check return value of create_workqueue() · 27eaa1c9

由 Namhyung Kim 提交于 12月 14, 2010

Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

27eaa1c9

14 1月, 2011 2 次提交

aio: remove unused aio_run_iocbs() · d3486f8b

由 Jeff Moyer 提交于 1月 12, 2011

aio_run_iocbs() is not used at all, so get rid of it.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d3486f8b

aio: remove unnecessary check · 2e410255

由 Namhyung Kim 提交于 1月 12, 2011

'nr >= min_nr >= 0' always satisfies 'nr >= 0' so the check is unnecesary.
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2e410255

26 10月, 2010 2 次提交

new helper: ihold() · 7de9c6ee

由 Al Viro 提交于 10月 23, 2010

Clones an existing reference to inode; caller must already hold one.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7de9c6ee

aio: bump i_count instead of using igrab · 306fb097

由 Chris Mason 提交于 8月 23, 2010

The aio batching code is using igrab to get an extra reference on the
inode so it can safely batch.  igrab will go ahead and take the global
inode spinlock, which can be a bottleneck on large machines doing lots
of AIO.

In this case, igrab isn't required because we already have a reference
on the file handle.  It is safe to just bump the i_count directly
on the inode.

Benchmarking shows this patch brings IOP/s on tons of flash up by about
2.5X.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

306fb097

23 9月, 2010 1 次提交

aio: do not return ERESTARTSYS as a result of AIO · a0c42bac

由 Jan Kara 提交于 9月 22, 2010

OCFS2 can return ERESTARTSYS from its write function when the process is
signalled while waiting for a cluster lock (and the filesystem is mounted
with intr mount option).  Generally, it seems reasonable to allow
filesystems to return this error code from its IO functions.  As we must
not leak ERESTARTSYS (and similar error codes) to userspace as a result of
an AIO operation, we have to properly convert it to EINTR inside AIO code
(restarting the syscall isn't really an option because other AIO could
have been already submitted by the same io_submit syscall).
Signed-off-by: NJan Kara <jack@suse.cz>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a0c42bac

15 9月, 2010 1 次提交

aio: check for multiplication overflow in do_io_submit · 75e1c70f

由 Jeff Moyer 提交于 9月 10, 2010

Tavis Ormandy pointed out that do_io_submit does not do proper bounds
checking on the passed-in iocb array:

       if (unlikely(nr < 0))
               return -EINVAL;

       if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(iocbpp)))))
               return -EFAULT;                      ^^^^^^^^^^^^^^^^^^

The attached patch checks for overflow, and if it is detected, the
number of iocbs submitted is scaled down to a number that will fit in
the long.  This is an ok thing to do, as sys_io_submit is documented as
returning the number of iocbs submitted, so callers should handle a
return value of less than the 'nr' argument passed in.
Reported-by: NTavis Ormandy <taviso@cmpxchg8b.com>
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

75e1c70f

06 8月, 2010 1 次提交

aio: fix wrong subsystem comments · 642b5123

由 Satoru Takeuchi 提交于 8月 05, 2010

 - sys_io_destroy(): acutually return -EINVAL if the context pointed to
   is invalidIndex: linux-2.6.33-rc4/fs/aio.c
 - sys_io_getevents(): An argument specifying timeout is not `when',
   but `timeout'.
 - sys_io_getevents(): Should describe what is returned if this syscall
   succeeds.
Signed-off-by: NSatoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

642b5123

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功