提交 · 71d8e532b1549a478e6a6a8a44f309d050294d00 · openeuler / Kernel

07 5月, 2014 2 次提交

start adding the tag to iov_iter · 71d8e532

由 Al Viro 提交于 3月 05, 2014

For now, just use the same thing we pass to ->direct_IO() - it's all
iovec-based at the moment.  Pass it explicitly to iov_iter_init() and
account for kvec vs. iovec in there, by the same kludge NFS ->direct_IO()
uses.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

71d8e532

kill iov_iter_copy_from_user() · e7c24607

由 Al Viro 提交于 4月 10, 2014

all callers can use copy_page_from_iter() and it actually simplifies
them.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e7c24607

04 4月, 2014 1 次提交

mm/process_vm_access.c: mark function as static · 2eb2e141

由 Rashika Kheria 提交于 4月 03, 2014

Mark function as static in process_vm_access.c because it is not used
outside this file.

This eliminates the following warning in mm/process_vm_access.c:

  mm/process_vm_access.c:416:1: warning: no previous prototype for `compat_process_vm_rw' [-Wmissing-prototypes]

[akpm@linux-foundation.org: remove unneeded asmlinkage - compat_process_vm_rw isn't referenced from asm]
Signed-off-by: NRashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2eb2e141

02 4月, 2014 10 次提交

process_vm_access: tidy up a bit · 4bafbec7

由 Al Viro 提交于 2月 05, 2014

saner variable names, update linuxdoc comments, etc.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4bafbec7

A
process_vm_access: don't bother with returning the amounts of bytes copied · 9acc1a0f
由 Al Viro 提交于 2月 05, 2014
```
we can calculate that in the caller just fine, TYVM
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
9acc1a0f
A
process_vm_rw_pages(): pass accurate amount of bytes · e21345f9
由 Al Viro 提交于 2月 05, 2014
```
... makes passing the amount of pages unnecessary
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
e21345f9

process_vm_access: take get_user_pages/put_pages one level up · 70eca12d

由 Al Viro 提交于 2月 05, 2014

... and trim the fuck out of process_vm_rw_pages() argument list.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

70eca12d

process_vm_access: switch to copy_page_to_iter/iov_iter_copy_from_user · 240f3905

由 Al Viro 提交于 2月 05, 2014

... rather than open-coding those.  As a side benefit, we get much saner
loop calling those; we can just feed entire pages, instead of the "copy
would span the iovec boundary, let's do it in two loop iterations" mess.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

240f3905

process_vm_access: switch to iov_iter · 9f78bdfa

由 Al Viro 提交于 2月 05, 2014

instead of keeping its pieces in separate variables and passing
pointers to all of them...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9f78bdfa

untangling process_vm_..., part 4 · 1291afc1

由 Al Viro 提交于 2月 05, 2014

instead of passing vector size (by value) and index (by reference),
pass the number of elements remaining.  That's all we care about
in these functions by that point.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1291afc1

untangling process_vm_..., part 3 · 12e3004e

由 Al Viro 提交于 2月 05, 2014

lift iov one more level out - from process_vm_rw_single_vec to
process_vm_rw_core().  Same story as with the previous commit.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

12e3004e

untangling process_vm_..., part 2 · c61c7038

由 Al Viro 提交于 2月 05, 2014

move iov to caller's stack frame; the value we assign to it on the
next call of process_vm_rw_pages() is equal to the value it had
when the last time we were leaving process_vm_rw_pages().

drop lvec argument of process_vm_rw_pages() - it's not used anymore.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c61c7038

untangling process_vm_..., part 1 · 480402e1

由 Al Viro 提交于 2月 05, 2014

we want to massage it to use of iov_iter.  This one is an equivalent
transformation - just introduce a local variable mirroring
lvec + *lvec_current.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

480402e1

06 3月, 2014 1 次提交

mm/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types · 2f2728f6

由 Heiko Carstens 提交于 3月 04, 2014

In order to allow the COMPAT_SYSCALL_DEFINE macro generate code that
performs proper zero and sign extension convert all 64 bit parameters
to their corresponding 32 bit compat counterparts.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

2f2728f6

13 3月, 2013 1 次提交

Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys · 8aec0f5d

由 Mathieu Desnoyers 提交于 2月 25, 2013

Looking at mm/process_vm_access.c:process_vm_rw() and comparing it to
compat_process_vm_rw() shows that the compatibility code requires an
explicit "access_ok()" check before calling
compat_rw_copy_check_uvector(). The same difference seems to appear when
we compare fs/read_write.c:do_readv_writev() to
fs/compat.c:compat_do_readv_writev().

This subtle difference between the compat and non-compat requirements
should probably be debated, as it seems to be error-prone. In fact,
there are two others sites that use this function in the Linux kernel,
and they both seem to get it wrong:

Now shifting our attention to fs/aio.c, we see that aio_setup_iocb()
also ends up calling compat_rw_copy_check_uvector() through
aio_setup_vectored_rw(). Unfortunately, the access_ok() check appears to
be missing. Same situation for
security/keys/compat.c:compat_keyctl_instantiate_key_iov().

I propose that we add the access_ok() check directly into
compat_rw_copy_check_uvector(), so callers don't have to worry about it,
and it therefore makes the compat call code similar to its non-compat
counterpart. Place the access_ok() check in the same location where
copy_from_user() can trigger a -EFAULT error in the non-compat code, so
the ABI behaviors are alike on both compat and non-compat.

While we are here, fix compat_do_readv_writev() so it checks for
compat_rw_copy_check_uvector() negative return values.

And also, fix a memory leak in compat_keyctl_instantiate_key_iov() error
handling.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Acked-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8aec0f5d

01 6月, 2012 1 次提交

aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() · ac34ebb3

由 Christopher Yeoh 提交于 5月 31, 2012

A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after
changes made to support CMA in an earlier patch.

Rather than having an additional check_access parameter to these
functions, the first paramater type is overloaded to allow the caller to
specify CHECK_IOVEC_ONLY which means check that the contents of the iovec
are valid, but do not check the memory that they point to. This is used
by process_vm_readv/writev where we need to validate that a iovec passed
to the syscall is valid but do not want to check the memory that it points
to at this point because it refers to an address space in another process.
Signed-off-by: NChris Yeoh <yeohc@au1.ibm.com>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ac34ebb3

03 2月, 2012 1 次提交

Fix race in process_vm_rw_core · 8cdb878d

由 Christopher Yeoh 提交于 2月 02, 2012

This fixes the race in process_vm_core found by Oleg (see

  http://article.gmane.org/gmane.linux.kernel/1235667/

for details).

This has been updated since I last sent it as the creation of the new
mm_access() function did almost exactly the same thing as parts of the
previous version of this patch did.

In order to use mm_access() even when /proc isn't enabled, we move it to
kernel/fork.c where other related process mm access functions already
are.
Signed-off-by: NChris Yeoh <yeohc@au1.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8cdb878d

01 11月, 2011 1 次提交

Cross Memory Attach · fcf63409

由 Christopher Yeoh 提交于 10月 31, 2011

The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than a
double copy of the message via shared memory.

The following patch attempts to achieve this by allowing a destination
process, given an address and size from a source process, to copy memory
directly from the source process into its own address space via a system
call.  There is also a symmetrical ability to copy from the current
process's address space into a destination process's address space.

- Use of /proc/pid/mem has been considered, but there are issues with
  using it:
  - Does not allow for specifying iovecs for both src and dest, assuming
    preadv or pwritev was implemented either the area read from or
  written to would need to be contiguous.
  - Currently mem_read allows only processes who are currently
  ptrace'ing the target and are still able to ptrace the target to read
  from the target. This check could possibly be moved to the open call,
  but its not clear exactly what race this restriction is stopping
  (reason  appears to have been lost)
  - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
  domain socket is a bit ugly from a userspace point of view,
  especially when you may have hundreds if not (eventually) thousands
  of processes  that all need to do this with each other
  - Doesn't allow for some future use of the interface we would like to
  consider adding in the future (see below)
  - Interestingly reading from /proc/pid/mem currently actually
  involves two copies! (But this could be fixed pretty easily)

As mentioned previously use of vmsplice instead was considered, but has
problems.  Since you need the reader and writer working co-operatively if
the pipe is not drained then you block.  Which requires some wrapping to
do non blocking on the send side or polling on the receive.  In all to all
communication it requires ordering otherwise you can deadlock.  And in the
example of many MPI tasks writing to one MPI task vmsplice serialises the
copying.

There are some cases of MPI collectives where even a single copy interface
does not get us the performance gain we could.  For example in an
MPI_Reduce rather than copy the data from the source we would like to
instead use it directly in a mathops (say the reduce is doing a sum) as
this would save us doing a copy.  We don't need to keep a copy of the data
from the source.  I haven't implemented this, but I think this interface
could in the future do all this through the use of the flags - eg could
specify the math operation and type and the kernel rather than just
copying the data would apply the specified operation between the source
and destination and store it in the destination.

Although we don't have a "second user" of the interface (though I've had
some nibbles from people who may be interested in using it for intra
process messaging which is not MPI).  This interface is something which
hardware vendors are already doing for their custom drivers to implement
fast local communication.  And so in addition to this being useful for
OpenMPI it would mean the driver maintainers don't have to fix things up
when the mm changes.

There was some discussion about how much faster a true zero copy would
go. Here's a link back to the email with some testing I did on that:

http://marc.info/?l=linux-mm&m=130105930902915&w=2

There is a basic man page for the proposed interface here:

http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

This has been implemented for x86 and powerpc, other architecture should
mainly (I think) just need to add syscall numbers for the process_vm_readv
and process_vm_writev. There are 32 bit compatibility versions for
64-bit kernels.

For arch maintainers there are some simple tests to be able to quickly
verify that the syscalls are working correctly here:

http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgzSigned-off-by: NChris Yeoh <yeohc@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: <linux-man@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fcf63409

openeuler / Kernel 11 个月 前同步成功

openeuler / Kernel
11 个月前同步成功