提交 · 4d0e9df5e43dba52d38b251e3b909df8fa1110be · openeuler / Kernel

17 10月, 2020 1 次提交

lib, uaccess: add failure injection to usercopy functions · 4d0e9df5

由 Albert van der Linde 提交于 10月 15, 2020

To test fault-tolerance of user memory access functions, introduce fault
injection to usercopy functions.

If a failure is expected return either -EFAULT or the total amount of
bytes that were not copied.
Signed-off-by: NAlbert van der Linde <alinde@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NAkinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: NAlexander Potapenko <glider@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Marco Elver <elver@google.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20200831171733.955393-3-alinde@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4d0e9df5

06 10月, 2020 1 次提交

x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() · ec6347bb

由 Dan Williams 提交于 10月 05, 2020

In reaction to a proposal to introduce a memcpy_mcsafe_fast()
implementation Linus points out that memcpy_mcsafe() is poorly named
relative to communicating the scope of the interface. Specifically what
addresses are valid to pass as source, destination, and what faults /
exceptions are handled.

Of particular concern is that even though x86 might be able to handle
the semantics of copy_mc_to_user() with its common copy_user_generic()
implementation other archs likely need / want an explicit path for this
case:

  On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote:
  >
  > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote:
  > >
  > > However now I see that copy_user_generic() works for the wrong reason.
  > > It works because the exception on the source address due to poison
  > > looks no different than a write fault on the user address to the
  > > caller, it's still just a short copy. So it makes copy_to_user() work
  > > for the wrong reason relative to the name.
  >
  > Right.
  >
  > And it won't work that way on other architectures. On x86, we have a
  > generic function that can take faults on either side, and we use it
  > for both cases (and for the "in_user" case too), but that's an
  > artifact of the architecture oddity.
  >
  > In fact, it's probably wrong even on x86 - because it can hide bugs -
  > but writing those things is painful enough that everybody prefers
  > having just one function.

Replace a single top-level memcpy_mcsafe() with either
copy_mc_to_user(), or copy_mc_to_kernel().

Introduce an x86 copy_mc_fragile() name as the rename for the
low-level x86 implementation formerly named memcpy_mcsafe(). It is used
as the slow / careful backend that is supplanted by a fast
copy_mc_generic() in a follow-on patch.

One side-effect of this reorganization is that separating copy_mc_64.S
to its own file means that perf no longer needs to track dependencies
for its memcpy_64.S benchmarks.

 [ bp: Massage a bit. ]
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
Cc: <stable@vger.kernel.org>
Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com
Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com

ec6347bb

03 10月, 2020 2 次提交

iov_iter: transparently handle compat iovecs in import_iovec · 89cd35c5

由 Christoph Hellwig 提交于 9月 25, 2020

Use in compat_syscall to import either native or the compat iovecs, and
remove the now superflous compat_import_iovec.

This removes the need for special compat logic in most callers, and
the remaining ones can still be simplified by using __import_iovec
with a bool compat parameter.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

89cd35c5

iov_iter: refactor rw_copy_check_uvector and import_iovec · bfdc5970

由 Christoph Hellwig 提交于 9月 25, 2020

Split rw_copy_check_uvector into two new helpers with more sensible
calling conventions:

 - iovec_from_user copies a iovec from userspace either into the provided
   stack buffer if it fits, or allocates a new buffer for it.  Returns
   the actually used iovec.  It also verifies that iov_len does fit a
   signed type, and handles compat iovecs if the compat flag is set.
 - __import_iovec consolidates the native and compat versions of
   import_iovec. It calls iovec_from_user, then validates each iovec
   actually points to user addresses, and ensures the total length
   doesn't overflow.

This has two major implications:

 - the access_process_vm case loses the total lenght checking, which
   wasn't required anyway, given that each call receives two iovecs
   for the local and remote side of the operation, and it verifies
   the total length on the local side already.
 - instead of a single loop there now are two loops over the iovecs.
   Given that the iovecs are cache hot this doesn't make a major
   difference
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bfdc5970

25 9月, 2020 1 次提交

iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c · fb041b59

由 David Laight 提交于 9月 25, 2020

This lets the compiler inline it into import_iovec() generating
much better code.
Signed-off-by: NDavid Laight <david.laight@aculab.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fb041b59

21 8月, 2020 3 次提交

saner calling conventions for csum_and_copy_..._user() · c693cc46

由 Al Viro 提交于 7月 11, 2020

All callers of these primitives will
	* discard anything we might've copied in case of error
	* ignore the csum value in case of error
	* always pass 0xffffffff as the initial sum, so the
resulting csum value (in case of success, that is) will never be 0.

That suggest the following calling conventions:
	* don't pass err_ptr - just return 0 on error.
	* don't bother with zeroing destination, etc. in case of error
	* don't pass the initial sum - just use 0xffffffff.

This commit does the minimal conversion in the instances of csum_and_copy_...();
the changes of actual asm code behind them are done later in the series.
Note that this asm code is often shared with csum_partial_copy_nocheck();
the difference is that csum_partial_copy_nocheck() passes 0 for initial
sum while csum_and_copy_..._user() pass 0xffffffff.  Fortunately, we are
free to pass 0xffffffff in all cases and subsequent patches will use that
freedom without any special comments.

A part that could be split off: parisc and uml/i386 claimed to have
csum_and_copy_to_user() instances of their own, but those were identical
to the generic one, so we simply drop them.  Not sure if it's worth
a separate commit...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c693cc46

csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum · 99a2c96d

由 Al Viro 提交于 7月 11, 2020

Preparation for the change of calling conventions; right now all
callers pass 0 as initial sum.  Passing 0xffffffff instead yields
the values comparable mod 0xffff and guarantees that 0 will not
be returned on success.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

99a2c96d

csum_partial_copy_nocheck(): drop the last argument · cc44c17b

由 Al Viro 提交于 7月 11, 2020

It's always 0.  Note that we theoretically could use ~0U as well -
result will be the same modulo 0xffff, _if_ the damn thing did the
right thing for any value of initial sum; later we'll make use of
that when convenient.

However, unlike csum_and_copy_..._user(), there are instances that
did not work for arbitrary initial sums; c6x is one such.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

cc44c17b

30 6月, 2020 1 次提交

iov_iter: Move unnecessary inclusion of crypto/hash.h · 7999096f

由 Herbert Xu 提交于 6月 12, 2020

The header file linux/uio.h includes crypto/hash.h which pulls in
most of the Crypto API.  Since linux/uio.h is used throughout the
kernel this means that every tiny bit of change to the Crypto API
causes the entire kernel to get rebuilt.

This patch fixes this by moving it into lib/iov_iter.c instead
where it is actually used.

This patch also fixes the ifdef to use CRYPTO_HASH instead of just
CRYPTO which does not guarantee the existence of ahash.

Unfortunately a number of drivers were relying on linux/uio.h to
provide access to linux/slab.h.  This patch adds inclusions of
linux/slab.h as detected by build failures.

Also skbuff.h was relying on this to provide a declaration for
ahash_request.  This patch adds a forward declaration instead.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7999096f

21 3月, 2020 1 次提交

iov_iter: Use generic instrumented.h · d0ef4c36

由 Marco Elver 提交于 1月 21, 2020

This replaces the kasan instrumentation with generic instrumentation,
implicitly adding KCSAN instrumentation support.

For KASAN no functional change is intended.
Suggested-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NMarco Elver <elver@google.com>
Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

d0ef4c36

17 12月, 2019 1 次提交

pipe: Fix bogus dereference in iov_iter_alignment() · e0ff126e

由 Jan Kara 提交于 12月 16, 2019

We cannot look at 'i->pipe' unless we know the iter is a pipe. Move the
ring_size load to a branch in iov_iter_alignment() where we've already
checked the iter is a pipe to avoid bogus dereference.

Reported-by: syzbot+bea68382bae9490e7dd6@syzkaller.appspotmail.com
Fixes: 8cefc107 ("pipe: Use head and tail pointers for the ring, not cursor and length")
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e0ff126e

16 11月, 2019 1 次提交

pipe: Allow pipes to have kernel-reserved slots · 6718b6f8

由 David Howells 提交于 10月 16, 2019

Split pipe->ring_size into two numbers:

 (1) pipe->ring_size - indicates the hard size of the pipe ring.

 (2) pipe->max_usage - indicates the maximum number of pipe ring slots that
     userspace orchestrated events can fill.

This allows for a pipe that is both writable by the general kernel
notification facility and by userspace, allowing plenty of ring space for
notifications to be added whilst preventing userspace from being able to
pin too much unswappable kernel space.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

6718b6f8

31 10月, 2019 1 次提交

pipe: Use head and tail pointers for the ring, not cursor and length · 8cefc107

由 David Howells 提交于 11月 15, 2019

Convert pipes to use head and tail pointers for the buffer ring rather than
pointer and length as the latter requires two atomic ops to update (or a
combined op) whereas the former only requires one.

 (1) The head pointer is the point at which production occurs and points to
     the slot in which the next buffer will be placed.  This is equivalent
     to pipe->curbuf + pipe->nrbufs.

     The head pointer belongs to the write-side.

 (2) The tail pointer is the point at which consumption occurs.  It points
     to the next slot to be consumed.  This is equivalent to pipe->curbuf.

     The tail pointer belongs to the read-side.

 (3) head and tail are allowed to run to UINT_MAX and wrap naturally.  They
     are only masked off when the array is being accessed, e.g.:

	pipe->bufs[head & mask]

     This means that it is not necessary to have a dead slot in the ring as
     head == tail isn't ambiguous.

 (4) The ring is empty if "head == tail".

     A helper, pipe_empty(), is provided for this.

 (5) The occupancy of the ring is "head - tail".

     A helper, pipe_occupancy(), is provided for this.

 (6) The number of free slots in the ring is "pipe->ring_size - occupancy".

     A helper, pipe_space_for_user() is provided to indicate how many slots
     userspace may use.

 (7) The ring is full if "head - tail >= pipe->ring_size".

     A helper, pipe_full(), is provided for this.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

8cefc107

23 10月, 2019 1 次提交

compat_ioctl: reimplement SG_IO handling · 98aaaec4

由 Arnd Bergmann 提交于 3月 14, 2019

There are two code locations that implement the SG_IO ioctl: the old
sg.c driver, and the generic scsi_ioctl helper that is in turn used by
multiple drivers.

To eradicate the old compat_ioctl conversion handler for the SG_IO
command, I implement a readable pair of put_sg_io_hdr() /get_sg_io_hdr()
helper functions that can be used for both compat and native mode,
and then I call this from both drivers.

For the iovec handling, there is already a compat_import_iovec() function
that can simply be called in place of import_iovec().

To avoid having to pass the compat/native state through multiple
indirections, I mark the SG_IO command itself as compatible in
fs/compat_ioctl.c and use in_compat_syscall() to figure out where
we are called from.

As a side-effect of this, the sg.c driver now also accepts the 32-bit
sg_io_hdr format in compat mode using the read/write interface, not
just ioctl. This should improve compatiblity with old 32-bit binaries,
but it would break if any application intentionally passes the 64-bit
data structure in compat mode here.

Steffen Maier helped debug an issue in an earlier version of this patch.

Cc: Steffen Maier <maier@linux.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

98aaaec4

25 9月, 2019 1 次提交

mm: introduce page_size() · a50b854e

由 Matthew Wilcox (Oracle) 提交于 9月 23, 2019

Patch series "Make working with compound pages easier", v2.

These three patches add three helpers and convert the appropriate
places to use them.

This patch (of 3):

It's unnecessarily hard to find out the size of a potentially huge page.
Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).

Link: http://lkml.kernel.org/r/20190721104612.19120-2-willy@infradead.orgSigned-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: NMichal Hocko <mhocko@suse.com>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a50b854e

01 6月, 2019 1 次提交

uio: make import_iovec()/compat_import_iovec() return bytes on success · 87e5e6da

由 Jens Axboe 提交于 5月 14, 2019

Currently these functions return < 0 on error, and 0 for success.
Change that so that we return < 0 on error, but number of bytes
for success.

Some callers already treat the return value that way, others need a
slight tweak.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

87e5e6da

21 5月, 2019 1 次提交

treewide: Add SPDX license identifier for missed files · 457c8996

由 Thomas Gleixner 提交于 5月 19, 2019

Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

457c8996

15 5月, 2019 1 次提交

mm/gup: change GUP fast to use flags rather than a write 'bool' · 73b0140b

由 Ira Weiny 提交于 5月 13, 2019

To facilitate additional options to get_user_pages_fast() change the
singular write parameter to be gup_flags.

This patch does not change any functionality.  New functionality will
follow in subsequent patches.

Some of the get_user_pages_fast() call sites were unchanged because they
already passed FOLL_WRITE or 0 for the write parameter.

NOTE: It was suggested to change the ordering of the get_user_pages_fast()
arguments to ensure that callers were converted.  This breaks the current
GUP call site convention of having the returned pages be the final
parameter.  So the suggestion was rejected.

Link: http://lkml.kernel.org/r/20190328084422.29911-4-ira.weiny@intel.com
Link: http://lkml.kernel.org/r/20190317183438.2057-4-ira.weiny@intel.comSigned-off-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NMike Marshall <hubcap@omnibond.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

73b0140b

04 4月, 2019 1 次提交

iov_iter: Fix build error without CONFIG_CRYPTO · 27fad74a

由 YueHaibing 提交于 4月 04, 2019

If CONFIG_CRYPTO is not set or set to m,
gcc building warn this:

lib/iov_iter.o: In function `hash_and_copy_to_iter':
iov_iter.c:(.text+0x9129): undefined reference to `crypto_stats_get'
iov_iter.c:(.text+0x9152): undefined reference to `crypto_stats_ahash_update'
Reported-by: NHulk Robot <hulkci@huawei.com>
Fixes: d05f4435 ("iov_iter: introduce hash_and_copy_to_iter helper")
Suggested-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

27fad74a

27 2月, 2019 1 次提交

iov_iter: optimize page_copy_sane() · 6daef95b

由 Eric Dumazet 提交于 2月 26, 2019

Avoid cache line miss dereferencing struct page if we can.

page_copy_sane() mostly deals with order-0 pages.

Extra cache line miss is visible on TCP recvmsg() calls dealing
with GRO packets (typically 45 page frags are attached to one skb).

Bringing the 45 struct pages into cpu cache while copying the data
is not free, since the freeing of the skb (and associated
page frags put_page()) can happen after cache lines have been evicted.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6daef95b

04 1月, 2019 1 次提交

Remove 'type' argument from access_ok() function · 96d4f267

由 Linus Torvalds 提交于 1月 03, 2019

Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.

It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access.  But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.

A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model.  And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.

This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

There were a couple of notable cases:

 - csky still had the old "verify_area()" name as an alias.

 - the iter_iov code had magical hardcoded knowledge of the actual
   values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
   really used it)

 - microblaze used the type argument for a debug printout

but other than those oddities this should be a total no-op patch.

I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something.  Any missed conversion should be trivially fixable, though.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

96d4f267

13 12月, 2018 2 次提交

iov_iter: introduce hash_and_copy_to_iter helper · d05f4435

由 Sagi Grimberg 提交于 12月 03, 2018

Allow consumers that want to use iov iterator helpers and also update
a predefined hash calculation online when copying data. This is useful
when copying incoming network buffers to a local iterator and calculate
a digest on the incoming stream. nvme-tcp host driver that will be
introduced in following patches is the first consumer via
skb_copy_and_hash_datagram_iter.
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSagi Grimberg <sagi@lightbitslabs.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d05f4435

iov_iter: pass void csum pointer to csum_and_copy_to_iter · cb002d07

由 Sagi Grimberg 提交于 12月 03, 2018

The single caller to csum_and_copy_to_iter is skb_copy_and_csum_datagram
and we are trying to unite its logic with skb_copy_datagram_iter by passing
a callback to the copy function that we want to apply. Thus, we need
to make the checksum pointer private to the function.
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSagi Grimberg <sagi@lightbitslabs.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

cb002d07

28 11月, 2018 1 次提交

iov_iter: reduce code duplication · f9152895

由 Al Viro 提交于 11月 27, 2018

The same combination of csum_partial_copy_nocheck() with csum_add_block()
is used in a bunch of places. Add a helper doing just that and use it.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f9152895

26 11月, 2018 1 次提交
- A
  iov_iter: teach csum_and_copy_to_iter() to handle pipe-backed ones · 78e1f386
  由 Al Viro 提交于 11月 25, 2018
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  78e1f386
24 10月, 2018 3 次提交

iov_iter: Add I/O discard iterator · 9ea9ce04

由 David Howells 提交于 10月 20, 2018

Add a new iterator, ITER_DISCARD, that can only be used in READ mode and
just discards any data copied to it.

This is useful in a network filesystem for discarding any unwanted data
sent by a server.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

9ea9ce04

iov_iter: Separate type from direction and use accessor functions · aa563d7b

由 David Howells 提交于 10月 20, 2018

In the iov_iter struct, separate the iterator type from the iterator
direction and use accessor functions to access them in most places.

Convert a bunch of places to use switch-statements to access them rather
then chains of bitwise-AND statements. This makes it easier to add further
iterator types. Also, this can be more efficient as to implement a switch
of small contiguous integers, the compiler can use ~50% fewer compare
instructions than it has to use bitwise-and instructions.

Further, cease passing the iterator type into the iterator setup function.
The iterator function can set that itself. Only the direction is required.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

aa563d7b

iov_iter: Use accessor function · 00e23707

由 David Howells 提交于 10月 22, 2018

Use accessor functions to access an iterator's type and direction. This
allows for the possibility of using some other method of determining the
type of iterator than if-chains with bitwise-AND conditions.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

00e23707

16 7月, 2018 3 次提交

lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe() · ca146f6f

由 Dan Williams 提交于 7月 08, 2018

By mistake the ITER_PIPE early-exit / warning from copy_from_iter() was
cargo-culted in _copy_to_iter_mcsafe() rather than a machine-check-safe
version of copy_to_iter_pipe().

Implement copy_pipe_to_iter_mcsafe() being careful to return the
indication of short copies due to a CPU exception.

Without this regression-fix all splice reads to dax-mode files fail.
Reported-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Fixes: 8780356e ("x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()")
Link: http://lkml.kernel.org/r/153108277278.37979.3327916996902264102.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ca146f6f

lib/iov_iter: Document _copy_to_iter_flushcache() · abd08d7d

由 Dan Williams 提交于 7月 08, 2018

Add some theory of operation documentation to _copy_to_iter_flushcache().
Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/153108276767.37979.9462477994086841699.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

abd08d7d

lib/iov_iter: Document _copy_to_iter_mcsafe() · bf3eeb9b

由 Dan Williams 提交于 7月 08, 2018

Add some theory of operation documentation to _copy_to_iter_mcsafe().
Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/153108276256.37979.1689794213845539316.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

bf3eeb9b

15 5月, 2018 1 次提交

x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe() · 8780356e

由 Dan Williams 提交于 5月 03, 2018

Use the updated memcpy_mcsafe() implementation to define
copy_user_mcsafe() and copy_to_iter_mcsafe(). The most significant
difference from typical copy_to_iter() is that the ITER_KVEC and
ITER_BVEC iterator types can fail to complete a full transfer.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: hch@lst.de
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-nvdimm@lists.01.org
Link: http://lkml.kernel.org/r/152539239150.31796.9189779163576449784.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

8780356e

03 5月, 2018 2 次提交

iov_iter: fix memory leak in pipe_get_pages_alloc() · d7760d63

由 Ilya Dryomov 提交于 5月 02, 2018

Make n signed to avoid leaking the pages array if __pipe_get_pages()
fails to allocate any pages.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d7760d63

iov_iter: fix return type of __pipe_get_pages() · e76b6312

由 Ilya Dryomov 提交于 5月 02, 2018

It returns -EFAULT and happens to be a helper for pipe_get_pages()
whose return type is ssize_t.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e76b6312

12 10月, 2017 1 次提交

new primitive: iov_iter_for_each_range() · 09cf698a

由 Al Viro 提交于 2月 18, 2017

For kvec and bvec: feeds segments to given callback as long as it
returns 0.  For iovec and pipe: fails.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

09cf698a

21 9月, 2017 1 次提交

iov_iter: fix page_copy_sane for compound pages · a90bcb86

由 Petar Penkov 提交于 8月 29, 2017

Issue is that if the data crosses a page boundary inside a compound
page, this check will incorrectly trigger a WARN_ON.

To fix this, compute the order using the head of the compound page and
adjust the offset to be relative to that head.

Fixes: 72e809ed ("iov_iter: sanity checks for copy to/from page
primitives")
Signed-off-by: NPetar Penkov <ppenkov@google.com>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: Eric Dumazet <edumazet@google.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a90bcb86

07 7月, 2017 1 次提交

iov_iter: saner checks on copyin/copyout · 09fc68dc

由 Al Viro 提交于 6月 29, 2017

* might_fault() is better checked in caller (and e.g. fault-in + kmap_atomic
codepath also needs might_fault() coverage)
* we have already done object size checks
* we have *NOT* done access_ok() recently enough; we rely upon the
iovec array having passed sanity checks back when it had been created
and not nothing having buggered it since.  However, that's very much
non-local, so we'd better recheck that.

So the thing we want does not match anything in uaccess - we need
access_ok + kasan checks + raw copy without any zeroing.  Just define
such helpers and use them here.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

09fc68dc

30 6月, 2017 2 次提交

iov_iter: sanity checks for copy to/from page primitives · 72e809ed

由 Al Viro 提交于 6月 29, 2017

for now - just that we don't attempt to cross out of compound page
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

72e809ed

iov_iter/hardening: move object size checks to inlined part · aa28de27

由 Al Viro 提交于 6月 29, 2017

There we actually have useful information about object sizes.
Note: this patch has them done for all iov_iter flavours.
Right now we do them twice in iovec case, but that'll change
very shortly.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

aa28de27

10 6月, 2017 1 次提交

x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations · 0aed55af

由 Dan Williams 提交于 5月 29, 2017

The pmem driver has a need to transfer data with a persistent memory
destination and be able to rely on the fact that the destination writes are not
cached. It is sufficient for the writes to be flushed to a cpu-store-buffer
(non-temporal / "movnt" in x86 terms), as we expect userspace to call fsync()
to ensure data-writes have reached a power-fail-safe zone in the platform. The
fsync() triggers a REQ_FUA or REQ_FLUSH to the pmem driver which will turn
around and fence previous writes with an "sfence".

Implement a __copy_from_user_inatomic_flushcache, memcpy_page_flushcache, and
memcpy_flushcache, that guarantee that the destination buffer is not dirty in
the cpu cache on completion. The new copy_from_iter_flushcache and sub-routines
will be used to replace the "pmem api" (include/linux/pmem.h +
arch/x86/include/asm/pmem.h). The availability of copy_from_iter_flushcache()
and memcpy_flushcache() are gated by the CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
config symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
otherwise.

This is meant to satisfy the concern from Linus that if a driver wants to do
something beyond the normal nocache semantics it should be something private to
that driver [1], and Al's concern that anything uaccess related belongs with
the rest of the uaccess code [2].

The first consumer of this interface is a new 'copy_from_iter' dax operation so
that pmem can inject cache maintenance operations without imposing this
overhead on other dax-capable drivers.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
[2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html

Cc: <x86@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

0aed55af

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功