1. 10 11月, 2022 1 次提交
  2. 04 10月, 2022 1 次提交
    • A
      instrumented.h: allow instrumenting both sides of copy_from_user() · 33b75c1d
      Alexander Potapenko 提交于
      Introduce instrument_copy_from_user_before() and
      instrument_copy_from_user_after() hooks to be invoked before and after the
      call to copy_from_user().
      
      KASAN and KCSAN will be only using instrument_copy_from_user_before(), but
      for KMSAN we'll need to insert code after copy_from_user().
      
      Link: https://lkml.kernel.org/r/20220915150417.722975-4-glider@google.comSigned-off-by: NAlexander Potapenko <glider@google.com>
      Reviewed-by: NMarco Elver <elver@google.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrey Konovalov <andreyknvl@gmail.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Eric Biggers <ebiggers@google.com>
      Cc: Eric Biggers <ebiggers@kernel.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Ilya Leoshkevich <iii@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      33b75c1d
  3. 09 8月, 2022 23 次提交
    • A
      fix copy_page_from_iter() for compound destinations · c03f05f1
      Al Viro 提交于
      had been broken for ITER_BVEC et.al. since ever (OK, v3.17 when
      ITER_BVEC had first appeared)...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c03f05f1
    • A
      copy_page_to_iter(): don't split high-order page in case of ITER_PIPE · f0f6b614
      Al Viro 提交于
      ... just shove it into one pipe_buffer.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      f0f6b614
    • A
      expand those iov_iter_advance()... · 310d9d5a
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      310d9d5a
    • A
      pipe_get_pages(): switch to append_pipe() · 746de1f8
      Al Viro 提交于
      now that we are advancing the iterator, there's no need to
      treat the first page separately - just call append_pipe()
      in a loop.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      746de1f8
    • A
      get rid of non-advancing variants · eba2d3d7
      Al Viro 提交于
      mechanical change; will be further massaged in subsequent commits
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      eba2d3d7
    • A
      iov_iter: saner helper for page array allocation · 3cf42da3
      Al Viro 提交于
      All call sites of get_pages_array() are essenitally identical now.
      Replace with common helper...
      
      Returns number of slots available in resulting array or 0 on OOM;
      it's up to the caller to make sure it doesn't ask to zero-entry
      array (i.e. neither maxpages nor size are allowed to be zero).
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      3cf42da3
    • A
      fold __pipe_get_pages() into pipe_get_pages() · 85200084
      Al Viro 提交于
      ... and don't mangle maxsize there - turn the loop into counting
      one instead.  Easier to see that we won't run out of array that
      way.  Note that special treatment of the partial buffer in that
      thing is an artifact of the non-advancing semantics of
      iov_iter_get_pages() - if not for that, it would be append_pipe(),
      same as the body of the loop that follows it.  IOW, once we make
      iov_iter_get_pages() advancing, the whole thing will turn into
      	calculate how many pages do we want
      	allocate an array (if needed)
      	call append_pipe() that many times.
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      85200084
    • A
      ITER_XARRAY: don't open-code DIV_ROUND_UP() · 0aa4fc32
      Al Viro 提交于
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0aa4fc32
    • A
      unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts · 451c0ba9
      Al Viro 提交于
      same as for pipes and xarrays; after that iov_iter_get_pages() becomes
      a wrapper for __iov_iter_get_pages_alloc().
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      451c0ba9
    • A
      unify xarray_get_pages() and xarray_get_pages_alloc() · 68fe506f
      Al Viro 提交于
      same as for pipes
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      68fe506f
    • A
      unify pipe_get_pages() and pipe_get_pages_alloc() · acbdeb83
      Al Viro 提交于
      	The differences between those two are
      * pipe_get_pages() gets a non-NULL struct page ** value pointing to
      preallocated array + array size.
      * pipe_get_pages_alloc() gets an address of struct page ** variable that
      contains NULL, allocates the array and (on success) stores its address in
      that variable.
      
      	Not hard to combine - always pass struct page ***, have
      the previous pipe_get_pages_alloc() caller pass ~0U as cap for
      array size.
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      acbdeb83
    • A
      iov_iter_get_pages(): sanity-check arguments · c81ce28d
      Al Viro 提交于
      zero maxpages is bogus, but best treated as "just return 0";
      NULL pages, OTOH, should be treated as a hard bug.
      
      get rid of now completely useless checks in xarray_get_pages{,_alloc}().
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c81ce28d
    • A
      iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper · 91329559
      Al Viro 提交于
      Incidentally, ITER_XARRAY did *not* free the sucker in case when
      iter_xarray_populate_pages() returned 0...
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      91329559
    • A
      ITER_PIPE: fold data_start() and pipe_space_for_user() together · 12d426ab
      Al Viro 提交于
      All their callers are next to each other; all of them
      want the total amount of pages and, possibly, the
      offset in the partial final buffer.
      
      Combine into a new helper (pipe_npages()), fix the
      bogosity in pipe_space_for_user(), while we are at it.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      12d426ab
    • A
      ITER_PIPE: cache the type of last buffer · 10f525a8
      Al Viro 提交于
      We often need to find whether the last buffer is anon or not, and
      currently it's rather clumsy:
      	check if ->iov_offset is non-zero (i.e. that pipe is not empty)
      	if so, get the corresponding pipe_buffer and check its ->ops
      	if it's &default_pipe_buf_ops, we have an anon buffer.
      
      Let's replace the use of ->iov_offset (which is nowhere near similar to
      its role for other flavours) with signed field (->last_offset), with
      the following rules:
      	empty, no buffers occupied:		0
      	anon, with bytes up to N-1 filled:	N
      	zero-copy, with bytes up to N-1 filled:	-N
      
      That way abs(i->last_offset) is equal to what used to be in i->iov_offset
      and empty vs. anon vs. zero-copy can be distinguished by the sign of
      i->last_offset.
      
      	Checks for "should we extend the last buffer or should we start
      a new one?" become easier to follow that way.
      
      	Note that most of the operations can only be done in a sane
      state - i.e. when the pipe has nothing past the current position of
      iterator.  About the only thing that could be done outside of that
      state is iov_iter_advance(), which transitions to the sane state by
      truncating the pipe.  There are only two cases where we leave the
      sane state:
      	1) iov_iter_get_pages()/iov_iter_get_pages_alloc().  Will be
      dealt with later, when we make get_pages advancing - the callers are
      actually happier that way.
      	2) iov_iter copied, then something is put into the copy.  Since
      they share the underlying pipe, the original gets behind.  When we
      decide that we are done with the copy (original is not usable until then)
      we advance the original.  direct_io used to be done that way; nowadays
      it operates on the original and we do iov_iter_revert() to discard
      the excessive data.  At the moment there's nothing in the kernel that
      could do that to ITER_PIPE iterators, so this reason for insane state
      is theoretical right now.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      10f525a8
    • A
      ITER_PIPE: clean iov_iter_revert() · 92acdc4f
      Al Viro 提交于
      Fold pipe_truncate() into it, clean up.  We can release buffers
      in the same loop where we walk backwards to the iterator beginning
      looking for the place where the new position will be.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      92acdc4f
    • A
      ITER_PIPE: clean pipe_advance() up · 2c855de9
      Al Viro 提交于
      instead of setting ->iov_offset for new position and calling
      pipe_truncate() to adjust ->len of the last buffer and discard
      everything after it, adjust ->len at the same time we set ->iov_offset
      and use pipe_discard_from() to deal with buffers past that.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2c855de9
    • A
      ITER_PIPE: lose iter_head argument of __pipe_get_pages() · ca591967
      Al Viro 提交于
      it's only used to get to the partial buffer we can add to,
      and that's always the last one, i.e. pipe->head - 1.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      ca591967
    • A
      ITER_PIPE: fold push_pipe() into __pipe_get_pages() · e3b42964
      Al Viro 提交于
      	Expand the only remaining call of push_pipe() (in
      __pipe_get_pages()), combine it with the page-collecting loop there.
      
      Note that the only reason it's not a loop doing append_pipe() is
      that append_pipe() is advancing, while iov_iter_get_pages() is not.
      As soon as it switches to saner semantics, this thing will switch
      to using append_pipe().
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e3b42964
    • A
      ITER_PIPE: allocate buffers as we go in copy-to-pipe primitives · 8fad7767
      Al Viro 提交于
      New helper: append_pipe().  Extends the last buffer if possible,
      allocates a new one otherwise.  Returns page and offset in it
      on success, NULL on failure.  iov_iter is advanced past the
      data we've got.
      
      Use that instead of push_pipe() in copy-to-pipe primitives;
      they get simpler that way.  Handling of short copy (in "mc" one)
      is done simply by iov_iter_revert() - iov_iter is in consistent
      state after that one, so we can use that.
      
      [Fix for braino caught by Liu Xinpeng <liuxp11@chinatelecom.cn> folded in]
      [another braino fix, this time in copy_pipe_to_iter() and pipe_zero();
      caught by testcase from Hugh Dickins]
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8fad7767
    • A
      ITER_PIPE: helpers for adding pipe buffers · 47b7fcae
      Al Viro 提交于
      There are only two kinds of pipe_buffer in the area used by ITER_PIPE.
      
      1) anonymous - copy_to_iter() et.al. end up creating those and copying
      data there.  They have zero ->offset, and their ->ops points to
      default_pipe_page_ops.
      
      2) zero-copy ones - those come from copy_page_to_iter(), and page
      comes from caller.  ->offset is also caller-supplied - it might be
      non-zero.  ->ops points to page_cache_pipe_buf_ops.
      
      Move creation and insertion of those into helpers - push_anon(pipe, size)
      and push_page(pipe, page, offset, size) resp., separating them from
      the "could we avoid creating a new buffer by merging with the current
      head?" logics.
      Acked-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      47b7fcae
    • A
      ITER_PIPE: helper for getting pipe buffer by index · 2dcedb2a
      Al Viro 提交于
      pipe_buffer instances of a pipe are organized as a ring buffer,
      with power-of-2 size.  Indices are kept *not* reduced modulo ring
      size, so the buffer refered to by index N is
      	pipe->bufs[N & (pipe->ring_size - 1)].
      
      Ring size can change over the lifetime of a pipe, but not while
      the pipe is locked.  So for any iov_iter primitives it's a constant.
      Original conversion of pipes to this layout went overboard trying
      to microoptimize that - calculating pipe->ring_size - 1, storing
      it in a local variable and using through the function.  In some
      cases it might be warranted, but most of the times it only
      obfuscates what's going on in there.
      
      Introduce a helper (pipe_buf(pipe, N)) that would encapsulate
      that and use it in the obvious cases.  More will follow...
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Reviewed-by: NChristian Brauner (Microsoft) <brauner@kernel.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2dcedb2a
    • A
      new iov_iter flavour - ITER_UBUF · fcb14cb1
      Al Viro 提交于
      Equivalent of single-segment iovec.  Initialized by iov_iter_ubuf(),
      checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC
      ones.
      
      We are going to expose the things like ->write_iter() et.al. to those
      in subsequent commits.
      
      New predicate (user_backed_iter()) that is true for ITER_IOVEC and
      ITER_UBUF; places like direct-IO handling should use that for
      checking that pages we modify after getting them from iov_iter_get_pages()
      would need to be dirtied.
      
      DO NOT assume that replacing iter_is_iovec() with user_backed_iter()
      will solve all problems - there's code that uses iter_is_iovec() to
      decide how to poke around in iov_iter guts and for that the predicate
      replacement obviously won't suffice.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fcb14cb1
  4. 07 7月, 2022 6 次提交
  5. 29 6月, 2022 2 次提交
  6. 27 6月, 2022 1 次提交
  7. 12 6月, 2022 1 次提交
    • L
      iov_iter: fix build issue due to possible type mis-match · 1c27f1fc
      Linus Torvalds 提交于
      Commit 6c776766 ("iov_iter: Fix iter_xarray_get_pages{,_alloc}()")
      introduced a problem on some 32-bit architectures (at least arm, xtensa,
      csky,sparc and mips), that have a 'size_t' that is 'unsigned int'.
      
      The reason is that we now do
      
          min(nr * PAGE_SIZE - offset, maxsize);
      
      where 'nr' and 'offset' and both 'unsigned int', and PAGE_SIZE is
      'unsigned long'.  As a result, the normal C type rules means that the
      first argument to 'min()' ends up being 'unsigned long'.
      
      In contrast, 'maxsize' is of type 'size_t'.
      
      Now, 'size_t' and 'unsigned long' are always the same physical type in
      the kernel, so you'd think this doesn't matter, and from an actual
      arithmetic standpoint it doesn't.
      
      But on 32-bit architectures 'size_t' is commonly 'unsigned int', even if
      it could also be 'unsigned long'.  In that situation, both are unsigned
      32-bit types, but they are not the *same* type.
      
      And as a result 'min()' will complain about the distinct types (ignore
      the "pointer types" part of the error message: that's an artifact of the
      way we have made 'min()' check types for being the same):
      
        lib/iov_iter.c: In function 'iter_xarray_get_pages':
        include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast [-Werror]
           20 |         (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
              |                                   ^~
        lib/iov_iter.c:1464:16: note: in expansion of macro 'min'
         1464 |         return min(nr * PAGE_SIZE - offset, maxsize);
              |                ^~~
      
      This was not visible on 64-bit architectures (where we always define
      'size_t' to be 'unsigned long').
      
      Force these cases to use 'min_t(size_t, x, y)' to make the type explicit
      and avoid the issue.
      
      [ Nit-picky note: technically 'size_t' doesn't have to match 'unsigned
        long' arithmetically. We've certainly historically seen environments
        with 16-bit address spaces and 32-bit 'unsigned long'.
      
        Similarly, even in 64-bit modern environments, 'size_t' could be its
        own type distinct from 'unsigned long', even if it were arithmetically
        identical.
      
        So the above type commentary is only really descriptive of the kernel
        environment, not some kind of universal truth for the kinds of wild
        and crazy situations that are allowed by the C standard ]
      Reported-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Link: https://lore.kernel.org/all/YqRyL2sIqQNDfky2@debian/
      Cc: Jeff Layton <jlayton@kernel.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1c27f1fc
  8. 11 6月, 2022 1 次提交
    • D
      iov_iter: Fix iter_xarray_get_pages{,_alloc}() · 6c776766
      David Howells 提交于
      The maths at the end of iter_xarray_get_pages() to calculate the actual
      size doesn't work under some circumstances, such as when it's been asked to
      extract a partial single page.  Various terms of the equation cancel out
      and you end up with actual == offset.  The same issue exists in
      iter_xarray_get_pages_alloc().
      
      Fix these to just use min() to select the lesser amount from between the
      amount of page content transcribed into the buffer, minus the offset, and
      the size limit specified.
      
      This doesn't appear to have caused a problem yet upstream because network
      filesystems aren't getting the pages from an xarray iterator, but rather
      passing it directly to the socket, which just iterates over it.  Cachefiles
      *does* do DIO from one to/from ext4/xfs/btrfs/etc. but it always asks for
      whole pages to be written or read.
      
      Fixes: 7ff50620 ("iov_iter: Add ITER_XARRAY")
      Reported-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cc: Alexander Viro <viro@zeniv.linux.org.uk>
      cc: Dominique Martinet <asmadeus@codewreck.org>
      cc: Mike Marshall <hubcap@omnibond.com>
      cc: Gao Xiang <xiang@kernel.org>
      cc: linux-afs@lists.infradead.org
      cc: v9fs-developer@lists.sourceforge.net
      cc: devel@lists.orangefs.org
      cc: linux-erofs@lists.ozlabs.org
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6c776766
  9. 21 2月, 2022 1 次提交
  10. 05 1月, 2022 1 次提交
  11. 24 10月, 2021 1 次提交
  12. 21 10月, 2021 1 次提交