- 10 11月, 2022 1 次提交
-
-
由 Logan Gunthorpe 提交于
Add iov_iter_get_pages_flags() and iov_iter_get_pages_alloc_flags() which take a flags argument that is passed to get_user_pages_fast(). This is so that FOLL_PCI_P2PDMA can be passed when appropriate. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221021174116.7200-4-logang@deltatee.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 04 10月, 2022 1 次提交
-
-
由 Alexander Potapenko 提交于
Introduce instrument_copy_from_user_before() and instrument_copy_from_user_after() hooks to be invoked before and after the call to copy_from_user(). KASAN and KCSAN will be only using instrument_copy_from_user_before(), but for KMSAN we'll need to insert code after copy_from_user(). Link: https://lkml.kernel.org/r/20220915150417.722975-4-glider@google.comSigned-off-by: NAlexander Potapenko <glider@google.com> Reviewed-by: NMarco Elver <elver@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 09 8月, 2022 23 次提交
-
-
由 Al Viro 提交于
had been broken for ITER_BVEC et.al. since ever (OK, v3.17 when ITER_BVEC had first appeared)... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
... just shove it into one pipe_buffer. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
now that we are advancing the iterator, there's no need to treat the first page separately - just call append_pipe() in a loop. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
mechanical change; will be further massaged in subsequent commits Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
All call sites of get_pages_array() are essenitally identical now. Replace with common helper... Returns number of slots available in resulting array or 0 on OOM; it's up to the caller to make sure it doesn't ask to zero-entry array (i.e. neither maxpages nor size are allowed to be zero). Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
... and don't mangle maxsize there - turn the loop into counting one instead. Easier to see that we won't run out of array that way. Note that special treatment of the partial buffer in that thing is an artifact of the non-advancing semantics of iov_iter_get_pages() - if not for that, it would be append_pipe(), same as the body of the loop that follows it. IOW, once we make iov_iter_get_pages() advancing, the whole thing will turn into calculate how many pages do we want allocate an array (if needed) call append_pipe() that many times. Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
same as for pipes and xarrays; after that iov_iter_get_pages() becomes a wrapper for __iov_iter_get_pages_alloc(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
same as for pipes Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
The differences between those two are * pipe_get_pages() gets a non-NULL struct page ** value pointing to preallocated array + array size. * pipe_get_pages_alloc() gets an address of struct page ** variable that contains NULL, allocates the array and (on success) stores its address in that variable. Not hard to combine - always pass struct page ***, have the previous pipe_get_pages_alloc() caller pass ~0U as cap for array size. Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
zero maxpages is bogus, but best treated as "just return 0"; NULL pages, OTOH, should be treated as a hard bug. get rid of now completely useless checks in xarray_get_pages{,_alloc}(). Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Incidentally, ITER_XARRAY did *not* free the sucker in case when iter_xarray_populate_pages() returned 0... Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
All their callers are next to each other; all of them want the total amount of pages and, possibly, the offset in the partial final buffer. Combine into a new helper (pipe_npages()), fix the bogosity in pipe_space_for_user(), while we are at it. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
We often need to find whether the last buffer is anon or not, and currently it's rather clumsy: check if ->iov_offset is non-zero (i.e. that pipe is not empty) if so, get the corresponding pipe_buffer and check its ->ops if it's &default_pipe_buf_ops, we have an anon buffer. Let's replace the use of ->iov_offset (which is nowhere near similar to its role for other flavours) with signed field (->last_offset), with the following rules: empty, no buffers occupied: 0 anon, with bytes up to N-1 filled: N zero-copy, with bytes up to N-1 filled: -N That way abs(i->last_offset) is equal to what used to be in i->iov_offset and empty vs. anon vs. zero-copy can be distinguished by the sign of i->last_offset. Checks for "should we extend the last buffer or should we start a new one?" become easier to follow that way. Note that most of the operations can only be done in a sane state - i.e. when the pipe has nothing past the current position of iterator. About the only thing that could be done outside of that state is iov_iter_advance(), which transitions to the sane state by truncating the pipe. There are only two cases where we leave the sane state: 1) iov_iter_get_pages()/iov_iter_get_pages_alloc(). Will be dealt with later, when we make get_pages advancing - the callers are actually happier that way. 2) iov_iter copied, then something is put into the copy. Since they share the underlying pipe, the original gets behind. When we decide that we are done with the copy (original is not usable until then) we advance the original. direct_io used to be done that way; nowadays it operates on the original and we do iov_iter_revert() to discard the excessive data. At the moment there's nothing in the kernel that could do that to ITER_PIPE iterators, so this reason for insane state is theoretical right now. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Fold pipe_truncate() into it, clean up. We can release buffers in the same loop where we walk backwards to the iterator beginning looking for the place where the new position will be. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
instead of setting ->iov_offset for new position and calling pipe_truncate() to adjust ->len of the last buffer and discard everything after it, adjust ->len at the same time we set ->iov_offset and use pipe_discard_from() to deal with buffers past that. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
it's only used to get to the partial buffer we can add to, and that's always the last one, i.e. pipe->head - 1. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Expand the only remaining call of push_pipe() (in __pipe_get_pages()), combine it with the page-collecting loop there. Note that the only reason it's not a loop doing append_pipe() is that append_pipe() is advancing, while iov_iter_get_pages() is not. As soon as it switches to saner semantics, this thing will switch to using append_pipe(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
New helper: append_pipe(). Extends the last buffer if possible, allocates a new one otherwise. Returns page and offset in it on success, NULL on failure. iov_iter is advanced past the data we've got. Use that instead of push_pipe() in copy-to-pipe primitives; they get simpler that way. Handling of short copy (in "mc" one) is done simply by iov_iter_revert() - iov_iter is in consistent state after that one, so we can use that. [Fix for braino caught by Liu Xinpeng <liuxp11@chinatelecom.cn> folded in] [another braino fix, this time in copy_pipe_to_iter() and pipe_zero(); caught by testcase from Hugh Dickins] Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
There are only two kinds of pipe_buffer in the area used by ITER_PIPE. 1) anonymous - copy_to_iter() et.al. end up creating those and copying data there. They have zero ->offset, and their ->ops points to default_pipe_page_ops. 2) zero-copy ones - those come from copy_page_to_iter(), and page comes from caller. ->offset is also caller-supplied - it might be non-zero. ->ops points to page_cache_pipe_buf_ops. Move creation and insertion of those into helpers - push_anon(pipe, size) and push_page(pipe, page, offset, size) resp., separating them from the "could we avoid creating a new buffer by merging with the current head?" logics. Acked-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
pipe_buffer instances of a pipe are organized as a ring buffer, with power-of-2 size. Indices are kept *not* reduced modulo ring size, so the buffer refered to by index N is pipe->bufs[N & (pipe->ring_size - 1)]. Ring size can change over the lifetime of a pipe, but not while the pipe is locked. So for any iov_iter primitives it's a constant. Original conversion of pipes to this layout went overboard trying to microoptimize that - calculating pipe->ring_size - 1, storing it in a local variable and using through the function. In some cases it might be warranted, but most of the times it only obfuscates what's going on in there. Introduce a helper (pipe_buf(pipe, N)) that would encapsulate that and use it in the obvious cases. More will follow... Reviewed-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NChristian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Equivalent of single-segment iovec. Initialized by iov_iter_ubuf(), checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC ones. We are going to expose the things like ->write_iter() et.al. to those in subsequent commits. New predicate (user_backed_iter()) that is true for ITER_IOVEC and ITER_UBUF; places like direct-IO handling should use that for checking that pages we modify after getting them from iov_iter_get_pages() would need to be dirtied. DO NOT assume that replacing iter_is_iovec() with user_backed_iter() will solve all problems - there's code that uses iter_is_iovec() to decide how to poke around in iov_iter guts and for that the predicate replacement obviously won't suffice. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 07 7月, 2022 6 次提交
-
-
由 Al Viro 提交于
... and calculate the offset in the caller Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Pass maxsize by reference, return length via the same. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
We return length + offset in page via *size. Don't bother - the caller can do that arithmetics just as well; just report the length to it. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
caller can do that just as easily Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
All callers can and should handle iov_iter_get_pages() returning fewer pages than requested. All in-kernel ones do. And it makes the arithmetical overflow analysis much simpler... Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
do what we do for iovec/kvec; that ends up generating better code, AFAICS. Reviewed-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NChristian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 29 6月, 2022 2 次提交
-
-
由 Al Viro 提交于
Unlike other copying operations on ITER_PIPE, copy_mc_to_iter() can result in a short copy. In that case we need to trim the unused buffers, as well as the length of partially filled one - it's not enough to set ->head, ->iov_offset and ->count to reflect how much had we copied. Not hard to fix, fortunately... I'd put a helper (pipe_discard_from(pipe, head)) into pipe_fs_i.h, rather than iov_iter.c - it has nothing to do with iov_iter and having it will allow us to avoid an ugly kludge in fs/splice.c. We could put it into lib/iov_iter.c for now and move it later, but I don't see the point going that way... Cc: stable@kernel.org # 4.19+ Fixes: ca146f6f "lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe()" Reviewed-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NChristian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
we can do copyin/copyout under kmap_local_page(); it shouldn't overflow the kmap stack - the maximal footprint increase only by one here. Reviewed-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NChristian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 27 6月, 2022 1 次提交
-
-
由 Keith Busch 提交于
The existing iov_iter_alignment() function returns the logical OR of address and length. For cases where address and length need to be considered separately, introduce a helper function that a caller can specificy length and address masks that indicate if the iov is unaligned. Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220610195830.3574005-9-kbusch@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 6月, 2022 1 次提交
-
-
由 Linus Torvalds 提交于
Commit 6c776766 ("iov_iter: Fix iter_xarray_get_pages{,_alloc}()") introduced a problem on some 32-bit architectures (at least arm, xtensa, csky,sparc and mips), that have a 'size_t' that is 'unsigned int'. The reason is that we now do min(nr * PAGE_SIZE - offset, maxsize); where 'nr' and 'offset' and both 'unsigned int', and PAGE_SIZE is 'unsigned long'. As a result, the normal C type rules means that the first argument to 'min()' ends up being 'unsigned long'. In contrast, 'maxsize' is of type 'size_t'. Now, 'size_t' and 'unsigned long' are always the same physical type in the kernel, so you'd think this doesn't matter, and from an actual arithmetic standpoint it doesn't. But on 32-bit architectures 'size_t' is commonly 'unsigned int', even if it could also be 'unsigned long'. In that situation, both are unsigned 32-bit types, but they are not the *same* type. And as a result 'min()' will complain about the distinct types (ignore the "pointer types" part of the error message: that's an artifact of the way we have made 'min()' check types for being the same): lib/iov_iter.c: In function 'iter_xarray_get_pages': include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast [-Werror] 20 | (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) | ^~ lib/iov_iter.c:1464:16: note: in expansion of macro 'min' 1464 | return min(nr * PAGE_SIZE - offset, maxsize); | ^~~ This was not visible on 64-bit architectures (where we always define 'size_t' to be 'unsigned long'). Force these cases to use 'min_t(size_t, x, y)' to make the type explicit and avoid the issue. [ Nit-picky note: technically 'size_t' doesn't have to match 'unsigned long' arithmetically. We've certainly historically seen environments with 16-bit address spaces and 32-bit 'unsigned long'. Similarly, even in 64-bit modern environments, 'size_t' could be its own type distinct from 'unsigned long', even if it were arithmetically identical. So the above type commentary is only really descriptive of the kernel environment, not some kind of universal truth for the kinds of wild and crazy situations that are allowed by the C standard ] Reported-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com> Link: https://lore.kernel.org/all/YqRyL2sIqQNDfky2@debian/ Cc: Jeff Layton <jlayton@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 6月, 2022 1 次提交
-
-
由 David Howells 提交于
The maths at the end of iter_xarray_get_pages() to calculate the actual size doesn't work under some circumstances, such as when it's been asked to extract a partial single page. Various terms of the equation cancel out and you end up with actual == offset. The same issue exists in iter_xarray_get_pages_alloc(). Fix these to just use min() to select the lesser amount from between the amount of page content transcribed into the buffer, minus the offset, and the size limit specified. This doesn't appear to have caused a problem yet upstream because network filesystems aren't getting the pages from an xarray iterator, but rather passing it directly to the socket, which just iterates over it. Cachefiles *does* do DIO from one to/from ext4/xfs/btrfs/etc. but it always asks for whole pages to be written or read. Fixes: 7ff50620 ("iov_iter: Add ITER_XARRAY") Reported-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NDavid Howells <dhowells@redhat.com> cc: Alexander Viro <viro@zeniv.linux.org.uk> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Mike Marshall <hubcap@omnibond.com> cc: Gao Xiang <xiang@kernel.org> cc: linux-afs@lists.infradead.org cc: v9fs-developer@lists.sourceforge.net cc: devel@lists.orangefs.org cc: linux-erofs@lists.ozlabs.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 21 2月, 2022 1 次提交
-
-
由 Max Kellermann 提交于
The functions copy_page_to_iter_pipe() and push_pipe() can both allocate a new pipe_buffer, but the "flags" member initializer is missing. Fixes: 241699cd ("new iov_iter flavour: pipe-backed") To: Alexander Viro <viro@zeniv.linux.org.uk> To: linux-fsdevel@vger.kernel.org To: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: NMax Kellermann <max.kellermann@ionos.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 05 1月, 2022 1 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
Take advantage of how kmap_local_folio() works to simplify the loop. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com>
-
- 24 10月, 2021 1 次提交
-
-
由 Andreas Gruenbacher 提交于
Introduce a new nofault flag to indicate to iov_iter_get_pages not to fault in user pages. This is implemented by passing the FOLL_NOFAULT flag to get_user_pages, which causes get_user_pages to fail when it would otherwise fault in a page. We'll use the ->nofault flag to prevent iomap_dio_rw from faulting in pages when page faults are not allowed. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
-
- 21 10月, 2021 1 次提交
-
-
由 Andreas Gruenbacher 提交于
Introduce a new fault_in_iov_iter_writeable helper for safely faulting in an iterator for writing. Uses get_user_pages() to fault in the pages without actually writing to them, which would be destructive. We'll use fault_in_iov_iter_writeable in gfs2 once we've determined that the iterator passed to .read_iter isn't in memory. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
-