提交 · bfc4ee39fdbb2deb8864785d5e5bc5cdd3b31a69 · openeuler / raspberrypi-kernel

04 5月, 2006 1 次提交

[PATCH] splice: fix unlocking of page on error ->prepare_write() · bfc4ee39

由 Jens Axboe 提交于 5月 03, 2006

Looking at generic_file_buffered_write(), we need to unlock_page() if
prepare write fails and it isn't due to racing with truncate().

Also trim the size if ->prepare_write() fails, if we have to.
Signed-off-by: NJens Axboe <axboe@suse.de>

bfc4ee39

02 5月, 2006 8 次提交

[PATCH] vmsplice: restrict stealing a little more · 330ab716

由 Jens Axboe 提交于 5月 02, 2006

Apply the same rules as the anon pipe pages, only allow stealing
if no one else is using the page.
Signed-off-by: NJens Axboe <axboe@suse.de>

330ab716

[PATCH] splice: fix page LRU accounting · a893b99b

由 Jens Axboe 提交于 5月 02, 2006

Currently we rely on the PIPE_BUF_FLAG_LRU flag being set correctly
to know whether we need to fiddle with page LRU state after stealing it,
however for some origins we just don't know if the page is on the LRU
list or not.

So remove PIPE_BUF_FLAG_LRU and do this check/add manually in pipe_to_file()
instead.
Signed-off-by: NJens Axboe <axboe@suse.de>

a893b99b

[PATCH] vmsplice: fix badly placed end paranthesis · 7591489a

由 Jens Axboe 提交于 5月 02, 2006

We need to use the minium of {len, PAGE_SIZE-off}, not {len, PAGE_SIZE}-off.
The latter doesn't make any sense, and could cause us to attempt negative
length transfers...
Signed-off-by: NJens Axboe <axboe@suse.de>

7591489a

[PATCH] vmsplice: allow user to pass in gift pages · 7afa6fd0

由 Jens Axboe 提交于 5月 01, 2006

If SPLICE_F_GIFT is set, the user is basically giving this pages away to
the kernel. That means we can steal them for eg page cache uses instead
of copying it.

The data must be properly page aligned and also a multiple of the page size
in length.
Signed-off-by: NJens Axboe <axboe@suse.de>

7afa6fd0

[PATCH] pipe: enable atomic copying of pipe data to/from user space · f6762b7a

由 Jens Axboe 提交于 5月 01, 2006

The pipe ->map() method uses kmap() to virtually map the pages, which
is both slow and has known scalability issues on SMP. This patch enables
atomic copying of pipe pages, by pre-faulting data and using kmap_atomic()
instead.

lmbench bw_pipe and lat_pipe measurements agree this is a Good Thing. Here
are results from that on a UP machine with highmem (1.5GiB of RAM), running
first a UP kernel, SMP kernel, and SMP kernel patched.

Vanilla-UP:
Pipe bandwidth: 1622.28 MB/sec
Pipe bandwidth: 1610.59 MB/sec
Pipe bandwidth: 1608.30 MB/sec
Pipe latency: 7.3275 microseconds
Pipe latency: 7.2995 microseconds
Pipe latency: 7.3097 microseconds

Vanilla-SMP:
Pipe bandwidth: 1382.19 MB/sec
Pipe bandwidth: 1317.27 MB/sec
Pipe bandwidth: 1355.61 MB/sec
Pipe latency: 9.6402 microseconds
Pipe latency: 9.6696 microseconds
Pipe latency: 9.6153 microseconds

Patched-SMP:
Pipe bandwidth: 1578.70 MB/sec
Pipe bandwidth: 1579.95 MB/sec
Pipe bandwidth: 1578.63 MB/sec
Pipe latency: 9.1654 microseconds
Pipe latency: 9.2266 microseconds
Pipe latency: 9.1527 microseconds
Signed-off-by: NJens Axboe <axboe@suse.de>

f6762b7a

[PATCH] splice: call handle_ra_miss() on failure to lookup page · e27dedd8

由 Jens Axboe 提交于 5月 01, 2006

Notify the readahead logic of the missing page. Suggested by
Oleg Nesterov.
Signed-off-by: NJens Axboe <axboe@suse.de>

e27dedd8

[PATCH] pipe: introduce ->pin() buffer operation · f84d7519

由 Jens Axboe 提交于 5月 01, 2006

The ->map() function is really expensive on highmem machines right now,
since it has to use the slower kmap() instead of kmap_atomic(). Splice
rarely needs to access the virtual address of a page, so it's a waste
of time doing it.

Introduce ->pin() to take over the responsibility of making sure the
page data is valid. ->map() is then reduced to just kmap(). That way we
can also share a most of the pipe buffer ops between pipe.c and splice.c
Signed-off-by: NJens Axboe <axboe@suse.de>

f84d7519

[PATCH] splice: fix bugs in pipe_to_file() · 0568b409

由 Jens Axboe 提交于 5月 01, 2006

Found by Oleg Nesterov <oleg@tv-sign.ru>, fixed by me.

- Only allow full pages to go to the page cache.
- Check page != buf->page instead of using PIPE_BUF_FLAG_STOLEN.
- Remember to clear 'stolen' if add_to_page_cache() fails.

And as a cleanup on that:

- Make the bottom fall-through logic a little less convoluted. Also make
  the steal path hold an extra reference to the page, so we don't have
  to differentiate between stolen and non-stolen at the end.
Signed-off-by: NJens Axboe <axboe@suse.de>

0568b409

30 4月, 2006 1 次提交

[PATCH] splice: fix bugs with stealing regular pipe pages · 46e678c9

由 Jens Axboe 提交于 4月 30, 2006

- Check that page has suitable count for stealing in the regular pipes.
- pipe_to_file() assumes that the page is locked on succesful steal, so
  do that in the pipe steal hook
- Missing unlock_page() in add_to_page_cache() failure.
Signed-off-by: NJens Axboe <axboe@suse.de>

46e678c9

27 4月, 2006 2 次提交

[PATCH] splice: make the read-side do batched page lookups · eb20796b

由 Jens Axboe 提交于 4月 27, 2006

Use the new find_get_pages_contig() to potentially look up the entire
splice range in one single call. This speeds up generic_file_splice_read()
quite a bit.
Signed-off-by: NJens Axboe <axboe@suse.de>

eb20796b

J
[PATCH] splice: switch to using page_cache_readahead() · eb645a24
由 Jens Axboe 提交于 4月 27, 2006
```
Avoids doing useless work, when the file is fully cached.
Signed-off-by: NJens Axboe <axboe@suse.de>
```
eb645a24

26 4月, 2006 4 次提交

[PATCH] splice: rearrange moving to/from pipe helpers · 00522fb4

由 Jens Axboe 提交于 4月 26, 2006

We need these for people writing their own ->splice_read/write hooks.
Signed-off-by: NJens Axboe <axboe@suse.de>

00522fb4

[PATCH] Add support for the sys_vmsplice syscall · 912d35f8

由 Jens Axboe 提交于 4月 26, 2006

sys_splice() moves data to/from pipes with a file input/output. sys_vmsplice()
moves data to a pipe, with the input being a user address range instead.

This uses an approach suggested by Linus, where we can hold partial ranges
inside the pages[] map. Hopefully this will be useful for network
receive support as well.
Signed-off-by: NJens Axboe <axboe@suse.de>

912d35f8

[PATCH] splice: fix offset problems · 016b661e

由 Jens Axboe 提交于 4月 25, 2006

Make the move_from_pipe() actors return number of bytes processed, then
move_from_pipe() can decide more cleverly when to move on to the next
buffer.

This fixes problems with pipe offset and differing file offset.
Signed-off-by: NJens Axboe <axboe@suse.de>

016b661e

[PATCH] splice: fix min() warning · ba5f5d90

由 Andrew Morton 提交于 4月 25, 2006

Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NJens Axboe <axboe@suse.de>

ba5f5d90

20 4月, 2006 1 次提交
- J
  [PATCH] splice: fix smaller sized splice reads · 82aa5d61
  由 Jens Axboe 提交于 4月 20, 2006
```
Signed-off-by: NJens Axboe <axboe@suse.de>
```
  82aa5d61
19 4月, 2006 5 次提交

[PATCH] splice: fixup writeout path after ->map changes · 9e0267c2

由 Jens Axboe 提交于 4月 19, 2006

Since ->map() no longer locks the page, we need to adjust the handling
of those pages (and stealing) a little. This now passes full regressions
again.
Signed-off-by: NJens Axboe <axboe@suse.de>

9e0267c2

[PATCH] splice: offset fixes · a4514ebd

由 Jens Axboe 提交于 4月 19, 2006

- We need to adjust *ppos for writes as well.
- Copy back modified offset value if one was passed in, similar to
  what sendfile does.
Signed-off-by: NJens Axboe <axboe@suse.de>

a4514ebd

[PATCH] tee: link_pipe() must be careful when dropping one of the pipe locks · 2a27250e

由 Jens Axboe 提交于 4月 19, 2006

We need to ensure that we only drop a lock that is ordered last, to avoid
ABBA deadlocks with competing processes.
Signed-off-by: NJens Axboe <axboe@suse.de>

2a27250e

[PATCH] splice: cleanup the SPLICE_F_NONBLOCK handling · c4f895cb

由 Jens Axboe 提交于 4月 19, 2006

- generic_file_splice_read() more readable and correct
- Don't bail on page allocation with NONBLOCK set, just don't allow
  direct blocking on IO (eg lock_page).
Signed-off-by: NJens Axboe <axboe@suse.de>

c4f895cb

J
[PATCH] splice: close i_size truncate races on read · 91ad66ef
由 Jens Axboe 提交于 4月 19, 2006
```
We need to check i_size after doing a blocking readpage.
Signed-off-by: NJens Axboe <axboe@suse.de>
```
91ad66ef

11 4月, 2006 8 次提交

[PATCH] splice: add support for sys_tee() · 70524490

由 Jens Axboe 提交于 4月 11, 2006

Basically an in-kernel implementation of tee, which uses splice and the
pipe buffers as an intelligent way to pass data around by reference.

Where the user space tee consumes the input and produces a stdout and
file output, this syscall merely duplicates the data inside a pipe to
another pipe. No data is copied, the output just grabs a reference to the
input pipe data.
Signed-off-by: NJens Axboe <axboe@suse.de>

70524490

[PATCH] splice: pass offset around for ->splice_read() and ->splice_write() · cbb7e577

由 Jens Axboe 提交于 4月 11, 2006

We need not use ->f_pos as the offset for the file input/output. If the
user passed an offset pointer in through sys_splice(), just use that and
leave ->f_pos alone.
Signed-off-by: NJens Axboe <axboe@suse.de>

cbb7e577

[PATCH] splice: comment styles · 73d62d83

由 Ingo Molnar 提交于 4月 11, 2006

 - capitalize consistently
 - end sentences in one way or another
 - update comment text to match the implementation
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NJens Axboe <axboe@suse.de>

73d62d83

[PATCH] splice: add Ingo as addition copyright holder · c2058e06

由 Jens Axboe 提交于 4月 11, 2006

The comment is also somewhat out of date, correct that as well.
Signed-off-by: NJens Axboe <axboe@suse.de>

c2058e06

[PATCH] splice: unlikely() optimizations · 49570e9b

由 Jens Axboe 提交于 4月 11, 2006

Also corrects a few comments. Patch mainly from Ingo, changes by me.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NJens Axboe <axboe@suse.de>

49570e9b

[PATCH] splice: speedups and optimizations · 6f767b04

由 Jens Axboe 提交于 4月 11, 2006

- Kill the local variables that cache ->nrbufs, they just take up space.

- Only set do_wakeup for a real pipe. This is a big win for direct splicing.

- Kill i_mutex lock around ->f_pos update, regular io paths don't do this
  either.
Signed-off-by: NJens Axboe <axboe@suse.de>

6f767b04

[PATCH] splice: speedup __generic_file_splice_read · 7480a904

由 Jens Axboe 提交于 4月 11, 2006

Using find_get_page() is a lot faster than find_or_create_page(). This
gets splice a lot closer to sendfile() for fd -> socket transfers.
Signed-off-by: NJens Axboe <axboe@suse.de>

7480a904

[PATCH] splice: add direct fd <-> fd splicing support · b92ce558

由 Jens Axboe 提交于 4月 11, 2006

It's more efficient for sendfile() emulation. Basically we cache an
internal private pipe and just use that as the intermediate area for
pages. Direct splicing is not available from sys_splice(), it is only
meant to be used for sendfile() emulation.

Additional patch from Ingo Molnar to avoid the PIPE_BUFFERS loop at
exit for the normal fast path.
Signed-off-by: NJens Axboe <axboe@suse.de>

b92ce558

10 4月, 2006 8 次提交

[PATCH] splice: add optional input and output offsets · 529565dc

由 Ingo Molnar 提交于 4月 10, 2006

add optional input and output offsets to sys_splice(), for seekable file
descriptors:

 asmlinkage long sys_splice(int fd_in, loff_t __user *off_in,
                            int fd_out, loff_t __user *off_out,
                            size_t len, unsigned int flags);

semantics are straightforward: f_pos will be updated with the offset
provided by user-space, before the splice transfer is about to begin.
Providing a NULL offset pointer means the existing f_pos will be used
(and updated in situ).  Providing an offset for a pipe results in
-ESPIPE. Providing an invalid offset pointer results in -EFAULT.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NJens Axboe <axboe@suse.de>

529565dc

[PATCH] introduce a "kernel-internal pipe object" abstraction · 3a326a2c

由 Ingo Molnar 提交于 4月 10, 2006

separate out the 'internal pipe object' abstraction, and make it
usable to splice. This cleans up and fixes several aspects of the
internal splice APIs and the pipe code:

 - pipes: the allocation and freeing of pipe_inode_info is now more symmetric
   and more streamlined with existing kernel practices.

 - splice: small micro-optimization: less pointer dereferencing in splice
   methods
Signed-off-by: NIngo Molnar <mingo@elte.hu>

Update XFS for the ->splice_read/->splice_write changes.
Signed-off-by: NJens Axboe <axboe@suse.de>

3a326a2c

[PATCH] splice: be smarter about calling do_page_cache_readahead() · 0b749ce3

由 Jens Axboe 提交于 4月 10, 2006

We don't want to call into the read-ahead logic unless we are at the
start of a page, _or_ we have multiple pages to read.
Signed-off-by: NJens Axboe <axboe@suse.de>

0b749ce3

[PATCH] splice: optimize the splice buffer mapping · 49d0b21b

由 Jens Axboe 提交于 4月 10, 2006

We don't really need to lock down the pages, just make sure they
are uptodate.
Signed-off-by: NJens Axboe <axboe@suse.de>

49d0b21b

[PATCH] splice: cleanup __generic_file_splice_read() · 16c523dd

由 Jens Axboe 提交于 4月 10, 2006

The whole shadow/pages logic got overly complex, and this simpler
approach is actually faster in testing.
Signed-off-by: NJens Axboe <axboe@suse.de>

16c523dd

[PATCH] splice: only call wake_up_interruptible() when we really have to · c0bd1f65

由 Jens Axboe 提交于 4月 10, 2006

__wake_up_common() is pretty heavy in the kernel profiles, this brings
it down to a more acceptable level.
Signed-off-by: NJens Axboe <axboe@suse.de>

c0bd1f65

[PATCH] splice: potential !page dereference · 9aefe431

由 Dave Jones 提交于 4月 10, 2006

We can get to out: with a NULL page, which we probably
don't want to be calling page_cache_release() on.
Signed-off-by: NDave Jones <davej@redhat.com>
Signed-off-by: NJens Axboe <axboe@suse.de>

9aefe431

[PATCH] splice: mark the io page as accessed · c7f21e4f

由 Jens Axboe 提交于 4月 10, 2006

We should do that, since we do the LRU manipulation ourselves now. Suggested
by Nick Piggin.
Signed-off-by: NJens Axboe <axboe@suse.de>

c7f21e4f

03 4月, 2006 2 次提交

[PATCH] splice: fix page stealing LRU handling. · 3e7ee3e7

由 Jens Axboe 提交于 4月 02, 2006

Originally from Nick Piggin, just adapted to the newer branch.

You can't check PageLRU without holding zone->lru_lock.  The page
release code can get away with it only because the page refcount is 0 at
that point. Also, you can't reliably remove pages from the LRU unless
the refcount is 0. Ever.
Signed-off-by: NNick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NJens Axboe <axboe@suse.de>

3e7ee3e7

[PATCH] splice: page stealing needs to wait_on_page_writeback() · ad8d6f0a

由 Jens Axboe 提交于 4月 02, 2006

Thanks to Andrew for the good explanation of why this is so. akpm writes:

If a page is under writeback and we remove it from pagecache, it's still
going to get written to disk. But the VFS no longer knows about that page,
nor that this page is about to modify disk blocks.

So there might be scenarios in which those
blocks-which-are-about-to-be-written-to get reused for something else.
When writeback completes, it'll scribble on those blocks.

This won't happen in ext2/ext3-style filesystems in normal mode because the
page has buffers and try_to_release_page() will fail.

But ext2 in nobh mode doesn't attach buffers at all - it just sticks the
page in a BIO, finds some new blocks, points the BIO at those blocks and
lets it rip.

While that write IO's in flight, someone could truncate the file. Truncate
won't block on the writeout because the page isn't in pagecache any more.
So truncate will the free the blocks from the file under the page's feet.
Then something else can reallocate those blocks. Then write data to them.

Now, the original write completes, corrupting the filesystem.
Signed-off-by: NJens Axboe <axboe@suse.de>

ad8d6f0a