提交 · 79685b8deea4541d18882d8c07d0e99e788292ab · openanolis / cloud-kernel

27 7月, 2007 1 次提交

docbook: add pipes, other fixes · 79685b8d

由 Randy Dunlap 提交于 7月 27, 2007

Fix some typos in pipe.c and splice.c.
Add pipes API to kernel-api.tmpl.
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

79685b8d

21 7月, 2007 1 次提交

splice: fix bad unlock_page() in error case · 6a860c97

由 Jens Axboe 提交于 7月 20, 2007

If add_to_page_cache_lru() fails, the page will not be locked. But
splice jumps to an error path that does a page release and unlock,
causing a BUG() in unlock_page().

Fix this by adding one more label that just releases the page. This bug
was actually triggered on EL5 by gurudas pai <gurudas.pai@oracle.com>
using fio.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6a860c97

20 7月, 2007 4 次提交

readahead: split ondemand readahead interface into two functions · cf914a7d

由 Rusty Russell 提交于 7月 19, 2007

Split ondemand readahead interface into two functions.  I think this makes it
a little clearer for non-readahead experts (like Rusty).

Internally they both call ondemand_readahead(), but the page argument is
changed to an obvious boolean flag.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NFengguang Wu <wfg@mail.ustc.edu.cn>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cf914a7d

readahead: pass real splice size · d8983910

由 Fengguang Wu 提交于 7月 19, 2007

Pass real splice size to page_cache_readahead_ondemand().

The splice code works in chunks of 16 pages internally.  The readahead code
should be told of the overall splice size, instead of the internal chunk size.
 Otherwize bad things may happen.  Imagine some 17-page random splice reads.
The code before this patch will result in two readahead calls: readahead(16);
readahead(1); That leads to one 16-page I/O and one 32-page I/O: one extra I/O
and 31 readahead miss pages.
Signed-off-by: NFengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d8983910

readahead: move synchronous readahead call out of splice loop · 431a4820

由 Fengguang Wu 提交于 7月 19, 2007

Move synchronous page_cache_readahead_ondemand() call out of splice loop.

This avoids one pointless page allocation/insertion in case of non-zero
ra_pages, or many pointless readahead calls in case of zero ra_pages.

Note that if a user sets ra_pages to less than PIPE_BUFFERS=16 pages, he will
not get expected readahead behavior anyway. The splice code works in batches
of 16 pages, which can be taken as another form of synchronous readahead.
Signed-off-by: NFengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

431a4820

readahead: convert splice invocations · a08a166f

由 Fengguang Wu 提交于 7月 19, 2007

Convert splice reads to use on-demand readahead.
Signed-off-by: NFengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Steven Pratt <slpratt@austin.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Jens Axboe <axboe@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a08a166f

16 7月, 2007 1 次提交

splice: direct splicing updates ppos twice · bcd4f3ac

由 Jens Axboe 提交于 7月 16, 2007

OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> reported that he's noticed
nfsd read corruption in recent kernels, and did the hard work of
discovering that it's due to splice updating the file position twice.
This means that the next operation would start further ahead than it
should.

nfsd_vfs_read()
    splice_direct_to_actor()
        while(len) {
            do_splice_to()                     [update sd->pos]
                -> generic_file_splice_read()  [read from sd->pos]
            nfsd_direct_splice_actor()
                -> __splice_from_pipe()        [update sd->pos]

There's nothing wrong with the core splice code, but the direct
splicing is an addon that calls both input and output paths.
So it has to take care in locally caching offset so it remains correct.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

bcd4f3ac

13 7月, 2007 2 次提交

splice: fix offset mangling with direct splicing (sendfile) · 51a92c0f

由 Jens Axboe 提交于 7月 13, 2007

If the output actor doesn't transfer the full amount of data, we will
increment ppos too much. Two related bugs in there:

- We need to break out and return actor() retval if it is shorted than
  what we spliced into the pipe.

- Adjust ppos only according to actor() return.

Also fix loop problem in generic_file_splice_read(), it should not keep
going when data has already been transferred.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

51a92c0f

security: revalidate rw permissions for sys_splice and sys_vmsplice · 29ce2058

由 James Morris 提交于 7月 13, 2007

Revalidate read/write permissions for splice(2) and vmslice(2), in case
security policy has changed since the files were opened.
Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NJames Morris <jmorris@namei.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

29ce2058

10 7月, 2007 7 次提交

pipe: add documentation and comments · 0845718d

由 Jens Axboe 提交于 6月 12, 2007

As per Andrew Mortons request, here's a set of documentation for
the generic pipe_buf_operations hooks, the pipe, and pipe_buffer
structures.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0845718d

pipe: change the ->pin() operation to ->confirm() · cac36bb0

由 Jens Axboe 提交于 6月 14, 2007

The name 'pin' was badly chosen, it doesn't pin a pipe buffer
in the most commonly used sense in the kernel. So change the
name to 'confirm', after debating this issue with Hugh
Dickins a bit.

A good return from ->confirm() means that the buffer is really
there, and that the contents are good.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

cac36bb0

splice: completely document external interface with kerneldoc · 932cc6d4

由 Jens Axboe 提交于 6月 21, 2007

Also add fs/splice.c as a kerneldoc target with a smaller blurb that
should be expanded to better explain the overview of splice.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

932cc6d4

pipe: allow passing around of ops private pointer · 497f9625

由 Jens Axboe 提交于 6月 11, 2007

relay needs this for proper consumption handling, and the network
receive support needs it as well to lookup the sk_buff on pipe
release.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

497f9625

splice: divorce the splice structure/function definitions from the pipe header · d6b29d7c

由 Jens Axboe 提交于 6月 04, 2007

We need to move even more stuff into the header so that folks can use
the splice_to_pipe() implementation instead of open-coding a lot of
pipe knowledge (see relay implementation), so move to our own header
file finally.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d6b29d7c

vmsplice: add vmsplice-to-user support · 6a14b90b

由 Jens Axboe 提交于 6月 14, 2007

A bit of a cheat, it actually just copies the data to userspace. But
this makes the interface nice and symmetric and enables people to build
on splice, with room for future improvement in performance.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6a14b90b

splice: abstract out actor data · c66ab6fa

由 Jens Axboe 提交于 6月 12, 2007

For direct splicing (or private splicing), the output may not be a file.
So abstract out the handling into a specified actor function and put
the data in the splice_desc structure earlier, so we can build on top
of that.

This is the first step in better splice handling for drivers, and also
for implementing vmsplice _to_ user memory.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c66ab6fa

15 6月, 2007 3 次提交

splice: only check do_wakeup in splice_to_pipe() for a real pipe · 02676e5a

由 Jens Axboe 提交于 6月 15, 2007

We only ever set do_wakeup to non-zero if the pipe has an inode
backing, so it's pointless to check outside the pipe->inode
check.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

02676e5a

splice: fix leak of pages on short splice to pipe · 00de00bd

由 Jens Axboe 提交于 6月 15, 2007

If the destination pipe is full and we already transferred
data, we break out instead of waiting for more pipe room.
The exit logic looks at spd->nr_pages to see if we moved
everything inside the spd container, but we decrement that
variable in the loop to decide when spd has emptied.

Instead we want to compare to the original page count in
the spd, so cache that in a local variable.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

00de00bd

splice: adjust balance_dirty_pages_ratelimited() call · 17ee4f49

由 Jens Axboe 提交于 6月 15, 2007

As we have potentially dirtied more than 1 page, we should indicate as
such to the dirty page balancing. So call
balance_dirty_pages_ratelimited_nr() and pass in the approximate number
of pages we dirtied.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

17ee4f49

08 6月, 2007 5 次提交

splice: __generic_file_splice_read: fix read/truncate race · 620a324b

由 Jens Axboe 提交于 6月 07, 2007

Original patch and description from Neil Brown <neilb@suse.de>,
merged and adapted to splice branch by me. Neils text follows:

__generic_file_splice_read() currently samples the i_size at the start
and doesn't do so again unless it needs to call ->readpage to load
a page.  After ->readpage it has to re-sample i_size as a truncate
may have caused that page to be filled with zeros, and the read()
call should not see these.

However there are other activities that might cause ->readpage to be
called on a page between the time that __generic_file_splice_read()
samples i_size and when it finds that it has an uptodate page. These
include at least read-ahead and possibly another thread performing a
read

So we must sample i_size *after* it has an uptodate page.  Thus the
current sampling at the start and after a read can be replaced with a
sampling before page addition into spd.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

620a324b

splice: __generic_file_splice_read: fix i_size_read() length checks · 475ecade

由 Hugh Dickins 提交于 6月 07, 2007

__generic_file_splice_read's partial page check, at eof after readpage,
not only got its calculations wrong, but also reused the loff variable:
causing data corruption when splicing from a non-0 offset in the file's
last page (revealed by ext2 -b 1024 testing on a loop of a tmpfs file).
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

475ecade

splice: move balance_dirty_pages_ratelimited() outside of splice actor · 20d698db

由 Jens Axboe 提交于 6月 05, 2007

I've seen inode related deadlocks, so move this call outside of the
actor itself, which may hold the inode lock.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

20d698db

splice: remove do_splice_direct() symbol export · 267adc3e

由 Jens Axboe 提交于 6月 08, 2007

It's only supposed to be used by do_sendfile(), which is never
modular. So kill the export.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

267adc3e

J
splice: move inode size check into generic_file_splice_read() · d366d398
由 Jens Axboe 提交于 6月 01, 2007
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
d366d398

08 5月, 2007 2 次提交

[PATCH] splice: always call into page_cache_readahead() · 86aa5ac5

由 Jens Axboe 提交于 5月 08, 2007

Don't try to guess what the read-ahead logic will do, allow it
to make its own decisions.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

86aa5ac5

[PATCH] splice(): fix interaction with readahead · 9ae9d68c

由 Fengguang Wu 提交于 5月 08, 2007

Eric Dumazet, thank you for disclosing this bug.

Readahead logic somehow fails to populate the page range with data.
It can be because

1) the readahead routine is not always called in the following lines of

fs/splice.c:
        if (!loff || nr_pages > 1)
                page_cache_readahead(mapping, &in->f_ra, in, index, nr_pages);

2) even called, page_cache_readahead() wont guarantee the pages are there.
It wont submit readahead I/O for pages already in the radix tree, or when
(ra_pages == 0), or after 256 cache hits.

In your case, it should be because of the retried reads, which lead to
excessive cache hits, and disables readahead at some time.

And that _one_ failure of readahead blocks the whole read process.
The application receives EAGAIN and retries the read, but
__generic_file_splice_read() refuse to make progress:

- in the previous invocation, it has allocated a blank page and inserted it
  into the radix tree, but never has the chance to start I/O for it: the test
  of SPLICE_F_NONBLOCK goes before that.

- in the retried invocation, the readahead code will neither get out of the
  cache hit mode, nor will it submit I/O for an already existing page.

Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9ae9d68c

29 3月, 2007 1 次提交

[PATCH] splice: partial write fix · d9993c37

由 Dmitriy Monakhov 提交于 3月 29, 2007

Currently if partial write has happened while ->commit_write() then page
wasn't marked as accessed and rebalanced.
Signed-off-by: NMonakhov Dmitriy <dmonakhov@openvz.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d9993c37

27 3月, 2007 3 次提交

Export __splice_from_pipe() · 40bee44e

由 Mark Fasheh 提交于 3月 21, 2007

Ocfs2 wants to implement it's own splice write actor so that it can better
manage cluster / page locks. This lets us re-use the rest of splice write
while only providing our own code where it's actually important.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

40bee44e

2/2 splice: dont readpage · 08c72591

由 Nick Piggin 提交于 3月 27, 2007

Splice does not need to readpage to bring the page uptodate before writing
to it, because prepare_write will take care of that for us.

Splice is also wrong to SetPageUptodate before the page is actually uptodate.
This results in the old uninitialised memory leak. This gets fixed as a
matter of course when removing the readpage logic.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

08c72591

1/2 splice: dont steal · 485ddb4b

由 Nick Piggin 提交于 3月 27, 2007

Stealing pages with splice is problematic because we cannot just insert
an uptodate page into the pagecache and hope the filesystem can take care
of it later.

We also cannot just ClearPageUptodate, then hope prepare_write does not
write anything into the page, because I don't think prepare_write gives
that guarantee.

Remove support for SPLICE_F_MOVE for now. If we really want to bring it
back, we might be able to do so with a the new filesystem buffered write
aops APIs I'm working on. If we really don't want to bring it back, then
we should decide that sooner rather than later, and remove the flag and
all the stealing infrastructure before anybody starts using it.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

485ddb4b

14 12月, 2006 1 次提交

[PATCH] constify pipe_buf_operations · d4c3cca9

由 Eric Dumazet 提交于 12月 13, 2006

- pipe/splice should use const pipe_buf_operations and file_operations

- struct pipe_inode_info has an unused field "start" : get rid of it.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d4c3cca9

09 12月, 2006 1 次提交

[PATCH] VFS: change struct file to use struct path · 0f7fc9e4

由 Josef "Jeff" Sipek 提交于 12月 08, 2006

This patch changes struct file to use struct path instead of having
independent pointers to struct dentry and struct vfsmount, and converts all
users of f_{dentry,vfsmnt} in fs/ to use f_path.{dentry,mnt}.

Additionally, it adds two #define's to make the transition easier for users of
the f_dentry and f_vfsmnt.
Signed-off-by: NJosef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0f7fc9e4

05 11月, 2006 1 次提交

[PATCH] splice: fix problem introduced with inode diet · ddac0d39

由 Jens Axboe 提交于 11月 04, 2006

After the inode slimming patch that unionised i_pipe/i_bdev/i_cdev, it's
no longer enough to check for existance of ->i_pipe to verify that this
is a pipe.

Original patch from Eric Dumazet <dada1@cosmosbay.com>
Final solution suggested by Linus.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ddac0d39

29 10月, 2006 1 次提交

[PATCH] mm: clean up pagecache allocation · 2ae88149

由 Nick Piggin 提交于 10月 28, 2006

- Consolidate page_cache_alloc

- Fix splice: only the pagecache pages and filesystem data need to use
  mapping_gfp_mask.

- Fix grab_cache_page_nowait: same as splice, also honour NUMA placement.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

2ae88149

20 10月, 2006 3 次提交

[PATCH] Remove SUID when splicing into an inode · 8c34e2d6

由 Jens Axboe 提交于 10月 17, 2006

Originally from Mark Fasheh <mark.fasheh@oracle.com>

generic_file_splice_write() does not remove S_ISUID or S_ISGID. This is
inconsistent with the way we generally write to files.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8c34e2d6

[PATCH] Introduce generic_file_splice_write_nolock() · 6da61809

由 Mark Fasheh 提交于 10月 17, 2006

This allows file systems to manage their own i_mutex locking while
still re-using the generic_file_splice_write() logic.

OCFS2 in particular wants this so that it can order cluster locks within
i_mutex.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6da61809

[PATCH] Take i_mutex in splice_from_pipe() · 62752ee1

由 Mark Fasheh 提交于 10月 17, 2006

The splice_actor may be calling ->prepare_write() and ->commit_write(). We
want i_mutex on the inode being written to before calling those so that we
don't race i_size changes.

The double locking behavior is done elsewhere in splice.c, and if we
eventually want _nolock variants of generic_file_splice_write(), fs modules
might have to replicate the nasty locking code. We introduce
inode_double_lock() and inode_double_unlock() to consolidate the locking
rules into one set of functions.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

62752ee1

12 10月, 2006 1 次提交
- J
  [PATCH] splice: fix pipe_to_file() ->prepare_write() error path · e6e80f29
  由 Jens Axboe 提交于 10月 11, 2006
```
Don't jump to the unlock+release path, we already did that.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  e6e80f29
01 10月, 2006 1 次提交

[PATCH] Update axboe@suse.de email address · 0fe23479

由 Jens Axboe 提交于 9月 04, 2006

As people often look for the copyright in files to see who to mail,
update the link to a neutral one.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0fe23479

10 7月, 2006 1 次提交

[PATCH] splice: fix problems with sys_tee() · aadd06e5

由 Jens Axboe 提交于 7月 10, 2006

Several issues noticed/fixed:

- We cannot reliably block in link_pipe() while holding both input and output
  mutexes. So do preparatory checks before locking down both mutexes and doing
  the link.

- The ipipe->nrbufs vs i check was bad, because we could have dropped the
  ipipe lock in-between. This causes us to potentially look at unknown
  buffers if we were racing with someone else reading this pipe.
Signed-off-by: NJens Axboe <axboe@suse.de>

aadd06e5

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功