提交 · 599bd19bdc4c6b20fd91d50f2f79dececbaf80c1 · openanolis / cloud-kernel

13 3月, 2015 1 次提交

由 Christoph Hellwig 提交于 2月 11, 2015

There is no need to pass the total request length in the kiocb, as
we already get passed in through the iov_iter argument.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

66ee59af

04 2月, 2015 8 次提交

nfs: count DIO good bytes correctly with mirroring · 5fadeb47

由 Peng Tao 提交于 1月 19, 2015

When resending to MDS, we might resend multiple mirroring
requests to MDS. As a result, nfs_direct_good_bytes() ends
up counting bytes multiple times, causing application to
get wrong return results in read/write syscalls.

Fix it by tracking start of a dreq and checking the range of
pgio header.

Cc: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: NPeng Tao <tao.peng@primarydata.com>

5fadeb47

P
nfs: add a helper to set NFS_ODIRECT_RESCHED_WRITES to direct writes · 012fa16d
由 Peng Tao 提交于 12月 01, 2014
```
To allow pnfs LD to ask direct writes to be resend.
Signed-off-by: NPeng Tao <tao.peng@primarydata.com>
```
012fa16d

pnfs: fail comparison when bucket verifier not set · 80c76fe3

由 Weston Andros Adamson 提交于 10月 01, 2014

This skips the WARN_ON_ONCE, but doesnt change behavior (the memcmp would
fail).
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTom Haynes <Thomas.Haynes@primarydata.com>

80c76fe3

nfs: mirroring support for direct io · 0a00b77b

由 Weston Andros Adamson 提交于 9月 19, 2014

The current mirroring code only notices short writes to the first
mirror. This patch keeps per-mirror byte counts and only considers
a byte to be written once all mirrors report so.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>

0a00b77b

nfs: add mirroring support to pgio layer · a7d42ddb

由 Weston Andros Adamson 提交于 9月 19, 2014

This patch adds mirrored write support to the pgio layer. The default
is to use one mirror, but pgio callers may define callbacks to change
this to any value up to the (arbitrarily selected) limit of 16.

The basic idea is to break out members of nfs_pageio_descriptor that cannot
be shared between mirrored DSes and put them in a new structure.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>

a7d42ddb

pnfs: pass ds_commit_idx through the commit path · b57ff130

由 Weston Andros Adamson 提交于 9月 05, 2014

Pass ds_commit_idx through the nfs commit path. It's used to select
the commit bucket when using pnfs and is ignored when not using pnfs.
Several functions had to be changed: nfs_retry_commit,
nfs_mark_request_commit, pnfs_mark_request_commit and the pnfs layout
driver .mark_request_commit functions.
Signed-off-by: NTom Haynes <loghyr@primarydata.com>

b57ff130

nfs: rename pgio header ds_idx to ds_commit_idx · 6cccbb6f

由 Weston Andros Adamson 提交于 9月 16, 2014

'ds_commit_idx' is a better name - it is used to select the right
commit bucket for pnfs.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>

6cccbb6f

T
pnfs: Do not grab the commit_info lock twice when rescheduling writes · 085d1e33
由 Tom Haynes 提交于 12月 11, 2014
```
Acked-by: NJeff Layton <jlayton@primarydata.com>
Signed-off-by: NTom Haynes <loghyr@primarydata.com>
```
085d1e33

22 1月, 2015 1 次提交

nfs: fix dio deadlock when O_DIRECT flag is flipped · ee8a1a8b

由 Peng Tao 提交于 1月 20, 2015

We only support swap file calling nfs_direct_IO. However, application
might be able to get to nfs_direct_IO if it toggles O_DIRECT flag
during IO and it can deadlock because we grab inode->i_mutex in
nfs_file_direct_write(). So return 0 for such case. Then the generic
layer will fall back to buffer IO.
Signed-off-by: NPeng Tao <tao.peng@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ee8a1a8b

13 11月, 2014 1 次提交

nfs: fix pnfs direct write memory leak · 8c393f9a

由 Peng Tao 提交于 11月 05, 2014

For pNFS direct writes, layout driver may dynamically allocate ds_cinfo.buckets.
So we need to take care to free them when freeing dreq.

Ideally this needs to be done inside layout driver where ds_cinfo.buckets
are allocated. But buckets are attached to dreq and reused across LD IO iterations.
So I feel it's OK to free them in the generic layer.

Cc: stable@vger.kernel.org [v3.4+]
Signed-off-by: NPeng Tao <tao.peng@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8c393f9a

14 10月, 2014 1 次提交

block: Remove REQ_KERNEL · e19a8a0a

由 Martin K. Petersen 提交于 10月 14, 2014

REQ_KERNEL is no longer used. Remove it and drop the redundant uio
argument to nfs_file_direct_{read,write}.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e19a8a0a

13 9月, 2014 1 次提交

NFS: Unconditionally enable commit code · f418c64b

由 Anna Schumaker 提交于 9月 03, 2014

The goal is to create a generic NFS module with code that does not
depend on what versions of NFS are enabled.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f418c64b

13 7月, 2014 1 次提交

NFS: Remove 2 unused variables · aafe3750

由 Trond Myklebust 提交于 7月 12, 2014

Cc: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

aafe3750

25 6月, 2014 3 次提交

nfs: remove unused writeverf code · c65e6254

由 Weston Andros Adamson 提交于 6月 09, 2014

Remove duplicate writeverf structure from merge of nfs_pgio_header and
nfs_pgio_data and remove writeverf related flags and logic to handle
more than one RPC per nfs_pgio_header.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c65e6254

nfs: merge nfs_pgio_data into _header · d45f60c6

由 Weston Andros Adamson 提交于 6月 09, 2014

struct nfs_pgio_data only exists as a member of nfs_pgio_header, but is
passed around everywhere, because there used to be multiple _data structs
per _header. Many of these functions then use the _data to find a pointer
to the _header. This patch cleans this up by merging the nfs_pgio_data
structure into nfs_pgio_header and passing nfs_pgio_header around instead.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d45f60c6

nfs: move nfs_pgio_data and remove nfs_rw_header · 1e7f3a48

由 Weston Andros Adamson 提交于 6月 09, 2014

nfs_rw_header was used to allocate an nfs_pgio_header along with an
nfs_pgio_data, because a _header would need at least one _data.

Now there is only ever one nfs_pgio_data for each nfs_pgio_header -- move
it to nfs_pgio_header and get rid of nfs_rw_header.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1e7f3a48

29 5月, 2014 6 次提交

pnfs: support multiple verfs per direct req · 5002c586

由 Weston Andros Adamson 提交于 5月 15, 2014

Support direct requests that span multiple pnfs data servers by
comparing nfs_pgio_header->verf to a cached verf in pnfs_commit_bucket.
Continue to use dreq->verf if the MDS is used / non-pNFS.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5002c586

nfs: add support for multiple nfs reqs per page · 2bfc6e56

由 Weston Andros Adamson 提交于 5月 15, 2014

Add "page groups" - a circular list of nfs requests (struct nfs_page)
that all reference the same page. This gives nfs read and write paths
the ability to account for sub-page regions independently. This
somewhat follows the design of struct buffer_head's sub-page
accounting.

Only "head" requests are ever added/removed from the inode list in
the buffered write path. "head" and "sub" requests are treated the
same through the read path and the rest of the write/commit path.
Requests are given an extra reference across the life of the list.

Page groups are never rejoined after being split. If the read/write
request fails and the client falls back to another path (ie revert
to MDS in PNFS case), the already split requests are pushed through
the recoalescing code again, which may split them further and then
coalesce them into properly sized requests on the wire. Fragmentation
shouldn't be a problem with the current design, because we flush all
requests in page group when a non-contiguous request is added, so
the only time resplitting should occur is on a resend of a read or
write.

This patch lays the groundwork for sub-page splitting, but does not
actually do any splitting. For now all page groups have one request
as pg_test functions don't yet split pages. There are several related
patches that are needed support multiple requests per page group.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2bfc6e56

nfs: remove unused arg from nfs_create_request · 8c8f1ac1

由 Weston Andros Adamson 提交于 5月 15, 2014

@inode is passed but not used.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8c8f1ac1

NFS: Move the write verifier into the nfs_pgio_header · f79d06f5

由 Anna Schumaker 提交于 5月 06, 2014

The header had a pointer to the verifier that was set from the old write
data struct.  We don't need to keep the pointer around now that we have
shared structures.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f79d06f5

nfs: remove ->read_pageio_init from rpc ops · fab5fc25

由 Christoph Hellwig 提交于 4月 16, 2014

The read_pageio_init method is just a very convoluted way to grab the
right nfs_pageio_ops vector.  The vector to chose is not a choice of
protocol version, but just a pNFS vs MDS I/O choice that can simply be
done inside nfs_pageio_init_read based on the presence of a layout
driver, and a new force_mds flag to the special case of falling back
to MDS I/O on a pNFS-capable volume.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

fab5fc25

nfs: remove ->write_pageio_init from rpc ops · a20c93e3

由 Christoph Hellwig 提交于 4月 16, 2014

The write_pageio_init method is just a very convoluted way to grab the
right nfs_pageio_ops vector.  The vector to chose is not a choice of
protocol version, but just a pNFS vs MDS I/O choice that can simply be
done inside nfs_pageio_init_write based on the presence of a layout
driver, and a new force_mds flag to the special case of falling back
to MDS I/O on a pNFS-capable volume.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

a20c93e3

07 5月, 2014 4 次提交

new helper: iov_iter_get_pages_alloc() · 91f79c43

由 Al Viro 提交于 3月 21, 2014

same as iov_iter_get_pages(), except that pages array is allocated
(kmalloc if possible, vmalloc if that fails) and left for caller to
free.  Lustre and NFS ->direct_IO() switched to it.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

91f79c43

get rid of pointless iov_length() in ->direct_IO() · a6cbcd4a

由 Al Viro 提交于 3月 04, 2014

all callers have iov_length(iter->iov, iter->nr_segs) == iov_iter_count(iter)
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a6cbcd4a

A
convert the guts of nfs_direct_IO() to iov_iter · 619d30b4
由 Al Viro 提交于 3月 04, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
619d30b4
A
pass iov_iter to ->direct_IO() · d8d3d94b
由 Al Viro 提交于 3月 04, 2014
```
unmodified, for now
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
d8d3d94b

14 1月, 2014 7 次提交

nfs: page cache invalidation for dio · a9ab5e84

由 Christoph Hellwig 提交于 11月 14, 2013

Make sure to properly invalidate the pagecache before performing direct I/O,
so that no stale pages are left around. This matches what the generic
direct I/O code does. Also take the i_mutex over the direct write submission
to avoid the lifelock vs truncate waiting for i_dio_count to decrease, and
to avoid having the pagecache easily repopulated while direct I/O is in
progrss. Again matching the generic direct I/O code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

a9ab5e84

nfs: take i_mutex during direct I/O reads · d0b9875d

由 Christoph Hellwig 提交于 11月 14, 2013

We'll need the i_mutex to prevent i_dio_count from incrementing while
truncate is waiting for it to reach zero, and protects against having
the pagecache repopulated after we flushed it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d0b9875d

nfs: merge nfs_direct_write into nfs_file_direct_write · 22cd1bf1

由 Christoph Hellwig 提交于 11月 14, 2013

Simple code cleanup to prepare for later fixes.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

22cd1bf1

nfs: merge nfs_direct_read into nfs_file_direct_read · 14a3ec79

由 Christoph Hellwig 提交于 11月 14, 2013

Simple code cleanup to prepare for later fixes.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

14a3ec79

nfs: increment i_dio_count for reads, too · 1f90ee27

由 Christoph Hellwig 提交于 11月 14, 2013

i_dio_count is used to protect dio access against truncate. We want
to make sure there are no dio reads pending either when doing a
truncate. I suspect on plain NFS things might work even without
this, but once we use a pnfs layout driver that access backing devices
directly things will go bad without the proper synchronization.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1f90ee27

nfs: defer inode_dio_done call until size update is done · 2a009ec9

由 Christoph Hellwig 提交于 11月 14, 2013

We need to have the I/O fully finished before telling the truncate code
that we are done.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2a009ec9

nfs: fix size updates for aio writes · 9811cd57

由 Christoph Hellwig 提交于 11月 14, 2013

nfs_file_direct_write only updates the inode size if it succeeded and
returned the number of bytes written. But in the AIO case nfs_direct_wait
turns the return value into -EIOCBQUEUED and we skip the size update.

Instead the aio completion path should updated it, which this patch
does. The implementation is a little hacky because there is no obvious
way to find out we are called for a write in nfs_direct_complete.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

9811cd57

06 1月, 2014 1 次提交

NFS: dprintk() should not print negative fileids and inode numbers · 1e8968c5

由 Niels de Vos 提交于 12月 17, 2013

A fileid in NFS is a uint64. There are some occurrences where dprintk()
outputs a signed fileid. This leads to confusion and more difficult to
read debugging (negative fileids matching positive inode numbers).
Signed-off-by: NNiels de Vos <ndevos@redhat.com>
CC: Santosh Pradhan <spradhan@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1e8968c5

25 10月, 2013 1 次提交
- A
  nfs: use %p[dD] instead of open-coded (and often racy) equivalents · 6de1472f
  由 Al Viro 提交于 9月 16, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  6de1472f
30 7月, 2013 1 次提交

aio: Kill aio_rw_vect_retry() · 73a7075e

由 Kent Overstreet 提交于 5月 09, 2013

This code doesn't serve any purpose anymore, since the aio retry
infrastructure has been removed.

This change should be safe because aio_read/write are also used for
synchronous IO, and called from do_sync_read()/do_sync_write() - and
there's no looping done in the sync case (the read and write syscalls).
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>

73a7075e

13 12月, 2012 2 次提交

nfs: fix page dirtying in NFS DIO read codepath · be7e9858

由 Jeff Layton 提交于 12月 12, 2012

The NFS DIO code will dirty pages that catch read responses in order to
handle the case where someone is doing DIO reads into an mmapped buffer.
The existing code doesn't really do the right thing though since it
doesn't take into account the case where we might be attempting to read
past the EOF.

Fix the logic in that code to only dirty pages that ended up receiving
data from the read. Note too that it really doesn't matter if
NFS_IOHDR_ERROR is set or not. All that matters is if the page was
altered by the read.

Cc: Fred Isaman <iisaman@netapp.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

be7e9858

nfs: don't zero out the rest of the page if we hit the EOF on a DIO READ · 67fad106

由 Jeff Layton 提交于 12月 12, 2012

Eryu provided a test program that would segfault when attempting to read
past the EOF on file that was opened O_DIRECT. The buffer given to the
read() call was on the stack, and when he attempted to read past it it
would scribble over the rest of the stack page.

If we hit the end of the file on a DIO READ request, then we don't want
to zero out the rest of the buffer. These aren't pagecache pages after
all, and there's no guarantee that the buffers that were passed in
represent entire pages.

Cc: <stable@vger.kernel.org> # v3.5+
Cc: Fred Isaman <iisaman@netapp.com>
Reported-by: NEryu Guan <eguan@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

67fad106

09 10月, 2012 1 次提交

NFS41: send real write size in layoutget · 6296556f

由 Peng Tao 提交于 9月 25, 2012

For buffer write, block layout client scan inode mapping to find
next hole and use offset-to-hole as layoutget length. Object
layout client uses offset-to-isize as layoutget length.

For direct write, both block layout and object layout use dreq->bytes_left.
Signed-off-by: NPeng Tao <tao.peng@emc.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

6296556f

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功