提交 · e7029206ff43f6cf7d6fcb741adb126f47200516 · OpenHarmony / kernel_linux

04 8月, 2014 1 次提交

nfs: check wait_on_bit_lock err in page_group_lock · e7029206

由 Weston Andros Adamson 提交于 7月 17, 2014

Return errors from wait_on_bit_lock from nfs_page_group_lock.

Add a bool argument @wait to nfs_page_group_lock. If true, loop over
wait_on_bit_lock until it returns cleanly. If false, return the error
from wait_on_bit_lock.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

e7029206

13 7月, 2014 5 次提交

NFS: Remove 2 unused variables · aafe3750

由 Trond Myklebust 提交于 7月 12, 2014

Cc: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

aafe3750

nfs: handle multiple reqs in nfs_wb_page_cancel · 3e217045

由 Weston Andros Adamson 提交于 7月 11, 2014

Use nfs_lock_and_join_requests to merge all subrequests into the head request -
this cancels and dereferences all subrequests.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

3e217045

nfs: handle multiple reqs in nfs_page_async_flush · d4581383

由 Weston Andros Adamson 提交于 7月 11, 2014

Change nfs_find_and_lock_request so nfs_page_async_flush can handle multiple
requests in a page. There is only one request for a page the first time
nfs_page_async_flush is called, but if a write or commit fails, async_flush
is called again and there may be multiple requests associated with the page.
The solution is to merge all the requests in a page group into a single
request before calling nfs_pageio_add_request.

Rename nfs_find_and_lock_request to nfs_lock_and_join_requests and
change it to first lock all requests for the page, then cancel and merge
all subrequests into the head request.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d4581383

nfs: change find_request to find_head_request · 84d3a9a9

由 Weston Andros Adamson 提交于 7月 11, 2014

nfs_page_find_request_locked* should find the head request for that page.
Rename the functions and add comments to make this clear, and fix a bug
that could return a subrequest when page_private isn't set on the page.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

84d3a9a9

nfs: mark nfs_page reqs with flag for extra ref · 17089a29

由 Weston Andros Adamson 提交于 7月 11, 2014

Change the use of PG_INODE_REF - set it when taking extra reference on
subrequests and take care to only release once for each request.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

17089a29

25 6月, 2014 5 次提交

nfs: remove unused writeverf code · c65e6254

由 Weston Andros Adamson 提交于 6月 09, 2014

Remove duplicate writeverf structure from merge of nfs_pgio_header and
nfs_pgio_data and remove writeverf related flags and logic to handle
more than one RPC per nfs_pgio_header.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c65e6254

nfs: merge nfs_pgio_data into _header · d45f60c6

由 Weston Andros Adamson 提交于 6月 09, 2014

struct nfs_pgio_data only exists as a member of nfs_pgio_header, but is
passed around everywhere, because there used to be multiple _data structs
per _header. Many of these functions then use the _data to find a pointer
to the _header. This patch cleans this up by merging the nfs_pgio_data
structure into nfs_pgio_header and passing nfs_pgio_header around instead.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d45f60c6

nfs: rename members of nfs_pgio_data · 823b0c9d

由 Weston Andros Adamson 提交于 6月 09, 2014

Rename "verf" to "writeverf" and "pages" to "page_array" to prepare for
merge of nfs_pgio_data and nfs_pgio_header.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

823b0c9d

nfs: move nfs_pgio_data and remove nfs_rw_header · 1e7f3a48

由 Weston Andros Adamson 提交于 6月 09, 2014

nfs_rw_header was used to allocate an nfs_pgio_header along with an
nfs_pgio_data, because a _header would need at least one _data.

Now there is only ever one nfs_pgio_data for each nfs_pgio_header -- move
it to nfs_pgio_header and get rid of nfs_rw_header.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1e7f3a48

nfs: Fix cache_validity check in nfs_write_pageuptodate() · 18dd78c4

由 Scott Mayhew 提交于 6月 20, 2014

NFS_INO_INVALID_DATA cannot be ignored, even if we have a delegation.

We're still having some problems with data corruption when multiple
clients are appending to a file and those clients are being granted
write delegations on open.

To reproduce:

Client A:
vi /mnt/`hostname -s`
while :; do echo "XXXXXXXXXXXXXXX" >>/mnt/file; sleep $(( $RANDOM % 5 )); done

Client B:
vi /mnt/`hostname -s`
while :; do echo "YYYYYYYYYYYYYYY" >>/mnt/file; sleep $(( $RANDOM % 5 )); done

What's happening is that in nfs_update_inode() we're recognizing that
the file size has changed and we're setting NFS_INO_INVALID_DATA
accordingly, but then we ignore the cache_validity flags in
nfs_write_pageuptodate() because we have a delegation.  As a result,
in nfs_updatepage() we're extending the write to cover the full page
even though we've not read in the data to begin with.
Signed-off-by: NScott Mayhew <smayhew@redhat.com>
Cc: <stable@vger.kernel.org> # v3.11+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

18dd78c4

29 5月, 2014 22 次提交

nfs: page group support in nfs_mark_uptodate · d72ddcba

由 Weston Andros Adamson 提交于 5月 15, 2014

Change how nfs_mark_uptodate checks to see if writes cover a whole page.

This patch should have no effect yet since all page groups currently
have one request, but will come into play when pg_test functions are
modified to split pages into sub-page regions.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d72ddcba

nfs: page group syncing in write path · 20633f04

由 Weston Andros Adamson 提交于 5月 15, 2014

Operations that modify state for a whole page must be syncronized across
all requests within a page group. In the write path, this is calling
end_page_writeback and removing the head request from an inode.
Both of these operations should not be called until all requests
in a page group have reached the point where they would call them.

20633f04

nfs: add support for multiple nfs reqs per page · 2bfc6e56

由 Weston Andros Adamson 提交于 5月 15, 2014

Add "page groups" - a circular list of nfs requests (struct nfs_page)
that all reference the same page. This gives nfs read and write paths
the ability to account for sub-page regions independently. This
somewhat follows the design of struct buffer_head's sub-page
accounting.

Only "head" requests are ever added/removed from the inode list in
the buffered write path. "head" and "sub" requests are treated the
same through the read path and the rest of the write/commit path.
Requests are given an extra reference across the life of the list.

Page groups are never rejoined after being split. If the read/write
request fails and the client falls back to another path (ie revert
to MDS in PNFS case), the already split requests are pushed through
the recoalescing code again, which may split them further and then
coalesce them into properly sized requests on the wire. Fragmentation
shouldn't be a problem with the current design, because we flush all
requests in page group when a non-contiguous request is added, so
the only time resplitting should occur is on a resend of a read or
write.

This patch lays the groundwork for sub-page splitting, but does not
actually do any splitting. For now all page groups have one request
as pg_test functions don't yet split pages. There are several related
patches that are needed support multiple requests per page group.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2bfc6e56

nfs: remove unused arg from nfs_create_request · 8c8f1ac1

由 Weston Andros Adamson 提交于 5月 15, 2014

@inode is passed but not used.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8c8f1ac1

NFS: Create a common nfs_pageio_ops struct · 41d8d5b7

由 Anna Schumaker 提交于 5月 06, 2014

At this point the read and write structures look identical, so combine
them into something shared by both.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

41d8d5b7

NFS: Create a common generic_pg_pgios() · cf485fcd

由 Anna Schumaker 提交于 5月 06, 2014

What we have here is two functions that look identical.  Let's share
some more code!
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

cf485fcd

NFS: Create a common multiple_pgios() function · c3766276

由 Anna Schumaker 提交于 5月 06, 2014

Once again, these two functions look identical in the read and write
case.  Time to combine them together!
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c3766276

NFS: Create a common initiate_pgio() function · 1ed26f33

由 Anna Schumaker 提交于 5月 06, 2014

Most of this code is the same for both the read and write paths, so
combine everything and use the rw_ops when necessary.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1ed26f33

NFS: Create a generic_pgio function · ef2c488c

由 Anna Schumaker 提交于 5月 06, 2014

These functions are almost identical on both the read and write side.
FLUSH_COND_STABLE will never be set for the read path, so leaving it in
the generic code won't hurt anything.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ef2c488c

NFS: Create a common pgio_error function · 844c9e69

由 Anna Schumaker 提交于 5月 06, 2014

At this point, the read and write versions of this function look
identical so both should use the same function.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

844c9e69

NFS: Create a common rpcsetup function for reads and writes · ce59515c

由 Anna Schumaker 提交于 5月 06, 2014

Write adds a little bit of code dealing with flush flags, but since
"how" will always be 0 when reading we can share the code.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ce59515c

NFS: Create a common rpc_call_ops struct · 6f92fa45

由 Anna Schumaker 提交于 5月 06, 2014

The read and write paths set up this struct in exactly the same way, so
create a single shared struct.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

6f92fa45

NFS: Create a common nfs_pgio_result_common function · 0eecb214

由 Anna Schumaker 提交于 5月 06, 2014

Combining these functions will let me make a single nfs_rw_common_ops
struct (see the next patch).
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

0eecb214

NFS: Create a common pgio_rpc_prepare function · a4cdda59

由 Anna Schumaker 提交于 5月 06, 2014

The read and write paths do exactly the same thing for the rpc_prepare
rpc_op. This patch combines them together into a single function.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

a4cdda59

NFS: Create a common rw_header_alloc and rw_header_free function · 4a0de55c

由 Anna Schumaker 提交于 5月 06, 2014

I create a new struct nfs_rw_ops to decide the differences between reads
and writes.  This struct will be set when initializing a new
nfs_pgio_descriptor, and then passed on to the nfs_rw_header when a new
header is allocated.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

4a0de55c

NFS: Create a common pgio_alloc and pgio_release function · 00bfa30a

由 Anna Schumaker 提交于 5月 06, 2014

These functions are identical for the read and write paths so they can
be combined.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

00bfa30a

NFS: Move the write verifier into the nfs_pgio_header · f79d06f5

由 Anna Schumaker 提交于 5月 06, 2014

The header had a pointer to the verifier that was set from the old write
data struct.  We don't need to keep the pointer around now that we have
shared structures.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f79d06f5

NFS: Create a common read and write header struct · c0752cdf

由 Anna Schumaker 提交于 5月 06, 2014

The only difference is the write verifier field, but we can keep that
for a little bit longer.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c0752cdf

NFS: Create a common read and write data struct · 9c7e1b3d

由 Anna Schumaker 提交于 5月 06, 2014

At this point, the only difference between nfs_read_data and
nfs_write_data is the write verifier.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

9c7e1b3d

NFS: Create a common results structure for reads and writes · 9137bdf3

由 Anna Schumaker 提交于 5月 06, 2014

Reads and writes have very similar results.  This patch combines the two
structs together with comments to show where the differing fields are
used.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

9137bdf3

NFS: Create a common argument structure for reads and writes · 3c6b899c

由 Anna Schumaker 提交于 5月 06, 2014

Reads and writes have very similar arguments. This patch combines them
together and documents the few fields used only by write.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

3c6b899c

nfs: remove ->write_pageio_init from rpc ops · a20c93e3

由 Christoph Hellwig 提交于 4月 16, 2014

The write_pageio_init method is just a very convoluted way to grab the
right nfs_pageio_ops vector.  The vector to chose is not a choice of
protocol version, but just a pNFS vs MDS I/O choice that can simply be
done inside nfs_pageio_init_write based on the presence of a layout
driver, and a new force_mds flag to the special case of falling back
to MDS I/O on a pNFS-capable volume.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

a20c93e3

18 4月, 2014 1 次提交

arch: Mass conversion of smp_mb__*() · 4e857c58

由 Peter Zijlstra 提交于 3月 17, 2014

Mostly scripted conversion of the smp_mb__* barriers.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4e857c58

16 4月, 2014 1 次提交

NFS: Don't ignore suid/sgid bit changes after a successful write · 1f2edbe3

由 Trond Myklebust 提交于 4月 13, 2014

If we suspect that the server may have cleared the suid/sgid bit,
then mark the inode for revalidation.
Reported-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1f2edbe3

29 1月, 2014 1 次提交

nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATING · 4db72b40

由 Jeff Layton 提交于 1月 28, 2014

If the setting of NFS_INO_INVALIDATING gets reordered to before the
clearing of NFS_INO_INVALID_DATA, then another task may hit a race
window where both appear to be clear, even though the inode's pages are
still in need of invalidation. Fix this by adding the appropriate memory
barriers.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

4db72b40

28 1月, 2014 1 次提交

NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping · d529ef83

由 Jeff Layton 提交于 1月 27, 2014

There is a possible race in how the nfs_invalidate_mapping function is
handled. Currently, we go and invalidate the pages in the file and then
clear NFS_INO_INVALID_DATA.

The problem is that it's possible for a stale page to creep into the
mapping after the page was invalidated (i.e., via readahead). If another
writer comes along and sets the flag after that happens but before
invalidate_inode_pages2 returns then we could clear the flag
without the cache having been properly invalidated.

So, we must clear the flag first and then invalidate the pages. Doing
this however, opens another race:

It's possible to have two concurrent read() calls that end up in
nfs_revalidate_mapping at the same time. The first one clears the
NFS_INO_INVALID_DATA flag and then goes to call nfs_invalidate_mapping.

Just before calling that though, the other task races in, checks the
flag and finds it cleared. At that point, it trusts that the mapping is
good and gets the lock on the page, allowing the read() to be satisfied
from the cache even though the data is no longer valid.

These effects are easily manifested by running diotest3 from the LTP
test suite on NFS. That program does a series of DIO writes and buffered
reads. The operations are serialized and page-aligned but the existing
code fails the test since it occasionally allows a read to come out of
the cache incorrectly. While mixing direct and buffered I/O isn't
recommended, I believe it's possible to hit this in other ways that just
use buffered I/O, though that situation is much harder to reproduce.

The problem is that the checking/clearing of that flag and the
invalidation of the mapping really need to be atomic. Fix this by
serializing concurrent invalidations with a bitlock.

At the same time, we also need to allow other places that check
NFS_INO_INVALID_DATA to check whether we might be in the middle of
invalidating the file, so fix up a couple of places that do that
to look for the new NFS_INO_INVALIDATING flag.

Doing this requires us to be careful not to set the bitlock
unnecessarily, so this code only does that if it believes it will
be doing an invalidation.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d529ef83

18 1月, 2014 1 次提交

nfs: always make sure page is up-to-date before extending a write to cover the entire page · 263b4509

由 Scott Mayhew 提交于 1月 17, 2014

We should always make sure the cached page is up-to-date when we're
determining whether we can extend a write to cover the full page -- even
if we've received a write delegation from the server.

Commit c7559663 added logic to skip this check if we have a write
delegation, which can lead to data corruption such as the following
scenario if client B receives a write delegation from the NFS server:

Client A:
    # echo 123456789 > /mnt/file

Client B:
    # echo abcdefghi >> /mnt/file
    # cat /mnt/file
    0�D0�abcdefghi

Just because we hold a write delegation doesn't mean that we've read in
the entire page contents.

Cc: <stable@vger.kernel.org> # v3.11+
Signed-off-by: NScott Mayhew <smayhew@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

263b4509

06 1月, 2014 1 次提交

NFS: dprintk() should not print negative fileids and inode numbers · 1e8968c5

由 Niels de Vos 提交于 12月 17, 2013

A fileid in NFS is a uint64. There are some occurrences where dprintk()
outputs a signed fileid. This leads to confusion and more difficult to
read debugging (negative fileids matching positive inode numbers).
Signed-off-by: NNiels de Vos <ndevos@redhat.com>
CC: Santosh Pradhan <spradhan@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1e8968c5

25 10月, 2013 1 次提交
- A
  nfs: use %p[dD] instead of open-coded (and often racy) equivalents · 6de1472f
  由 Al Viro 提交于 9月 16, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  6de1472f

OpenHarmony / kernel_linux 上一次同步 4 年多

OpenHarmony / kernel_linux
上一次同步 4 年多