提交 · 0198c00df2b6acf0358f2e88c48a0f1f1d45c044 · openeuler / Kernel

27 12月, 2019 5 次提交

nfs: Fix nfsi->nrequests count error on nfs_inode_remove_request · 0198c00d

由 ZhangXiaoxu 提交于 10月 12, 2019

mainline inclusion
from mainline-v5.4-rc2
commit 33ea5aaa
category: bugfix
bugzilla: 23211
CVE: NA

-------------------------------------------------

When xfstests testing, there are some WARNING as below:

WARNING: CPU: 0 PID: 6235 at fs/nfs/inode.c:122 nfs_clear_inode+0x9c/0xd8
Modules linked in:
CPU: 0 PID: 6235 Comm: umount.nfs
Hardware name: linux,dummy-virt (DT)
pstate: 60000005 (nZCv daif -PAN -UAO)
pc : nfs_clear_inode+0x9c/0xd8
lr : nfs_evict_inode+0x60/0x78
sp : fffffc000f68fc00
x29: fffffc000f68fc00 x28: fffffe00c53155c0
x27: fffffe00c5315000 x26: fffffc0009a63748
x25: fffffc000f68fd18 x24: fffffc000bfaaf40
x23: fffffc000936d3c0 x22: fffffe00c4ff5e20
x21: fffffc000bfaaf40 x20: fffffe00c4ff5d10
x19: fffffc000c056000 x18: 000000000000003c
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000040 x14: 0000000000000228
x13: fffffc000c3a2000 x12: 0000000000000045
x11: 0000000000000000 x10: 0000000000000000
x9 : 0000000000000000 x8 : 0000000000000000
x7 : 0000000000000000 x6 : fffffc00084b027c
x5 : fffffc0009a64000 x4 : fffffe00c0e77400
x3 : fffffc000c0563a8 x2 : fffffffffffffffb
x1 : 000000000000764e x0 : 0000000000000001
Call trace:
 nfs_clear_inode+0x9c/0xd8
 nfs_evict_inode+0x60/0x78
 evict+0x108/0x380
 dispose_list+0x70/0xa0
 evict_inodes+0x194/0x210
 generic_shutdown_super+0xb0/0x220
 nfs_kill_super+0x40/0x88
 deactivate_locked_super+0xb4/0x120
 deactivate_super+0x144/0x160
 cleanup_mnt+0x98/0x148
 __cleanup_mnt+0x38/0x50
 task_work_run+0x114/0x160
 do_notify_resume+0x2f8/0x308
 work_pending+0x8/0x14

The nrequest should be increased/decreased only if PG_INODE_REF flag
was setted.

But in the nfs_inode_remove_request function, it maybe decrease when
no PG_INODE_REF flag, this maybe lead nrequests count error.
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NZhangXiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Nzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

0198c00d

NFS: Pass error information to the pgio error cleanup routine · 1aea4c49

由 Trond Myklebust 提交于 9月 09, 2019

[ Upstream commit df3accb8 ]

Allow the caller to pass error information when cleaning up a failed
I/O request so that we can conditionally take action to cancel the
request altogether if the error turned out to be fatal.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

1aea4c49

NFS: Don't interrupt file writeout due to fatal errors · 812ffba3

由 Trond Myklebust 提交于 6月 12, 2019

mainline inclusion
from mainline-v5.2-rc1
commit 14bebe3c
category: bugfix
bugzilla: 15379
CVE: NA

-------------------------------------------------

When flushing out dirty pages, the fact that we may hit fatal errors
is not a reason to stop writeback. Those errors are reported through
fsync(), not through the flush mechanism.

Fixes: a6598813 ("NFS: Don't write back further requests if there...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NZhangXiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Nyangerkun <yangerkun@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

812ffba3

NFS: Don't use page_file_mapping after removing the page · 2b8dcc22

由 Benjamin Coddington 提交于 2月 23, 2019

mainline inclusion
from mainline-5.0
commit d2ceb7e5
category: bugfix
bugzilla: 10400
CVE: NA

-------------------------------------------------

If nfs_page_async_flush() removes the page from the mapping, then we can't
use page_file_mapping() on it as nfs_updatepate() is wont to do when
receiving an error.  Instead, push the mapping to the stack before the page
is possibly truncated.

Fixes: 8fc75bed ("NFS: Fix up return value on fatal errors in nfs_page_async_flush()")
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NZhangXiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

2b8dcc22

NFS: Fix up return value on fatal errors in nfs_page_async_flush() · d9c0d2c4

由 Trond Myklebust 提交于 1月 29, 2019

commit 8fc75bed upstream.

Ensure that we return the fatal error value that caused us to exit
nfs_page_async_flush().

Fixes: c373fff7 ("NFSv4: Don't special case "launder"")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.12+
Reviewed-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

d9c0d2c4

27 7月, 2018 1 次提交

NFS: Ensure we immediately start writeback on rescheduled writes · 7be7b3ca

由 Trond Myklebust 提交于 7月 04, 2018

If the writes are being rescheduled due to a pNFS error, then we really
want to immediately start a new flush. The O_DIRECT code already does
this, so we only need to worry about buffered writes.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7be7b3ca

01 6月, 2018 2 次提交

NFS: Move call to nfs4_state_protect() to nfs4_commit_setup() · e9ae1ee2

由 Anna Schumaker 提交于 5月 04, 2018

Rather than doing this in the generic NFS client code.  Let's put this
with the other v4 stuff so it's all in one place.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

e9ae1ee2

NFS: Move call to nfs4_state_protect_write() to nfs4_write_setup() · fb91fb0e

由 Anna Schumaker 提交于 5月 04, 2018

This doesn't really need to be in the generic NFS client code, and I
think it makes more sense to keep the v4 code in one place.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

fb91fb0e

11 4月, 2018 2 次提交

NFSv4: Declare the size up to date after it was set. · f6cdfa6d

由 Trond Myklebust 提交于 3月 27, 2018

When we've changed the file size, then ensure we declare it to be
up to date in the inode attributes.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

f6cdfa6d

NFS: More fine grained attribute tracking · 16e14375

由 Trond Myklebust 提交于 3月 20, 2018

Currently, if the NFS_INO_INVALID_ATTR flag is set, for instance by
a call to nfs_post_op_update_inode_locked(), then it will not be cleared
until all the attributes have been revalidated. This means, for instance,
that NFSv4 writes will always force a full attribute revalidation.

Track the ctime, mtime, size and change attribute separately from the
other attributes so that we can have nfs_post_op_update_inode_locked()
set them correctly, and later have the cache consistency bitmask be
able to clear them.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

16e14375

20 3月, 2018 1 次提交

sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API · 723c921e

由 Peter Zijlstra 提交于 3月 15, 2018

The old wait_on_atomic_t() is going to get removed, use the more
flexible wait_var_event() API instead.

No change in functionality.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Anna Schumaker <anna.schumaker@netapp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

723c921e

09 3月, 2018 1 次提交

NFS: Fix unstable write completion · c4f24df9

由 Trond Myklebust 提交于 3月 07, 2018

We do want to respect the FLUSH_SYNC argument to nfs_commit_inode() to
ensure that all outstanding COMMIT requests to the inode in question are
complete. Currently we may exit early from both nfs_commit_inode() and
nfs_write_inode() even if there are COMMIT requests in flight, or unstable
writes on the commit list.

In order to get the right semantics w.r.t. sync_inode(), we don't need
to have nfs_commit_inode() reset the inode dirty flags when called from
nfs_wb_page() and/or nfs_wb_all(). We just need to ensure that
nfs_write_inode() leaves them in the right state if there are outstanding
commits, or stable pages.
Reported-by: NScott Mayhew <smayhew@redhat.com>
Fixes: dc4fd9ab ("nfs: don't wait on commit in nfs_commit_inode()...")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c4f24df9

29 1月, 2018 1 次提交

nfs: convert to new i_version API · 1eb5d98f

由 Jeff Layton 提交于 1月 09, 2018

For NFS, we just use the "raw" API since the i_version is mostly
managed by the server. The exception there is when the client
holds a write delegation, but we only need to bump it once
there anyway to handle CB_GETATTR.
Tested-by: NKrzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: NJeff Layton <jlayton@redhat.com>

1eb5d98f

15 1月, 2018 1 次提交

NFS: Add a cond_resched() to nfs_commit_release_pages() · 7f1bda44

由 Trond Myklebust 提交于 12月 18, 2017

The commit list can get very large, and so we need a cond_resched()
in nfs_commit_release_pages() in order to ensure we don't hog the CPU
for excessive periods of time.
Reported-by: NMike Galbraith <efault@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

7f1bda44

16 12月, 2017 1 次提交

nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests · dc4fd9ab

由 Scott Mayhew 提交于 12月 08, 2017

If there were no commit requests, then nfs_commit_inode() should not
wait on the commit or mark the inode dirty, otherwise the following
BUG_ON can be triggered:

[ 1917.130762] kernel BUG at fs/inode.c:578!
[ 1917.130766] Oops: Exception in kernel mode, sig: 5 [#1]
[ 1917.130768] SMP NR_CPUS=2048 NUMA pSeries
[ 1917.130772] Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc sg nx_crypto pseries_rng ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod
[ 1917.130805] CPU: 2 PID: 14923 Comm: umount.nfs4 Tainted: G               ------------ T 3.10.0-768.el7.ppc64 #1
[ 1917.130810] task: c0000005ecd88040 ti: c00000004cea0000 task.ti: c00000004cea0000
[ 1917.130813] NIP: c000000000354178 LR: c000000000354160 CTR: c00000000012db80
[ 1917.130816] REGS: c00000004cea3720 TRAP: 0700   Tainted: G               ------------ T  (3.10.0-768.el7.ppc64)
[ 1917.130820] MSR: 8000000100029032 <SF,EE,ME,IR,DR,RI>  CR: 22002822  XER: 20000000
[ 1917.130828] CFAR: c00000000011f594 SOFTE: 1
GPR00: c000000000354160 c00000004cea39a0 c0000000014c4700 c0000000018cc750
GPR04: 000000000000c750 80c0000000000000 0600000000000000 04eeb76bea749a03
GPR08: 0000000000000034 c0000000018cc758 0000000000000001 d000000005e619e8
GPR12: c00000000012db80 c000000007b31200 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 c000000000dfc3ec 0000000000000000 c0000005eefc02c0
GPR28: d0000000079dbd50 c0000005b94a02c0 c0000005b94a0250 c0000005b94a01c8
[ 1917.130867] NIP [c000000000354178] .evict+0x1c8/0x350
[ 1917.130871] LR [c000000000354160] .evict+0x1b0/0x350
[ 1917.130873] Call Trace:
[ 1917.130876] [c00000004cea39a0] [c000000000354160] .evict+0x1b0/0x350 (unreliable)
[ 1917.130880] [c00000004cea3a30] [c0000000003558cc] .evict_inodes+0x13c/0x270
[ 1917.130884] [c00000004cea3af0] [c000000000327d20] .kill_anon_super+0x70/0x1e0
[ 1917.130896] [c00000004cea3b80] [d000000005e43e30] .nfs_kill_super+0x20/0x60 [nfs]
[ 1917.130900] [c00000004cea3c00] [c000000000328a20] .deactivate_locked_super+0xa0/0x1b0
[ 1917.130903] [c00000004cea3c80] [c00000000035ba54] .cleanup_mnt+0xd4/0x180
[ 1917.130907] [c00000004cea3d10] [c000000000119034] .task_work_run+0x114/0x150
[ 1917.130912] [c00000004cea3db0] [c00000000001ba6c] .do_notify_resume+0xcc/0x100
[ 1917.130916] [c00000004cea3e30] [c00000000000a7b0] .ret_from_except_lite+0x5c/0x60
[ 1917.130919] Instruction dump:
[ 1917.130921] 7fc3f378 486734b5 60000000 387f00a0 38800003 4bdcb365 60000000 e95f00a0
[ 1917.130927] 694a0060 7d4a0074 794ad182 694a0001 <0b0a0000> 892d02a4 2f890000 40de0134
Signed-off-by: NScott Mayhew <smayhew@redhat.com>
Cc: stable@vger.kernel.org # 4.5+
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

dc4fd9ab

18 11月, 2017 1 次提交

nfs/write: Use common error handling code in nfs_lock_and_join_requests() · 0671d8f1

由 Markus Elfring 提交于 11月 07, 2017

Add a jump target so that a bit of exception handling can be better reused
at the end of this function.

This issue was detected by using the Coccinelle software.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

0671d8f1

12 9月, 2017 2 次提交

NFS: various changes relating to reporting IO errors. · bf4b4905

由 NeilBrown 提交于 9月 11, 2017

1/ remove 'start' and 'end' args from nfs_file_fsync_commit().
   They aren't used.

2/ Make nfs_context_set_write_error() a "static inline" in internal.h
   so we can...

3/ Use nfs_context_set_write_error() instead of mapping_set_error()
   if nfs_pageio_add_request() fails before sending any request.
   NFS generally keeps errors in the open_context, not the mapping,
   so this is more consistent.

4/ If filemap_write_and_write_range() reports any error, still
   check ctx->error.  The value in ctx->error is likely to be
   more useful.  As part of this, NFS_CONTEXT_ERROR_WRITE is
   cleared slightly earlier, before nfs_file_fsync_commit() is called,
   rather than at the start of that function.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

bf4b4905

NFS: Add static NFS I/O tracepoints · 8224b273

由 Chuck Lever 提交于 8月 21, 2017

Tools like tcpdump and rpcdebug can be very useful. But there are
plenty of environments where they are difficult or impossible to
use. For example, we've had customers report I/O failures during
workloads so heavy that collecting network traffic or enabling
RPC debugging are themselves onerous.

The kernel's static tracepoints are lightweight (less likely to
introduce timing changes) and efficient (the trace data is compact).
They also work in scenarios where capturing network traffic is not
possible due to lack of hardware support (some InfiniBand HCAs) or
where data or network privacy is a concern.

Introduce tracepoints that show when an NFS READ, WRITE, or COMMIT
is initiated, and when it completes. Record the arguments and
results of each operation, which are not shown by existing sunrpc
module's tracepoints.

For instance, the recorded offset and count can be used to match an
"initiate" event to a "done" event. If an NFS READ result returns
fewer bytes than requested or zero, seeing the EOF flag can be
probative. Seeing an NFS4ERR_BAD_STATEID result is also indication
of a particular class of problems. The timing information attached
to each event record can often be useful as well.

Usage example:

[root@manet tmp]# trace-cmd record -e nfs:*initiate* -e nfs:*done
/sys/kernel/debug/tracing/events/nfs/*initiate*/filter
/sys/kernel/debug/tracing/events/nfs/*done/filter
Hit Ctrl^C to stop recording
^CKernel buffer statistics:
  Note: "entries" are the entries left in the kernel ring buffer and are not
        recorded in the trace data. They should all be zero.

CPU: 0
entries: 0
overrun: 0
commit overrun: 0
bytes: 3680
oldest event ts:    78.367422
now ts:   100.124419
dropped events: 0
read events: 74

... and so on.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8224b273

10 9月, 2017 4 次提交

NFS: Count the bytes of skipped subrequests in nfs_lock_and_join_requests() · 1bd5d6d0

由 Trond Myklebust 提交于 9月 09, 2017

If we skip a subrequest due to a zero refcount, we should still count
the byte range that it covered so that we accurately reconstruct the
original request size.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1bd5d6d0

NFS: Don't hold the group lock when calling nfs_release_request() · 8b77484f

由 Trond Myklebust 提交于 9月 09, 2017

That can deadlock if this is the last reference since
nfs_page_group_destroy() calls nfs_page_group_sync_on_bit().
Note that even if the page was removed from the subpage list,
the req->wb_head could still be pointing to the old head.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8b77484f

NFS: Remove pnfs_generic_transfer_commit_list() · 5d2a9d9d

由 Trond Myklebust 提交于 9月 09, 2017

It's pretty much a duplicate of nfs_scan_commit_list() that also
clears the PG_COMMIT_TO_DS flag.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5d2a9d9d

NFS: nfs_lock_and_join_requests and nfs_scan_commit_list can deadlock · 137da553

由 Trond Myklebust 提交于 9月 09, 2017

Since the commit list is not ordered, it is possible for nfs_scan_commit_list
to hold a request that nfs_lock_and_join_requests() is waiting for, while
at the same time trying to grab a request that nfs_lock_and_join_requests
already holds.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

137da553

07 9月, 2017 1 次提交

NFS: don't expect errors from mempool_alloc(). · 237f8306

由 NeilBrown 提交于 8月 18, 2017

Commit fbe77c30 ("NFS: move rw_mode to nfs_pageio_header")
reintroduced some pointless code that commit 518662e0 ("NFS: fix
usage of mempools.") had recently removed.

Remove it again.

Cc: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

237f8306

20 8月, 2017 1 次提交

NFS: Remove unused parameter gfp_flags from nfs_pageio_init() · 3bde7afd

由 Trond Myklebust 提交于 8月 20, 2017

Now that the mirror allocation has been moved, the parameter can go.
Also remove the redundant symbol export.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

3bde7afd

15 8月, 2017 16 次提交

NFS: Wait for requests that are locked on the commit list · 2ce209c4

由 Trond Myklebust 提交于 8月 01, 2017

If a request is on the commit list, but is locked, we will currently skip
it, which can lead to livelocking when the commit count doesn't reduce
to zero.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2ce209c4

NFS: Switch to using mapping->private_lock for page writeback lookups. · 4b9bb25b

由 Trond Myklebust 提交于 8月 01, 2017

Switch from using the inode->i_lock for this to avoid contention with
other metadata manipulation.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

4b9bb25b

T
NFS: Use an atomic_long_t to count the number of commits · 5cb953d4
由 Trond Myklebust 提交于 8月 01, 2017
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
5cb953d4

NFS: Use an atomic_long_t to count the number of requests · a6b6d5b8

由 Trond Myklebust 提交于 8月 01, 2017

Rather than forcing us to take the inode->i_lock just in order to bump
the number.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

a6b6d5b8

NFSv4: Use a mutex to protect the per-inode commit lists · e824f99a

由 Trond Myklebust 提交于 8月 01, 2017

The commit lists can get very large, so using the inode->i_lock can
end up affecting general metadata performance.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

e824f99a

NFS: Refactor nfs_page_find_head_request() · b30d2f04

由 Trond Myklebust 提交于 8月 01, 2017

Split out the 2 cases so that we can treat the locking differently.
The issue is that the locking in the pageswapcache cache is highly
linked to the commit list locking.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b30d2f04

NFSv4: Convert nfs_lock_and_join_requests() to use nfs_page_find_head_request() · bd37d6fc

由 Trond Myklebust 提交于 8月 01, 2017

Hide the locking from nfs_lock_and_join_requests() so that we can
separate out the requirements for swapcache pages.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

bd37d6fc

NFS: Fix up nfs_page_group_covers_page() · 7e8a30f8

由 Trond Myklebust 提交于 7月 17, 2017

Fix up the test in nfs_page_group_covers_page(). The simplest implementation
is to check that we have a set of intersecting or contiguous subrequests
that connect page offset 0 to nfs_page_length(req->wb_page).
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

7e8a30f8

NFS: Remove unused parameter from nfs_page_group_lock() · 1344b7ea

由 Trond Myklebust 提交于 7月 17, 2017

nfs_page_group_lock() is now always called with the 'nonblock'
parameter set to 'false'.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1344b7ea

NFS: Remove nfs_page_group_clear_bits() · 902a4c00

由 Trond Myklebust 提交于 7月 19, 2017

At this point, we only expect ever to potentially see PG_REMOVE and
PG_TEARDOWN being set on the subrequests.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

902a4c00

NFS: Fix nfs_page_group_destroy() and nfs_lock_and_join_requests() race cases · 5b2b5187

由 Trond Myklebust 提交于 7月 19, 2017

Since nfs_page_group_destroy() does not take any locks on the requests
to be freed, we need to ensure that we don't inadvertently free the
request in nfs_destroy_unlinked_subrequests() while the last reference
is being released elsewhere.

Do this by:

1) Taking a reference to the request unless it is already being freed
2) Checking (under the page group lock) if PG_TEARDOWN is already set before
   freeing an unreferenced request in nfs_destroy_unlinked_subrequests()
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5b2b5187

NFS: Further optimise nfs_lock_and_join_requests() · 74a6d4b5

由 Trond Myklebust 提交于 7月 19, 2017

When locking the entire group in order to remove subrequests,
the locks are always taken in order, and with the page group
lock being taken after the page head is locked. The intention
is that:

1) The lock on the group head guarantees that requests may not
   be removed from the group (although new entries could be appended
   if we're not holding the group lock).
2) It is safe to drop and retake the page group lock while iterating
   through the list, in particular when waiting for a subrequest lock.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

74a6d4b5

NFS: Reduce inode->i_lock contention in nfs_lock_and_join_requests() · b5bab9bf

由 Trond Myklebust 提交于 7月 17, 2017

We should no longer need the inode->i_lock, now that we've
straightened out the request locking. The locking schema is now:

1) Lock page head request
2) Lock the page group
3) Lock the subrequests one by one

Note that there is a subtle race with nfs_inode_remove_request() due
to the fact that the latter does not lock the page head, when removing
it from the struct page. Only the last subrequest is locked, hence
we need to re-check that the PagePrivate(page) is still set after
we've locked all the subrequests.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b5bab9bf

NFS: Remove page group limit in nfs_flush_incompatible() · 7e6cca6c

由 Trond Myklebust 提交于 7月 17, 2017

nfs_try_to_update_request() should be able to cope now.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

7e6cca6c

T
NFS: Teach nfs_try_to_update_request() to deal with request page_groups · f6032f21
由 Trond Myklebust 提交于 7月 17, 2017
```
Simplify the code, and avoid some flushes to disk.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
f6032f21

NFS: Fix the inode request accounting when pages have subrequests · b66aaa8d

由 Trond Myklebust 提交于 7月 18, 2017

Both nfs_destroy_unlinked_subrequests() and nfs_lock_and_join_requests()
manipulate the inode flags adjusting the NFS_I(inode)->nrequests.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b66aaa8d

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功