提交 · 5cf9d70659594e1a75b34d18619d0bb6e0cbbafa · openeuler / raspberrypi-kernel

05 9月, 2015 1 次提交

NFS: Optimise away the close-to-open getattr if there is no cached data · 5cf9d706

由 Trond Myklebust 提交于 9月 04, 2015

If there is no cached data, then there is no need to track the file
change attribute on close.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5cf9d706

28 8月, 2015 1 次提交

NFS: Check size by inode_newsize_ok in nfs_setattr · ae57ca0f

由 Kinglong Mee 提交于 8月 26, 2015

Set rlimit for NFS's files is useless right now.
For local process's rlimit, it should be checked by nfs client.

The same, CIFS also call inode_change_ok checking rlimit at its client
in cifs_setattr_nounix() and cifs_setattr_unix().

v3, fix bad using of error
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ae57ca0f

26 8月, 2015 1 次提交

NFSv4: Force a post-op attribute update when holding a delegation · aaae3f00

由 Trond Myklebust 提交于 8月 20, 2015

If the ctime or mtime or change attribute have changed because
of an operation we initiated, we should make sure that we force
an attribute update. However we do not want to mark the page cache
for revalidation.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org # v4.0+

aaae3f00

18 8月, 2015 2 次提交

NFS: Don't let the ctime override attribute barriers. · 7c2dad99

由 Trond Myklebust 提交于 8月 06, 2015

Chuck reports seeing cases where a GETATTR that happens to race
with an asynchronous WRITE is overriding the file size, despite
the attribute barrier being set by the writeback code.

The culprit turns out to be the check in nfs_ctime_need_update(),
which sees that the ctime is newer than the cached ctime, and
assumes that it is safe to override the attribute barrier.
This patch removes that override, and ensures that attribute
barriers are always respected.
Reported-by: NChuck Lever <chuck.lever@oracle.com>
Fixes: a08a8cd3 ("NFS: Add attribute update barriers to NFS writebacks")
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

7c2dad99

NFS: Remove nfs_release() · aff8d8dc

由 Anna Schumaker 提交于 7月 13, 2015

And call nfs_file_clear_open_context() directly.  This makes it obvious
that nfs_file_release() will always return 0.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

aff8d8dc

23 7月, 2015 3 次提交

NFS: Remove the "NFS_CAP_CHANGE_ATTR" capability · cd812599

由 Trond Myklebust 提交于 7月 05, 2015

Setting the change attribute has been mandatory for all NFS versions, since
commit 3a1556e8 ("NFSv2/v3: Simulate the change attribute"). We should
therefore not have anything be conditional on it being set/unset.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

cd812599

NFS: Set NFS_INO_REVAL_PAGECACHE if the change attribute is uninitialised · 5c675d64

由 Trond Myklebust 提交于 7月 05, 2015

We can't allow caching of data until the change attribute has been
initialised correctly.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5c675d64

NFS: Don't revalidate the mapping if both size and change attr are up to date · 85a23cee

由 Trond Myklebust 提交于 7月 05, 2015

If we've ensured that the size and the change attribute are both correct,
then there is no point in marking those attributes as needing revalidation
again. Only do so if we know the size is incorrect and was not updated.

Fixes: f2467b6f ("NFS: Clear NFS_INO_REVAL_PAGECACHE when...")
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

85a23cee

01 7月, 2015 1 次提交

nfs: Remove unneeded micro checking of CONFIG_PROC_FS · cd738ee9

由 Kinglong Mee 提交于 7月 01, 2015

Have checking CONFIG_PROC_FS in include/linux/sunrpc/stats.h.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

cd738ee9

02 6月, 2015 1 次提交

NFS: report more appropriate block size for directories. · 7ef5ca4f

由 NeilBrown 提交于 5月 08, 2015

In glibc 2.21 (and several previous), a call to opendir() will
result in a 32K (BUFSIZ*4) buffer being allocated and passed to
getdents.

However a call to fdopendir() results in an 'fstat' request to
determine block size and a matching buffer allocated for subsequent
use with getdents.  This will typically be 1M.

The first getdents call on an NFS directory will always use
READDIR_PLUS (or NFSv4 equivalent) if available.  Subsequent getdents
calls only use this more expensive version if some 'stat' requests are
made between the getdents calls.

For this reason it is good to keep at least that first getdents call
relatively short.  When fdopendir() and readdir() is used on a large
directory, it takes approximately 32 times as long to complete as
using "opendir".  Current versions of 'find' use fdopendir() and
demonstrate this slowness.

'stat' on a directory currently returns the 'wsize'.  This number has
no meaning on directories.
Actual READDIR requests are limited to ->dtsize, which itself is
capped at 4 pages, coincidently the same as BUFSIZ*4.
So this is a meaningful number to use as the blocksize on directories,
and has the effect of making 'find' on large directories go a lot
faster.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

7ef5ca4f

24 4月, 2015 3 次提交

nfs: Remove unneeded casts in nfs · c456aacf

由 Firo Yang 提交于 4月 23, 2015

Don't unnecessarily cast allocation return value in
fs/nfs/inode.c::nfs_alloc_inode().
Signed-off-by: NFiro Yang <firogm@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c456aacf

nfs: Fetch MOUNTED_ON_FILEID when updating an inode · ea96d1ec

由 Anna Schumaker 提交于 4月 03, 2015

2ef47eb1 (NFS: Fix use of nfs_attr_use_mounted_on_fileid()) was a good
start to fixing a circular directory structure warning for NFS v4
"junctioned" mountpoints.  Unfortunately, further testing continued to
generate this error.

My server is configured like this:

anna@nfsd ~ % df
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1       9.1G  2.0G  6.5G  24% /
/dev/vdc1      1014M   33M  982M   4% /exports
/dev/vdc2      1014M   33M  982M   4% /exports/vol1
/dev/vdc3      1014M   33M  982M   4% /exports/vol1/vol2

anna@nfsd ~ % cat /etc/exports
/exports/          *(rw,async,no_subtree_check,no_root_squash)
/exports/vol1/     *(rw,async,no_subtree_check,no_root_squash)
/exports/vol1/vol2 *(rw,async,no_subtree_check,no_root_squash)

I've been running chown across the entire mountpoint twice in a row to
hit this problem.  The first run succeeds, but the second one fails with
the circular directory warning along with:

anna@client ~ % dmesg
[Apr 3 14:28] NFS: server 192.168.100.204 error: fileid changed
              fsid 0:39: expected fileid 0x100080, got 0x80

WHere 0x80 is the mountpoint's fileid and 0x100080 is the mounted-on
fileid.

This patch fixes the issue by requesting an updated mounted-on fileid
from the server during nfs_update_inode(), and then checking that the
fileid stored in the nfs_inode matches either the fileid or mounted-on
fileid returned by the server.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ea96d1ec

NFS: Don't zap caches on fallocate() · 9a51940b

由 Anna Schumaker 提交于 3月 16, 2015

This patch adds a GETATTR to the end of ALLOCATE and DEALLOCATE
operations so we can set the updated inode size and change attribute
directly.  DEALLOCATE will still need to release pagecache pages, so
nfs42_proc_deallocate() now calls truncate_pagecache_range() before
contacting the server.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

9a51940b

16 4月, 2015 1 次提交

VFS: normal filesystems (and lustre): d_inode() annotations · 2b0143b5

由 David Howells 提交于 3月 17, 2015

that's the bulk of filesystem drivers dealing with inodes of their own
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2b0143b5

28 3月, 2015 3 次提交
- T
  NFS: Block new writes while syncing data in nfs_getattr() · 8c18d76b
  由 Trond Myklebust 提交于 3月 25, 2015
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  8c18d76b
- T
  NFSv4: Truncating file opens should also sync O_DIRECT writes · 9e1681c2
  由 Trond Myklebust 提交于 3月 25, 2015
```
We don't just want to sync out buffered writes, but also O_DIRECT ones.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  9e1681c2
- T
  NFS: Add a helper to sync both O_DIRECT and buffered writes · 4d346bea
  由 Trond Myklebust 提交于 3月 25, 2015
```
Then apply it to nfs_setattr() and nfs_getattr().
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  4d346bea
04 3月, 2015 2 次提交

NFS: Don't write enable new pages while an invalidation is proceeding · ef070dcb

由 Trond Myklebust 提交于 3月 03, 2015

nfs_vm_page_mkwrite() should wait until the page cache invalidation
is finished. This is the second patch in a 2 patch series to deprecate
the NFS client's reliance on nfs_release_page() in the context of
nfs_invalidate_mapping().
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ef070dcb

NFS: Fix a regression in the read() syscall · 874f9463

由 Trond Myklebust 提交于 3月 02, 2015

When invalidating the page cache for a regular file, we want to first
sync all dirty data to disk and then call invalidate_inode_pages2().
The latter relies on nfs_launder_page() and nfs_release_page() to deal
respectively with dirty pages, and unstable written pages.

When commit 95905446 ("NFS: avoid deadlocks with loop-back mounted
NFS filesystems.") changed the behaviour of nfs_release_page(), then it
made it possible for invalidate_inode_pages2() to fail with an EBUSY.
Unfortunately, that error is then propagated back to read().

Let's therefore work around the problem for now by protecting the call
to sync the data and invalidate_inode_pages2() so that they are atomic
w.r.t. the addition of new writes.
Later on, we can revisit whether or not we still need nfs_launder_page()
and nfs_release_page().
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

874f9463

02 3月, 2015 8 次提交

NFSv4: Set a barrier in the update_changeattr() helper · 3235b403

由 Trond Myklebust 提交于 2月 26, 2015

Ensure that we don't regress the changes that were made to the
directory.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Tested-by: NChuck Lever <chuck.lever@oracle.com>

3235b403

NFS: Fix nfs_post_op_update_inode() to set an attribute barrier · 92d64e47

由 Trond Myklebust 提交于 2月 26, 2015

nfs_post_op_update_inode() is called after a self-induced attribute
update. Ensure that it also sets the barrier.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Tested-by: NChuck Lever <chuck.lever@oracle.com>

92d64e47

NFS: Remove size hack in nfs_inode_attrs_need_update() · 00fb4c9f

由 Trond Myklebust 提交于 2月 26, 2015

Prior to this patch, we used to always OK attribute updates that extended
the file size on the assumption that we might be performing writeback.
Now that we have attribute barriers to protect the writeback related updates,
we should remove this hack, as it can cause truncate() operations to
apparently be reverted if/when a readahead or getattr RPC call races
with our on-the-wire SETATTR.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Tested-by: NChuck Lever <chuck.lever@oracle.com>

00fb4c9f

NFSv4: Add attribute update barriers to delegreturn and pNFS layoutcommit · 8f8ba1d7

由 Trond Myklebust 提交于 2月 26, 2015

Ensure that other operations that race with delegreturn and layoutcommit
cannot revert the attribute updates that were made on the server.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Tested-by: NChuck Lever <chuck.lever@oracle.com>

8f8ba1d7

NFS: Add attribute update barriers to NFS writebacks · a08a8cd3

由 Trond Myklebust 提交于 2月 26, 2015

Ensure that other operations that race with our write RPC calls
cannot revert the file size updates that were made on the server.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Tested-by: NChuck Lever <chuck.lever@oracle.com>

a08a8cd3

NFS: Set an attribute barrier on all updates · f5062003

由 Trond Myklebust 提交于 2月 26, 2015

Ensure that we update the attribute barrier even if there were no
invalidations, provided that this value is newer than the old one.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Tested-by: NChuck Lever <chuck.lever@oracle.com>

f5062003

NFS: Add attribute update barriers to nfs_setattr_update_inode() · f044636d

由 Trond Myklebust 提交于 2月 26, 2015

Ensure that other operations which raced with our setattr RPC call
cannot revert the file attribute changes that were made on the server.
To do so, we artificially bump the attribute generation counter on
the inode so that all calls to nfs_fattr_init() that precede ours
will be dropped.

The motivation for the patch came from Chuck Lever's reports of readaheads
racing with truncate operations and causing the file size to be reverted.
Reported-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Tested-by: NChuck Lever <chuck.lever@oracle.com>

f044636d

NFS: Add a helper to set attribute barriers · 140e049c

由 Trond Myklebust 提交于 2月 26, 2015

Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Tested-by: NChuck Lever <chuck.lever@oracle.com>

140e049c

14 2月, 2015 1 次提交
- T
  NFSv4: Kill unused nfs_inode->delegation_state field · bf40e556
  由 Trond Myklebust 提交于 2月 13, 2015
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  bf40e556
31 1月, 2015 1 次提交

nfs: prevent truncate on active swapfile · 3a7ed3ff

由 Omar Sandoval 提交于 1月 08, 2015

Most filesystems prevent truncation of an active swapfile by way of
inode_newsize_ok, called from inode_change_ok. NFS doesn't call either
from nfs_setattr, presumably because most of these checks are expected
to be done server-side. However, the IS_SWAPFILE check can only be done
client-side, and truncating a swapfile can't possibly be good.
Signed-off-by: NOmar Sandoval <osandov@osandov.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

3a7ed3ff

22 1月, 2015 1 次提交

NFS: Fix use of nfs_attr_use_mounted_on_fileid() · 2ef47eb1

由 Anna Schumaker 提交于 12月 09, 2014

This function call was being optimized out during nfs_fhget(), leading
to situations where we have a valid fileid but still want to use the
mounted_on_fileid.  For example, imagine we have our server configured
like this:

server % df
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1       9.1G  6.5G  1.9G  78% /
/dev/vdb1       487M  2.3M  456M   1% /exports
/dev/vdc1       487M  2.3M  456M   1% /exports/vol1
/dev/vdd1       487M  2.3M  456M   1% /exports/vol2

If our client mounts /exports and tries to do a "chown -R" across the
entire mountpoint, we will get a nasty message warning us about a circular
directory structure.  Running chown with strace tells me that each directory
has the same device and inode number:

newfstatat(AT_FDCWD, "/nfs/", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0
newfstatat(4, "vol1", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0
newfstatat(4, "vol2", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0

With this patch the mounted_on_fileid values are used for st_ino, so the
directory loop warning isn't reported.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2ef47eb1

21 1月, 2015 1 次提交

fs: remove mapping->backing_dev_info · b83ae6d4

由 Christoph Hellwig 提交于 1月 14, 2015

Now that we never use the backing_dev_info pointer in struct address_space
we can simply remove it and save 4 to 8 bytes in every inode.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reviewed-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

b83ae6d4

26 11月, 2014 1 次提交

nfs: Add ALLOCATE support · f4ac1674

由 Anna Schumaker 提交于 11月 25, 2014

This patch adds support for using the NFS v4.2 operation ALLOCATE to
preallocate data in a file.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f4ac1674

25 11月, 2014 1 次提交

NFS: fix subtle change in COMMIT behavior · cb1410c7

由 Weston Andros Adamson 提交于 11月 12, 2014

Recent work in the pgio layer made it possible for there to be more than one
request per page. This caused a subtle change in commit behavior, because
write.c:nfs_commit_unstable_pages compares the number of *pages* waiting for
writeback against the number of requests on a commit list to choose when to
send a COMMIT in a non-blocking flush.

This is probably hard to hit in normal operation - you have to be using
rsize/wsize < PAGE_SIZE, or pnfs with lots of boundaries that are not page
aligned to have a noticeable change in behavior.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

cb1410c7

13 11月, 2014 1 次提交

nfs: Fix use of uninitialized variable in nfs_getattr() · 16caf5b6

由 Jan Kara 提交于 10月 23, 2014

Variable 'err' needn't be initialized when nfs_getattr() uses it to
check whether it should call generic_fillattr() or not. That can result
in spurious error returns. Initialize 'err' properly.
Signed-off-by: NJan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

16caf5b6

01 10月, 2014 1 次提交

NFS: Implement SEEK · 1c6dcbe5

由 Anna Schumaker 提交于 9月 26, 2014

The SEEK operation is used when an application makes an lseek call with
either the SEEK_HOLE or SEEK_DATA flags set. I fall back on
nfs_file_llseek() if the server does not have SEEK support.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1c6dcbe5

11 9月, 2014 1 次提交

nfs: setattr can only change regular file sizes · 08a899d5

由 Christoph Hellwig 提交于 9月 07, 2014

The VFS never calls setattr with ATTR_SIZE on anything but regular
files.  Remove the if check and turn it into an assert similar to
what some other file systems do.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

08a899d5

05 8月, 2014 1 次提交

NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes · 65b38851

由 Eric W. Biederman 提交于 7月 31, 2014

The usage of pid_ns->child_reaper->nsproxy->net_ns in
nfs_server_list_open and nfs_client_list_open is not safe.

/proc for a pid namespace can remain mounted after the all of the
process in that pid namespace have exited.  There are also times
before the initial process in a pid namespace has started or after the
initial process in a pid namespace has exited where
pid_ns->child_reaper can be NULL or stale.  Making the idiom
pid_ns->child_reaper->nsproxy a double whammy of problems.

Luckily all that needs to happen is to move /proc/fs/nfsfs/servers and
/proc/fs/nfsfs/volumes under /proc/net to /proc/net/nfsfs/servers and
/proc/net/nfsfs/volumes and add a symlink from the original location,
and to use seq_open_net as it has been designed.

Cc: stable@vger.kernel.org
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

65b38851

04 8月, 2014 1 次提交

NFS: teach nfs_neg_need_reval to understand LOOKUP_RCU · 912a108d

由 NeilBrown 提交于 7月 14, 2014

This requires nfs_check_verifier to take an rcu_walk flag, and requires
an rcu version of nfs_revalidate_inode which returns -ECHILD rather
than making an RPC call.

With this, nfs_lookup_revalidate can call nfs_neg_need_reval in
RCU-walk mode.

We can also move the LOOKUP_RCU check past the nfs_check_verifier()
call in nfs_lookup_revalidate.

If RCU_WALK prevents nfs_check_verifier or nfs_neg_need_reval from
doing a full check, they return a status indicating that a revalidation
is required.  As this revalidation will not be possible in RCU_WALK
mode, -ECHILD will ultimately be returned, which is the desired result.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

912a108d

16 7月, 2014 2 次提交

sched: Allow wait_on_bit_action() functions to support a timeout · c1221321

由 NeilBrown 提交于 7月 07, 2014

It is currently not possible for various wait_on_bit functions
to implement a timeout.

While the "action" function that is called to do the waiting
could certainly use schedule_timeout(), there is no way to carry
forward the remaining timeout after a false wake-up.
As false-wakeups a clearly possible at least due to possible
hash collisions in bit_waitqueue(), this is a real problem.

The 'action' function is currently passed a pointer to the word
containing the bit being waited on.  No current action functions
use this pointer.  So changing it to something else will be a
little noisy but will have no immediate effect.

This patch changes the 'action' function to take a pointer to
the "struct wait_bit_key", which contains a pointer to the word
containing the bit so nothing is really lost.

It also adds a 'private' field to "struct wait_bit_key", which
is initialized to zero.

An action function can now implement a timeout with something
like

static int timed_out_waiter(struct wait_bit_key *key)
{
	unsigned long waited;
	if (key->private == 0) {
		key->private = jiffies;
		if (key->private == 0)
			key->private -= 1;
	}
	waited = jiffies - key->private;
	if (waited > 10 * HZ)
		return -EAGAIN;
	schedule_timeout(waited - 10 * HZ);
	return 0;
}

If any other need for context in a waiter were found it would be
easy to use ->private for some other purpose, or even extend
"struct wait_bit_key".

My particular need is to support timeouts in nfs_release_page()
to avoid deadlocks with loopback mounted NFS.

While wait_on_bit_timeout() would be a cleaner interface, it
will not meet my need.  I need the timeout to be sensitive to
the state of the connection with the server, which could change.
 So I need to use an 'action' interface.
Signed-off-by: NNeilBrown <neilb@suse.de>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>

c1221321

sched: Remove proliferation of wait_on_bit() action functions · 74316201

由 NeilBrown 提交于 7月 07, 2014

The current "wait_on_bit" interface requires an 'action'
function to be provided which does the actual waiting.
There are over 20 such functions, many of them identical.
Most cases can be satisfied by one of just two functions, one
which uses io_schedule() and one which just uses schedule().

So:
 Rename wait_on_bit and        wait_on_bit_lock to
        wait_on_bit_action and wait_on_bit_lock_action
 to make it explicit that they need an action function.

 Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
 which are *not* given an action function but implicitly use
 a standard one.
 The decision to error-out if a signal is pending is now made
 based on the 'mode' argument rather than being encoded in the action
 function.

 All instances of the old wait_on_bit and wait_on_bit_lock which
 can use the new version have been changed accordingly and their
 action functions have been discarded.
 wait_on_bit{_lock} does not return any specific error code in the
 event of a signal so the caller must check for non-zero and
 interpolate their own error code as appropriate.

The wait_on_bit() call in __fscache_wait_on_invalidate() was
ambiguous as it specified TASK_UNINTERRUPTIBLE but used
fscache_wait_bit_interruptible as an action function.
David Howells confirms this should be uniformly
"uninterruptible"

The main remaining user of wait_on_bit{,_lock}_action is NFS
which needs to use a freezer-aware schedule() call.

A comment in fs/gfs2/glock.c notes that having multiple 'action'
functions is useful as they display differently in the 'wchan'
field of 'ps'. (and /proc/$PID/wchan).
As the new bit_wait{,_io} functions are tagged "__sched", they
will not show up at all, but something higher in the stack.  So
the distinction will still be visible, only with different
function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
gfs2/glock.c case).

Since first version of this patch (against 3.15) two new action
functions appeared, on in NFS and one in CIFS.  CIFS also now
uses an action function that makes the same freezer aware
schedule call as NFS.
Signed-off-by: NNeilBrown <neilb@suse.de>
Acked-by: David Howells <dhowells@redhat.com> (fscache, keys)
Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2)
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>

74316201