提交 ecf3d1f1 编写于 作者: J Jeff Layton 提交者: Al Viro

vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op

The following set of operations on a NFS client and server will cause

    server# mkdir a
    client# cd a
    server# mv a a.bak
    client# sleep 30  # (or whatever the dir attrcache timeout is)
    client# stat .
    stat: cannot stat `.': Stale NFS file handle

Obviously, we should not be getting an ESTALE error back there since the
inode still exists on the server. The problem is that the lookup code
will call d_revalidate on the dentry that "." refers to, because NFS has
FS_REVAL_DOT set.

nfs_lookup_revalidate will see that the parent directory has changed and
will try to reverify the dentry by redoing a LOOKUP. That of course
fails, so the lookup code returns ESTALE.

The problem here is that d_revalidate is really a bad fit for this case.
What we really want to know at this point is whether the inode is still
good or not, but we don't really care what name it goes by or whether
the dcache is still valid.

Add a new d_op->d_weak_revalidate operation and have complete_walk call
that instead of d_revalidate. The intent there is to allow for a
"weaker" d_revalidate that just checks to see whether the inode is still
good. This is also gives us an opportunity to kill off the FS_REVAL_DOT
special casing.

[AV: changed method name, added note in porting, fixed confusion re
having it possibly called from RCU mode (it won't be)]

Cc: NeilBrown <neilb@suse.de>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
上级 4f4a4fad
...@@ -10,6 +10,7 @@ be able to use diff(1). ...@@ -10,6 +10,7 @@ be able to use diff(1).
--------------------------- dentry_operations -------------------------- --------------------------- dentry_operations --------------------------
prototypes: prototypes:
int (*d_revalidate)(struct dentry *, unsigned int); int (*d_revalidate)(struct dentry *, unsigned int);
int (*d_weak_revalidate)(struct dentry *, unsigned int);
int (*d_hash)(const struct dentry *, const struct inode *, int (*d_hash)(const struct dentry *, const struct inode *,
struct qstr *); struct qstr *);
int (*d_compare)(const struct dentry *, const struct inode *, int (*d_compare)(const struct dentry *, const struct inode *,
...@@ -25,6 +26,7 @@ prototypes: ...@@ -25,6 +26,7 @@ prototypes:
locking rules: locking rules:
rename_lock ->d_lock may block rcu-walk rename_lock ->d_lock may block rcu-walk
d_revalidate: no no yes (ref-walk) maybe d_revalidate: no no yes (ref-walk) maybe
d_weak_revalidate:no no yes no
d_hash no no no maybe d_hash no no no maybe
d_compare: yes no no maybe d_compare: yes no no maybe
d_delete: no yes no no d_delete: no yes no no
......
...@@ -441,3 +441,7 @@ d_make_root() drops the reference to inode if dentry allocation fails. ...@@ -441,3 +441,7 @@ d_make_root() drops the reference to inode if dentry allocation fails.
two, it gets "is it an O_EXCL or equivalent?" boolean argument. Note that two, it gets "is it an O_EXCL or equivalent?" boolean argument. Note that
local filesystems can ignore tha argument - they are guaranteed that the local filesystems can ignore tha argument - they are guaranteed that the
object doesn't exist. It's remote/distributed ones that might care... object doesn't exist. It's remote/distributed ones that might care...
--
[mandatory]
FS_REVAL_DOT is gone; if you used to have it, add ->d_weak_revalidate()
in your dentry operations instead.
...@@ -900,6 +900,7 @@ defined: ...@@ -900,6 +900,7 @@ defined:
struct dentry_operations { struct dentry_operations {
int (*d_revalidate)(struct dentry *, unsigned int); int (*d_revalidate)(struct dentry *, unsigned int);
int (*d_weak_revalidate)(struct dentry *, unsigned int);
int (*d_hash)(const struct dentry *, const struct inode *, int (*d_hash)(const struct dentry *, const struct inode *,
struct qstr *); struct qstr *);
int (*d_compare)(const struct dentry *, const struct inode *, int (*d_compare)(const struct dentry *, const struct inode *,
...@@ -915,8 +916,13 @@ struct dentry_operations { ...@@ -915,8 +916,13 @@ struct dentry_operations {
d_revalidate: called when the VFS needs to revalidate a dentry. This d_revalidate: called when the VFS needs to revalidate a dentry. This
is called whenever a name look-up finds a dentry in the is called whenever a name look-up finds a dentry in the
dcache. Most filesystems leave this as NULL, because all their dcache. Most local filesystems leave this as NULL, because all their
dentries in the dcache are valid dentries in the dcache are valid. Network filesystems are different
since things can change on the server without the client necessarily
being aware of it.
This function should return a positive value if the dentry is still
valid, and zero or a negative error code if it isn't.
d_revalidate may be called in rcu-walk mode (flags & LOOKUP_RCU). d_revalidate may be called in rcu-walk mode (flags & LOOKUP_RCU).
If in rcu-walk mode, the filesystem must revalidate the dentry without If in rcu-walk mode, the filesystem must revalidate the dentry without
...@@ -927,6 +933,20 @@ struct dentry_operations { ...@@ -927,6 +933,20 @@ struct dentry_operations {
If a situation is encountered that rcu-walk cannot handle, return If a situation is encountered that rcu-walk cannot handle, return
-ECHILD and it will be called again in ref-walk mode. -ECHILD and it will be called again in ref-walk mode.
d_weak_revalidate: called when the VFS needs to revalidate a "jumped" dentry.
This is called when a path-walk ends at dentry that was not acquired by
doing a lookup in the parent directory. This includes "/", "." and "..",
as well as procfs-style symlinks and mountpoint traversal.
In this case, we are less concerned with whether the dentry is still
fully correct, but rather that the inode is still valid. As with
d_revalidate, most local filesystems will set this to NULL since their
dcache entries are always valid.
This function has the same return code semantics as d_revalidate.
d_weak_revalidate is only called after leaving rcu-walk mode.
d_hash: called when the VFS adds a dentry to the hash table. The first d_hash: called when the VFS adds a dentry to the hash table. The first
dentry passed to d_hash is the parent directory that the name is dentry passed to d_hash is the parent directory that the name is
to be hashed into. The inode is the dentry's inode. to be hashed into. The inode is the dentry's inode.
......
...@@ -137,6 +137,7 @@ static int v9fs_lookup_revalidate(struct dentry *dentry, unsigned int flags) ...@@ -137,6 +137,7 @@ static int v9fs_lookup_revalidate(struct dentry *dentry, unsigned int flags)
const struct dentry_operations v9fs_cached_dentry_operations = { const struct dentry_operations v9fs_cached_dentry_operations = {
.d_revalidate = v9fs_lookup_revalidate, .d_revalidate = v9fs_lookup_revalidate,
.d_weak_revalidate = v9fs_lookup_revalidate,
.d_delete = v9fs_cached_dentry_delete, .d_delete = v9fs_cached_dentry_delete,
.d_release = v9fs_dentry_release, .d_release = v9fs_dentry_release,
}; };
......
...@@ -363,5 +363,5 @@ struct file_system_type v9fs_fs_type = { ...@@ -363,5 +363,5 @@ struct file_system_type v9fs_fs_type = {
.mount = v9fs_mount, .mount = v9fs_mount,
.kill_sb = v9fs_kill_super, .kill_sb = v9fs_kill_super,
.owner = THIS_MODULE, .owner = THIS_MODULE,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT, .fs_flags = FS_RENAME_DOES_D_MOVE,
}; };
...@@ -1358,6 +1358,7 @@ void d_set_d_op(struct dentry *dentry, const struct dentry_operations *op) ...@@ -1358,6 +1358,7 @@ void d_set_d_op(struct dentry *dentry, const struct dentry_operations *op)
WARN_ON_ONCE(dentry->d_flags & (DCACHE_OP_HASH | WARN_ON_ONCE(dentry->d_flags & (DCACHE_OP_HASH |
DCACHE_OP_COMPARE | DCACHE_OP_COMPARE |
DCACHE_OP_REVALIDATE | DCACHE_OP_REVALIDATE |
DCACHE_OP_WEAK_REVALIDATE |
DCACHE_OP_DELETE )); DCACHE_OP_DELETE ));
dentry->d_op = op; dentry->d_op = op;
if (!op) if (!op)
...@@ -1368,6 +1369,8 @@ void d_set_d_op(struct dentry *dentry, const struct dentry_operations *op) ...@@ -1368,6 +1369,8 @@ void d_set_d_op(struct dentry *dentry, const struct dentry_operations *op)
dentry->d_flags |= DCACHE_OP_COMPARE; dentry->d_flags |= DCACHE_OP_COMPARE;
if (op->d_revalidate) if (op->d_revalidate)
dentry->d_flags |= DCACHE_OP_REVALIDATE; dentry->d_flags |= DCACHE_OP_REVALIDATE;
if (op->d_weak_revalidate)
dentry->d_flags |= DCACHE_OP_WEAK_REVALIDATE;
if (op->d_delete) if (op->d_delete)
dentry->d_flags |= DCACHE_OP_DELETE; dentry->d_flags |= DCACHE_OP_DELETE;
if (op->d_prune) if (op->d_prune)
......
...@@ -600,14 +600,10 @@ static int complete_walk(struct nameidata *nd) ...@@ -600,14 +600,10 @@ static int complete_walk(struct nameidata *nd)
if (likely(!(nd->flags & LOOKUP_JUMPED))) if (likely(!(nd->flags & LOOKUP_JUMPED)))
return 0; return 0;
if (likely(!(dentry->d_flags & DCACHE_OP_REVALIDATE))) if (likely(!(dentry->d_flags & DCACHE_OP_WEAK_REVALIDATE)))
return 0; return 0;
if (likely(!(dentry->d_sb->s_type->fs_flags & FS_REVAL_DOT))) status = dentry->d_op->d_weak_revalidate(dentry, nd->flags);
return 0;
/* Note: we do not d_invalidate() */
status = d_revalidate(dentry, nd->flags);
if (status > 0) if (status > 0)
return 0; return 0;
......
...@@ -1135,6 +1135,45 @@ static int nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags) ...@@ -1135,6 +1135,45 @@ static int nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags)
return error; return error;
} }
/*
* A weaker form of d_revalidate for revalidating just the dentry->d_inode
* when we don't really care about the dentry name. This is called when a
* pathwalk ends on a dentry that was not found via a normal lookup in the
* parent dir (e.g.: ".", "..", procfs symlinks or mountpoint traversals).
*
* In this situation, we just want to verify that the inode itself is OK
* since the dentry might have changed on the server.
*/
static int nfs_weak_revalidate(struct dentry *dentry, unsigned int flags)
{
int error;
struct inode *inode = dentry->d_inode;
/*
* I believe we can only get a negative dentry here in the case of a
* procfs-style symlink. Just assume it's correct for now, but we may
* eventually need to do something more here.
*/
if (!inode) {
dfprintk(LOOKUPCACHE, "%s: %s/%s has negative inode\n",
__func__, dentry->d_parent->d_name.name,
dentry->d_name.name);
return 1;
}
if (is_bad_inode(inode)) {
dfprintk(LOOKUPCACHE, "%s: %s/%s has dud inode\n",
__func__, dentry->d_parent->d_name.name,
dentry->d_name.name);
return 0;
}
error = nfs_revalidate_inode(NFS_SERVER(inode), inode);
dfprintk(LOOKUPCACHE, "NFS: %s: inode %lu is %s\n",
__func__, inode->i_ino, error ? "invalid" : "valid");
return !error;
}
/* /*
* This is called from dput() when d_count is going to 0. * This is called from dput() when d_count is going to 0.
*/ */
...@@ -1202,6 +1241,7 @@ static void nfs_d_release(struct dentry *dentry) ...@@ -1202,6 +1241,7 @@ static void nfs_d_release(struct dentry *dentry)
const struct dentry_operations nfs_dentry_operations = { const struct dentry_operations nfs_dentry_operations = {
.d_revalidate = nfs_lookup_revalidate, .d_revalidate = nfs_lookup_revalidate,
.d_weak_revalidate = nfs_weak_revalidate,
.d_delete = nfs_dentry_delete, .d_delete = nfs_dentry_delete,
.d_iput = nfs_dentry_iput, .d_iput = nfs_dentry_iput,
.d_automount = nfs_d_automount, .d_automount = nfs_d_automount,
......
...@@ -28,7 +28,7 @@ static struct file_system_type nfs4_remote_fs_type = { ...@@ -28,7 +28,7 @@ static struct file_system_type nfs4_remote_fs_type = {
.name = "nfs4", .name = "nfs4",
.mount = nfs4_remote_mount, .mount = nfs4_remote_mount,
.kill_sb = nfs_kill_super, .kill_sb = nfs_kill_super,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, .fs_flags = FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
}; };
static struct file_system_type nfs4_remote_referral_fs_type = { static struct file_system_type nfs4_remote_referral_fs_type = {
...@@ -36,7 +36,7 @@ static struct file_system_type nfs4_remote_referral_fs_type = { ...@@ -36,7 +36,7 @@ static struct file_system_type nfs4_remote_referral_fs_type = {
.name = "nfs4", .name = "nfs4",
.mount = nfs4_remote_referral_mount, .mount = nfs4_remote_referral_mount,
.kill_sb = nfs_kill_super, .kill_sb = nfs_kill_super,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, .fs_flags = FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
}; };
struct file_system_type nfs4_referral_fs_type = { struct file_system_type nfs4_referral_fs_type = {
...@@ -44,7 +44,7 @@ struct file_system_type nfs4_referral_fs_type = { ...@@ -44,7 +44,7 @@ struct file_system_type nfs4_referral_fs_type = {
.name = "nfs4", .name = "nfs4",
.mount = nfs4_referral_mount, .mount = nfs4_referral_mount,
.kill_sb = nfs_kill_super, .kill_sb = nfs_kill_super,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, .fs_flags = FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
}; };
static const struct super_operations nfs4_sops = { static const struct super_operations nfs4_sops = {
......
...@@ -292,7 +292,7 @@ struct file_system_type nfs_fs_type = { ...@@ -292,7 +292,7 @@ struct file_system_type nfs_fs_type = {
.name = "nfs", .name = "nfs",
.mount = nfs_fs_mount, .mount = nfs_fs_mount,
.kill_sb = nfs_kill_super, .kill_sb = nfs_kill_super,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, .fs_flags = FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
}; };
EXPORT_SYMBOL_GPL(nfs_fs_type); EXPORT_SYMBOL_GPL(nfs_fs_type);
...@@ -301,7 +301,7 @@ struct file_system_type nfs_xdev_fs_type = { ...@@ -301,7 +301,7 @@ struct file_system_type nfs_xdev_fs_type = {
.name = "nfs", .name = "nfs",
.mount = nfs_xdev_mount, .mount = nfs_xdev_mount,
.kill_sb = nfs_kill_super, .kill_sb = nfs_kill_super,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, .fs_flags = FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
}; };
const struct super_operations nfs_sops = { const struct super_operations nfs_sops = {
...@@ -331,7 +331,7 @@ struct file_system_type nfs4_fs_type = { ...@@ -331,7 +331,7 @@ struct file_system_type nfs4_fs_type = {
.name = "nfs4", .name = "nfs4",
.mount = nfs_fs_mount, .mount = nfs_fs_mount,
.kill_sb = nfs_kill_super, .kill_sb = nfs_kill_super,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, .fs_flags = FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
}; };
EXPORT_SYMBOL_GPL(nfs4_fs_type); EXPORT_SYMBOL_GPL(nfs4_fs_type);
......
...@@ -145,6 +145,7 @@ enum dentry_d_lock_class ...@@ -145,6 +145,7 @@ enum dentry_d_lock_class
struct dentry_operations { struct dentry_operations {
int (*d_revalidate)(struct dentry *, unsigned int); int (*d_revalidate)(struct dentry *, unsigned int);
int (*d_weak_revalidate)(struct dentry *, unsigned int);
int (*d_hash)(const struct dentry *, const struct inode *, int (*d_hash)(const struct dentry *, const struct inode *,
struct qstr *); struct qstr *);
int (*d_compare)(const struct dentry *, const struct inode *, int (*d_compare)(const struct dentry *, const struct inode *,
...@@ -192,6 +193,8 @@ struct dentry_operations { ...@@ -192,6 +193,8 @@ struct dentry_operations {
#define DCACHE_GENOCIDE 0x0200 #define DCACHE_GENOCIDE 0x0200
#define DCACHE_SHRINK_LIST 0x0400 #define DCACHE_SHRINK_LIST 0x0400
#define DCACHE_OP_WEAK_REVALIDATE 0x0800
#define DCACHE_NFSFS_RENAMED 0x1000 #define DCACHE_NFSFS_RENAMED 0x1000
/* this dentry has been "silly renamed" and has to be deleted on the last /* this dentry has been "silly renamed" and has to be deleted on the last
* dput() */ * dput() */
......
...@@ -1807,7 +1807,6 @@ struct file_system_type { ...@@ -1807,7 +1807,6 @@ struct file_system_type {
#define FS_HAS_SUBTYPE 4 #define FS_HAS_SUBTYPE 4
#define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */ #define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */
#define FS_USERNS_DEV_MOUNT 16 /* A userns mount does not imply MNT_NODEV */ #define FS_USERNS_DEV_MOUNT 16 /* A userns mount does not imply MNT_NODEV */
#define FS_REVAL_DOT 16384 /* Check the paths ".", ".." for staleness */
#define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */
struct dentry *(*mount) (struct file_system_type *, int, struct dentry *(*mount) (struct file_system_type *, int,
const char *, void *); const char *, void *);
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册