提交 · 8e6d782cab50884ba94324632700e6233a252f6a · openeuler / raspberrypi-kernel

09 11月, 2013 9 次提交

locks: break delegations on rename · 8e6d782c

由 J. Bruce Fields 提交于 9月 20, 2011

Cc: David Howells <dhowells@redhat.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8e6d782c

locks: helper functions for delegation breaking · 5a14696c

由 J. Bruce Fields 提交于 8月 28, 2012

We'll need the same logic for rename and link.
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5a14696c

locks: break delegations on unlink · b21996e3

由 J. Bruce Fields 提交于 9月 20, 2011

We need to break delegations on any operation that changes the set of
links pointing to an inode.  Start with unlink.

Such operations also hold the i_mutex on a parent directory.  Breaking a
delegation may require waiting for a timeout (by default 90 seconds) in
the case of a unresponsive NFS client.  To avoid blocking all directory
operations, we therefore drop locks before waiting for the delegation.
The logic then looks like:

	acquire locks
	...
	test for delegation; if found:
		take reference on inode
		release locks
		wait for delegation break
		drop reference on inode
		retry

It is possible this could never terminate.  (Even if we take precautions
to prevent another delegation being acquired on the same inode, we could
get a different inode on each retry.)  But this seems very unlikely.

The initial test for a delegation happens after the lock on the target
inode is acquired, but the directory inode may have been acquired
further up the call stack.  We therefore add a "struct inode **"
argument to any intervening functions, which we use to pass the inode
back up to the caller in the case it needs a delegation synchronously
broken.

Cc: David Howells <dhowells@redhat.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b21996e3

namei: minor vfs_unlink cleanup · 9accbb97

由 J. Bruce Fields 提交于 8月 28, 2012

We'll be using dentry->d_inode in one more place.
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9accbb97

vfs: take i_mutex on renamed file · 6cedba89

由 J. Bruce Fields 提交于 3月 05, 2012

A read delegation is used by NFSv4 as a guarantee that a client can
perform local read opens without informing the server.

The open operation takes the last component of the pathname as an
argument, thus is also a lookup operation, and giving the client the
above guarantee means informing the client before we allow anything that
would change the set of names pointing to the inode.

Therefore, we need to break delegations on rename, link, and unlink.

We also need to prevent new delegations from being acquired while one of
these operations is in progress.

We could add some completely new locking for that purpose, but it's
simpler to use the i_mutex, since that's already taken by all the
operations we care about.

The single exception is rename.  So, modify rename to take the i_mutex
on the file that is being renamed.

Also fix up lockdep and Documentation/filesystems/directory-locking to
reflect the change.
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6cedba89

dcache: fix outdated DCACHE_NEED_LOOKUP comment · 13a2c3be

由 J. Bruce Fields 提交于 10月 23, 2013

The DCACHE_NEED_LOOKUP case referred to here was removed with
39e3c955 "vfs: remove
DCACHE_NEED_LOOKUP".

There are only four real_lookup() callers and all of them pass in an
unhashed dentry just returned from d_alloc.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

13a2c3be

VFS: Put a small type field into struct dentry::d_flags · b18825a7

由 David Howells 提交于 9月 12, 2013

Put a type field into struct dentry::d_flags to indicate if the dentry is one
of the following types that relate particularly to pathwalk:

	Miss (negative dentry)
	Directory
	"Automount" directory (defective - no i_op->lookup())
	Symlink
	Other (regular, socket, fifo, device)

The type field is set to one of the first five types on a dentry by calls to
__d_instantiate() and d_obtain_alias() from information in the inode (if one is
given).

The type is cleared by dentry_unlink_inode() when it reconstitutes an existing
dentry as a negative dentry.

Accessors provided are:

	d_set_type(dentry, type)
	d_is_directory(dentry)
	d_is_autodir(dentry)
	d_is_symlink(dentry)
	d_is_file(dentry)
	d_is_negative(dentry)
	d_is_positive(dentry)

A bunch of checks in pathname resolution switched to those.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b18825a7

get rid of {lock,unlock}_rcu_walk() · 8b61e74f

由 Al Viro 提交于 11月 08, 2013

those have become aliases for rcu_read_{lock,unlock}()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8b61e74f

RCU'd vfsmounts · 48a066e7

由 Al Viro 提交于 9月 29, 2013

* RCU-delayed freeing of vfsmounts
* vfsmount_lock replaced with a seqlock (mount_lock)
* sequence number from mount_lock is stored in nameidata->m_seq and
used when we exit RCU mode
* new vfsmount flag - MNT_SYNC_UMOUNT.  Set by umount_tree() when its
caller knows that vfsmount will have no surviving references.
* synchronize_rcu() done between unlocking namespace_sem in namespace_unlock()
and doing pending mntput().
* new helper: legitimize_mnt(mnt, seq).  Checks the mount_lock sequence
number against seq, then grabs reference to mnt.  Then it rechecks mount_lock
again to close the race and either returns success or drops the reference it
has acquired.  The subtle point is that in case of MNT_SYNC_UMOUNT we can
simply decrement the refcount and sod off - aforementioned synchronize_rcu()
makes sure that final mntput() won't come until we leave RCU mode.  We need
that, since we don't want to end up with some lazy pathwalk racing with
umount() and stealing the final mntput() from it - caller of umount() may
expect it to return only once the fs is shut down and we don't want to break
that.  In other cases (i.e. with MNT_SYNC_UMOUNT absent) we have to do
full-blown mntput() in case of mount_lock sequence number mismatch happening
just as we'd grabbed the reference, but in those cases we won't be stealing
the final mntput() from anything that would care.
* mntput_no_expire() doesn't lock anything on the fast path now.  Incidentally,
SMP and UP cases are handled the same way - no ifdefs there.
* normal pathname resolution does *not* do any writes to mount_lock.  It does,
of course, bump the refcounts of vfsmount and dentry in the very end, but that's
it.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

48a066e7

25 10月, 2013 1 次提交

split __lookup_mnt() in two functions · 474279dc

由 Al Viro 提交于 10月 01, 2013

Instead of passing the direction as argument (and checking it on every
step through the hash chain), just have separate __lookup_mnt() and
__lookup_mnt_last().  And use the standard iterators...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

474279dc

18 9月, 2013 1 次提交
- A
  atomic_open: take care of EEXIST in no-open case with O_CREAT|O_EXCL in fs/namei.c · 03da633a
  由 Al Viro 提交于 9月 16, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  03da633a
17 9月, 2013 1 次提交

vfs: don't set FILE_CREATED before calling ->atomic_open() · 116cc022

由 Miklos Szeredi 提交于 9月 16, 2013

If O_CREAT|O_EXCL are passed to open, then we know that either

 - the file is successfully created, or
 - the operation fails in some way.

So previously we set FILE_CREATED before calling ->atomic_open() so the
filesystem doesn't have to.  This, however, led to bugs in the
implementation that went unnoticed when the filesystem didn't check for
existence, yet returned success.  To prevent this kind of bug, require
filesystems to always explicitly set FILE_CREATED on O_CREAT|O_EXCL and
verify this in the VFS.

Also added a couple more verifications for the result of atomic_open():

 - Warn if filesystem set FILE_CREATED despite the lack of O_CREAT.
 - Warn if filesystem set FILE_CREATED but gave a negative dentry.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

116cc022

11 9月, 2013 7 次提交

D
Add missing unlocks to error paths of mountpoint_last. · da5338c7
由 Dave Jones 提交于 9月 10, 2013
```
Signed-off-by: NDave Jones <davej@fedoraproject.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
da5338c7
A
... and fold the renamed __vfs_follow_link() into its only caller · bcce56d5
由 Al Viro 提交于 9月 10, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
bcce56d5

fs: remove vfs_follow_link · aac34df1

由 Christoph Hellwig 提交于 9月 09, 2013

For a long time no filesystem has been using vfs_follow_link, and as seen
by recent filesystem submissions any new use is accidental as well.

Remove vfs_follow_link, document the replacement in
Documentation/filesystems/porting and also rename __vfs_follow_link
to match its only caller better.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

aac34df1

D
Add missing unlocks to error paths of mountpoint_last. · bcceeeba
由 Dave Jones 提交于 9月 10, 2013
```
Signed-off-by: NDave Jones <davej@fedoraproject.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
bcceeeba
A
... and fold the renamed __vfs_follow_link() into its only caller · 443ed254
由 Al Viro 提交于 9月 10, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
443ed254

fs: remove vfs_follow_link · 4aa32895

由 Christoph Hellwig 提交于 9月 09, 2013

For a long time no filesystem has been using vfs_follow_link, and as seen
by recent filesystem submissions any new use is accidental as well.

Remove vfs_follow_link, document the replacement in
Documentation/filesystems/porting and also rename __vfs_follow_link
to match its only caller better.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4aa32895

vfs: make sure we don't have a stale root path if unlazy_walk() fails · d0d27277

由 Linus Torvalds 提交于 9月 10, 2013

When I moved the RCU walk termination into unlazy_walk(), I didn't copy
quite all of it: for the successful RCU termination we properly add the
necessary reference counts to our temporary copy of the root path, but
for the failure case we need to make sure that any temporary root path
information is cleared out (since it does _not_ have the proper
reference counts from the RCU lookup).

We could clean up this mess by just always dropping the temporary root
information, but Al points out that that would mean that a single lookup
through symlinks could see multiple different root entries if it races
with another thread doing chroot.  Not that I think we should really
care (we had that before too, back before we had a copy of the root path
in the nameidata).

Al says he has a cunning plan.  In the meantime, this is the minimal fix
for the problem, even if it's not all that pretty.
Reported-by: NMace Moneta <moneta.mace@gmail.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0d27277

09 9月, 2013 5 次提交

vfs: fix dentry RCU to refcounting possibly sleeping dput() · e5c832d5

由 Linus Torvalds 提交于 9月 08, 2013

This is the fix that the last two commits indirectly led up to - making
sure that we don't call dput() in a bad context on the dentries we've
looked up in RCU mode after the sequence count validation fails.

This basically expands d_rcu_to_refcount() into the callers, and then
fixes the callers to delay the dput() in the failure case until _after_
we've dropped all locks and are no longer in an RCU-locked region.

The case of 'complete_walk()' was trivial, since its failure case did
the unlock_rcu_walk() directly after the call to d_rcu_to_refcount(),
and as such that is just a pure expansion of the function with a trivial
movement of the resulting dput() to after 'unlock_rcu_walk()'.

In contrast, the unlazy_walk() case was much more complicated, because
not only does convert two different dentries from RCU to be reference
counted, but it used to not call unlock_rcu_walk() at all, and instead
just returned an error and let the caller clean everything up in
"terminate_walk()".

Happily, one of the dentries in question (called "parent" inside
unlazy_walk()) is the dentry of "nd->path", which terminate_walk() wants
a refcount to anyway for the non-RCU case.

So what the new and improved unlazy_walk() does is to first turn that
dentry into a refcounted one, and once that is set up, the error cases
can continue to use the terminate_walk() helper for cleanup, but for the
non-RCU case. Which makes it possible to drop out of RCU mode if we
actually hit the sequence number failure case.
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e5c832d5

A
introduce kern_path_mountpoint() · 2d864651
由 Al Viro 提交于 9月 08, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
2d864651

rename user_path_umountat() to user_path_mountpoint_at() · 197df04c

由 Al Viro 提交于 9月 08, 2013

... and move the extern from linux/namei.h to fs/internal.h,
along with that of vfs_path_lookup().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

197df04c

A
take unlazy_walk() into umount_lookup_last() · 35759521
由 Al Viro 提交于 9月 08, 2013
```
... and massage it a bit to reduce nesting
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
35759521

vfs: use lockred "dead" flag to mark unrecoverably dead dentries · 0d98439e

由 Linus Torvalds 提交于 9月 08, 2013

This simplifies the RCU to refcounting code in particular.

I was originally intending to leave this for later, but walking through
all the dput() logic (see previous commit), I realized that the dput()
"might_sleep()" check was misleadingly weak. And I removed it as
misleading, both for performance profiling and for debugging.

However, the might_sleep() debugging case is actually true: the final
dput() can indeed sleep, if the inode of the dentry that you are
releasing ends up sleeping at iput time (see dentry_iput()). So the
problem with the might_sleep() in dput() wasn't that it wasn't true, it
was that it wasn't actually testing and triggering on the interesting
case.

In particular, just about *any* dput() can indeed sleep, if you happen
to race with another thread deleting the file in question, and you then
lose the race to the be the last dput() for that file. But because it's
a very rare race, the debugging code would never trigger it in practice.

Why is this problematic? The new d_rcu_to_refcount() (see commit
15570086: "vfs: reimplement d_rcu_to_refcount() using
lockref_get_or_lock()") does a dput() for the failure case, and it does
it under the RCU lock. So potentially sleeping really is a bug.

But there's no way I'm going to fix this with the previous complicated
"lockref_get_or_lock()" interface. And rather than revert to the old
and crufty nested dentry locking code (which did get this right by
delaying the reference count updates until they were verified to be
safe), let's make forward progress.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0d98439e

04 9月, 2013 1 次提交

vfs: allow umount to handle mountpoints without revalidating them · 8033426e

由 Jeff Layton 提交于 7月 26, 2013

Christopher reported a regression where he was unable to unmount a NFS
filesystem where the root had gone stale. The problem is that
d_revalidate handles the root of the filesystem differently from other
dentries, but d_weak_revalidate does not. We could simply fix this by
making d_weak_revalidate return success on IS_ROOT dentries, but there
are cases where we do want to revalidate the root of the fs.

A umount is really a special case. We generally aren't interested in
anything but the dentry and vfsmount that's attached at that point. If
the inode turns out to be stale we just don't care since the intent is
to stop using it anyway.

Try to handle this situation better by treating umount as a special
case in the lookup code. Have it resolve the parent using normal
means, and then do a lookup of the final dentry without revalidating
it. In most cases, the final lookup will come out of the dcache, but
the case where there's a trailing symlink or !LAST_NORM entry on the
end complicates things a bit.

Cc: Neil Brown <neilb@suse.de>
Reported-by: NChristopher T Vogan <cvogan@us.ibm.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8033426e

03 9月, 2013 1 次提交

vfs: reimplement d_rcu_to_refcount() using lockref_get_or_lock() · 15570086

由 Linus Torvalds 提交于 9月 02, 2013

This moves __d_rcu_to_refcount() from <linux/dcache.h> into fs/namei.c
and re-implements it using the lockref infrastructure instead.  It also
adds a lot of comments about what is actually going on, because turning
a dentry that was looked up using RCU into a long-lived reference
counted entry is one of the more subtle parts of the rcu walk.

We also used to be _particularly_ subtle in unlazy_walk() where we
re-validate both the dentry and its parent using the same sequence
count.  We used to do it by nesting the locks and then verifying the
sequence count just once.

That was silly, because nested locking is expensive, but the sequence
count check is not.  So this just re-validates the dentry and the parent
separately, avoiding the nested locking, and making the lockref lookup
possible.
Acked-by: NWaiman Long <waiman.long@hp.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

15570086

29 8月, 2013 2 次提交

vfs: make the dentry cache use the lockref infrastructure · 98474236

由 Waiman Long 提交于 8月 28, 2013

This just replaces the dentry count/lock combination with the lockref
structure that contains both a count and a spinlock, and does the
mechanical conversion to use the lockref infrastructure.

There are no semantic changes here, it's purely syntactic.  The
reference lockref implementation uses the spinlock exactly the same way
that the old dcache code did, and the bulk of this patch is just
expanding the internal "d_count" use in the dcache code to use
"d_lockref.count" instead.

This is purely preparation for the real change to make the reference
count updates be lockless during the 3.12 merge window.

[ As with the previous commit, this is a rewritten version of a concept
  originally from Waiman, so credit goes to him, blame for any errors
  goes to me.

  Waiman's patch had some semantic differences for taking advantage of
  the lockless update in dget_parent(), while this patch is
  intentionally a pure search-and-replace change with no semantic
  changes.     - Linus ]
Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

98474236

Revert "fs: Allow unprivileged linkat(..., AT_EMPTY_PATH) aka flink" · f0cc6ffb

由 Linus Torvalds 提交于 8月 28, 2013

This reverts commit bb2314b4.

It wasn't necessarily wrong per se, but we're still busily discussing
the exact details of this all, so I'm going to revert it for now.

It's true that you can already do flink() through /proc and that flink()
isn't new.  But as Brad Spengler points out, some secure environments do
not mount proc, and flink adds a new interface that can avoid path
lookup of the source for those kinds of environments.

We may re-do this (and even mark it for stable backporting back in 3.11
and possibly earlier) once the whole discussion about the interface is done.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f0cc6ffb

05 8月, 2013 1 次提交

fs: Allow unprivileged linkat(..., AT_EMPTY_PATH) aka flink · bb2314b4

由 Andy Lutomirski 提交于 8月 01, 2013

Every now and then someone proposes a new flink syscall, and this spawns
a long discussion of whether it would be a security problem.  I think
that this is missing the point: flink is *already* allowed without
privilege as long as /proc is mounted -- it's called AT_SYMLINK_FOLLOW.

Now that O_TMPFILE is here, the ability to create a file with O_TMPFILE,
write it, and link it in is very convenient.  The only problem is that
it requires that /proc be mounted so that you can do:

linkat(AT_FDCWD, "/proc/self/fd/<tmpfd>", dfd, path, AT_SYMLINK_NOFOLLOW)

This sucks -- it's much nicer to do:

linkat(tmpfd, "", dfd, path, AT_EMPTY_PATH)

Let's allow it.

If this turns out to be excessively scary, it we could instead require
that the inode in question be I_LINKABLE, but this seems pointless given
the /proc situation
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bb2314b4

13 7月, 2013 1 次提交

Safer ABI for O_TMPFILE · bb458c64

由 Al Viro 提交于 7月 13, 2013

[suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE;
that will fail on old kernels in a lot more cases than what I came up with.
And make sure O_CREAT doesn't get there...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bb458c64

29 6月, 2013 5 次提交

Don't pass inode to ->d_hash() and ->d_compare() · da53be12

由 Linus Torvalds 提交于 5月 21, 2013

Instances either don't look at it at all (the majority of cases) or
only want it to find the superblock (which can be had as dentry->d_sb).
A few cases that want more are actually safe with dentry->d_inode -
the only precaution needed is the check that it hadn't been replaced with
NULL by rmdir() or by overwriting rename(), which case should be simply
treated as cache miss.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

da53be12

allow the temp files created by open() to be linked to · f4e0c30c

由 Al Viro 提交于 6月 11, 2013

O_TMPFILE | O_CREAT => linkat() with AT_SYMLINK_FOLLOW and /proc/self/fd/<n>
as oldpath (i.e. flink()) will create a link
O_TMPFILE | O_CREAT | O_EXCL => ENOENT on attempt to link those guys
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f4e0c30c

A
[O_TMPFILE] it's still short a few helpers, but infrastructure should be OK now... · 60545d0d
由 Al Viro 提交于 6月 07, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
60545d0d
A
allow build_open_flags() to return an error · f9652e10
由 Al Viro 提交于 6月 11, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
f9652e10

do_last(): fix missing checks for LAST_BIND case · bc77daa7

由 Al Viro 提交于 6月 06, 2013

/proc/self/cwd with O_CREAT should fail with EISDIR.  /proc/self/exe, OTOH,
should fail with ENOTDIR when opened with O_DIRECTORY.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bc77daa7

15 6月, 2013 1 次提交

use can_lookup() instead of direct checks of ->i_op->lookup · 05252901

由 Al Viro 提交于 6月 06, 2013

a couple of places got missed back when Linus has introduced that one...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

05252901

08 5月, 2013 1 次提交

audit: vfs: fix audit_inode call in O_CREAT case of do_last · 33e2208a

由 Jeff Layton 提交于 4月 12, 2013

Jiri reported a regression in auditing of open(..., O_CREAT) syscalls.
In older kernels, creating a file with open(..., O_CREAT) created
audit_name records that looked like this:

type=PATH msg=audit(1360255720.628:64): item=1 name="/abc/foo" inode=138810 dev=fd:00 mode=0100640 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0
type=PATH msg=audit(1360255720.628:64): item=0 name="/abc/" inode=138635 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0

...in recent kernels though, they look like this:

type=PATH msg=audit(1360255402.886:12574): item=2 name=(null) inode=264599 dev=fd:00 mode=0100640 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0
type=PATH msg=audit(1360255402.886:12574): item=1 name=(null) inode=264598 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0
type=PATH msg=audit(1360255402.886:12574): item=0 name="/abc/foo" inode=264598 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0

Richard bisected to determine that the problems started with commit
bfcec708, but the log messages have changed with some later
audit-related patches.

The problem is that this audit_inode call is passing in the parent of
the dentry being opened, but audit_inode is being called with the parent
flag false. This causes later audit_inode and audit_inode_child calls to
match the wrong entry in the audit_names list.

This patch simply sets the flag to properly indicate that this inode
represents the parent. With this, the audit_names entries are back to
looking like they did before.

Cc: <stable@vger.kernel.org> # v3.7+
Reported-by: NJiri Jaburek <jjaburek@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Test By: Richard Guy Briggs <rbriggs@redhat.com>
Signed-off-by: NEric Paris <eparis@redhat.com>

33e2208a

09 3月, 2013 1 次提交

vfs: don't BUG_ON() if following a /proc fd pseudo-symlink results in a symlink · 7b54c165

由 Linus Torvalds 提交于 3月 08, 2013

It's "normal" - it can happen if the file descriptor you followed was
opened with O_NOFOLLOW.
Reported-by: NDave Jones <davej@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7b54c165

02 3月, 2013 1 次提交
- A
  constify path_get/path_put and fs_struct.c stuff · dcf787f3
  由 Al Viro 提交于 3月 01, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  dcf787f3
26 2月, 2013 1 次提交

vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op · ecf3d1f1

由 Jeff Layton 提交于 2月 20, 2013

The following set of operations on a NFS client and server will cause

    server# mkdir a
    client# cd a
    server# mv a a.bak
    client# sleep 30  # (or whatever the dir attrcache timeout is)
    client# stat .
    stat: cannot stat `.': Stale NFS file handle

Obviously, we should not be getting an ESTALE error back there since the
inode still exists on the server. The problem is that the lookup code
will call d_revalidate on the dentry that "." refers to, because NFS has
FS_REVAL_DOT set.

nfs_lookup_revalidate will see that the parent directory has changed and
will try to reverify the dentry by redoing a LOOKUP. That of course
fails, so the lookup code returns ESTALE.

The problem here is that d_revalidate is really a bad fit for this case.
What we really want to know at this point is whether the inode is still
good or not, but we don't really care what name it goes by or whether
the dcache is still valid.

Add a new d_op->d_weak_revalidate operation and have complete_walk call
that instead of d_revalidate. The intent there is to allow for a
"weaker" d_revalidate that just checks to see whether the inode is still
good. This is also gives us an opportunity to kill off the FS_REVAL_DOT
special casing.

[AV: changed method name, added note in porting, fixed confusion re
having it possibly called from RCU mode (it won't be)]

Cc: NeilBrown <neilb@suse.de>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ecf3d1f1