提交 · 4453641fe85f2ffda653e2e61b6a554dba1f0581 · openanolis / cloud-kernel

27 9月, 2014 4 次提交

__d_materialise_dentry(): flip the order of arguments · 4453641f

由 Al Viro 提交于 9月 26, 2014

... thus making it much closer to (now unreachable, BTW) IS_ROOT(dentry)
case in __d_move().  A bit more and it'll fold in.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4453641f

__d_move(): fold manipulations with ->d_child/->d_subdirs · 9d8cd306

由 Al Viro 提交于 9月 26, 2014

list_del() + list_add() is a slightly pessimised list_move()
list_del() + INIT_LIST_HEAD() is a slightly pessimised list_del_init()

Interleaving those makes the resulting code even worse. And harder to follow...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9d8cd306

A
don't open-code d_rehash() in d_materialise_unique() · 8527dd71
由 Al Viro 提交于 9月 26, 2014
```
... and get rid of duplicate BUG_ON() there
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
8527dd71
A
pull rehashing and unlocking the target dentry into __d_materialise_dentry() · 5cc3821b
由 Al Viro 提交于 9月 26, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
5cc3821b

14 9月, 2014 2 次提交

A
move the call of __d_drop(anon) into __d_materialise_unique(dentry, anon) · 6f18493e
由 Al Viro 提交于 9月 11, 2014
```
and lock the right list there

Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
6f18493e

vfs: fix bad hashing of dentries · 99d263d4

由 Linus Torvalds 提交于 9月 13, 2014

Josef Bacik found a performance regression between 3.2 and 3.10 and
narrowed it down to commit bfcfaa77 ("vfs: use 'unsigned long'
accesses for dcache name comparison and hashing"). He reports:

 "The test case is essentially

      for (i = 0; i < 1000000; i++)
              mkdir("a$i");

  On xfs on a fio card this goes at about 20k dir/sec with 3.2, and 12k
  dir/sec with 3.10.  This is because we spend waaaaay more time in
  __d_lookup on 3.10 than in 3.2.

  The new hashing function for strings is suboptimal for <
  sizeof(unsigned long) string names (and hell even > sizeof(unsigned
  long) string names that I've tested).  I broke out the old hashing
  function and the new one into a userspace helper to get real numbers
  and this is what I'm getting:

      Old hash table had 1000000 entries, 0 dupes, 0 max dupes
      New hash table had 12628 entries, 987372 dupes, 900 max dupes
      We had 11400 buckets with a p50 of 30 dupes, p90 of 240 dupes, p99 of 567 dupes for the new hash

  My test does the hash, and then does the d_hash into a integer pointer
  array the same size as the dentry hash table on my system, and then
  just increments the value at the address we got to see how many
  entries we overlap with.

  As you can see the old hash function ended up with all 1 million
  entries in their own bucket, whereas the new one they are only
  distributed among ~12.5k buckets, which is why we're using so much
  more CPU in __d_lookup".

The reason for this hash regression is two-fold:

 - On 64-bit architectures the down-mixing of the original 64-bit
   word-at-a-time hash into the final 32-bit hash value is very
   simplistic and suboptimal, and just adds the two 32-bit parts
   together.

   In particular, because there is no bit shuffling and the mixing
   boundary is also a byte boundary, similar character patterns in the
   low and high word easily end up just canceling each other out.

 - the old byte-at-a-time hash mixed each byte into the final hash as it
   hashed the path component name, resulting in the low bits of the hash
   generally being a good source of hash data.  That is not true for the
   word-at-a-time case, and the hash data is distributed among all the
   bits.

The fix is the same in both cases: do a better job of mixing the bits up
and using as much of the hash data as possible.  We already have the
"hash_32|64()" functions to do that.
Reported-by: NJosef Bacik <jbacik@fb.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

99d263d4

08 8月, 2014 9 次提交

fs: mark __d_obtain_alias static · 49c7dd28

由 Fengguang Wu 提交于 7月 31, 2014

Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

49c7dd28

dcache: d_splice_alias should detect loops · 95ad5c29

由 J. Bruce Fields 提交于 3月 12, 2014

I believe this can only happen in the case of a corrupted filesystem.
So -EIO looks like the appropriate error.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

95ad5c29

dcache: d_find_alias needn't recheck IS_ROOT && DCACHE_DISCONNECTED · 8d80d7da

由 J. Bruce Fields 提交于 1月 16, 2014

If we get to this point and discover the dentry is not a root dentry, or
not DCACHE_DISCONNECTED--great, we always prefer that anyway.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8d80d7da

dcache: remove unused d_find_alias parameter · 52ed46f0

由 J. Bruce Fields 提交于 1月 16, 2014

Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

52ed46f0

dcache: d_obtain_alias callers don't all want DISCONNECTED · 1a0a397e

由 J. Bruce Fields 提交于 2月 14, 2014

There are a few d_obtain_alias callers that are using it to get the
root of a filesystem which may already have an alias somewhere else.

This is not the same as the filehandle-lookup case, and none of them
actually need DCACHE_DISCONNECTED set.

It isn't really a serious problem, but it would really be clearer if we
reserved DCACHE_DISCONNECTED for those cases where it's actually needed.

In the btrfs case this was causing a spurious printk from
nfsd/nfsfh.c:fh_verify when it found an unexpected DCACHE_DISCONNECTED
dentry.  Josef worked around this by unsetting DCACHE_DISCONNECTED
manually in 3a0dfa6a "Btrfs: unset DCACHE_DISCONNECTED when mounting
default subvol", and this replaces that workaround.

Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1a0a397e

dcache: d_splice_alias should ignore DCACHE_DISCONNECTED · da093a9b

由 J. Bruce Fields 提交于 2月 17, 2014

Any IS_ROOT() alias should be safe to use; there's nothing special about
DCACHE_DISCONNECTED dentries.

Note that this is in fact useful for filesystems such as btrfs which can
legimately encounter a directory with a preexisting IS_ROOT alias on a
lookup that crosses into a subvolume.  (Those aliases are currently
marked DCACHE_DISCONNECTED--but not really for any good reason, and
we'll change that soon.)
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

da093a9b

dcache: d_splice_alias mustn't create directory aliases · 908790fa

由 J. Bruce Fields 提交于 2月 17, 2014

Currently if d_splice_alias finds a directory with an alias that is not
IS_ROOT or not DCACHE_DISCONNECTED, it creates a duplicate directory.

Duplicate directory dentries are unacceptable; it is better just to
error out.

(In the case of a local filesystem the most likely case is filesystem
corruption: for example, perhaps two directories point to the same child
directory, and the other parent has already been found and cached.)

Note that distributed filesystems may encounter this case in normal
operation if a remote host moves a directory to a location different
from the one we last cached in the dcache.  For that reason, such
filesystems should instead use d_materialise_unique, which tries to move
the old directory alias to the right place instead of erroring out.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

908790fa

dcache: close d_move race in d_splice_alias · 75a2352d

由 J. Bruce Fields 提交于 2月 17, 2014

d_splice_alias will d_move an IS_ROOT() directory dentry into place if
one exists.  This should be safe as long as the dentry remains IS_ROOT,
but I can't see what guarantees that: once we drop the i_lock all we
hold here is the i_mutex on an unrelated parent directory.

Instead copy the logic of d_materialise_unique.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

75a2352d

dcache: move d_splice_alias · 3f70bd51

由 J. Bruce Fields 提交于 2月 18, 2014

Just a trivial move to locate it near (similar) d_materialise_unique
code and save some forward references in a following patch.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3f70bd51

12 6月, 2014 1 次提交

lock_parent: don't step on stale ->d_parent of all-but-freed one · c2338f2d

由 Al Viro 提交于 6月 12, 2014

Dentry that had been through (or into) __dentry_kill() might be seen
by shrink_dentry_list(); that's normal, it'll be taken off the shrink
list and freed if __dentry_kill() has already finished.  The problem
is, its ->d_parent might be pointing to already freed dentry, so
lock_parent() needs to be careful.

We need to check that dentry hasn't already gone into __dentry_kill()
*and* grab rcu_read_lock() before dropping ->d_lock - the latter makes
sure that whatever we see in ->d_parent after dropping ->d_lock it
won't be freed until we drop rcu_read_lock().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c2338f2d

07 6月, 2014 1 次提交

fs: convert use of typedef ctl_table to struct ctl_table · 1f7e0616

由 Joe Perches 提交于 6月 06, 2014

This typedef is unnecessary and should just be removed.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1f7e0616

01 6月, 2014 1 次提交

dcache: add missing lockdep annotation · 9f12600f

由 Linus Torvalds 提交于 5月 31, 2014

lock_parent() very much on purpose does nested locking of dentries, and
is careful to maintain the right order (lock parent first).  But because
it didn't annotate the nested locking order, lockdep thought it might be
a deadlock on d_lock, and complained.

Add the proper annotation for the inner locking of the child dentry to
make lockdep happy.

Introduced by commit 046b961b ("shrink_dentry_list(): take parent's
->d_lock earlier").
Reported-and-tested-by: NJosh Boyer <jwboyer@fedoraproject.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9f12600f

30 5月, 2014 3 次提交

A
dentry_kill() doesn't need the second argument now · 8cbf74da
由 Al Viro 提交于 5月 29, 2014
```
it's 1 in the only remaining caller.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
8cbf74da

dealing with the rest of shrink_dentry_list() livelock · b2b80195

由 Al Viro 提交于 5月 29, 2014

We have the same problem with ->d_lock order in the inner loop, where
we are dropping references to ancestors.  Same solution, basically -
instead of using dentry_kill() we use lock_parent() (introduced in the
previous commit) to get that lock in a safe way, recheck ->d_count
(in case if lock_parent() has ended up dropping and retaking ->d_lock
and somebody managed to grab a reference during that window), trylock
the inode->i_lock and use __dentry_kill() to do the rest.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b2b80195

shrink_dentry_list(): take parent's ->d_lock earlier · 046b961b

由 Al Viro 提交于 5月 29, 2014

The cause of livelocks there is that we are taking ->d_lock on
dentry and its parent in the wrong order, forcing us to use
trylock on the parent's one.  d_walk() takes them in the right
order, and unfortunately it's not hard to create a situation
when shrink_dentry_list() can't make progress since trylock
keeps failing, and shrink_dcache_parent() or check_submounts_and_drop()
keeps calling d_walk() disrupting the very shrink_dentry_list() it's
waiting for.

Solution is straightforward - if that trylock fails, let's unlock
the dentry itself and take locks in the right order.  We need to
stabilize ->d_parent without holding ->d_lock, but that's doable
using RCU.  And we'd better do that in the very beginning of the
loop in shrink_dentry_list(), since the checks on refcount, etc.
would need to be redone anyway.

That deals with a half of the problem - killing dentries on the
shrink list itself.  Another one (dropping their parents) is
in the next commit.

locking parent is interesting - it would be easy to do rcu_read_lock(),
lock whatever we think is a parent, lock dentry itself and check
if the parent is still the right one.  Except that we need to check
that *before* locking the dentry, or we are risking taking ->d_lock
out of order.  Fortunately, once the D1 is locked, we can check if
D2->d_parent is equal to D1 without the need to lock D2; D2->d_parent
can start or stop pointing to D1 only under D1->d_lock, so taking
D1->d_lock is enough.  In other words, the right solution is
rcu_read_lock/lock what looks like parent right now/check if it's
still our parent/rcu_read_unlock/lock the child.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

046b961b

29 5月, 2014 2 次提交

expand dentry_kill(dentry, 0) in shrink_dentry_list() · ff2fde99

由 Al Viro 提交于 5月 28, 2014

Result will be massaged to saner shape in the next commits.  It is
ugly, no questions - the point of that one is to be a provably
equivalent transformation (and it might be worth splitting a bit
more).
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ff2fde99

split dentry_kill() · e55fd011

由 Al Viro 提交于 5月 28, 2014

... into trylocks and everything else.  The latter (actual killing)
is __dentry_kill().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e55fd011

28 5月, 2014 1 次提交

lift the "already marked killed" case into shrink_dentry_list() · 64fd72e0

由 Al Viro 提交于 5月 28, 2014

It can happen only when dentry_kill() is called with unlock_on_failure
equal to 0 - other callers had dentry pinned until the moment they've
got ->d_lock and DCACHE_DENTRY_KILLED is set only after lockref_mark_dead().

IOW, only one of three call sites of dentry_kill() might end up reaching
that code. Just move it there.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

64fd72e0

04 5月, 2014 3 次提交

dcache: don't need rcu in shrink_dentry_list() · 60942f2f

由 Miklos Szeredi 提交于 5月 02, 2014

Since now the shrink list is private and nobody can free the dentry while
it is on the shrink list, we can remove RCU protection from this.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

60942f2f

more graceful recovery in umount_collect() · 9c8c10e2

由 Al Viro 提交于 5月 02, 2014

Start with shrink_dcache_parent(), then scan what remains.

First of all, BUG() is very much an overkill here; we are holding
->s_umount, and hitting BUG() means that a lot of interesting stuff
will be hanging after that point (sync(2), for example).  Moreover,
in cases when there had been more than one leak, we'll be better
off reporting all of them.  And more than just the last component
of pathname - %pd is there for just such uses...

That was the last user of dentry_lru_del(), so kill it off...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9c8c10e2

don't remove from shrink list in select_collect() · fe91522a

由 Al Viro 提交于 5月 03, 2014

	If we find something already on a shrink list, just increment
data->found and do nothing else.  Loops in shrink_dcache_parent() and
check_submounts_and_drop() will do the right thing - everything we
did put into our list will be evicted and if there had been nothing,
but data->found got non-zero, well, we have somebody else shrinking
those guys; just try again.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fe91522a

01 5月, 2014 5 次提交

dentry_kill(): don't try to remove from shrink list · 41edf278

由 Al Viro 提交于 5月 01, 2014

If the victim in on the shrink list, don't remove it from there.
If shrink_dentry_list() manages to remove it from the list before
we are done - fine, we'll just free it as usual.  If not - mark
it with new flag (DCACHE_MAY_FREE) and leave it there.

Eventually, shrink_dentry_list() will get to it, remove the sucker
from shrink list and call dentry_kill(dentry, 0).  Which is where
we'll deal with freeing.

Since now dentry_kill(dentry, 0) may happen after or during
dentry_kill(dentry, 1), we need to recognize that (by seeing
DCACHE_DENTRY_KILLED already set), unlock everything
and either free the sucker (in case DCACHE_MAY_FREE has been
set) or leave it for ongoing dentry_kill(dentry, 1) to deal with.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

41edf278

A
expand the call of dentry_lru_del() in dentry_kill() · 01b60351
由 Al Viro 提交于 4月 29, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
01b60351

new helper: dentry_free() · b4f0354e

由 Al Viro 提交于 4月 29, 2014

The part of old d_free() that dealt with actual freeing of dentry.
Taken out of dentry_kill() into a separate function.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b4f0354e

A
fold try_prune_one_dentry() · 5c47e6d0
由 Al Viro 提交于 4月 29, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
5c47e6d0
A
fold d_kill() and d_free() · 03b3b889
由 Al Viro 提交于 4月 29, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
03b3b889

20 4月, 2014 1 次提交

fix races between __d_instantiate() and checks of dentry flags · 22213318

由 Al Viro 提交于 4月 19, 2014

in non-lazy walk we need to be careful about dentry switching from
negative to positive - both ->d_flags and ->d_inode are updated,
and in some places we might see only one store.  The cases where
dentry has been obtained by dcache lookup with ->i_mutex held on
parent are safe - ->d_lock and ->i_mutex provide all the barriers
we need.  However, there are several places where we run into
trouble:
	* do_last() fetches ->d_inode, then checks ->d_flags and
assumes that inode won't be NULL unless d_is_negative() is true.
Race with e.g. creat() - we might have fetched the old value of
->d_inode (still NULL) and new value of ->d_flags (already not
DCACHE_MISS_TYPE).  Lin Ming has observed and reported the resulting
oops.
	* a bunch of places checks ->d_inode for being non-NULL,
then checks ->d_flags for "is it a symlink".  Race with symlink(2)
in case if our CPU sees ->d_inode update first - we see non-NULL
there, but ->d_flags still contains DCACHE_MISS_TYPE instead of
DCACHE_SYMLINK_TYPE.  Result: false negative on "should we follow
link here?", with subsequent unpleasantness.

Cc: stable@vger.kernel.org # 3.13 and 3.14 need that one
Reported-and-tested-by: NLin Ming <minggr@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

22213318

01 4月, 2014 1 次提交

vfs: add cross-rename · da1ce067

由 Miklos Szeredi 提交于 4月 01, 2014

If flags contain RENAME_EXCHANGE then exchange source and destination files.
There's no restriction on the type of the files; e.g. a directory can be
exchanged with a symlink.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NJ. Bruce Fields <bfields@redhat.com>

da1ce067

23 3月, 2014 1 次提交

make prepend_name() work correctly when called with negative *buflen · e825196d

由 Al Viro 提交于 3月 23, 2014

In all callchains leading to prepend_name(), the value left in *buflen
is eventually discarded unused if prepend_name() has returned a negative.
So we are free to do what prepend() does, and subtract from *buflen
*before* checking for underflow (which turns into checking the sign
of subtraction result, of course).

Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e825196d

16 3月, 2014 1 次提交

drm: add pseudo filesystem for shared inodes · 31bbe16f

由 David Herrmann 提交于 1月 03, 2014

Our current DRM design uses a single address_space for all users of the
same DRM device. However, there is no way to create an anonymous
address_space without an underlying inode. Therefore, we wait for the
first ->open() callback on a registered char-dev and take-over the inode
of the char-dev. This worked well so far, but has several drawbacks:
 - We screw with FS internals and rely on some non-obvious invariants like
   inode->i_mapping being the same as inode->i_data for char-devs.
 - We don't have any address_space prior to the first ->open() from
   user-space. This leads to ugly fallback code and we cannot allocate
   global objects early.

As pointed out by Al-Viro, fs/anon_inode.c is *not* supposed to be used by
drivers for anonymous inode-allocation. Therefore, this patch follows the
proposed alternative solution and adds a pseudo filesystem mount-point to
DRM. We can then allocate private inodes including a private address_space
for each DRM device at initialization time.

Note that we could use:
  sysfs_get_inode(sysfs_mnt->mnt_sb, drm_device->dev->kobj.sd);
to get access to the underlying sysfs-inode of a "struct device" object.
However, most of this information is currently hidden and it's not clear
whether this address_space is suitable for driver access. Thus, unless
linux allows anonymous address_space objects or driver-core provides a
public inode per device, we're left with our own private internal mount
point.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>

31bbe16f

27 1月, 2014 1 次提交

__dentry_path() fixes · f6500801

由 Al Viro 提交于 1月 26, 2014

* we need to save the starting point for restarts
* reject pathologically short buffers outright
Spotted-by: NDenys Vlasenko <dvlasenk@redhat.com>
Spotted-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f6500801

26 1月, 2014 1 次提交

vfs: Remove second variable named error in __dentry_path · a8323da0

由 Eric W. Biederman 提交于 1月 20, 2014

In commit  232d2d60
Author: Waiman Long <Waiman.Long@hp.com>
Date:   Mon Sep 9 12:18:13 2013 -0400

    dcache: Translating dentry into pathname without taking rename_lock

The __dentry_path locking was changed and the variable error was
intended to be moved outside of the loop.  Unfortunately the inner
declaration of error was not removed. Resulting in a version of
__dentry_path that will never return an error.

Remove the problematic inner declaration of error and allow
__dentry_path to return errors once again.

Cc: stable@vger.kernel.org
Cc: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a8323da0

13 12月, 2013 1 次提交

dcache: allow word-at-a-time name hashing with big-endian CPUs · a5c21dce

由 Will Deacon 提交于 12月 12, 2013

When explicitly hashing the end of a string with the word-at-a-time
interface, we have to be careful which end of the word we pick up.

On big-endian CPUs, the upper-bits will contain the data we're after, so
ensure we generate our masks accordingly (and avoid hashing whatever
random junk may have been sitting after the string).

This patch adds a new dcache helper, bytemask_from_count, which creates
a mask appropriate for the CPU endianness.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a5c21dce

27 11月, 2013 1 次提交

vfs: In d_path don't call d_dname on a mount point · f48cfddc

由 Eric W. Biederman 提交于 11月 08, 2013

Aditya Kali (adityakali@google.com) wrote:
> Commit bf056bfa:
> "proc: Fix the namespace inode permission checks." converted
> the namespace files into symlinks. The same commit changed
> the way namespace bind mounts appear in /proc/mounts:
>   $ mount --bind /proc/self/ns/ipc /mnt/ipc
> Originally:
>   $ cat /proc/mounts | grep ipc
>   proc /mnt/ipc proc rw,nosuid,nodev,noexec 0 0
>
> After commit bf056bfa:
>   $ cat /proc/mounts | grep ipc
>   proc ipc:[4026531839] proc rw,nosuid,nodev,noexec 0 0
>
> This breaks userspace which expects the 2nd field in
> /proc/mounts to be a valid path.

The symlink /proc/<pid>/ns/{ipc,mnt,net,pid,user,uts} point to
dentries allocated with d_alloc_pseudo that we can mount, and
that have interesting names printed out with d_dname.

When these files are bind mounted /proc/mounts is not currently
displaying the mount point correctly because d_dname is called instead
of just displaying the path where the file is mounted.

Solve this by adding an explicit check to distinguish mounted pseudo
inodes and unmounted pseudo inodes.  Unmounted pseudo inodes always
use mount of their filesstem as the mnt_root  in their path making
these two cases easy to distinguish.

CC: stable@vger.kernel.org
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Reported-by: NAditya Kali <adityakali@google.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

f48cfddc

openanolis / cloud-kernel 10 个月 前同步成功

openanolis / cloud-kernel
10 个月前同步成功