提交 · 12a5b5294cb1896e9a3c9fca8ff5a7e3def4e8c6 · openeuler / Kernel

12 8月, 2014 2 次提交

由 Al Viro 提交于 8月 10, 2014

Since 3.14 we had copy_tree() get the shadowing wrong - if we had one
vfsmount shadowing another (i.e. if A is a slave of B, C is mounted
on A/foo, then D got mounted on B/foo creating D' on A/foo shadowed
by C), copy_tree() of A would make a copy of D' shadow the the copy of
C, not the other way around.

It's easy to fix, fortunately - just make sure that mount follows
the one that shadows it in mnt_child as well as in mnt_hash, and when
copy_tree() decides to attach a new mount, check if the last child
it has added to the same parent should be shadowing the new one.
And if it should, just use the same logics commit_tree() has - put the
new mount into the hash and children lists right after the one that
should shadow it.

Cc: stable@vger.kernel.org [3.14 and later]
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

12a5b529

__generic_file_write_iter(): fix handling of sync error after DIO · 60bb4529

由 Al Viro 提交于 8月 08, 2014

If DIO results in short write and sync write fails, we want to bugger off
whether the DIO part has written anything or not; the logics on the return
will take care of the right return value.

Cc: stable@vger.kernel.org [3.16]
Reported-by: NAnton Altaparmakov <aia21@cam.ac.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

60bb4529

08 8月, 2014 38 次提交

A
switch iov_iter_get_pages() to passing maximal number of pages · c7f3888a
由 Al Viro 提交于 6月 18, 2014
```
... instead of maximal size.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
c7f3888a

fs: mark __d_obtain_alias static · 49c7dd28

由 Fengguang Wu 提交于 7月 31, 2014

Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

49c7dd28

dcache: d_splice_alias should detect loops · 95ad5c29

由 J. Bruce Fields 提交于 3月 12, 2014

I believe this can only happen in the case of a corrupted filesystem.
So -EIO looks like the appropriate error.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

95ad5c29

exportfs: update Exporting documentation · 96353895

由 J. Bruce Fields 提交于 2月 18, 2014

Minor documentation updates:
	- refer to d_obtain_alias rather than d_alloc_anon
	- explain when to use d_splice_alias and when
	  d_materialise_unique.
	- cut some details of d_splice_alias/d_materialise_unique
	  implementation.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

96353895

dcache: d_find_alias needn't recheck IS_ROOT && DCACHE_DISCONNECTED · 8d80d7da

由 J. Bruce Fields 提交于 1月 16, 2014

If we get to this point and discover the dentry is not a root dentry, or
not DCACHE_DISCONNECTED--great, we always prefer that anyway.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8d80d7da

dcache: remove unused d_find_alias parameter · 52ed46f0

由 J. Bruce Fields 提交于 1月 16, 2014

Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

52ed46f0

dcache: d_obtain_alias callers don't all want DISCONNECTED · 1a0a397e

由 J. Bruce Fields 提交于 2月 14, 2014

There are a few d_obtain_alias callers that are using it to get the
root of a filesystem which may already have an alias somewhere else.

This is not the same as the filehandle-lookup case, and none of them
actually need DCACHE_DISCONNECTED set.

It isn't really a serious problem, but it would really be clearer if we
reserved DCACHE_DISCONNECTED for those cases where it's actually needed.

In the btrfs case this was causing a spurious printk from
nfsd/nfsfh.c:fh_verify when it found an unexpected DCACHE_DISCONNECTED
dentry.  Josef worked around this by unsetting DCACHE_DISCONNECTED
manually in 3a0dfa6a "Btrfs: unset DCACHE_DISCONNECTED when mounting
default subvol", and this replaces that workaround.

Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1a0a397e

dcache: d_splice_alias should ignore DCACHE_DISCONNECTED · da093a9b

由 J. Bruce Fields 提交于 2月 17, 2014

Any IS_ROOT() alias should be safe to use; there's nothing special about
DCACHE_DISCONNECTED dentries.

Note that this is in fact useful for filesystems such as btrfs which can
legimately encounter a directory with a preexisting IS_ROOT alias on a
lookup that crosses into a subvolume.  (Those aliases are currently
marked DCACHE_DISCONNECTED--but not really for any good reason, and
we'll change that soon.)
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

da093a9b

dcache: d_splice_alias mustn't create directory aliases · 908790fa

由 J. Bruce Fields 提交于 2月 17, 2014

Currently if d_splice_alias finds a directory with an alias that is not
IS_ROOT or not DCACHE_DISCONNECTED, it creates a duplicate directory.

Duplicate directory dentries are unacceptable; it is better just to
error out.

(In the case of a local filesystem the most likely case is filesystem
corruption: for example, perhaps two directories point to the same child
directory, and the other parent has already been found and cached.)

Note that distributed filesystems may encounter this case in normal
operation if a remote host moves a directory to a location different
from the one we last cached in the dcache.  For that reason, such
filesystems should instead use d_materialise_unique, which tries to move
the old directory alias to the right place instead of erroring out.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

908790fa

dcache: close d_move race in d_splice_alias · 75a2352d

由 J. Bruce Fields 提交于 2月 17, 2014

d_splice_alias will d_move an IS_ROOT() directory dentry into place if
one exists.  This should be safe as long as the dentry remains IS_ROOT,
but I can't see what guarantees that: once we drop the i_lock all we
hold here is the i_mutex on an unrelated parent directory.

Instead copy the logic of d_materialise_unique.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

75a2352d

dcache: move d_splice_alias · 3f70bd51

由 J. Bruce Fields 提交于 2月 18, 2014

Just a trivial move to locate it near (similar) d_materialise_unique
code and save some forward references in a following patch.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3f70bd51

namei: trivial fix to vfs_rename_dir comment · d03b29a2

由 J. Bruce Fields 提交于 2月 17, 2014

Looks like the directory loop check is actually done in renameat?
Whatever, leave this out rather than trying to keep it up to date with
the code.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d03b29a2

VFS: allow ->d_manage() to declare -EISDIR in rcu_walk mode. · b8faf035

由 NeilBrown 提交于 8月 04, 2014

In REF-walk mode, ->d_manage can return -EISDIR to indicate
that the dentry is not really a mount trap (or even a mount point)
and that any mounts or any DCACHE_NEED_AUTOMOUNT flag should be
ignored.

RCU-walk mode doesn't currently support this, so if there is a dentry
with DCACHE_NEED_AUTOMOUNT set but which shouldn't be a mount-trap,
lookup_fast() will always drop in REF-walk mode.

With this patch, an -EISDIR from ->d_manage will always cause mounts
and automounts to be ignored, both in REF-walk and RCU-walk.
Bug-fixed-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: Ian Kent <raven@themaw.net>
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b8faf035

cifs: support RENAME_NOREPLACE · 7c33d597

由 Miklos Szeredi 提交于 7月 23, 2014

This flag gives CIFS the ability to support its native rename semantics.

Implementation is simple: just bail out before trying to hack around the
noreplace semantics.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Cc: Steve French <smfrench@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7c33d597

hostfs: support rename flags · 9a423bb6

由 Miklos Szeredi 提交于 7月 23, 2014

Support RENAME_NOREPLACE and RENAME_EXCHANGE flags on hostfs if the
underlying filesystem supports it.

Since renameat2(2) is not yet in any libc, use syscall(2) to invoke the
renameat2 syscall.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9a423bb6

shmem: support RENAME_EXCHANGE · 37456771

由 Miklos Szeredi 提交于 7月 23, 2014

This is really simple in tmpfs since the VFS already takes care of
shuffling the dentries.  Just adjust nlink on parent directories and touch
c & mtimes.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Acked-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

37456771

shmem: support RENAME_NOREPLACE · 3b69ff51

由 Miklos Szeredi 提交于 7月 23, 2014

Implement ->rename2 instead of ->rename.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Acked-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3b69ff51

btrfs: add RENAME_NOREPLACE · 80ace85c

由 Miklos Szeredi 提交于 7月 23, 2014

RENAME_NOREPLACE is trivial to implement for most filesystems: switch over
to ->rename2() and check for the supported flags.  The rest is done by the
VFS.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

80ace85c

bad_inode: add ->rename2() · a0dbc566

由 Miklos Szeredi 提交于 7月 23, 2014

so we return -EIO instead of -EINVAL.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a0dbc566

fs: call rename2 if exists · 7177a9c4

由 Miklos Szeredi 提交于 7月 23, 2014

Christoph Hellwig suggests:

1) make vfs_rename call ->rename2 if it exists instead of ->rename
2) switch all filesystems that you're adding NOREPLACE support for to
   use ->rename2
3) see how many ->rename instances we'll have left after a few
   iterations of 2.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7177a9c4

kernel/acct.c: fix coding style warnings and errors · 2577d92e

由 Ionut Alexa 提交于 7月 31, 2014

Signed-off-by: NIonut Alexa <ionut.m.alexa@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2577d92e

death to mnt_pinned · 3064c356

由 Al Viro 提交于 8月 07, 2014

Rather than playing silly buggers with vfsmount refcounts, just have
acct_on() ask fs/namespace.c for internal clone of file->f_path.mnt
and replace it with said clone.  Then attach the pin to original
vfsmount.  Voila - the clone will be alive until the file gets closed,
making sure that underlying superblock remains active, etc., and
we can drop the original vfsmount, so that it's not kept busy.
If the file lives until the final mntput of the original vfsmount,
we'll notice that there's an fs_pin (one in bsd_acct_struct that
holds that file) and mnt_pin_kill() will take it out.  Since
->kill() is synchronous, we won't proceed past that point until
these files are closed (and private clones of our vfsmount are
gone), so we get the same ordering warranties we used to get.

mnt_pin()/mnt_unpin()/->mnt_pinned is gone now, and good riddance -
it never became usable outside of kernel/acct.c (and racy wrt
umount even there).
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3064c356

make fs/{namespace,super}.c forget about acct.h · 8fa1f1c2

由 Al Viro 提交于 5月 21, 2014

These externs belong in fs/internal.h.  Rename (they are not acct-specific
anymore) and move them over there.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8fa1f1c2

take fs_pin stuff to fs/* · efb170c2

由 Al Viro 提交于 8月 07, 2014

Add a new field to fs_pin - kill(pin). That's what umount and r/o remount
will be calling for all pins attached to vfsmount and superblock resp.
Called after bumping the refcount, so it won't go away under us. Dropping
the refcount is responsibility of the instance. All generic stuff moved to
fs/fs_pin.c; the next step will rip all the knowledge of kernel/acct.c from
fs/super.c and fs/namespace.c. After that - death to mnt_pin(); it was
intended to be usable as generic mechanism for code that wants to attach
objects to vfsmount, so that they would not make the sucker busy and
would get killed on umount. Never got it right; it remained acct.c-specific
all along. Now it's very close to being killable.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

efb170c2

start carving bsd_acct_struct up · 1629d0eb

由 Al Viro 提交于 8月 07, 2014

pull generic parts into struct fs_pin.  Eventually we want those
to replace mnt_pin()/mnt_unpin() mess; that stuff will move to
fs/*.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1629d0eb

A
acct: move mnt_pin() upwards. · 215748e6
由 Al Viro 提交于 8月 07, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
215748e6

make acct_kill() wait for file closing. · 17c0a5aa

由 Al Viro 提交于 8月 07, 2014

Do actual closing of file via schedule_work().  And use
__fput_sync() there.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

17c0a5aa

drop ->s_umount around acct_auto_close() · 0aec09d0

由 Al Viro 提交于 8月 07, 2014

just repeat the frozen check after regaining it, and check that sb
is still alive.  If several threads hit acct_auto_close() at the
same time, acct_auto_close() will survive that just fine.  And we
really don't want to play with writes and closing the file with
->s_umount held exclusive - it's a deadlock country.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0aec09d0

acct: get rid of acct_lock for acct->count · 2798d4ce

由 Al Viro 提交于 8月 07, 2014

* make acct->count atomic and acct freeing - rcu-delayed.
* instead of grabbing acct_lock around the places where we take a reference,
do that under rcu_read_lock() with atomic_long_inc_not_zero().
* have the new acct locked before making ns->bacct point to it
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2798d4ce

acct: get rid of acct_list · 215752fc

由 Al Viro 提交于 8月 07, 2014

Put these suckers on per-vfsmount and per-superblock lists instead.
Note: right now it's still acct_lock for everything, but that's
going to change.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

215752fc

acct: simplify check_free_space() · 54a4d58a

由 Al Viro 提交于 4月 19, 2014

a) file can't be NULL
b) file can't be changed under us
c) all writes are serialized by acct->lock; no need to mess with
spinlock there.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

54a4d58a

acct: new lifetime rules · b8f00e6b

由 Al Viro 提交于 8月 07, 2014

Do not reuse bsd_acct_struct after closing the damn thing.
Structure lifetime is controlled by refcount now.  We also
have a mutex in there, held over closing and writing (the
file is O_APPEND, so we are not losing any concurrency).

As the result, we do not need to bother with get_file()/fput()
on log write anymore.  Moreover, do_acct_process() only needs
acct itself; file and pidns are picked from it.

Killed instances are distinguished by having NULL ->ns.
Refcount is protected by acct_lock; anybody taking the
mutex needs to grab a reference first.

The things will get a lot simpler in the next commits - this
is just the minimal chunk switching to the new lifetime rules.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b8f00e6b

acct: serialize acct_on() · 9df7fa16

由 Al Viro 提交于 5月 15, 2014

brute-force - on a global mutex that isn't nested into anything.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9df7fa16

acct() should honour the limits from the very beginning · 795a2f22

由 Al Viro 提交于 5月 07, 2014

We need to check free space on the first write to freshly opened log.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

795a2f22

A
split the slow path in acct_process() off · e25ff11f
由 Al Viro 提交于 5月 07, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
e25ff11f
A
separate namespace-independent parts of filling acct_t · cdd37e23
由 Al Viro 提交于 4月 26, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
cdd37e23
A
acct: switch to __kernel_write() · ed44724b
由 Al Viro 提交于 4月 19, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ed44724b

acct: encode_comp_t(0) is 0, fortunately... · ecfdb33d

由 Al Viro 提交于 4月 19, 2014

There was an amusing bogosity in ac_rw calculation - it tried to
do encode_comp_t(encode_comp_t(0) / 1024).  Seeing that comp_t is
a 3-bit exponent + 13-bit mantissa... it's a good thing that 0 is
represented by all-bits-clear.

The history of that one is interesting - it was introduced in
2.1.68pre1, when acct.c had been reworked and moved to separate
file.  Two months later (2.1.86) somebody has noticed that the
sucker won't compile - there was no task_struct::io_usage.
At which point the ac_io calculation had changed from
encode_comp_t(current->io_usage) to encode_comp_t(0) and the
bug in the next line (absolutely real back then, had it ever
managed to compile) become a harmless bogosity.  Looks like
nobody has ever noticed until now.

Anyway, let's bury that idiocy now that it got noticed.  17 years
is long enough...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ecfdb33d

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功