提交 · 0f1db7dee200127da4c07928189748918c312031 · openeuler / Kernel

05 7月, 2015 7 次提交

9p: cope with bogus responses from server in p9_client_{read,write} · 0f1db7de

由 Al Viro 提交于 7月 04, 2015

if server claims to have written/read more than we'd told it to,
warn and cap the claimed byte count to avoid advancing more than
we are ready to.

0f1db7de

p9_client_write(): avoid double p9_free_req() · 67e808fb

由 Al Viro 提交于 7月 04, 2015

Braino in "9p: switch p9_client_write() to passing it struct iov_iter *";
if response is impossible to parse and we discard the request, get the
out of the loop right there.

Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

67e808fb

9p: forgetting to cancel request on interrupted zero-copy RPC · a84b69cb

由 Al Viro 提交于 7月 04, 2015

If we'd already sent a request and decide to abort it, we *must*
issue TFLUSH properly and not just blindly reuse the tag, or
we'll get seriously screwed when response eventually arrives
and we confuse it for response to later request that had reused
the same tag.

Cc: stable@vger.kernel.org # v3.2 and later
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a84b69cb

dax: bdev_direct_access() may sleep · 43c3dd08

由 Matthew Wilcox 提交于 7月 03, 2015

The brd driver is the only in-tree driver that may sleep currently.
After some discussion on linux-fsdevel, we decided that any driver
may choose to sleep in its ->direct_access method.  To ensure that all
callers of bdev_direct_access() are prepared for this, add a call
to might_sleep().
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

43c3dd08

block: Add support for DAX reads/writes to block devices · bbab37dd

由 Matthew Wilcox 提交于 7月 03, 2015

If a block device supports the ->direct_access methods, bypass the normal
DIO path and use DAX to go straight to memcpy() instead of allocating
a DIO and a BIO.

Includes support for the DIO_SKIP_DIO_COUNT flag in DAX, as is done in
do_blockdev_direct_IO().
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bbab37dd

dax: Use copy_from_iter_nocache · 872eb127

由 Matthew Wilcox 提交于 7月 03, 2015

When userspace does a write, there's no need for the written data to
pollute the CPU cache.  This matches the original XIP code.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

872eb127

dax: Add block size note to documentation · 44f4c054

由 Matthew Wilcox 提交于 7月 03, 2015

For block devices which are small enough, mkfs will default to creating
a filesystem with block sizes smaller than page size.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

44f4c054

01 7月, 2015 4 次提交

fs/file.c: __fget() and dup2() atomicity rules · 5ba97d28

由 Eric Dumazet 提交于 6月 29, 2015

__fget() does lockless fetch of pointer from the descriptor
table, attempts to grab a reference and treats "it was already
zero" as "it's already gone from the table, we just hadn't
seen the store, let's fail".  Unfortunately, that breaks the
atomicity of dup2() - __fget() might see the old pointer,
notice that it's been already dropped and treat that as
"it's closed".  What we should be getting is either the
old file or new one, depending whether we come before or after
dup2().

Dmitry had following test failing sometimes :

int fd;
void *Thread(void *x) {
  char buf;
  int n = read(fd, &buf, 1);
  if (n != 1)
    exit(printf("read failed: n=%d errno=%d\n", n, errno));
  return 0;
}

int main()
{
  fd = open("/dev/urandom", O_RDONLY);
  int fd2 = open("/dev/urandom", O_RDONLY);
  if (fd == -1 || fd2 == -1)
    exit(printf("open failed\n"));
  pthread_t th;
  pthread_create(&th, 0, Thread, 0);
  if (dup2(fd2, fd) == -1)
    exit(printf("dup2 failed\n"));
  pthread_join(th, 0);
  if (close(fd) == -1)
    exit(printf("close failed\n"));
  if (close(fd2) == -1)
    exit(printf("close failed\n"));
  printf("DONE\n");
  return 0;
}
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5ba97d28

fs/file.c: don't acquire files->file_lock in fd_install() · 8a81252b

由 Eric Dumazet 提交于 6月 30, 2015

Mateusz Guzik reported :

 Currently obtaining a new file descriptor results in locking fdtable
 twice - once in order to reserve a slot and second time to fill it.

Holding the spinlock in __fd_install() is needed in case a resize is
done, or to prevent a resize.

Mateusz provided an RFC patch and a micro benchmark :
  http://people.redhat.com/~mguzik/pipebench.c

A resize is an unlikely operation in a process lifetime,
as table size is at least doubled at every resize.

We can use RCU instead of the spinlock.

__fd_install() must wait if a resize is in progress.

The resize must block new __fd_install() callers from starting,
and wait that ongoing install are finished (synchronize_sched())

resize should be attempted by a single thread to not waste resources.

rcu_sched variant is used, as __fd_install() and expand_fdtable() run
from process context.

It gives us a ~30% speedup using pipebench on a dual Intel(R) Xeon(R)
CPU E5-2696 v2 @ 2.50GHz
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NMateusz Guzik <mguzik@redhat.com>
Acked-by: NMateusz Guzik <mguzik@redhat.com>
Tested-by: NMateusz Guzik <mguzik@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8a81252b

fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation · 1af95de6

由 Wang YanQing 提交于 6月 23, 2015

Execution of get_anon_bdev concurrently and preemptive kernel all
could bring race condition, it isn't enough to check dev against
its upper limitation with equality operator only.

This patch fix it.
Signed-off-by: NWang YanQing <udknight@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1af95de6

vfs: avoid creation of inode number 0 in get_next_ino · 2adc376c

由 Carlos Maiolino 提交于 6月 25, 2015

currently, get_next_ino() is able to create inodes with inode number = 0.
This have a bad impact in the filesystems relying in this function to generate
inode numbers.

While there is no problem at all in having inodes with number 0, userspace tools
which handle file management tasks can have problems handling these files, like
for example, the impossiblity of users to delete these files, since glibc will
ignore them. So, I believe the best way is kernel to avoid creating them.

This problem has been raised previously, but the old thread didn't have any
other update for a year+, and I've seen too many users hitting the same issue
regarding the impossibility to delete files while using filesystems relying on
this function. So, I'm starting the thread again, with the same patch
that I believe is enough to address this problem.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2adc376c

30 6月, 2015 1 次提交

namei: make set_root_rcu() return void · 06d7137e

由 Al Viro 提交于 6月 29, 2015

The only caller that cares about its return value can just
as easily pick it from nd->root_seq itself.  We used to just
calculate it and return to caller, but these days we are
storing it in nd->root_seq in all cases.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

06d7137e

24 6月, 2015 15 次提交

A
make simple_positive() public · dc3f4198
由 Al Viro 提交于 5月 18, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
dc3f4198

ufs: use dir_pages instead of ufs_dir_pages() · 5d754ced

由 Fabian Frederick 提交于 5月 24, 2015

dir_pages was declared in a lot of filesystems.
Use newly dir_pages() from pagemap.h
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5d754ced

pagemap.h: move dir_pages() over there · b57c2cb9

由 Fabian Frederick 提交于 5月 24, 2015

That function was declared in a lot of filesystems to calculate
directory pages.
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b57c2cb9

A
remove the pointless include of lglock.h · e5e6e97f
由 Al Viro 提交于 6月 04, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
e5e6e97f

fs: cleanup slight list_entry abuse · db6172c4

由 Rasmus Villemoes 提交于 3月 19, 2015

list_entry is just a wrapper for container_of, but it is arguably
wrong (and slightly confusing) to use it when the pointed-to struct
member is not a struct list_head. Use container_of directly instead.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

db6172c4

A

Merge branch 'fscache-fixes' into for-next · 8ea3a7c0
由 Al Viro 提交于 6月 23, 2015

8ea3a7c0

xfs: Correctly lock inode when removing suid and file capabilities · a6de82ca

由 Jan Kara 提交于 5月 21, 2015

Currently XFS calls file_remove_privs() without holding i_mutex. This is
wrong because that function can end up messing with file permissions and
file capabilities stored in xattrs for which we need i_mutex held.

Fix the problem by grabbing iolock exclusively when we will need to
change anything in permissions / xattrs.
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a6de82ca

fs: Call security_ops->inode_killpriv on truncate · 45f147a1

由 Jan Kara 提交于 5月 21, 2015

Comment in include/linux/security.h says that ->inode_killpriv() should
be called when setuid bit is being removed and that similar security
labels (in fact this applies only to file capabilities) should be
removed at this time as well. However we don't call ->inode_killpriv()
when we remove suid bit on truncate.

We fix the problem by calling ->inode_need_killpriv() and subsequently
->inode_killpriv() on truncate the same way as we do it on file write.

After this patch there's only one user of should_remove_suid() - ocfs2 -
and indeed it's buggy because it doesn't call ->inode_killpriv() on
write. However fixing it is difficult because of special locking
constraints.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

45f147a1

fs: Provide function telling whether file_remove_privs() will do anything · dbfae0cd

由 Jan Kara 提交于 5月 21, 2015

Provide function telling whether file_remove_privs() will do anything.
Currently we only have should_remove_suid() and that does something
slightly different.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

dbfae0cd

fs: Rename file_remove_suid() to file_remove_privs() · 5fa8e0a1

由 Jan Kara 提交于 5月 21, 2015

file_remove_suid() is a misnomer since it removes also file capabilities
stored in xattrs and sets S_NOSEC flag. Also should_remove_suid() tells
something else than whether file_remove_suid() call is necessary which
leads to bugs.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5fa8e0a1

fs: Fix S_NOSEC handling · 2426f391

由 Jan Kara 提交于 5月 21, 2015

file_remove_suid() could mistakenly set S_NOSEC inode bit when root was
modifying the file. As a result following writes to the file by ordinary
user would avoid clearing suid or sgid bits.

Fix the bug by checking actual mode bits before setting S_NOSEC.

CC: stable@vger.kernel.org
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2426f391

fs/posix_acl.c: make posix_acl_create() safer and cleaner · c0c3a718

由 Dan Carpenter 提交于 6月 19, 2015

If posix_acl_create() returns an error code then "*acl" and "*default_acl"
can be uninitialized or point to freed memory.  This is a dangerous thing
to do.  For example, it causes a problem in ocfs2_reflink():

	fs/ocfs2/refcounttree.c:4327 ocfs2_reflink()
	error: potentially using uninitialized 'default_acl'.

I've re-written this so we set the pointers to NULL at the start.  I've
added a temporary "clone" variable to hold the value of "*acl" until end.
Setting them to NULL means means we don't need the "no_acl" label.  We may
as well remove the "apply_umask" stuff forward and remove that label as
well.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

c0c3a718

A
nilfs2_direct_IO(): remove dead code · 6b6dabc8
由 Al Viro 提交于 6月 21, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
6b6dabc8

vfs: add seq_file_path() helper · 2726d566

由 Miklos Szeredi 提交于 6月 19, 2015

Turn
	seq_path(..., &file->f_path, ...);
into
	seq_file_path(..., file, ...);
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2726d566

vfs: add file_path() helper · 9bf39ab2

由 Miklos Szeredi 提交于 6月 19, 2015

Turn
	d_path(&file->f_path, ...);
into
	file_path(file, ...);
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9bf39ab2

19 6月, 2015 2 次提交

overlayfs: Make f_path always point to the overlay and f_inode to the underlay · 4bacc9c9

由 David Howells 提交于 6月 18, 2015

Make file->f_path always point to the overlay dentry so that the path in
/proc/pid/fd is correct and to ensure that label-based LSMs have access to the
overlay as well as the underlay (path-based LSMs probably don't need it).

Using my union testsuite to set things up, before the patch I see:

	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
	...
	lr-x------. 1 root root 64 Jun  5 14:38 5 -> /a/foo107
	[root@andromeda union-testsuite]# stat /mnt/a/foo107
	...
	Device: 23h/35d Inode: 13381       Links: 1
	...
	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
	...
	Device: 23h/35d Inode: 13381       Links: 1
	...

After the patch:

	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
	...
	lr-x------. 1 root root 64 Jun  5 14:22 5 -> /mnt/a/foo107
	[root@andromeda union-testsuite]# stat /mnt/a/foo107
	...
	Device: 23h/35d Inode: 40346       Links: 1
	...
	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
	...
	Device: 23h/35d Inode: 40346       Links: 1
	...

Note the change in where /proc/$$/fd/5 points to in the ls command.  It was
pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107
(which is correct).

The inode accessed, however, is the lower layer.  The union layer is on device
25h/37d and the upper layer on 24h/36d.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4bacc9c9

overlay: Call ovl_drop_write() earlier in ovl_dentry_open() · f25801ee

由 David Howells 提交于 6月 18, 2015

Call ovl_drop_write() earlier in ovl_dentry_open() before we call vfs_open()
as we've done the copy up for which we needed the freeze-write lock by that
point.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f25801ee

18 6月, 2015 2 次提交
- A
  
  Merge branch 'for-linus' into for-next · 4ef51e8b
  由 Al Viro 提交于 6月 17, 2015
  
  4ef51e8b
- F
  fs/ufs: restore s_lock mutex_init() · e4f95517
  由 Fabian Frederick 提交于 6月 17, 2015
```
Add last missing line in commit "cdd9eefd"
("fs/ufs: restore s_lock mutex")
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  e4f95517
16 6月, 2015 5 次提交

ufs: don't touch mtime/ctime of directory being moved · 70d45cdb

由 Al Viro 提交于 6月 16, 2015

See "ext2: Do not update mtime of a moved directory" (and followup in
"ext2: fix unbalanced kmap()/kunmap()") for background; this is UFS
equivalent - the same problem exists here.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

70d45cdb

ufs: don't bother with lock_ufs()/unlock_ufs() for directory access · a50e4a02

由 Al Viro 提交于 6月 16, 2015

We are already serialized by ->i_mutex and operations on different
directories are independent.  These calls are just rudiments of
blind BKL conversion and they should've been removed back then.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a50e4a02

ufs: Fix possible deadlock when looking up directories · 514d748f

由 Jan Kara 提交于 6月 02, 2015

Commit e4502c63 (ufs: deal with nfsd/iget races) made ufs
create inodes with I_NEW flag set. However ufs_mkdir() never cleared
this flag. Thus if someone ever tried to lookup the directory by inode
number, he would deadlock waiting for I_NEW to be cleared. Luckily this
mostly happens only if the filesystem is exported over NFS since
otherwise we have the inode attached to dentry and don't look it up by
inode number. In rare cases dentry can get freed without inode being
freed and then we'd hit the deadlock even without NFS export.

Fix the problem by clearing I_NEW before instantiating new directory
inode.

Fixes: e4502c63Reported-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

514d748f

ufs: Fix warning from unlock_new_inode() · 12ecbb4b

由 Jan Kara 提交于 6月 01, 2015

Commit e4502c63 (ufs: deal with nfsd/iget races) introduced
unlock_new_inode() call into ufs_add_nondir(). However that function
gets called also from ufs_link() which hands it already initialized
inode and thus unlock_new_inode() complains. The problem is harmless but
annoying.

Fix the problem by opencoding necessary stuff in ufs_link()

Fixes: e4502c63Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

12ecbb4b

fs/ufs: restore s_lock mutex · cdd9eefd

由 Fabian Frederick 提交于 6月 10, 2015

Commit 0244756e ("ufs: sb mutex merge + mutex_destroy") generated
deadlocks in read/write mode on mkdir.

This patch partially reverts it keeping fixes by Andrew Morton and
mutex_destroy()

[AV: fixed a missing bit in ufs_remount()]
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Reported-by: NIan Campbell <ian.campbell@citrix.com>
Suggested-by: NJan Kara <jack@suse.cz>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Roger Pau Monne <roger.pau@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

cdd9eefd

14 6月, 2015 2 次提交

fs/ufs: revert "ufs: fix deadlocks introduced by sb mutex merge" · 13b987ea

由 Fabian Frederick 提交于 6月 10, 2015

This reverts commit 9ef7db7f ("ufs: fix deadlocks introduced by sb
mutex merge") That patch tried to solve commit 0244756e ("ufs: sb
mutex merge + mutex_destroy") which is itself partially reverted due to
multiple deadlocks.
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Suggested-by: NJan Kara <jack@suse.cz>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Roger Pau Monne <roger.pau@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

13b987ea

A
ncpfs: successful rename() should invalidate caches for parents · 3f4a9494
由 Al Viro 提交于 6月 06, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
3f4a9494

29 5月, 2015 1 次提交

d_walk() might skip too much · 2159184e

由 Al Viro 提交于 5月 28, 2015

when we find that a child has died while we'd been trying to ascend,
we should go into the first live sibling itself, rather than its sibling.

Off-by-one in question had been introduced in "deal with deadlock in
d_walk()" and the fix needs to be backported to all branches this one
has been backported to.

Cc: stable@vger.kernel.org # 3.2 and later
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2159184e

15 5月, 2015 1 次提交
- A
  turn user_{path_at,path,lpath,path_dir}() into static inlines · b853a161
  由 Al Viro 提交于 5月 13, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  b853a161

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功