提交 · 656189d207f0ed267f41338f7ff86f98e420099f · openeuler / raspberrypi-kernel

29 7月, 2016 19 次提交

由 Miklos Szeredi 提交于 7月 29, 2016

There's a superfluous newline in the warning message in ovl_d_real().
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

656189d2

ovl: remove duplicated include from super.c · 5f215013

由 Wei Yongjun 提交于 7月 06, 2016

Remove duplicated include.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

5f215013

ovl: append MAY_READ when diluting write checks · 500cac3c

由 Vivek Goyal 提交于 7月 13, 2016

Right now we remove MAY_WRITE/MAY_APPEND bits from mask if realfile is on
lower/. This is done as files on lower will never be written and will be
copied up. But to copy up a file, mounter should have MAY_READ permission
otherwise copy up will fail. So set MAY_READ in mask when MAY_WRITE is
reset.

Dan Walsh noticed this when he did access(lowerfile, W_OK) and it returned
True (context mounts) but when he tried to actually write to file, it
failed as mounter did not have permission on lower file.

[SzM] don't set MAY_READ if only MAY_APPEND is set without MAY_WRITE; this
won't trigger a copy-up.
Reported-by: NDan Walsh <dwalsh@redhat.com>
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

500cac3c

ovl: dilute permission checks on lower only if not special file · e29841a0

由 Vivek Goyal 提交于 7月 13, 2016

Right now if file is on lower/, we remove MAY_WRITE/MAY_APPEND bits from
mask as lower/ will never be written and file will be copied up. But this
is not true for special files. These files are not copied up and are opened
in place. So don't dilute the checks for these types of files.
Reported-by: NDan Walsh <dwalsh@redhat.com>
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

e29841a0

ovl: fix POSIX ACL setting · d837a49b

由 Miklos Szeredi 提交于 7月 29, 2016

Setting POSIX ACL needs special handling:

1) Some permission checks are done by ->setxattr() which now uses mounter's
creds ("ovl: do operations on underlying file system in mounter's
context").  These permission checks need to be done with current cred as
well.

2) Setting ACL can fail for various reasons.  We do not need to copy up in
these cases.

In the mean time switch to using generic_setxattr.

[Arnd Bergmann] Fix link error without POSIX ACL. posix_acl_from_xattr()
doesn't have a 'static inline' implementation when CONFIG_FS_POSIX_ACL is
disabled, and I could not come up with an obvious way to do it.

This instead avoids the link error by defining two sets of ACL operations
and letting the compiler drop one of the two at compile time depending
on CONFIG_FS_POSIX_ACL. This avoids all references to the ACL code,
also leading to smaller code.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d837a49b

ovl: share inode for hard link · 51f7e52d

由 Miklos Szeredi 提交于 7月 29, 2016

Inode attributes are copied up to overlay inode (uid, gid, mode, atime,
mtime, ctime) so generic code using these fields works correcty. If a hard
link is created in overlayfs separate inodes are allocated for each link.
If chmod/chown/etc. is performed on one of the links then the inode
belonging to the other ones won't be updated.

This patch attempts to fix this by sharing inodes for hard links.

Use inode hash (with real inode pointer as a key) to make sure overlay
inodes are shared for hard links on upper. Hard links on lower are still
split (which is not user observable until the copy-up happens, see
Documentation/filesystems/overlayfs.txt under "Non-standard behavior").

The inode is only inserted in the hash if it is non-directoy and upper.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

51f7e52d

ovl: store real inode pointer in ->i_private · 39b681f8

由 Miklos Szeredi 提交于 7月 29, 2016

To get from overlay inode to real inode we currently use 'struct
ovl_entry', which has lifetime connected to overlay dentry. This is okay,
since each overlay dentry had a new overlay inode allocated.

Following patch will break that assumption, so need to leave out ovl_entry.
This patch stores the real inode directly in i_private, with the lowest bit
used to indicate whether the inode is upper or lower.

Lifetime rules remain, using ovl_inode_real() must only be done while
caller holds ref on overlay dentry (and hence on real dentry), or within
RCU protected regions.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

39b681f8

M
ovl: permission: return ECHILD instead of ENOENT · a999d7e1
由 Miklos Szeredi 提交于 7月 29, 2016
```
The error is due to RCU and is temporary.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
```
a999d7e1

ovl: update atime on upper · d719e8f2

由 Miklos Szeredi 提交于 7月 29, 2016

Fix atime update logic in overlayfs.

This patch adds an i_op->update_time() handler to overlayfs inodes.  This
forwards atime updates to the upper layer only.  No atime updates are done
on lower layers.

Remove implicit atime updates to underlying files and directories with
O_NOATIME.  Remove explicit atime update in ovl_readlink().

Clear atime related mnt flags from cloned upper mount.  This means atime
updates are controlled purely by overlayfs mount options.

Reported-by: Konstantin Khlebnikov <koct9i@gmail.com> 
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

d719e8f2

ovl: fix sgid on directory · bb0d2b8a

由 Miklos Szeredi 提交于 7月 29, 2016

When creating directory in workdir, the group/sgid inheritance from the
parent dir was omitted completely.  Fix this by calling inode_init_owner()
on overlay inode and using the resulting uid/gid/mode to create the file.

Unfortunately the sgid bit can be stripped off due to umask, so need to
reset the mode in this case in workdir before moving the directory in
place.
Reported-by: NEryu Guan <eguan@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

bb0d2b8a

ovl: simplify permission checking · 9c630ebe

由 Miklos Szeredi 提交于 7月 29, 2016

The fact that we always do permission checking on the overlay inode and
clear MAY_WRITE for checking access to the lower inode allows cruft to be
removed from ovl_permission().

1) "default_permissions" option effectively did generic_permission() on the
overlay inode with i_mode, i_uid and i_gid updated from underlying
filesystem. This is what we do by default now. It did the update using
vfs_getattr() but that's only needed if the underlying filesystem can
change (which is not allowed). We may later introduce a "paranoia_mode"
that verifies that mode/uid/gid are not changed.

2) splitting out the IS_RDONLY() check from inode_permission() also becomes
unnecessary once we remove the MAY_WRITE from the lower inode check.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

9c630ebe

ovl: do not require mounter to have MAY_WRITE on lower · 754f8cb7

由 Vivek Goyal 提交于 7月 01, 2016

Now we have two levels of checks in ovl_permission(). overlay inode
is checked with the creds of task while underlying inode is checked
with the creds of mounter.

Looks like mounter does not have to have WRITE access to files on lower/.
So remove the MAY_WRITE from access mask for checks on underlying
lower inode.

This means task should still have the MAY_WRITE permission on lower
inode and mounter is not required to have MAY_WRITE.

It also solves the problem of read only NFS mounts being used as lower.
If __inode_permission(lower_inode, MAY_WRITE) is called on read only
NFS, it fails. By resetting MAY_WRITE, check succeeds and case of
read only NFS shold work with overlay without having to specify any
special mount options (default permission).
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

754f8cb7

ovl: do operations on underlying file system in mounter's context · 1175b6b8

由 Vivek Goyal 提交于 7月 01, 2016

Given we are now doing checks both on overlay inode as well underlying
inode, we should be able to do checks and operations on underlying file
system using mounter's context.

So modify all operations to do checks/operations on underlying dentry/inode
in the context of mounter.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

1175b6b8

ovl: modify ovl_permission() to do checks on two inodes · c0ca3d70

由 Vivek Goyal 提交于 7月 01, 2016

Right now ovl_permission() calls __inode_permission(realinode), to do
permission checks on real inode and no checks are done on overlay inode.

Modify it to do checks both on overlay inode as well as underlying inode.
Checks on overlay inode will be done with the creds of calling task while
checks on underlying inode will be done with the creds of mounter.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

c0ca3d70

ovl: define ->get_acl() for overlay inodes · 39a25b2b

由 Vivek Goyal 提交于 7月 01, 2016

Now we are planning to do DAC permission checks on overlay inode
itself. And to make it work, we will need to make sure we can get acls from
underlying inode. So define ->get_acl() for overlay inodes and this in turn
calls into underlying filesystem to get acls, if any.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

39a25b2b

ovl: move some common code in a function · 72e48481

由 Vivek Goyal 提交于 6月 16, 2016

ovl_create_upper() and ovl_create_over_whiteout() seem to be sharing some
common code which can be moved into a separate function.  No functionality
change.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

72e48481

ovl: store ovl_entry in inode->i_private for all inodes · 58ed4e70

由 Andreas Gruenbacher 提交于 5月 26, 2016

Previously this was only done for directory inodes. Doing so for all
inodes makes for a nice cleanup in ovl_permission at zero cost.

Inodes are not shared for hard links on the overlay, so this works fine.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

58ed4e70

ovl: use generic_delete_inode · eead4f2d

由 Miklos Szeredi 提交于 7月 29, 2016

No point in keeping overlay inodes around since they will never be reused.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

eead4f2d

ovl: check mounter creds on underlying lookup · c1b2cc1a

由 Miklos Szeredi 提交于 7月 29, 2016

The hash salting changes meant that we can no longer reuse the hash in the
overlay dentry to look up the underlying dentry.

Instead of lookup_hash(), use lookup_one_len_unlocked() and swith to
mounter's creds (like we do for all other operations later in the series).

Now the lookup_hash() export introduced in 4.6 by 3c9fe8cd ("vfs: add
lookup_hash() helper") is unused and can possibly be removed; its
usefulness negated by the hash salting and the idea that mounter's creds
should be used on operations on underlying filesystems.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: 8387ff25 ("vfs: make the string hashes salt the hash")

c1b2cc1a

22 7月, 2016 1 次提交

ovl: verify upper dentry in ovl_remove_and_whiteout() · cfc9fde0

由 Maxim Patlasov 提交于 7月 21, 2016

The upper dentry may become stale before we call ovl_lock_rename_workdir.
For example, someone could (mistakenly or maliciously) manually unlink(2)
it directly from upperdir.

To ensure it is not stale, let's lookup it after ovl_lock_rename_workdir
and and check if it matches the upper dentry.

Essentially, it is the same problem and similar solution as in
commit 11f37104 ("ovl: verify upper dentry before unlink and rename").
Signed-off-by: NMaxim Patlasov <mpatlasov@virtuozzo.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>

cfc9fde0

16 7月, 2016 1 次提交

xfs: fix type confusion in xfs_ioc_swapext · 3e0a3965

由 Jann Horn 提交于 9月 11, 2015

Without this check, the following XFS_I invocations would return bad
pointers when used on non-XFS inodes (perhaps pointers into preceding
allocator chunks).

This could be used by an attacker to trick xfs_swap_extents into
performing locking operations on attacker-chosen structures in kernel
memory, potentially leading to code execution in the kernel.  (I have
not investigated how likely this is to be usable for an attack in
practice.)
Signed-off-by: NJann Horn <jann@thejh.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3e0a3965

08 7月, 2016 2 次提交

ecryptfs: don't allow mmap when the lower fs doesn't support it · f0fe970d

由 Jeff Mahoney 提交于 7月 05, 2016

There are legitimate reasons to disallow mmap on certain files, notably
in sysfs or procfs.  We shouldn't emulate mmap support on file systems
that don't offer support natively.

CVE-2016-1583
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Cc: stable@vger.kernel.org
[tyhicks: clean up f_op check by using ecryptfs_file_to_lower()]
Signed-off-by: NTyler Hicks <tyhicks@canonical.com>

f0fe970d

Revert "ecryptfs: forbid opening files without mmap handler" · 78c4e172

由 Jeff Mahoney 提交于 7月 05, 2016

This reverts commit 2f36db71.

It fixed a local root exploit but also introduced a dependency on
the lower file system implementing an mmap operation just to open a file,
which is a bit of a heavy hammer.  The right fix is to have mmap depend
on the existence of the mmap handler instead.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: NTyler Hicks <tyhicks@canonical.com>

78c4e172

06 7月, 2016 2 次提交

A
nfs_atomic_open(): prevent parallel nfs_lookup() on a negative hashed · c94c0953
由 Al Viro 提交于 7月 05, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
c94c0953

Use the right predicate in ->atomic_open() instances · 00699ad8

由 Al Viro 提交于 7月 05, 2016

->atomic_open() can be given an in-lookup dentry *or* a negative one
found in dcache.  Use d_in_lookup() to tell one from another, rather
than d_unhashed().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

00699ad8

04 7月, 2016 2 次提交

ovl: Copy up underlying inode's ->i_mode to overlay inode · 07a2daab

由 Vivek Goyal 提交于 7月 01, 2016

Right now when a new overlay inode is created, we initialize overlay
inode's ->i_mode from underlying inode ->i_mode but we retain only
file type bits (S_IFMT) and discard permission bits.

This patch changes it and retains permission bits too. This should allow
overlay to do permission checks on overlay inode itself in task context.

[SzM] It also fixes clearing suid/sgid bits on write.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Reported-by: NEryu Guan <eguan@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Cc: <stable@vger.kernel.org>

07a2daab

ovl: handle ATTR_KILL* · b99c2d91

由 Miklos Szeredi 提交于 7月 04, 2016

Before 4bacc9c9 ("overlayfs: Make f_path...") file->f_path pointed to
the underlying file, hence suid/sgid removal on write worked fine.

After that patch file->f_path pointed to the overlay file, and the file
mode bits weren't copied to overlay_inode->i_mode.  So the suid/sgid
removal simply stopped working.

The fix is to copy the mode bits, but then ovl_setattr() needs to clear
ATTR_MODE to avoid the BUG() in notify_change().  So do this first, then in
the next patch copy the mode.
Reported-by: NEryu Guan <eguan@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Cc: <stable@vger.kernel.org>

b99c2d91

03 7月, 2016 1 次提交

ovl: warn instead of error if d_type is not supported · e7c0b599

由 Vivek Goyal 提交于 7月 01, 2016

overlay needs underlying fs to support d_type. Recently I put in a
patch in to detect this condition and started failing mount if
underlying fs did not support d_type.

But this breaks existing configurations over kernel upgrade. Those who
are running docker (partially broken configuration) with xfs not
supporting d_type, are surprised that after kernel upgrade docker does
not run anymore.

https://github.com/docker/docker/issues/22937#issuecomment-229881315

So instead of erroring out, detect broken configuration and warn
about it. This should allow existing docker setups to continue
working after kernel upgrade.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: 45aebeaf ("ovl: Ensure upper filesystem supports d_type")
Cc: <stable@vger.kernel.org> 4.6

e7c0b599

01 7月, 2016 5 次提交

locks: use file_inode() · 6343a212

由 Miklos Szeredi 提交于 7月 01, 2016

(Another one for the f_path debacle.)

ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask.

The reason is that generic_add_lease() used filp->f_path.dentry->inode
while all the others use file_inode().  This makes a difference for files
opened on overlayfs since the former will point to the overlay inode the
latter to the underlying inode.

So generic_add_lease() added the lease to the overlay inode and
generic_delete_lease() removed it from the underlying inode.  When the file
was released the lease remained on the overlay inode's lock list, resulting
in use after free.
Reported-by: NEryu Guan <eguan@redhat.com>
Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Cc: <stable@vger.kernel.org>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6343a212

namespace: update event counter when umounting a deleted dentry · e06b933e

由 Andrey Ulanov 提交于 4月 15, 2016

- m_start() in fs/namespace.c expects that ns->event is incremented each
  time a mount added or removed from ns->list.
- umount_tree() removes items from the list but does not increment event
  counter, expecting that it's done before the function is called.
- There are some codepaths that call umount_tree() without updating
  "event" counter. e.g. from __detach_mounts().
- When this happens m_start may reuse a cached mount structure that no
  longer belongs to ns->list (i.e. use after free which usually leads
  to infinite loop).

This change fixes the above problem by incrementing global event counter
before invoking umount_tree().

Change-Id: I622c8e84dcb9fb63542372c5dbf0178ee86bb589
Cc: stable@vger.kernel.org
Signed-off-by: NAndrey Ulanov <andreyu@google.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e06b933e

9p: use file_dentry() · b403f0e3

由 Miklos Szeredi 提交于 6月 29, 2016

v9fs may be used as lower layer of overlayfs and accessing f_path.dentry
can lead to a crash.  In this case it's a NULL pointer dereference in
p9_fid_create().

Fix by replacing direct access of file->f_path.dentry with the
file_dentry() accessor, which will always return a native object.
Reported-by: NAlessio Igor Bogani <alessioigorbogani@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Tested-by: NAlessio Igor Bogani <alessioigorbogani@gmail.com>
Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Cc: <stable@vger.kernel.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b403f0e3

lockd: unregister notifier blocks if the service fails to come up completely · cb7d224f

由 Scott Mayhew 提交于 6月 30, 2016

If the lockd service fails to start up then we need to be sure that the
notifier blocks are not registered, otherwise a subsequent start of the
service could cause the same notifier to be registered twice, leading to
soft lockups.
Signed-off-by: NScott Mayhew <smayhew@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 0751ddf7 "lockd: Register callbacks on the inetaddr_chain..."
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

cb7d224f

writeback: inode cgroup wb switch should not call ihold() · 74524955

由 Tahsin Erdogan 提交于 6月 16, 2016

Asynchronous wb switching of inodes takes an additional ref count on an
inode to make sure inode remains valid until switchover is completed.

However, anyone calling ihold() must already have a ref count on inode,
but in this case inode->i_count may already be zero:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 917 at fs/inode.c:397 ihold+0x2b/0x30
CPU: 1 PID: 917 Comm: kworker/u4:5 Not tainted 4.7.0-rc2+ #49
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
Workqueue: writeback wb_workfn (flush-8:16)
 0000000000000000 ffff88007ca0fb58 ffffffff805990af 0000000000000000
 0000000000000000 ffff88007ca0fb98 ffffffff80268702 0000018d000004e2
 ffff88007cef40e8 ffff88007c9b89a8 ffff880079e3a740 0000000000000003
Call Trace:
 [<ffffffff805990af>] dump_stack+0x4d/0x6e
 [<ffffffff80268702>] __warn+0xc2/0xe0
 [<ffffffff802687d8>] warn_slowpath_null+0x18/0x20
 [<ffffffff8035b4ab>] ihold+0x2b/0x30
 [<ffffffff80367ecc>] inode_switch_wbs+0x11c/0x180
 [<ffffffff80369110>] wbc_detach_inode+0x170/0x1a0
 [<ffffffff80369abc>] writeback_sb_inodes+0x21c/0x530
 [<ffffffff80369f7e>] wb_writeback+0xee/0x1e0
 [<ffffffff8036a147>] wb_workfn+0xd7/0x280
 [<ffffffff80287531>] ? try_to_wake_up+0x1b1/0x2b0
 [<ffffffff8027bb09>] process_one_work+0x129/0x300
 [<ffffffff8027be06>] worker_thread+0x126/0x480
 [<ffffffff8098cde7>] ? __schedule+0x1c7/0x561
 [<ffffffff8027bce0>] ? process_one_work+0x300/0x300
 [<ffffffff80280ff4>] kthread+0xc4/0xe0
 [<ffffffff80335578>] ? kfree+0xc8/0x100
 [<ffffffff809903cf>] ret_from_fork+0x1f/0x40
 [<ffffffff80280f30>] ? __kthread_parkme+0x70/0x70
---[ end trace aaefd2fd9f306bc4 ]---
Signed-off-by: NTahsin Erdogan <tahsin@google.com>
Acked-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

74524955

30 6月, 2016 3 次提交

fuse: serialize dirops by default · 5c672ab3

由 Miklos Szeredi 提交于 6月 30, 2016

Negotiate with userspace filesystems whether they support parallel readdir
and lookup.  Disable parallelism by default for fear of breaking fuse
filesystems.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: 9902af79 ("parallel lookups: actual switch to rwsem")
Fixes: d9b3dbdc ("fuse: switch to ->iterate_shared()")

5c672ab3

configfs: Remove ppos increment in configfs_write_bin_file · f8608985

由 Marek Vasut 提交于 5月 18, 2016

The simple_write_to_buffer() already increments the @ppos on success,
see fs/libfs.c simple_write_to_buffer() comment:

"
On success, the number of bytes written is returned and the offset @ppos
advanced by this number, or negative value is returned on error.
"

If the configfs_write_bin_file() is invoked with @count smaller than the
total length of the written binary file, it will be invoked multiple times.
Since configfs_write_bin_file() increments @ppos on success, after calling
simple_write_to_buffer(), the @ppos is incremented twice.

Subsequent invocation of configfs_write_bin_file() will result in the next
piece of data being written to the offset twice as long as the length of
the previous write, thus creating buffer with "holes" in it.

The simple testcase using DTO follows:
  $ mkdir /sys/kernel/config/device-tree/overlays/1
  $ dd bs=1 if=foo.dtbo of=/sys/kernel/config/device-tree/overlays/1/dtbo
Without this patch, the testcase will result in twice as big buffer in the
kernel, which is then passed to the cfs_overlay_item_dtbo_write() .
Signed-off-by: NMarek Vasut <marex@denx.de>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>

f8608985

vfs: merge .d_select_inode() into .d_real() · 2d902671

由 Miklos Szeredi 提交于 6月 30, 2016

The two methods essentially do the same: find the real dentry/inode
belonging to an overlay dentry.  The difference is in the usage:

vfs_open() uses ->d_select_inode() and expects the function to perform
copy-up if necessary based on the open flags argument.

file_dentry() uses ->d_real() passing in the overlay dentry as well as the
underlying inode.

vfs_rename() uses ->d_select_inode() but passes zero flags.  ->d_real()
with a zero inode would have worked just as well here.

This patch merges the functionality of ->d_select_inode() into ->d_real()
by adding an 'open_flags' argument to the latter.

[Al Viro] Make the signature of d_real() match that of ->d_real() again.
And constify the inode argument, while we are at it.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

2d902671

29 6月, 2016 3 次提交

ovl: get_write_access() in truncate · 03bea604

由 Miklos Szeredi 提交于 6月 29, 2016

When truncating a file we should check write access on the underlying
inode.  And we should do so on the lower file as well (before copy-up) for
consistency.

Original patch and test case by Aihua Zhang.

 - - >o >o - - test.c - - >o >o - -
#include <stdio.h>
#include <errno.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
	int ret;

	ret = truncate(argv[0], 4096);
	if (ret != -1) {
		fprintf(stderr, "truncate(argv[0]) should have failed\n");
		return 1;
	}
	if (errno != ETXTBSY) {
		perror("truncate(argv[0])");
		return 1;
	}

	return 0;
}
 - - >o >o - - >o >o - - >o >o - -
Reported-by: NAihua Zhang <zhangaihua1@huawei.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>

03bea604

ovl: fix dentry leak for default_permissions · a4859d75

由 Miklos Szeredi 提交于 6月 29, 2016

When using the 'default_permissions' mount option, ovl_permission() on
non-directories was missing a dput(alias), resulting in "BUG Dentry still
in use".
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
Fixes: 8d3095f4 ("ovl: default permissions")
Cc: <stable@vger.kernel.org> # v4.5+

a4859d75

NFS: Fix another OPEN_DOWNGRADE bug · e547f262

由 Trond Myklebust 提交于 6月 25, 2016

Olga Kornievskaia reports that the following test fails to trigger
an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE.

	fd0 = open(foo, RDRW)   -- should be open on the wire for "both"
	fd1 = open(foo, RDONLY)  -- should be open on the wire for "read"
	close(fd0) -- should trigger an open_downgrade
	read(fd1)
	close(fd1)

The issue is that we're missing a check for whether or not the current
state transitioned from an O_RDWR state as opposed to having transitioned
from a combination of O_RDONLY and O_WRONLY.
Reported-by: NOlga Kornievskaia <aglo@umich.edu>
Fixes: cd9288ff ("NFSv4: Fix another bug in the close/open_downgrade code")
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

e547f262

28 6月, 2016 1 次提交

dax: fix offset overflow in dax_io · 02395435

由 Eric Sandeen 提交于 6月 23, 2016

This isn't functionally apparent for some reason, but
when we test io at extreme offsets at the end of the loff_t
rang, such as in fstests xfs/071, the calculation of
"max" in dax_io() can be wrong due to pos + size overflowing.

For example,

# xfs_io -c "pwrite 9223372036854771712 512" /mnt/test/file

enters dax_io with:

start 0x7ffffffffffff000
end   0x7ffffffffffff200

and the rounded up "size" variable is 0x1000.  This yields:

pos + size 0x8000000000000000 (overflows loff_t)
       end 0x7ffffffffffff200

Due to the overflow, the min() function picks the wrong
value for the "max" variable, and when we send (max - pos)
into i.e. copy_from_iter_pmem() it is also the wrong value.

This somehow(tm) gets magically absorbed without incident,
probably because iter->count is correct.  But it seems best
to fix it up properly by comparing the two values as
unsigned.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

02395435