- 28 7月, 2016 20 次提交
-
-
由 Yan, Zheng 提交于
We don't have requirement of searching cap flush by TID. In most cases, we just need to know TID of the oldest cap flush. List is ideal for this usage. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
This helps the recovering MDS to reconstruct the internal states that tracking pending snapflush. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
To mount non-default filesytem, user currently needs to provide mds namespace ID. This is inconvenience. This patch makes user be able to mount filesystem by name. If user wants to mount non-default filesystem. Client first subscribes to fsmap.user. Subscribe to mdsmap.<ID> after getting ID of filesystem. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Jeff Layton 提交于
We can now handle the snapshot cases under RCU, as well as the non-snapshot case when we don't need to queue up a lease renewal allow LOOKUP_RCU walks to proceed under those conditions. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NYan, Zheng <zyan@redhat.com>
-
由 Jeff Layton 提交于
Under rcuwalk, we need to take extra care when dereferencing d_parent. We want to do that once and pass a pointer to dentry_lease_is_valid. Also, we must ensure that that function can handle the case where we're racing with d_release. Check whether "di" is NULL under the d_lock, and just return 0 if so. Finally, we still need to kick off a renewal job if the lease is getting close to expiration. If that's the case, then just drop out of rcuwalk mode since that could block. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NYan, Zheng <zyan@redhat.com>
-
由 Jeff Layton 提交于
To check for a valid dentry lease, we need to get at the ceph_dentry_info. Under rcuwalk though, we may end up with a dentry that is on its way to destruction. Since we need to take the d_lock in dentry_lease_is_valid already, we can just ensure that we clear the d_fsinfo pointer out under the same lock before destroying it. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NYan, Zheng <zyan@redhat.com>
-
由 Jeff Layton 提交于
Nothing calls it. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NYan, Zheng <zyan@redhat.com>
-
由 Miklos Szeredi 提交于
Pretty simple: just use ceph_dentry_info.time instead (which was already there, unused). Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Colin Ian King 提交于
trivial fix to spelling mistake in pr_err message Signed-off-by: NColin Ian King <colin.king@canonical.com>
-
由 Yan, Zheng 提交于
old_snapc->seq is used in dout(...) Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Otherwise ceph_sync_write_unsafe() may access/modify freed inode. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
ceph_aio_complete() can free the ceph_aio_request struct before the code exits the while loop. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Track usage count for individual fmode bit. This can reduce the array size by half. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
This patch adds codes that decode pool namespace information in cap message and request reply. Pool namespace is saved in i_layout, it will be passed to libceph when doing read/write. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Add pool namesapce pointer to struct ceph_file_layout and struct ceph_object_locator. Pool namespace is used by when mapping object to PG, it's also used when composing OSD request. The namespace pointer in struct ceph_file_layout is RCU protected. So libceph can read namespace without taking lock. Signed-off-by: NYan, Zheng <zyan@redhat.com> [idryomov@gmail.com: ceph_oloc_destroy(), misc minor changes] Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
Define new ceph_file_layout structure and rename old ceph_file_layout to ceph_file_layout_legacy. This is preparation for adding namespace to ceph_file_layout structure. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
由 Ilya Dryomov 提交于
An on-stack oid in ceph_ioctl_get_dataloc() is not initialized, resulting in a WARN and a NULL pointer dereference later on. We will have more of these on-stack in the future, so fix it with a convenience macro. Fixes: d30291b9 ("libceph: variable-sized ceph_object_id") Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 22 7月, 2016 1 次提交
-
-
由 Maxim Patlasov 提交于
The upper dentry may become stale before we call ovl_lock_rename_workdir. For example, someone could (mistakenly or maliciously) manually unlink(2) it directly from upperdir. To ensure it is not stale, let's lookup it after ovl_lock_rename_workdir and and check if it matches the upper dentry. Essentially, it is the same problem and similar solution as in commit 11f37104 ("ovl: verify upper dentry before unlink and rename"). Signed-off-by: NMaxim Patlasov <mpatlasov@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
-
- 16 7月, 2016 1 次提交
-
-
由 Jann Horn 提交于
Without this check, the following XFS_I invocations would return bad pointers when used on non-XFS inodes (perhaps pointers into preceding allocator chunks). This could be used by an attacker to trick xfs_swap_extents into performing locking operations on attacker-chosen structures in kernel memory, potentially leading to code execution in the kernel. (I have not investigated how likely this is to be usable for an attack in practice.) Signed-off-by: NJann Horn <jann@thejh.net> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 7月, 2016 2 次提交
-
-
由 Jeff Mahoney 提交于
There are legitimate reasons to disallow mmap on certain files, notably in sysfs or procfs. We shouldn't emulate mmap support on file systems that don't offer support natively. CVE-2016-1583 Signed-off-by: NJeff Mahoney <jeffm@suse.com> Cc: stable@vger.kernel.org [tyhicks: clean up f_op check by using ecryptfs_file_to_lower()] Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
-
由 Jeff Mahoney 提交于
This reverts commit 2f36db71. It fixed a local root exploit but also introduced a dependency on the lower file system implementing an mmap operation just to open a file, which is a bit of a heavy hammer. The right fix is to have mmap depend on the existence of the mmap handler instead. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Cc: stable@vger.kernel.org Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
-
- 06 7月, 2016 2 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
->atomic_open() can be given an in-lookup dentry *or* a negative one found in dcache. Use d_in_lookup() to tell one from another, rather than d_unhashed(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 04 7月, 2016 2 次提交
-
-
由 Vivek Goyal 提交于
Right now when a new overlay inode is created, we initialize overlay inode's ->i_mode from underlying inode ->i_mode but we retain only file type bits (S_IFMT) and discard permission bits. This patch changes it and retains permission bits too. This should allow overlay to do permission checks on overlay inode itself in task context. [SzM] It also fixes clearing suid/sgid bits on write. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Reported-by: NEryu Guan <eguan@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Cc: <stable@vger.kernel.org>
-
由 Miklos Szeredi 提交于
Before 4bacc9c9 ("overlayfs: Make f_path...") file->f_path pointed to the underlying file, hence suid/sgid removal on write worked fine. After that patch file->f_path pointed to the overlay file, and the file mode bits weren't copied to overlay_inode->i_mode. So the suid/sgid removal simply stopped working. The fix is to copy the mode bits, but then ovl_setattr() needs to clear ATTR_MODE to avoid the BUG() in notify_change(). So do this first, then in the next patch copy the mode. Reported-by: NEryu Guan <eguan@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Cc: <stable@vger.kernel.org>
-
- 03 7月, 2016 1 次提交
-
-
由 Vivek Goyal 提交于
overlay needs underlying fs to support d_type. Recently I put in a patch in to detect this condition and started failing mount if underlying fs did not support d_type. But this breaks existing configurations over kernel upgrade. Those who are running docker (partially broken configuration) with xfs not supporting d_type, are surprised that after kernel upgrade docker does not run anymore. https://github.com/docker/docker/issues/22937#issuecomment-229881315 So instead of erroring out, detect broken configuration and warn about it. This should allow existing docker setups to continue working after kernel upgrade. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: 45aebeaf ("ovl: Ensure upper filesystem supports d_type") Cc: <stable@vger.kernel.org> 4.6
-
- 01 7月, 2016 5 次提交
-
-
由 Miklos Szeredi 提交于
(Another one for the f_path debacle.) ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask. The reason is that generic_add_lease() used filp->f_path.dentry->inode while all the others use file_inode(). This makes a difference for files opened on overlayfs since the former will point to the overlay inode the latter to the underlying inode. So generic_add_lease() added the lease to the overlay inode and generic_delete_lease() removed it from the underlying inode. When the file was released the lease remained on the overlay inode's lock list, resulting in use after free. Reported-by: NEryu Guan <eguan@redhat.com> Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Cc: <stable@vger.kernel.org> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Andrey Ulanov 提交于
- m_start() in fs/namespace.c expects that ns->event is incremented each time a mount added or removed from ns->list. - umount_tree() removes items from the list but does not increment event counter, expecting that it's done before the function is called. - There are some codepaths that call umount_tree() without updating "event" counter. e.g. from __detach_mounts(). - When this happens m_start may reuse a cached mount structure that no longer belongs to ns->list (i.e. use after free which usually leads to infinite loop). This change fixes the above problem by incrementing global event counter before invoking umount_tree(). Change-Id: I622c8e84dcb9fb63542372c5dbf0178ee86bb589 Cc: stable@vger.kernel.org Signed-off-by: NAndrey Ulanov <andreyu@google.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Miklos Szeredi 提交于
v9fs may be used as lower layer of overlayfs and accessing f_path.dentry can lead to a crash. In this case it's a NULL pointer dereference in p9_fid_create(). Fix by replacing direct access of file->f_path.dentry with the file_dentry() accessor, which will always return a native object. Reported-by: NAlessio Igor Bogani <alessioigorbogani@gmail.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Tested-by: NAlessio Igor Bogani <alessioigorbogani@gmail.com> Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Cc: <stable@vger.kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Scott Mayhew 提交于
If the lockd service fails to start up then we need to be sure that the notifier blocks are not registered, otherwise a subsequent start of the service could cause the same notifier to be registered twice, leading to soft lockups. Signed-off-by: NScott Mayhew <smayhew@redhat.com> Cc: stable@vger.kernel.org Fixes: 0751ddf7 "lockd: Register callbacks on the inetaddr_chain..." Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Tahsin Erdogan 提交于
Asynchronous wb switching of inodes takes an additional ref count on an inode to make sure inode remains valid until switchover is completed. However, anyone calling ihold() must already have a ref count on inode, but in this case inode->i_count may already be zero: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 917 at fs/inode.c:397 ihold+0x2b/0x30 CPU: 1 PID: 917 Comm: kworker/u4:5 Not tainted 4.7.0-rc2+ #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: writeback wb_workfn (flush-8:16) 0000000000000000 ffff88007ca0fb58 ffffffff805990af 0000000000000000 0000000000000000 ffff88007ca0fb98 ffffffff80268702 0000018d000004e2 ffff88007cef40e8 ffff88007c9b89a8 ffff880079e3a740 0000000000000003 Call Trace: [<ffffffff805990af>] dump_stack+0x4d/0x6e [<ffffffff80268702>] __warn+0xc2/0xe0 [<ffffffff802687d8>] warn_slowpath_null+0x18/0x20 [<ffffffff8035b4ab>] ihold+0x2b/0x30 [<ffffffff80367ecc>] inode_switch_wbs+0x11c/0x180 [<ffffffff80369110>] wbc_detach_inode+0x170/0x1a0 [<ffffffff80369abc>] writeback_sb_inodes+0x21c/0x530 [<ffffffff80369f7e>] wb_writeback+0xee/0x1e0 [<ffffffff8036a147>] wb_workfn+0xd7/0x280 [<ffffffff80287531>] ? try_to_wake_up+0x1b1/0x2b0 [<ffffffff8027bb09>] process_one_work+0x129/0x300 [<ffffffff8027be06>] worker_thread+0x126/0x480 [<ffffffff8098cde7>] ? __schedule+0x1c7/0x561 [<ffffffff8027bce0>] ? process_one_work+0x300/0x300 [<ffffffff80280ff4>] kthread+0xc4/0xe0 [<ffffffff80335578>] ? kfree+0xc8/0x100 [<ffffffff809903cf>] ret_from_fork+0x1f/0x40 [<ffffffff80280f30>] ? __kthread_parkme+0x70/0x70 ---[ end trace aaefd2fd9f306bc4 ]--- Signed-off-by: NTahsin Erdogan <tahsin@google.com> Acked-by: NTejun Heo <tj@kernel.org> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 30 6月, 2016 2 次提交
-
-
由 Miklos Szeredi 提交于
Negotiate with userspace filesystems whether they support parallel readdir and lookup. Disable parallelism by default for fear of breaking fuse filesystems. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: 9902af79 ("parallel lookups: actual switch to rwsem") Fixes: d9b3dbdc ("fuse: switch to ->iterate_shared()")
-
由 Marek Vasut 提交于
The simple_write_to_buffer() already increments the @ppos on success, see fs/libfs.c simple_write_to_buffer() comment: " On success, the number of bytes written is returned and the offset @ppos advanced by this number, or negative value is returned on error. " If the configfs_write_bin_file() is invoked with @count smaller than the total length of the written binary file, it will be invoked multiple times. Since configfs_write_bin_file() increments @ppos on success, after calling simple_write_to_buffer(), the @ppos is incremented twice. Subsequent invocation of configfs_write_bin_file() will result in the next piece of data being written to the offset twice as long as the length of the previous write, thus creating buffer with "holes" in it. The simple testcase using DTO follows: $ mkdir /sys/kernel/config/device-tree/overlays/1 $ dd bs=1 if=foo.dtbo of=/sys/kernel/config/device-tree/overlays/1/dtbo Without this patch, the testcase will result in twice as big buffer in the kernel, which is then passed to the cfs_overlay_item_dtbo_write() . Signed-off-by: NMarek Vasut <marex@denx.de> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Christoph Hellwig <hch@lst.de> Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
-
- 29 6月, 2016 3 次提交
-
-
由 Miklos Szeredi 提交于
When truncating a file we should check write access on the underlying inode. And we should do so on the lower file as well (before copy-up) for consistency. Original patch and test case by Aihua Zhang. - - >o >o - - test.c - - >o >o - - #include <stdio.h> #include <errno.h> #include <unistd.h> int main(int argc, char *argv[]) { int ret; ret = truncate(argv[0], 4096); if (ret != -1) { fprintf(stderr, "truncate(argv[0]) should have failed\n"); return 1; } if (errno != ETXTBSY) { perror("truncate(argv[0])"); return 1; } return 0; } - - >o >o - - >o >o - - >o >o - - Reported-by: NAihua Zhang <zhangaihua1@huawei.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
-
由 Miklos Szeredi 提交于
When using the 'default_permissions' mount option, ovl_permission() on non-directories was missing a dput(alias), resulting in "BUG Dentry still in use". Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: 8d3095f4 ("ovl: default permissions") Cc: <stable@vger.kernel.org> # v4.5+
-
由 Trond Myklebust 提交于
Olga Kornievskaia reports that the following test fails to trigger an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE. fd0 = open(foo, RDRW) -- should be open on the wire for "both" fd1 = open(foo, RDONLY) -- should be open on the wire for "read" close(fd0) -- should trigger an open_downgrade read(fd1) close(fd1) The issue is that we're missing a check for whether or not the current state transitioned from an O_RDWR state as opposed to having transitioned from a combination of O_RDONLY and O_WRONLY. Reported-by: NOlga Kornievskaia <aglo@umich.edu> Fixes: cd9288ff ("NFSv4: Fix another bug in the close/open_downgrade code") Cc: stable@vger.kernel.org # 2.6.33+ Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 28 6月, 2016 1 次提交
-
-
由 Eric Sandeen 提交于
This isn't functionally apparent for some reason, but when we test io at extreme offsets at the end of the loff_t rang, such as in fstests xfs/071, the calculation of "max" in dax_io() can be wrong due to pos + size overflowing. For example, # xfs_io -c "pwrite 9223372036854771712 512" /mnt/test/file enters dax_io with: start 0x7ffffffffffff000 end 0x7ffffffffffff200 and the rounded up "size" variable is 0x1000. This yields: pos + size 0x8000000000000000 (overflows loff_t) end 0x7ffffffffffff200 Due to the overflow, the min() function picks the wrong value for the "max" variable, and when we send (max - pos) into i.e. copy_from_iter_pmem() it is also the wrong value. This somehow(tm) gets magically absorbed without incident, probably because iter->count is correct. But it seems best to fix it up properly by comparing the two values as unsigned. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-