- 12 4月, 2015 14 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
identical to import_single_range() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
We don't need req in either of those. We don't need nr_segs in caller. We don't really need len in caller either - iov_iter_count(&iter) will do. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
the only non-trivial detail is that we do it before rw_verify_area(), so we'd better cap the length ourselves in aio_setup_single_rw() case (for vectored case rw_copy_check_uvector() will do that for us). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
get it closer to matching {compat_,}rw_copy_check_uvector(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Andrew Elble 提交于
We have observed a BUG() crash in fs/attr.c:notify_change(). The crash occurs during an rsync into a filesystem that is exported via NFS. 1.) fs/attr.c:notify_change() modifies the caller's version of attr. 2.) 6de0ec00 ("VFS: make notify_change pass ATTR_KILL_S*ID to setattr operations") introduced a BUG() restriction such that "no function will ever call notify_change() with both ATTR_MODE and ATTR_KILL_S*ID set". Under some circumstances though, it will have assisted in setting the caller's version of attr to this very combination. 3.) 27ac0ffe ("locks: break delegations on any attribute modification") introduced code to handle breaking delegations. This can result in notify_change() being re-called. attr _must_ be explicitly reset to avoid triggering the BUG() established in #2. 4.) The path that that triggers this is via fs/open.c:chmod_common(). The combination of attr flags set here and in the first call to notify_change() along with a later failed break_deleg_wait() results in notify_change() being called again via retry_deleg without resetting attr. Solution is to move retry_deleg in chmod_common() a bit further up to ensure attr is completely reset. There are other places where this seemingly could occur, such as fs/utimes.c:utimes_common(), but the attr flags are not initially set in such a way to trigger this. Fixes: 27ac0ffe ("locks: break delegations on any attribute modification") Reported-by: NEric Meddaugh <etmsys@rit.edu> Tested-by: NEric Meddaugh <etmsys@rit.edu> Signed-off-by: NAndrew Elble <aweits@rit.edu> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 J. Bruce Fields 提交于
On a distributed filesystem it's possible for lookup to discover that a directory it just found is already cached elsewhere in the directory heirarchy. The dcache won't let us keep the directory in both places, so we have to move the dentry to the new location from the place we previously had it cached. If the parent has changed, then this requires all the same locks as we'd need to do a cross-directory rename. But we're already in lookup holding one parent's i_mutex, so it's too late to acquire those locks in the right order. The (unreliable) solution in __d_unalias is to trylock() the required locks and return -EBUSY if it fails. I see no particular reason for returning -EBUSY, and -ESTALE is already the result of some other lookup races on NFS. I think -ESTALE is the more helpful error return. It also allows us to take advantage of the logic Jeff Layton added in c6a94284 "vfs: fix renameat to retry on ESTALE errors" and ancestors, which hopefully resolves some of these errors before they're returned to userspace. I can reproduce these cases using NFS with: ssh root@$client ' mount -olookupcache=pos '$server':'$export' /mnt/ mkdir /mnt/TO mkdir /mnt/DIR touch /mnt/DIR/test.txt while true; do strace -e open cat /mnt/DIR/test.txt 2>&1 | grep EBUSY done ' ssh root@$server ' while true; do mv $export/DIR $export/TO/DIR mv $export/TO/DIR $export/DIR done ' It also helps to add some other concurrent use of the directory on the client (e.g., "ls /mnt/TO"). And you can replace the server-side mv's by client-side mv's that are repeatedly killed. (If the client is interrupted while waiting for the RENAME response then it's left with a dentry that has to go under one parent or the other, but it doesn't yet know which.) Acked-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Anton Altaparmakov 提交于
Signed-off-by: NAnton Altaparmakov <anton@tuxera.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
For one thing, LOOKUP_DIRECTORY will be dealt with in do_last(). For another, name can be an empty string, but not NULL - no callers pass that and it would oops immediately if they would. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
just make const char iname[] the last member and compare name->name with name->iname instead of checking name->separate We need to make sure that out-of-line name doesn't end up allocated adjacent to struct filename refering to it; fortunately, it's easy to achieve - just allocate that struct filename with one byte in ->iname[], so that ->iname[0] will be inside the same object and thus have an address different from that of out-of-line name [spotted by Boqun Feng <boqun.feng@gmail.com>] Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 26 3月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
struct kiocb now is a generic I/O container, so move it to fs.h. Also do a #include diet for aio.h while we're at it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 25 3月, 2015 4 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
all callers were passing it ->name of some struct filename Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 20 3月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
Due to a merge error when creating c5c707f9 ("nfsd: implement pNFS layout recalls"), we recursively call nfsd4_cb_layout_fail from itself, leading to stack overflows. Signed-off-by: NChristoph Hellwig <hch@lst.de> Fixes: c5c707f9 ("nfsd: implement pNFS layout recalls") Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> --- fs/nfsd/nfs4layouts.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c index 3c1bfa1..1028a06 100644 --- a/fs/nfsd/nfs4layouts.c +++ b/fs/nfsd/nfs4layouts.c @@ -587,8 +587,6 @@ nfsd4_cb_layout_fail(struct nfs4_layout_stateid *ls) rpc_ntop((struct sockaddr *)&clp->cl_addr, addr_str, sizeof(addr_str)); - nfsd4_cb_layout_fail(ls); - printk(KERN_WARNING "nfsd: client %s failed to respond to layout recall. " " Fencing..\n", addr_str); -- 1.9.1
-
- 19 3月, 2015 1 次提交
-
-
由 Tom Van Braeckel 提交于
The misc subsystem (which is used for /dev/fuse) initializes private_data to point to the misc device when a driver has registered a custom open file operation, and initializes it to NULL when a custom open file operation has *not* been provided. This subtle quirk is confusing, to the point where kernel code registers *empty* file open operations to have private_data point to the misc device structure. And it leads to bugs, where the addition or removal of a custom open file operation surprisingly changes the initial contents of a file's private_data structure. So to simplify things in the misc subsystem, a patch [1] has been proposed to *always* set the private_data to point to the misc device, instead of only doing this when a custom open file operation has been registered. But before this patch can be applied we need to modify drivers that make the assumption that a misc device file's private_data is initialized to NULL because they didn't register a custom open file operation, so they don't rely on this assumption anymore. FUSE uses private_data to store the fuse_conn and errors out if this is not initialized to NULL at mount time. Hence, we now set a file's private_data to NULL explicitly, to be independent of whatever value the misc subsystem initializes it to by default. [1] https://lkml.org/lkml/2014/12/4/939Reported-by: NGiedrius Statkevicius <giedriuswork@gmail.com> Reported-by: NThierry Reding <thierry.reding@gmail.com> Signed-off-by: NTom Van Braeckel <tomvanbraeckel@gmail.com> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
- 18 3月, 2015 8 次提交
-
-
由 hujianyang 提交于
After importing multi-lower layer support, users could mount a r/o partition as the left most lowerdir instead of using it as upperdir. And a r/o upperdir may cause an error like overlayfs: failed to create directory ./workdir/work during mount. This patch check the *s_flags* of upper fs and return an error if it is a r/o partition. The checking of *upper_mnt->mnt_sb->s_flags* can be removed now. This patch also remove /* FIXME: workdir is not needed for a R/O mount */ from ovl_fill_super() because: 1) for upper fs r/o case Setting a r/o partition as upper is prevented, no need to care about workdir in this case. 2) for "mount overlay -o ro" with a r/w upper fs case Users could remount overlayfs to r/w in this case, so workdir should not be omitted. Signed-off-by: Nhujianyang <hujianyang@huawei.com> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 hujianyang 提交于
Recently multi-lower layer mount support allow upperdir and workdir to be omitted, then cause overlayfs can be mount with only one lowerdir directory. This action make no sense and have potential risk. This patch check the total number of lower directories to prevent mounting overlayfs with only one directory. Also, an error message is added to indicate lower directories exceed OVL_MAX_STACK limit. Signed-off-by: Nhujianyang <hujianyang@huawei.com> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 hujianyang 提交于
Overlayfs should print an error message if an incorrect mount option is caught like other filesystems. After this patch, improper option input could be clearly known. Reported-by: NFabian Sturm <fabian.sturm@aduu.de> Signed-off-by: Nhujianyang <hujianyang@huawei.com> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Josef Bacik 提交于
We are keeping track of how many extents we need to reserve properly based on the amount we want to write, but we were still incrementing outstanding_extents if we wrote less than what we requested. This isn't quite right since we will be limited to our max extent size. So instead lets do something horrible! Keep track of how many outstanding_extents we reserved, and decrement each time we allocate an extent. If we use our entire reserve make sure to jack up outstanding_extents on the inode so the accounting works out properly. Thanks, Reported-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NJosef Bacik <jbacik@fb.com>
-
由 Josef Bacik 提交于
I introduced a regression wrt outstanding_extents accounting. These are tricky areas that aren't easily covered by xfstests as we could change MAX_EXTENT_SIZE at any time. So add sanity tests to cover the various conditions that are tricky in order to make sure we don't introduce regressions in the future. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com>
-
由 Josef Bacik 提交于
If we fail during our sanity tests we could get NULL deref's because we unload the module before the dummy extent buffers are free'd via RCU. So check for this case and just free the things directly. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com>
-
由 Josef Bacik 提交于
My fix Btrfs: fix merge delalloc logic only fixed half of the problems, it didn't fix the case where we have two large extents on either side and then join them together with a new small extent. We need to instead keep track of how many extents we have accounted for with each side of the new extent, and then see how many extents we need for the new large extent. If they match then we know we need to keep our reservation, otherwise we need to drop our reservation. This shows up with a case like this [BTRFS_MAX_EXTENT_SIZE+4K][4K HOLE][BTRFS_MAX_EXTENT_SIZE+4K] Previously the logic would have said that the number extents required for the new size (3) is larger than the number of extents required for the largest side (2) therefore we need to keep our reservation. But this isn't the case, since both sides require a reservation of 2 which leads to 4 for the whole range currently reserved, but we only need 3, so we need to drop one of the reservations. The same problem existed for splits, we'd think we only need 3 extents when creating the hole but in reality we need 4. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com>
-
由 Kirill A. Shutemov 提交于
As pointed by recent post[1] on exploiting DRAM physical imperfection, /proc/PID/pagemap exposes sensitive information which can be used to do attacks. This disallows anybody without CAP_SYS_ADMIN to read the pagemap. [1] http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html [ Eventually we might want to do anything more finegrained, but for now this is the simple model. - Linus ] Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NKonstantin Khlebnikov <khlebnikov@openvz.org> Acked-by: NAndy Lutomirski <luto@amacapital.net> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Seaborn <mseaborn@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 3月, 2015 2 次提交
-
-
由 Josef Bacik 提交于
Writing the block group cache will modify the extent tree quite a bit because it truncates the old space cache and pre-allocates new stuff. To try and cut down on the churn lets do the setup dance first, then later on hopefully we can avoid looping with newly dirtied roots. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com>
-
由 NeilBrown 提交于
Kernfs supports two styles of read: direct_read and seqfile_read. The latter supports 'poll' correctly thanks to the update of '->event' in kernfs_seq_show. The former does not as '->event' is never updated on a read. So add an appropriate update in kernfs_file_direct_read(). This was noticed because some 'md' sysfs attributes were recently changed to use direct reads. Reported-by: NPrakash Punnoor <prakash@punnoor.de> Reported-by: NTorsten Kaiser <just.for.lkml@googlemail.com> Fixes: 750f199eSigned-off-by: NNeilBrown <neilb@suse.de> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 14 3月, 2015 9 次提交
-
-
由 Jeff Layton 提交于
It's possible that "fl" won't point at a valid lock at this point, so use "victim" instead which is either a valid lock or NULL. Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
-
由 Josef Bacik 提交于
Dave could hit this assert consistently running btrfs/078. This is because when we update the block groups we could truncate the free space, which would try to delete the csums for that range and dirty the csum root. For this to happen we have to have already written out the csum root so it's kind of hard to hit this case. This patch fixes this by changing the logic to only write the dirty block groups if the dirty_cowonly_roots list is empty. This will get us the same effect as before since we add the extent root last, and will cover the case that we dirty some other root again but not the extent root. Thanks, Reported-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Josef Bacik 提交于
Direct IO can easily pass in an buffer that is greater than BTRFS_MAX_EXTENT_SIZE, so take this into account when reserving extents in the delalloc reservation code. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Josef Bacik 提交于
My patch to properly count outstanding extents wrt MAX_EXTENT_SIZE introduced a regression when re-dirtying already dirty areas. We have logic in split to make sure we are taking the largest space into account but didn't have it for merge, so it was sometimes making us think we were turning a tiny extent into a huge extent, when in reality we already had a huge extent and needed to use the other side in our logic. This fixes the regression that was reported by a user on list. Thanks, Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Liu Bo 提交于
Case (oper1->seq > oper2->seq) should differ with case (oper1->seq < oper2->seq). Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.cz> Reviewed-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Liu Bo 提交于
This problem is uncovered by a test case: http://patchwork.ozlabs.org/patch/244297. Fsync() can report success when it actually doesn't. When we have several threads running fsync() at the same tiem and in one fsync() we get a transaction abortion due to some problems(in the test case it's disk failures), and other fsync()s may return successfully which makes userspace programs think that data is now safely flushed into disk. It's because that after fsyncs() fail btrfs_sync_log() due to disk failures, they get to try btrfs_commit_transaction() where it finds that there is already a transaction being committed, and they'll just call wait_for_commit() and return. Note that we actually check "trans->aborted" in btrfs_end_transaction, but it's likely that the error message is still not yet throwed out and only after wait_for_commit() we're sure whether the transaction is committed successfully. This add the necessary check and it now passes the test. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Fabian Frederick 提交于
This patch fixes mips compilation warning: fs/btrfs/disk-io.c: In function 'btrfs_check_super_valid': fs/btrfs/disk-io.c:3927:21: warning: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'unsigned int' [-Wformat] Signed-off-by: NFabian Frederick <fabf@skynet.be> Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Christoph Hellwig 提交于
Most callers in the kernel want to perform synchronous file I/O, but still have to bloat the stack with a full struct kiocb. Split out the parts needed in filesystem code from those in the aio code, and only allocate those needed to pass down argument on the stack. The aio code embedds the generic iocb in the one it allocates and can easily get back to it by using container_of. Also add a ->ki_complete method to struct kiocb, this is used to call into the aio code and thus removes the dependency on aio for filesystems impementing asynchronous operations. It will also allow other callers to substitute their own completion callback. We also add a new ->ki_flags field to work around the nasty layering violation recently introduced in commit 5e33f6 ("usb: gadget: ffs: add eventfd notification about ffs events"). Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
The AIO interface is fairly complex because it tries to allow filesystems to always work async and then wakeup a synchronous caller through aio_complete. It turns out that basically no one was doing this to avoid the complexity and context switches, and we've already fixed up the remaining users and can now get rid of this case. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-