- 14 3月, 2011 1 次提交
-
-
由 Al Viro 提交于
Fix for a dumb preadv()/pwritev() compat bug - unlike the native variants, compat_... ones forget to check FMODE_P{READ,WRITE}, so e.g. on pipe the native preadv() will fail with -ESPIPE and compat one will act as readv() and succeed. Not critical, but it's a clear bug with trivial fix. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 10 3月, 2011 10 次提交
-
-
由 J. Bruce Fields 提交于
Without this patch, inodes are not promptly freed on last close of an unlinked file by an nfs client: client$ mount -tnfs4 server:/export/ /mnt/ client$ tail -f /mnt/FOO ... server$ df -i /export server$ rm /export/FOO (^C the tail -f) server$ df -i /export server$ echo 2 >/proc/sys/vm/drop_caches server$ df -i /export the df's will show that the inode is not freed on the filesystem until the last step, when it could have been freed after killing the client's tail -f. On-disk data won't be deallocated either, leading to possible spurious ENOSPC. This occurs because when the client does the close, it arrives in a compound with a putfh and a close, processed like: - putfh: look up the filehandle. The only alias found for the inode will be DCACHE_UNHASHED alias referenced by the filp this, so it creates a new DCACHE_DISCONECTED dentry and returns that instead. - close: closes the existing filp, which is destroyed immediately by dput() since it's DCACHE_UNHASHED. - end of the compound: release the reference to the current filehandle, and dput() the new DCACHE_DISCONECTED dentry, which gets put on the unused list instead of being destroyed immediately. Nick Piggin suggested fixing this by allowing d_obtain_alias to return the unhashed dentry that is referenced by the filp, instead of making it create a new dentry. Leave __d_find_alias() alone to avoid changing behavior of other callers. Also nfsd doesn't need all the checks of __d_find_alias(); any dentry, hashed or unhashed, disconnected or not, should work. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Marco Stornelli 提交于
In the fallocate path the kernel doesn't check for the immutable/append flag. It's possible to have a race condition in this scenario: an application open a file in read/write and it does something, meanwhile root set the immutable flag on the file, the application at that point can call fallocate with success. In addition, we don't allow to do any unreserve operation on an append only file but only the reserve one. Signed-off-by: NMarco Stornelli <marco.stornelli@gmail.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
can't blindly check nd->flags in ->d_revalidate() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
can't blindly check nd->flags in ->d_revalidate() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
can't blindly check nd->flags in ->d_revalidate() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
can't blindly check nd->flags in ->d_revalidate() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
can't blindly check nd->flags in ->d_revalidate() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
can't blindly check nd->flags in ->d_revalidate() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
... it returns an error unconditionally Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 09 3月, 2011 2 次提交
-
-
由 Al Viro 提交于
We leave it at whatever it had been pointing to after the first link_path_walk() had failed with -ESTALE. Things do not work well after that... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 roel 提交于
Index i was already used in the outer loop Cc: stable@kernel.org Signed-off-by: NRoel Kluin <roel.kluin@gmail.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 08 3月, 2011 2 次提交
-
-
由 Al Viro 提交于
a) struct inode is not going to be freed under ->d_compare(); however, the thing PROC_I(inode)->sysctl points to just might. Fortunately, it's enough to make freeing that sucker delayed, provided that we don't step on its ->unregistering, clear the pointer to it in PROC_I(inode) before dropping the reference and check if it's NULL in ->d_compare(). b) I'm not sure that we *can* walk into NULL inode here (we recheck dentry->seq between verifying that it's still hashed / fetching dentry->d_inode and passing it to ->d_compare() and there's no negative hashed dentries in /proc/sys/*), but if we can walk into that, we really should not have ->d_compare() return 0 on it! Said that, I really suspect that this check can be simply killed. Nick? Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 J. Bruce Fields 提交于
In case of a nonempty list, the return on error here is obviously bogus; it ends up being a pointer to the list head instead of to any valid delegation on the list. In particular, if nfsd4_delegreturn() hits this case, and you're quite unlucky, then renew_client may oops, and it may take an embarassingly long time to figure out why. Facepalm. BUG: unable to handle kernel NULL pointer dereference at 0000000000000090 IP: [<ffffffff81292965>] nfsd4_delegreturn+0x125/0x200 ... Cc: stable@kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 05 3月, 2011 3 次提交
-
-
由 Neil Horman 提交于
The "bad_page()" page allocator sanity check was reported recently (call chain as follows): bad_page+0x69/0x91 free_hot_cold_page+0x81/0x144 skb_release_data+0x5f/0x98 __kfree_skb+0x11/0x1a tcp_ack+0x6a3/0x1868 tcp_rcv_established+0x7a6/0x8b9 tcp_v4_do_rcv+0x2a/0x2fa tcp_v4_rcv+0x9a2/0x9f6 do_timer+0x2df/0x52c ip_local_deliver+0x19d/0x263 ip_rcv+0x539/0x57c netif_receive_skb+0x470/0x49f :virtio_net:virtnet_poll+0x46b/0x5c5 net_rx_action+0xac/0x1b3 __do_softirq+0x89/0x133 call_softirq+0x1c/0x28 do_softirq+0x2c/0x7d do_IRQ+0xec/0xf5 default_idle+0x0/0x50 ret_from_intr+0x0/0xa default_idle+0x29/0x50 cpu_idle+0x95/0xb8 start_kernel+0x220/0x225 _sinittext+0x22f/0x236 It occurs because an skb with a fraglist was freed from the tcp retransmit queue when it was acked, but a page on that fraglist had PG_Slab set (indicating it was allocated from the Slab allocator (which means the free path above can't safely free it via put_page. We tracked this back to an nfsv4 setacl operation, in which the nfs code attempted to fill convert the passed in buffer to an array of pages in __nfs4_proc_set_acl, which gets used by the skb->frags list in xs_sendpages. __nfs4_proc_set_acl just converts each page in the buffer to a page struct via virt_to_page, but the vfs allocates the buffer via kmalloc, meaning the PG_slab bit is set. We can't create a buffer with kmalloc and free it later in the tcp ack path with put_page, so we need to either: 1) ensure that when we create the list of pages, no page struct has PG_Slab set or 2) not use a page list to send this data Given that these buffers can be multiple pages and arbitrarily sized, I think (1) is the right way to go. I've written the below patch to allocate a page from the buddy allocator directly and copy the data over to it. This ensures that we have a put_page free-able page for every entry that winds up on an skb frag list, so it can be safely freed when the frame is acked. We do a put page on each entry after the rpc_call_sync call so as to drop our own reference count to the page, leaving only the ref count taken by tcp_sendpages. This way the data will be properly freed when the ack comes in Successfully tested by myself to solve the above oops. Note, as this is the result of a setacl operation that exceeded a page of data, I think this amounts to a local DOS triggerable by an uprivlidged user, so I'm CCing security on this as well. Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> CC: Trond Myklebust <Trond.Myklebust@netapp.com> CC: security@kernel.org CC: Jeff Layton <jlayton@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sage Weil 提交于
Otherwise you can do things like # mkdir .snap/foo # cd .snap/foo/.snap # ls <badness> Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Al Viro 提交于
failure exits on the no-O_CREAT side of do_filp_open() merge with those of O_CREAT one; unfortunately, if do_path_lookup() returns -ESTALE, we'll get out_filp:, notice that we are about to return -ESTALE without having trying to create the sucker with LOOKUP_REVAL and jump right into the O_CREAT side of code. And proceed to try and create a file. Usually that'll fail with -ESTALE again, but we can race and get that attempt of pathname resolution to succeed. open() without O_CREAT really shouldn't end up creating files, races or not. The real fix is to rearchitect the whole do_filp_open(), but for now splitting the failure exits will do. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 04 3月, 2011 3 次提交
-
-
由 Sage Weil 提交于
First, this was racy anyway: d_release isn't called until well after the dentry is unhashed. Second, this runs afoul of the recent dcache change that clears d_parent prior to calling d_release (949854d0), causing a NULL pointer dereference. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Do not set the I_COMPLETE flag on directories until we resolve races with dcache pruning. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
This reverts commit 97d79b40. This fails to account for d_parent changes due to rename or disconnected dentries due to submounts or NFS reexports. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 03 3月, 2011 9 次提交
-
-
由 Al Viro 提交于
merge hfs_unlink() and hfs_rmdir(), while we are at it. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
(256 << sizeof(x)) - 1 is not the maximal possible value of x... In reality, the maximal allowed value for UDF FileLinkCount is 65535. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
if directory has so many subdirectories that its link count is set to 1 (i.e. "can't tell accurately") and reiserfs_new_inode() fails, we shouldn't decrement the parent's link count in cleanup path; that's what DEC_DIR_INODE_NLINK() is for. As it is, we end up with parent suddenly getting zero i_nlink, with very unpleasant effects. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Paul Bolle 提交于
This message looks like an error (which it isn't) when booting with a flattened device tree. Remove the message from normal kernel builds. Signed-off-by: NPaul Bolle <pebolle@tiscali.nl> Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
-
- 02 3月, 2011 3 次提交
-
-
由 Josh Hunt 提交于
vfs_rename_other() does not lock renamed inode with i_mutex. Thus changing i_nlink in a non-atomic manner (which happens in ext2_rename()) can corrupt it as reported and analyzed by Josh. In fact, there is no good reason to mess with i_nlink of the moved file. We did it presumably to simulate linking into the new directory and unlinking from an old one. But the practical effect of this is disputable because fsck can possibly treat file as being properly linked into both directories without writing any error which is confusing. So we just stop increment-decrement games with i_nlink which also fixes the corruption. CC: stable@kernel.org CC: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NJosh Hunt <johunt@akamai.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Alex Elder 提交于
Commit 493f3358 added this call to xfs_fs_geometry() in order to avoid passing kernel stack data back to user space: + memset(geo, 0, sizeof(*geo)); Unfortunately, one of the callers of that function passes the address of a smaller data type, cast to fit the type that xfs_fs_geometry() requires. As a result, this can happen: Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: f87aca93 Pid: 262, comm: xfs_fsr Not tainted 2.6.38-rc6-493f3358+ #1 Call Trace: [<c12991ac>] ? panic+0x50/0x150 [<c102ed71>] ? __stack_chk_fail+0x10/0x18 [<f87aca93>] ? xfs_ioc_fsgeometry_v1+0x56/0x5d [xfs] Fix this by fixing that one caller to pass the right type and then copy out the subset it is interested in. Note: This patch is an alternative to one originally proposed by Eric Sandeen. Reported-by: NJeffrey Hundstad <jeffrey.hundstad@mnsu.edu> Signed-off-by: NAlex Elder <aelder@sgi.com> Reviewed-by: NEric Sandeen <sandeen@redhat.com> Tested-by: NJeffrey Hundstad <jeffrey.hundstad@mnsu.edu>
-
由 Ryusuke Konishi 提交于
According to the report from Jiro SEKIBA titled "regression in 2.6.37?" (Message-Id: <8739n8vs1f.wl%jir@sekiba.com>), on 2.6.37 and later kernels, lscp command no longer displays "i" flag on checkpoints that snapshot operations or garbage collection created. This is a regression of nilfs2 checkpointing function, and it's critical since it broke behavior of a part of nilfs2 applications. For instance, snapshot manager of TimeBrowse gets to create meaningless snapshots continuously; snapshot creation triggers another checkpoint, but applications cannot distinguish whether the new checkpoint contains meaningful changes or not without the i-flag. This patch fixes the regression and brings that application behavior back to normal. Reported-by: NJiro SEKIBA <jir@unicus.jp> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: NJiro SEKIBA <jir@unicus.jp> Cc: stable <stable@kernel.org> [2.6.37]
-
- 01 3月, 2011 1 次提交
-
-
由 Randy Dunlap 提交于
Fix new kernel-doc warning in fs/block_dev.c: Warning(fs/block_dev.c:937): No description found for parameter 'kill_dirty' Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 2月, 2011 5 次提交
-
-
由 Jan Kara 提交于
A race can occur when io_submit() races with io_destroy(): CPU1 CPU2 io_submit() do_io_submit() ... ctx = lookup_ioctx(ctx_id); io_destroy() Now do_io_submit() holds the last reference to ctx. ... queue new AIO put_ioctx(ctx) - frees ctx with active AIOs We solve this issue by checking whether ctx is being destroyed in AIO submission path after adding new AIO to ctx. Then we are guaranteed that either io_destroy() waits for new AIO or we see that ctx is being destroyed and bail out. Cc: Nick Piggin <npiggin@kernel.dk> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nick Piggin 提交于
aio-dio-invalidate-failure GPFs in aio_put_req from io_submit. lookup_ioctx doesn't implement the rcu lookup pattern properly. rcu_read_lock does not prevent refcount going to zero, so we might take a refcount on a zero count ioctx. Fix the bug by atomically testing for zero refcount before incrementing. [jack@suse.cz: added comment into the code] Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NNick Piggin <npiggin@kernel.dk> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Timo Warns 提交于
The kernel automatically evaluates partition tables of storage devices. The code for evaluating LDM partitions (in fs/partitions/ldm.c) contains a bug that causes a kernel oops on certain corrupted LDM partitions. A kernel subsystem seems to crash, because, after the oops, the kernel no longer recognizes newly connected storage devices. The patch changes ldm_parse_vmdb() to Validate the value of vblk_size. Signed-off-by: NTimo Warns <warns@pre-sense.de> Cc: Eugene Teo <eugeneteo@kernel.sg> Acked-by: NRichard Russon <ldm@flatcap.org> Cc: Harvey Harrison <harvey.harrison@gmail.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Davide Libenzi 提交于
In several places, an epoll fd can call another file's ->f_op->poll() method with ep->mtx held. This is in general unsafe, because that other file could itself be an epoll fd that contains the original epoll fd. The code defends against this possibility in its own ->poll() method using ep_call_nested, but there are several other unsafe calls to ->poll elsewhere that can be made to deadlock. For example, the following simple program causes the call in ep_insert recursively call the original fd's ->poll, leading to deadlock: #include <unistd.h> #include <sys/epoll.h> int main(void) { int e1, e2, p[2]; struct epoll_event evt = { .events = EPOLLIN }; e1 = epoll_create(1); e2 = epoll_create(2); pipe(p); epoll_ctl(e2, EPOLL_CTL_ADD, e1, &evt); epoll_ctl(e1, EPOLL_CTL_ADD, p[0], &evt); write(p[1], p, sizeof p); epoll_ctl(e1, EPOLL_CTL_ADD, e2, &evt); return 0; } On insertion, check whether the inserted file is itself a struct epoll, and if so, do a recursive walk to detect whether inserting this file would create a loop of epoll structures, which could lead to deadlock. [nelhage@ksplice.com: Use epmutex to serialize concurrent inserts] Signed-off-by: NDavide Libenzi <davidel@xmailserver.org> Signed-off-by: NNelson Elhage <nelhage@ksplice.com> Reported-by: NNelson Elhage <nelhage@ksplice.com> Tested-by: NNelson Elhage <nelhage@ksplice.com> Cc: <stable@kernel.org> [2.6.34+, possibly earlier] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Anton Blanchard 提交于
I'm seeing the following oops when testing afs: Unable to handle kernel paging request for data at address 0x00000008 ... NIP [c0000000003393b0] .afs_unlink_writeback+0x38/0xc0 LR [c00000000033987c] .afs_put_writeback+0x98/0xec Call Trace: [c00000000345f600] [c00000000033987c] .afs_put_writeback+0x98/0xec [c00000000345f690] [c00000000033ae80] .afs_write_begin+0x6a4/0x75c [c00000000345f790] [c00000000012b77c] .generic_file_buffered_write+0x148/0x320 [c00000000345f8d0] [c00000000012e1b8] .__generic_file_aio_write+0x37c/0x3e4 [c00000000345f9d0] [c00000000012e2a8] .generic_file_aio_write+0x88/0xfc [c00000000345fa90] [c0000000003390a8] .afs_file_write+0x10c/0x178 [c00000000345fb40] [c000000000188788] .do_sync_write+0xc4/0x128 [c00000000345fcc0] [c000000000189658] .vfs_write+0xe8/0x1d8 [c00000000345fd70] [c000000000189884] .SyS_write+0x68/0xb0 [c00000000345fe30] [c000000000008564] syscall_exit+0x0/0x40 afs_write_begin hits an error and calls afs_unlink_writeback. In there we do list_del_init on an uninitialised list. The patch below initialises ->link when creating the afs_writeback struct. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 2月, 2011 1 次提交
-
-
由 Miklos Szeredi 提交于
Commit e1181ee6 "vfs: pass struct file to do_truncate on O_TRUNC opens" broke the behavior of open(O_TRUNC|O_RDONLY) in fuse. Fuse assumed that when called from open, a truncate() will be done, not an ftruncate(). Fix by restoring the old behavior, based on the ATTR_OPEN flag. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-