- 29 4月, 2012 1 次提交
-
-
由 Linus Torvalds 提交于
This removes a number of silly games around strncpy_from_user() in do_getname(), and removes that helper function entirely. We instead make getname_flags() just use strncpy_from_user() properly directly. Removing the wrapper function simplifies things noticeably, mostly because we no longer play the unnecessary games with segments (x86 strncpy_from_user() no longer needs the hack), but also because the empty path handling is just much more obvious. The return value of "strncpy_to_user()" is much more obvious than checking an odd error return case from do_getname(). [ non-x86 architectures were notified of this change several weeks ago, since it is possible that they have copied the old broken x86 strncpy_from_user. But nobody reacted, so .. See http://www.spinics.net/lists/linux-arch/msg17313.html for details ] Cc: linux-arch@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 4月, 2012 8 次提交
-
-
由 Linus Torvalds 提交于
This reverts commit a32744d4. While that commit was technically the right thing to do, and made the x86-64 compat mode work identically to native 32-bit mode (and thus fixing the problem with a 32-bit systemd install on a 64-bit kernel), it turns out that the automount binaries had workarounds for this compat problem. Now, the workarounds are disgusting: doing an "uname()" to find out the architecture of the kernel, and then comparing it for the 64-bit cases and fixing up the size of the read() in automount for those. And they were confused: it's not actually a generic 64-bit issue at all, it's very much tied to just x86-64, which has different alignment for an 'u64' in 64-bit mode than in 32-bit mode. But the end result is that fixing the compat layer actually breaks the case of a 32-bit automount on a x86-64 kernel. There are various approaches to fix this (including just doing a "strcmp()" on current->comm and comparing it to "automount"), but I think that I will do the one that teaches pipes about a special "packet mode", which will allow user space to not have to care too deeply about the padding at the end of the autofs packet. That change will make the compat workaround unnecessary, so let's revert it first, and get automount working again in compat mode. The packetized pipes will then fix autofs for systemd. Reported-and-requested-by: NMichael Tokarev <mjt@tls.msk.ru> Cc: Ian Kent <raven@themaw.net> Cc: stable@kernel.org # for 3.3 Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Chris Mason 提交于
We're spending huge amounts of time on lock contention during end_io processing because we unconditionally assume we are overwriting an existing extent in the file for each IO. This checks to see if we are outside i_size, and if so, it uses a less expensive readonly search of the btree to look for existing extents. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Btrfs has an optimization where it will preallocate dentries during readdir to fill in enough information to open the inode without an extra lookup. But, we're calling d_alloc, which is doing GFP_KERNEL allocations, and that leads to deadlocks because our readdir code has tree locks held. For now, disable this optimization. We'll fix the gfp mask in the next merge window. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Daniel J Blueman 提交于
Fix out-of-space checking, addressing a warning and potential resource leak when resizing the filesystem down while allocating blocks. Signed-off-by: NDaniel J Blueman <daniel@quora.org> Reviewed-by: NJosef Bacik <josef@redhat.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Stefan Behrens 提交于
may_commit_transaction() calls spin_lock(&space_info->lock); spin_lock(&delayed_rsv->lock); and update_global_block_rsv() calls spin_lock(&block_rsv->lock); spin_lock(&sinfo->lock); Lockdep complains about this at run time. Everywhere except in update_global_block_rsv(), the space_info lock is the outer lock, therefore the locking order in update_global_block_rsv() is changed. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Daniel J Blueman 提交于
I was seeing root_list corruption on unmount during fs resize in 3.4-rc4; add correct locking to address this. Signed-off-by: NDaniel J Blueman <daniel@quora.org> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Jan Schmidt 提交于
btrfs_map_block sets mirror_num, so that the repair code knows eventually which device gave us the read error. For RAID10, mirror_num must be 1 or 2. Before this fix mirror_num was incorrectly related to our stripe index. Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Josef Bacik 提交于
btrfs_start_delalloc_inodes will just walk the list of delalloc inodes and start writing them out, but it doesn't splice the list or anything so as long as somebody is doing work on the box you could end up in this section _forever_. So just remove it, it's not needed anyway since sync will start writeback on all inodes anyway, all we need to do is wait for ordered extents and then we can commit the transaction. In my horrible torture test sync goes from taking 4 minutes to about 1.5 minutes. Thanks, Signed-off-by: NJosef Bacik <josef@redhat.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 26 4月, 2012 4 次提交
-
-
由 Will Deacon 提交于
Revert commit 85e72aa5 ("proc: clear_refs: do not clear reserved pages"), which was a quick fix suitable for -stable until ARM had been moved over to the gate_vma mechanism: https://lkml.org/lkml/2012/1/14/55 With commit f9d4861f ("ARM: 7294/1: vectors: use gate_vma for vectors user mapping"), ARM does now use the gate_vma, so the PageReserved check can be removed from the proc code. Signed-off-by: NWill Deacon <will.deacon@arm.com> Cc: Nicolas Pitre <nico@linaro.org> Acked-by: NHugh Dickins <hughd@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Aneesh Kumar K.V 提交于
This fixes the below reported false lockdep warning. e096d0c7 ("lockdep: Add helper function for dir vs file i_mutex annotation") added a similar annotation for every other inode in hugetlbfs but missed the root inode because it was allocated by a separate function. For HugeTLB fs we allow taking i_mutex in mmap. HugeTLB fs doesn't support file write and its file read callback is modified in a05b0855 ("hugetlbfs: avoid taking i_mutex from hugetlbfs_read()") to not take i_mutex. Hence for HugeTLB fs with regular files we really don't take i_mutex with mmap_sem held. ====================================================== [ INFO: possible circular locking dependency detected ] 3.4.0-rc1+ #322 Not tainted ------------------------------------------------------- bash/1572 is trying to acquire lock: (&mm->mmap_sem){++++++}, at: [<ffffffff810f1618>] might_fault+0x40/0x90 but task is already holding lock: (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [<ffffffff81125f88>] vfs_readdir+0x56/0xa8 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&sb->s_type->i_mutex_key#12){+.+.+.}: [<ffffffff810a09e5>] lock_acquire+0xd5/0xfa [<ffffffff816a2f5e>] __mutex_lock_common+0x48/0x350 [<ffffffff816a3325>] mutex_lock_nested+0x2a/0x31 [<ffffffff811fb8e1>] hugetlbfs_file_mmap+0x7d/0x104 [<ffffffff810f859a>] mmap_region+0x272/0x47d [<ffffffff810f8a39>] do_mmap_pgoff+0x294/0x2ee [<ffffffff810f8b65>] sys_mmap_pgoff+0xd2/0x10e [<ffffffff8103d19e>] sys_mmap+0x1d/0x1f [<ffffffff816a5922>] system_call_fastpath+0x16/0x1b -> #0 (&mm->mmap_sem){++++++}: [<ffffffff810a0256>] __lock_acquire+0xa81/0xd75 [<ffffffff810a09e5>] lock_acquire+0xd5/0xfa [<ffffffff810f1645>] might_fault+0x6d/0x90 [<ffffffff81125d62>] filldir+0x6a/0xc2 [<ffffffff81133a83>] dcache_readdir+0x5c/0x222 [<ffffffff81125fa8>] vfs_readdir+0x76/0xa8 [<ffffffff811260b6>] sys_getdents+0x79/0xc9 [<ffffffff816a5922>] system_call_fastpath+0x16/0x1b other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sb->s_type->i_mutex_key#12); lock(&mm->mmap_sem); lock(&sb->s_type->i_mutex_key#12); lock(&mm->mmap_sem); *** DEADLOCK *** 1 lock held by bash/1572: #0: (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [<ffffffff81125f88>] vfs_readdir+0x56/0xa8 stack backtrace: Pid: 1572, comm: bash Not tainted 3.4.0-rc1+ #322 Call Trace: [<ffffffff81699a3c>] print_circular_bug+0x1f8/0x209 [<ffffffff810a0256>] __lock_acquire+0xa81/0xd75 [<ffffffff810f38aa>] ? handle_pte_fault+0x5ff/0x614 [<ffffffff8109e622>] ? mark_lock+0x2d/0x258 [<ffffffff810f1618>] ? might_fault+0x40/0x90 [<ffffffff810a09e5>] lock_acquire+0xd5/0xfa [<ffffffff810f1618>] ? might_fault+0x40/0x90 [<ffffffff816a3249>] ? __mutex_lock_common+0x333/0x350 [<ffffffff810f1645>] might_fault+0x6d/0x90 [<ffffffff810f1618>] ? might_fault+0x40/0x90 [<ffffffff81125d62>] filldir+0x6a/0xc2 [<ffffffff81133a83>] dcache_readdir+0x5c/0x222 [<ffffffff81125cf8>] ? sys_ioctl+0x74/0x74 [<ffffffff81125cf8>] ? sys_ioctl+0x74/0x74 [<ffffffff81125cf8>] ? sys_ioctl+0x74/0x74 [<ffffffff81125fa8>] vfs_readdir+0x76/0xa8 [<ffffffff811260b6>] sys_getdents+0x79/0xc9 [<ffffffff816a5922>] system_call_fastpath+0x16/0x1b Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Dave Jones <davej@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Josh Boyer <jwboyer@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Glauber Costa 提交于
While stressing the kernel with with failing allocations today, I hit the following chain of events: alloc_page_buffers(): bh = alloc_buffer_head(GFP_NOFS); if (!bh) goto no_grow; <= path taken grow_dev_page(): bh = alloc_page_buffers(page, size, 0); if (!bh) goto failed; <= taken, consequence of the above and then the failed path BUG()s the kernel. The failure is inserted a litte bit artificially, but even then, I see no reason why it should be deemed impossible in a real box. Even though this is not a condition that we expect to see around every time, failed allocations are expected to be handled, and BUG() sounds just too much. As a matter of fact, grow_dev_page() can return NULL just fine in other circumstances, so I propose we just remove it, then. Signed-off-by: NGlauber Costa <glommer@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jason Baron 提交于
An epoll_ctl(,EPOLL_CTL_ADD,,) operation can return '-ELOOP' to prevent circular epoll dependencies from being created. However, in that case we do not properly clear the 'tfile_check_list'. Thus, add a call to clear_tfile_check_list() for the -ELOOP case. Signed-off-by: NJason Baron <jbaron@redhat.com> Reported-by: NYurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru> Cc: Nelson Elhage <nelhage@nelhage.com> Cc: Davide Libenzi <davidel@xmailserver.org> Tested-by: NAlexandra N. Kossovsky <Alexandra.Kossovsky@oktetlabs.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 4月, 2012 2 次提交
-
-
由 Sachin Prabhu 提交于
cifs_show_options uses the wrong conversion specifier for uid, gid, rsize & wsize. Correct this to %u to match it to the variable type 'unsigned integer'. Signed-off-by: NSachin Prabhu <sprabhu@redhat.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
由 Sachin Prabhu 提交于
Show backupuid/backupgid in /proc/mounts for cifs shares mounted with the backupuid/backupgid feature. Also consolidate the two separate checks for pvolume_info->backupuid_specified into a single if condition in cifs_setup_cifs_sb(). Signed-off-by: NSachin Prabhu <sprabhu@redhat.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
- 24 4月, 2012 4 次提交
-
-
由 Bob Peterson 提交于
This patch instructs DLM to prevent an "in place" conversion, where the lock just stays on the granted queue, and instead forces the conversion to the back of the convert queue. This is done on upward conversions only. This is useful in cases where, for example, a lock is frequently needed in PR on one node, but another node needs it temporarily in EX to update it. This may happen, for example, when the rindex is being updated by gfs2_grow. The gfs2_grow needs to have the lock in EX, but the other nodes need to re-read it to retrieve the updates. The glock is already granted in PR on the non-growing nodes, so this prevents them from continually re-granting the lock in PR, and forces the EX from gfs2_grow to go through. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Eldad Zack 提交于
sb info is only checked with quota support. fs/ext4/super.c: In function ‘parse_options’: fs/ext4/super.c:1600:23: warning: unused variable ‘sbi’ [-Wunused-variable] Signed-off-by: NEldad Zack <eldad@fogrefinery.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Shaohua Li 提交于
flush request is issued in transaction commit code path, so looks using GFP_KERNEL to allocate memory for flush request bio falls into the classic deadlock issue. I saw btrfs and dm get it right, but ext4, xfs and md are using GFP. Signed-off-by: NShaohua Li <shli@fusionio.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz> Cc: stable@vger.kernel.org
-
由 David Teigland 提交于
The QUECVT flag should not prevent conversions from being granted immediately when the convert queue is empty. Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
- 22 4月, 2012 2 次提交
-
-
由 Trond Myklebust 提交于
To ensure that we don't reuse their identifiers. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
Retest the RB_EMPTY_NODE() condition under the spin lock to ensure that we don't call rb_erase() more than once on the same state owner. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 21 4月, 2012 9 次提交
-
-
由 Al Viro 提交于
it's always current->mm Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
... since exit_mmap() is coming and it will munmap() everything anyway. In all other cases aio_free_ring() has ctx->mm == current->mm; moreover, all other callers of vm_munmap() have mm == current->mm, so this will allow us to get rid of mm argument of vm_munmap(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Trond Myklebust 提交于
The NFSv4 spec is ambiguous about whether or not it is permissible to reuse open owner names, so play it safe. This patch adds a timestamp to the state_owner structure, and combines that with the IDA based uniquifier. Fixes a regression whereby the Linux server returns NFS4ERR_BAD_SEQID. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Linus Torvalds 提交于
This continues the theme started with vm_brk() and vm_munmap(): vm_mmap() does the same thing as do_mmap(), but additionally does the required VM locking. This uninlines (and rewrites it to be clearer) do_mmap(), which sadly duplicates it in mm/mmap.c and mm/nommu.c. But that way we don't have to export our internal do_mmap_pgoff() function. Some day we hopefully don't have to export do_mmap() either, if all modular users can become the simpler vm_mmap() instead. We're actually very close to that already, with the notable exception of the (broken) use in i810, and a couple of stragglers in binfmt_elf. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
Like the vm_brk() function, this is the same as "do_munmap()", except it does the VM locking for the caller. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
It does the same thing as "do_brk()", except it handles the VM locking too. It turns out that all external callers want that anyway, so we can make do_brk() static to just mm/mmap.c while at it. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Kara 提交于
When hostname contains colon (e.g. when it is an IPv6 address) it needs to be enclosed in brackets to make parsing of NFS device string possible. Fix nfs_do_root_mount() to enclose hostname properly when needed. NFS code actually does not need this as it does not parse the string passed by nfs_do_root_mount() but the device string is exposed to userspace in /proc/mounts. CC: Josh Boyer <jwboyer@redhat.com> CC: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NJan Kara <jack@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Fred Isaman 提交于
Cc: <stable@vger.kernel.org> Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Fred Isaman 提交于
Cc: <stable@vger.kernel.org> Signed-off-by: NFred Isaman <iisaman@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 20 4月, 2012 4 次提交
-
-
由 Jeff Layton 提交于
In the recent update of the cifs_iovec_write code to use async writes, the handling of the file position was broken. That patch added a local "offset" variable to handle the offset, and then only updated the original "*poffset" before exiting. Unfortunately, it copied off the original offset from the beginning, instead of doing so after generic_write_checks had been called. Fix this by moving the initialization of "offset" after that in the function. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
由 Trond Myklebust 提交于
If the file wasn't opened for writing, then truncate and ftruncate need to report the appropriate errors. Reported-by: NMiklos Szeredi <miklos@szeredi.hu> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
-
由 Trond Myklebust 提交于
Since we may be simulating flock() locks using NFS byte range locks, we can't rely on the VFS having checked the file open mode for us. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
-
由 Trond Myklebust 提交于
All callers of nfs4_handle_exception() that need to handle NFS4ERR_OPENMODE correctly should set exception->inode Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
-
- 19 4月, 2012 6 次提交
-
-
由 Stefan Behrens 提交于
The bitfield member mount_opt was too small by one bit to hold the mount option that enabled to include data extents in the integrity checker. Since the same issue happened when the BTRFS_MOUNT_PANIC_ON_FATAL_ERROR option was added (git rebase silently merges so that the increase of the size of the bitfield member is lost), the bit limit was removed entirely. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
-
由 Stefan Behrens 提交于
Each CRC or header error was counted twice, this is now fixed. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
-
由 Stefan Behrens 提交于
When a filesystem is mounted with the degraded option, it is possible that some of the devices are not there. btrfs_ioctl_dev_info() crashs in this case because the device name is a NULL pointer. This ioctl was only used for scrub. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
-
由 Arne Jansen 提交于
It is basically a good thing if we are interruptible when waiting for free space, but the generality in which it is implemented currently leads to system calls being interruptible that are not documented this way. For example git can't handle interrupted unlink(), leading to corrupt repos under space pressure. Instead we raise the bar to only be interruptible by SIGKILL. Thanks to David Sterba for suggesting this. Signed-off-by: NArne Jansen <sensille@gmx.net>
-
由 Dan Carpenter 提交于
The caller expects this function to return with the lock held and releases it immediately on error. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
-
由 Josef Bacik 提交于
A user reported a panic where we were trying to fix a bad mirror but the mirror number we were giving was 0, which is invalid. This is because we don't do the transid verification until after the read, so as far as the read code is concerned the read was a success. So instead store the mirror we read from so that if there is some failure post read we know which mirror to try next and which mirror needs to be fixed if we find a good copy of the block. Thanks, Signed-off-by: NJosef Bacik <josef@redhat.com>
-