1. 29 4月, 2012 1 次提交
    • L
      VFS: clean up and simplify getname_flags() · 3f9f0aa6
      Linus Torvalds 提交于
      This removes a number of silly games around strncpy_from_user() in
      do_getname(), and removes that helper function entirely.  We instead
      make getname_flags() just use strncpy_from_user() properly directly.
      
      Removing the wrapper function simplifies things noticeably, mostly
      because we no longer play the unnecessary games with segments (x86
      strncpy_from_user() no longer needs the hack), but also because the
      empty path handling is just much more obvious.  The return value of
      "strncpy_to_user()" is much more obvious than checking an odd error
      return case from do_getname().
      
      [ non-x86 architectures were notified of this change several weeks ago,
        since it is possible that they have copied the old broken x86
        strncpy_from_user. But nobody reacted, so .. See
      
          http://www.spinics.net/lists/linux-arch/msg17313.html
      
        for details ]
      
      Cc: linux-arch@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3f9f0aa6
  2. 28 4月, 2012 8 次提交
    • L
      Revert "autofs: work around unhappy compat problem on x86-64" · fcbf94b9
      Linus Torvalds 提交于
      This reverts commit a32744d4.
      
      While that commit was technically the right thing to do, and made the
      x86-64 compat mode work identically to native 32-bit mode (and thus
      fixing the problem with a 32-bit systemd install on a 64-bit kernel), it
      turns out that the automount binaries had workarounds for this compat
      problem.
      
      Now, the workarounds are disgusting: doing an "uname()" to find out the
      architecture of the kernel, and then comparing it for the 64-bit cases
      and fixing up the size of the read() in automount for those.  And they
      were confused: it's not actually a generic 64-bit issue at all, it's
      very much tied to just x86-64, which has different alignment for an
      'u64' in 64-bit mode than in 32-bit mode.
      
      But the end result is that fixing the compat layer actually breaks the
      case of a 32-bit automount on a x86-64 kernel.
      
      There are various approaches to fix this (including just doing a
      "strcmp()" on current->comm and comparing it to "automount"), but I
      think that I will do the one that teaches pipes about a special "packet
      mode", which will allow user space to not have to care too deeply about
      the padding at the end of the autofs packet.
      
      That change will make the compat workaround unnecessary, so let's revert
      it first, and get automount working again in compat mode.  The
      packetized pipes will then fix autofs for systemd.
      Reported-and-requested-by: NMichael Tokarev <mjt@tls.msk.ru>
      Cc: Ian Kent <raven@themaw.net>
      Cc: stable@kernel.org # for 3.3
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fcbf94b9
    • C
      Btrfs: reduce lock contention during extent insertion · dc7fdde3
      Chris Mason 提交于
      We're spending huge amounts of time on lock contention during
      end_io processing because we unconditionally assume we are overwriting
      an existing extent in the file for each IO.
      
      This checks to see if we are outside i_size, and if so, it uses a
      less expensive readonly search of the btree to look for existing
      extents.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      dc7fdde3
    • C
      Btrfs: avoid deadlocks from GFP_KERNEL allocations during btrfs_real_readdir · fede766f
      Chris Mason 提交于
      Btrfs has an optimization where it will preallocate dentries during
      readdir to fill in enough information to open the inode without an extra
      lookup.
      
      But, we're calling d_alloc, which is doing GFP_KERNEL allocations, and
      that leads to deadlocks because our readdir code has tree locks held.
      
      For now, disable this optimization.  We'll fix the gfp mask in the next
      merge window.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      fede766f
    • D
      Btrfs: Fix space checking during fs resize · 7654b724
      Daniel J Blueman 提交于
      Fix out-of-space checking, addressing a warning and potential resource
      leak when resizing the filesystem down while allocating blocks.
      Signed-off-by: NDaniel J Blueman <daniel@quora.org>
      Reviewed-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      7654b724
    • S
      Btrfs: fix block_rsv and space_info lock ordering · 1f699d38
      Stefan Behrens 提交于
      may_commit_transaction() calls
              spin_lock(&space_info->lock);
              spin_lock(&delayed_rsv->lock);
      and update_global_block_rsv() calls
              spin_lock(&block_rsv->lock);
              spin_lock(&sinfo->lock);
      
      Lockdep complains about this at run time.
      Everywhere except in update_global_block_rsv(), the space_info lock is
      the outer lock, therefore the locking order in update_global_block_rsv()
      is changed.
      Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      1f699d38
    • D
      Btrfs: Prevent root_list corruption · 1daf3540
      Daniel J Blueman 提交于
      I was seeing root_list corruption on unmount during fs resize in 3.4-rc4; add
      correct locking to address this.
      Signed-off-by: NDaniel J Blueman <daniel@quora.org>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      1daf3540
    • J
      Btrfs: fix repair code for RAID10 · 3e74317a
      Jan Schmidt 提交于
      btrfs_map_block sets mirror_num, so that the repair code knows eventually
      which device gave us the read error. For RAID10, mirror_num must be 1 or 2.
      Before this fix mirror_num was incorrectly related to our stripe index.
      Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      3e74317a
    • J
      Btrfs: do not start delalloc inodes during sync · 996d282c
      Josef Bacik 提交于
      btrfs_start_delalloc_inodes will just walk the list of delalloc inodes and
      start writing them out, but it doesn't splice the list or anything so as
      long as somebody is doing work on the box you could end up in this section
      _forever_.  So just remove it, it's not needed anyway since sync will start
      writeback on all inodes anyway, all we need to do is wait for ordered
      extents and then we can commit the transaction.  In my horrible torture test
      sync goes from taking 4 minutes to about 1.5 minutes.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      996d282c
  3. 26 4月, 2012 4 次提交
    • W
      revert "proc: clear_refs: do not clear reserved pages" · 63f61a6f
      Will Deacon 提交于
      Revert commit 85e72aa5 ("proc: clear_refs: do not clear reserved
      pages"), which was a quick fix suitable for -stable until ARM had been
      moved over to the gate_vma mechanism:
      
      https://lkml.org/lkml/2012/1/14/55
      
      With commit f9d4861f ("ARM: 7294/1: vectors: use gate_vma for vectors user
      mapping"), ARM does now use the gate_vma, so the PageReserved check can be
      removed from the proc code.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Cc: Nicolas Pitre <nico@linaro.org>
      Acked-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      63f61a6f
    • A
      hugetlbfs: lockdep annotate root inode properly · 65ed7601
      Aneesh Kumar K.V 提交于
      This fixes the below reported false lockdep warning.  e096d0c7
      ("lockdep: Add helper function for dir vs file i_mutex annotation") added
      a similar annotation for every other inode in hugetlbfs but missed the
      root inode because it was allocated by a separate function.
      
      For HugeTLB fs we allow taking i_mutex in mmap.  HugeTLB fs doesn't
      support file write and its file read callback is modified in a05b0855
      ("hugetlbfs: avoid taking i_mutex from hugetlbfs_read()") to not take
      i_mutex.  Hence for HugeTLB fs with regular files we really don't take
      i_mutex with mmap_sem held.
      
       ======================================================
       [ INFO: possible circular locking dependency detected ]
       3.4.0-rc1+ #322 Not tainted
       -------------------------------------------------------
       bash/1572 is trying to acquire lock:
        (&mm->mmap_sem){++++++}, at: [<ffffffff810f1618>] might_fault+0x40/0x90
      
       but task is already holding lock:
        (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [<ffffffff81125f88>] vfs_readdir+0x56/0xa8
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&sb->s_type->i_mutex_key#12){+.+.+.}:
              [<ffffffff810a09e5>] lock_acquire+0xd5/0xfa
              [<ffffffff816a2f5e>] __mutex_lock_common+0x48/0x350
              [<ffffffff816a3325>] mutex_lock_nested+0x2a/0x31
              [<ffffffff811fb8e1>] hugetlbfs_file_mmap+0x7d/0x104
              [<ffffffff810f859a>] mmap_region+0x272/0x47d
              [<ffffffff810f8a39>] do_mmap_pgoff+0x294/0x2ee
              [<ffffffff810f8b65>] sys_mmap_pgoff+0xd2/0x10e
              [<ffffffff8103d19e>] sys_mmap+0x1d/0x1f
              [<ffffffff816a5922>] system_call_fastpath+0x16/0x1b
      
       -> #0 (&mm->mmap_sem){++++++}:
              [<ffffffff810a0256>] __lock_acquire+0xa81/0xd75
              [<ffffffff810a09e5>] lock_acquire+0xd5/0xfa
              [<ffffffff810f1645>] might_fault+0x6d/0x90
              [<ffffffff81125d62>] filldir+0x6a/0xc2
              [<ffffffff81133a83>] dcache_readdir+0x5c/0x222
              [<ffffffff81125fa8>] vfs_readdir+0x76/0xa8
              [<ffffffff811260b6>] sys_getdents+0x79/0xc9
              [<ffffffff816a5922>] system_call_fastpath+0x16/0x1b
      
       other info that might help us debug this:
      
        Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(&sb->s_type->i_mutex_key#12);
                                      lock(&mm->mmap_sem);
                                      lock(&sb->s_type->i_mutex_key#12);
         lock(&mm->mmap_sem);
      
        *** DEADLOCK ***
      
       1 lock held by bash/1572:
        #0:  (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [<ffffffff81125f88>] vfs_readdir+0x56/0xa8
      
       stack backtrace:
       Pid: 1572, comm: bash Not tainted 3.4.0-rc1+ #322
       Call Trace:
        [<ffffffff81699a3c>] print_circular_bug+0x1f8/0x209
        [<ffffffff810a0256>] __lock_acquire+0xa81/0xd75
        [<ffffffff810f38aa>] ? handle_pte_fault+0x5ff/0x614
        [<ffffffff8109e622>] ? mark_lock+0x2d/0x258
        [<ffffffff810f1618>] ? might_fault+0x40/0x90
        [<ffffffff810a09e5>] lock_acquire+0xd5/0xfa
        [<ffffffff810f1618>] ? might_fault+0x40/0x90
        [<ffffffff816a3249>] ? __mutex_lock_common+0x333/0x350
        [<ffffffff810f1645>] might_fault+0x6d/0x90
        [<ffffffff810f1618>] ? might_fault+0x40/0x90
        [<ffffffff81125d62>] filldir+0x6a/0xc2
        [<ffffffff81133a83>] dcache_readdir+0x5c/0x222
        [<ffffffff81125cf8>] ? sys_ioctl+0x74/0x74
        [<ffffffff81125cf8>] ? sys_ioctl+0x74/0x74
        [<ffffffff81125cf8>] ? sys_ioctl+0x74/0x74
        [<ffffffff81125fa8>] vfs_readdir+0x76/0xa8
        [<ffffffff811260b6>] sys_getdents+0x79/0xc9
        [<ffffffff816a5922>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Josh Boyer <jwboyer@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      65ed7601
    • G
      fs/buffer.c: remove BUG() in possible but rare condition · 61065a30
      Glauber Costa 提交于
      While stressing the kernel with with failing allocations today, I hit the
      following chain of events:
      
      alloc_page_buffers():
      
      	bh = alloc_buffer_head(GFP_NOFS);
      	if (!bh)
      		goto no_grow; <= path taken
      
      grow_dev_page():
              bh = alloc_page_buffers(page, size, 0);
              if (!bh)
                      goto failed;  <= taken, consequence of the above
      
      and then the failed path BUG()s the kernel.
      
      The failure is inserted a litte bit artificially, but even then, I see no
      reason why it should be deemed impossible in a real box.
      
      Even though this is not a condition that we expect to see around every
      time, failed allocations are expected to be handled, and BUG() sounds just
      too much.  As a matter of fact, grow_dev_page() can return NULL just fine
      in other circumstances, so I propose we just remove it, then.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      61065a30
    • J
      epoll: clear the tfile_check_list on -ELOOP · 13d51807
      Jason Baron 提交于
      An epoll_ctl(,EPOLL_CTL_ADD,,) operation can return '-ELOOP' to prevent
      circular epoll dependencies from being created.  However, in that case we
      do not properly clear the 'tfile_check_list'.  Thus, add a call to
      clear_tfile_check_list() for the -ELOOP case.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Reported-by: NYurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru>
      Cc: Nelson Elhage <nelhage@nelhage.com>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Tested-by: NAlexandra N. Kossovsky <Alexandra.Kossovsky@oktetlabs.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      13d51807
  4. 25 4月, 2012 2 次提交
  5. 24 4月, 2012 4 次提交
  6. 22 4月, 2012 2 次提交
  7. 21 4月, 2012 9 次提交
  8. 20 4月, 2012 4 次提交
  9. 19 4月, 2012 6 次提交