1. 09 5月, 2009 17 次提交
  2. 08 5月, 2009 1 次提交
  3. 07 5月, 2009 2 次提交
    • J
      fiemap: fix problem with setting FIEMAP_EXTENT_LAST · df3935ff
      Josef Bacik 提交于
      Fix a problem where the generic block based fiemap stuff would not
      properly set FIEMAP_EXTENT_LAST on the last extent.  I've reworked things
      to keep track if we go past the EOF, and mark the last extent properly.
      The problem was reported by and tested by Eric Sandeen.
      Tested-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@redhat.com>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: <xfs-masters@oss.sgi.com>
      Cc: <linux-btrfs@vger.kernel.org>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <Joel.Becker@oracle.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      df3935ff
    • W
      inotify: use GFP_NOFS in kernel_event() to work around a lockdep false-positive · 381a80e6
      Wu Fengguang 提交于
      There is what we believe to be a false positive reported by lockdep.
      
      inotify_inode_queue_event() => take inotify_mutex => kernel_event() =>
      kmalloc() => SLOB => alloc_pages_node() => page reclaim => slab reclaim =>
      dcache reclaim => inotify_inode_is_dead => take inotify_mutex => deadlock
      
      The plan is to fix this via lockdep annotation, but that is proving to be
      quite involved.
      
      The patch flips the allocation over to GFP_NFS to shut the warning up, for
      the 2.6.30 release.
      
      Hopefully we will fix this for real in 2.6.31.  I'll queue a patch in -mm
      to switch it back to GFP_KERNEL so we don't forget.
      
        =================================
        [ INFO: inconsistent lock state ]
        2.6.30-rc2-next-20090417 #203
        ---------------------------------
        inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
        kswapd0/380 [HC0[0]:SC0[0]:HE1:SE1] takes:
         (&inode->inotify_mutex){+.+.?.}, at: [<ffffffff8112f1b5>] inotify_inode_is_dead+0x35/0xb0
        {RECLAIM_FS-ON-W} state was registered at:
          [<ffffffff81079188>] mark_held_locks+0x68/0x90
          [<ffffffff810792a5>] lockdep_trace_alloc+0xf5/0x100
          [<ffffffff810f5261>] __kmalloc_node+0x31/0x1e0
          [<ffffffff81130652>] kernel_event+0xe2/0x190
          [<ffffffff81130826>] inotify_dev_queue_event+0x126/0x230
          [<ffffffff8112f096>] inotify_inode_queue_event+0xc6/0x110
          [<ffffffff8110444d>] vfs_create+0xcd/0x140
          [<ffffffff8110825d>] do_filp_open+0x88d/0xa20
          [<ffffffff810f6b68>] do_sys_open+0x98/0x140
          [<ffffffff810f6c50>] sys_open+0x20/0x30
          [<ffffffff8100c272>] system_call_fastpath+0x16/0x1b
          [<ffffffffffffffff>] 0xffffffffffffffff
        irq event stamp: 690455
        hardirqs last  enabled at (690455): [<ffffffff81564fe4>] _spin_unlock_irqrestore+0x44/0x80
        hardirqs last disabled at (690454): [<ffffffff81565372>] _spin_lock_irqsave+0x32/0xa0
        softirqs last  enabled at (690178): [<ffffffff81052282>] __do_softirq+0x202/0x220
        softirqs last disabled at (690157): [<ffffffff8100d50c>] call_softirq+0x1c/0x50
      
        other info that might help us debug this:
        2 locks held by kswapd0/380:
         #0:  (shrinker_rwsem){++++..}, at: [<ffffffff810d0bd7>] shrink_slab+0x37/0x180
         #1:  (&type->s_umount_key#17){++++..}, at: [<ffffffff8110cfbf>] shrink_dcache_memory+0x11f/0x1e0
      
        stack backtrace:
        Pid: 380, comm: kswapd0 Not tainted 2.6.30-rc2-next-20090417 #203
        Call Trace:
         [<ffffffff810789ef>] print_usage_bug+0x19f/0x200
         [<ffffffff81018bff>] ? save_stack_trace+0x2f/0x50
         [<ffffffff81078f0b>] mark_lock+0x4bb/0x6d0
         [<ffffffff810799e0>] ? check_usage_forwards+0x0/0xc0
         [<ffffffff8107b142>] __lock_acquire+0xc62/0x1ae0
         [<ffffffff810f478c>] ? slob_free+0x10c/0x370
         [<ffffffff8107c0a1>] lock_acquire+0xe1/0x120
         [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0
         [<ffffffff81562d43>] mutex_lock_nested+0x63/0x420
         [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0
         [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0
         [<ffffffff81012fe9>] ? sched_clock+0x9/0x10
         [<ffffffff81077165>] ? lock_release_holdtime+0x35/0x1c0
         [<ffffffff8112f1b5>] inotify_inode_is_dead+0x35/0xb0
         [<ffffffff8110c9dc>] dentry_iput+0xbc/0xe0
         [<ffffffff8110cb23>] d_kill+0x33/0x60
         [<ffffffff8110ce23>] __shrink_dcache_sb+0x2d3/0x350
         [<ffffffff8110cffa>] shrink_dcache_memory+0x15a/0x1e0
         [<ffffffff810d0cc5>] shrink_slab+0x125/0x180
         [<ffffffff810d1540>] kswapd+0x560/0x7a0
         [<ffffffff810ce160>] ? isolate_pages_global+0x0/0x2c0
         [<ffffffff81065a30>] ? autoremove_wake_function+0x0/0x40
         [<ffffffff8107953d>] ? trace_hardirqs_on+0xd/0x10
         [<ffffffff810d0fe0>] ? kswapd+0x0/0x7a0
         [<ffffffff8106555b>] kthread+0x5b/0xa0
         [<ffffffff8100d40a>] child_rip+0xa/0x20
         [<ffffffff8100cdd0>] ? restore_args+0x0/0x30
         [<ffffffff81065500>] ? kthread+0x0/0xa0
         [<ffffffff8100d400>] ? child_rip+0x0/0x20
      
      [eparis@redhat.com: fix audit too]
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      381a80e6
  4. 06 5月, 2009 2 次提交
  5. 05 5月, 2009 1 次提交
  6. 04 5月, 2009 1 次提交
    • S
      [CIFS] NTLMSSP reenabled after move from connect.c to sess.c · 0b3cc858
      Steve French 提交于
      The NTLMSSP code was removed from fs/cifs/connect.c and merged
      (75% smaller, cleaner) into fs/cifs/sess.c
      
      As with the old code it requires that cifs be built with
      CONFIG_CIFS_EXPERIMENTAL, the /proc/fs/cifs/Experimental flag
      must be set to 2, and mount must turn on extended security
      (e.g. with sec=krb5).
      
      Although NTLMSSP encapsulated in SPNEGO is not enabled yet,
      "raw" ntlmssp is common and useful in some cases since it
      offers more complete security negotiation, and is the
      default way of negotiating security for many Windows systems.
      SPNEGO encapsulated NTLMSSP will be able to reuse the same
      code.
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      0b3cc858
  7. 03 5月, 2009 7 次提交
    • T
      NFS: Close page_mkwrite() races · 7fdf5230
      Trond Myklebust 提交于
      Follow up to Nick Piggin's patches to ensure that nfs_vm_page_mkwrite
      returns with the page lock held, and sets the VM_FAULT_LOCKED flag.
      
      See http://bugzilla.kernel.org/show_bug.cgi?id=12913Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7fdf5230
    • O
      ptrace: s/parent/real_parent/ in binfmt_elf_fdpic.c · 0ae05fb2
      Oleg Nesterov 提交于
      ->real_parent is the parent. ->parent may be the tracer.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ae05fb2
    • K
      mm: fix Committed_AS underflow on large NR_CPUS environment · 00a62ce9
      KOSAKI Motohiro 提交于
      The Committed_AS field can underflow in certain situations:
      
      >         # while true; do cat /proc/meminfo  | grep _AS; sleep 1; done | uniq -c
      >               1 Committed_AS: 18446744073709323392 kB
      >              11 Committed_AS: 18446744073709455488 kB
      >               6 Committed_AS:    35136 kB
      >               5 Committed_AS: 18446744073709454400 kB
      >               7 Committed_AS:    35904 kB
      >               3 Committed_AS: 18446744073709453248 kB
      >               2 Committed_AS:    34752 kB
      >               9 Committed_AS: 18446744073709453248 kB
      >               8 Committed_AS:    34752 kB
      >               3 Committed_AS: 18446744073709320960 kB
      >               7 Committed_AS: 18446744073709454080 kB
      >               3 Committed_AS: 18446744073709320960 kB
      >               5 Committed_AS: 18446744073709454080 kB
      >               6 Committed_AS: 18446744073709320960 kB
      
      Because NR_CPUS can be greater than 1000 and meminfo_proc_show() does
      not check for underflow.
      
      But NR_CPUS proportional isn't good calculation.  In general,
      possibility of lock contention is proportional to the number of online
      cpus, not theorical maximum cpus (NR_CPUS).
      
      The current kernel has generic percpu-counter stuff.  using it is right
      way.  it makes code simplify and percpu_counter_read_positive() don't
      make underflow issue.
      Reported-by: NDave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: <stable@kernel.org>		[All kernel versions]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00a62ce9
    • I
      alpha: binfmt_aout fix · 74641f58
      Ivan Kokshaysky 提交于
      This fixes the problem introduced by commit 3bfacef4 (get rid of
      special-casing the /sbin/loader on alpha): osf/1 ecoff binary segfaults
      when binfmt_aout built as module.  That happens because aout binary
      handler gets on the top of the binfmt list due to late registration, and
      kernel attempts to execute the binary without preparatory work that must
      be done by binfmt_loader.
      
      Fixed by changing the registration order of the default binfmt handlers
      using list_add_tail() and introducing insert_binfmt() function which
      places new handler on the top of the binfmt list.  This might be generally
      useful for installing arch-specific frontends for default handlers or just
      for overriding them.
      Signed-off-by: NIvan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Richard Henderson <rth@twiddle.net
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      74641f58
    • V
      pagemap: require aligned-length, non-null reads of /proc/pid/pagemap · 08161786
      Vitaly Mayatskikh 提交于
      The intention of commit aae8679b
      ("pagemap: fix bug in add_to_pagemap, require aligned-length reads of
      /proc/pid/pagemap") was to force reads of /proc/pid/pagemap to be a
      multiple of 8 bytes, but now it allows to read 0 bytes, which actually
      puts some data to user's buffer.  According to POSIX, if count is zero,
      read() should return zero and has no other results.
      Signed-off-by: NVitaly Mayatskikh <v.mayatskih@gmail.com>
      Cc: Thomas Tuttle <ttuttle@google.com>
      Acked-by: NMatt Mackall <mpm@selenic.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08161786
    • N
      mm: close page_mkwrite races · b827e496
      Nick Piggin 提交于
      Change page_mkwrite to allow implementations to return with the page
      locked, and also change it's callers (in page fault paths) to hold the
      lock until the page is marked dirty.  This allows the filesystem to have
      full control of page dirtying events coming from the VM.
      
      Rather than simply hold the page locked over the page_mkwrite call, we
      call page_mkwrite with the page unlocked and allow callers to return with
      it locked, so filesystems can avoid LOR conditions with page lock.
      
      The problem with the current scheme is this: a filesystem that wants to
      associate some metadata with a page as long as the page is dirty, will
      perform this manipulation in its ->page_mkwrite.  It currently then must
      return with the page unlocked and may not hold any other locks (according
      to existing page_mkwrite convention).
      
      In this window, the VM could write out the page, clearing page-dirty.  The
      filesystem has no good way to detect that a dirty pte is about to be
      attached, so it will happily write out the page, at which point, the
      filesystem may manipulate the metadata to reflect that the page is no
      longer dirty.
      
      It is not always possible to perform the required metadata manipulation in
      ->set_page_dirty, because that function cannot block or fail.  The
      filesystem may need to allocate some data structure, for example.
      
      And the VM cannot mark the pte dirty before page_mkwrite, because
      page_mkwrite is allowed to fail, so we must not allow any window where the
      page could be written to if page_mkwrite does fail.
      
      This solution of holding the page locked over the 3 critical operations
      (page_mkwrite, setting the pte dirty, and finally setting the page dirty)
      closes out races nicely, preventing page cleaning for writeout being
      initiated in that window.  This provides the filesystem with a strong
      synchronisation against the VM here.
      
      - Sage needs this race closed for ceph filesystem.
      - Trond for NFS (http://bugzilla.kernel.org/show_bug.cgi?id=12913).
      - I need it for fsblock.
      - I suspect other filesystems may need it too (eg. btrfs).
      - I have converted buffer.c to the new locking. Even simple block allocation
        under dirty pages might be susceptible to i_size changing under partial page
        at the end of file (we also have a buffer.c-side problem here, but it cannot
        be fixed properly without this patch).
      - Other filesystems (eg. NFS, maybe btrfs) will need to change their
        page_mkwrite functions themselves.
      
      [ This also moves page_mkwrite another step closer to fault, which should
        eventually allow page_mkwrite to be moved into ->fault, and thus avoiding a
        filesystem calldown and page lock/unlock cycle in __do_fault. ]
      
      [akpm@linux-foundation.org: fix derefs of NULL ->mapping]
      Cc: Sage Weil <sage@newdream.net>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b827e496
    • I
      autofs4: fix incorrect return in autofs4_mount_busy() · a8985f3a
      Ian Kent 提交于
      Fix an obvious incorrect return status in autofs4_mount_busy().
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a8985f3a
  8. 02 5月, 2009 7 次提交
  9. 01 5月, 2009 2 次提交