1. 23 8月, 2012 1 次提交
    • H
      block: replace __getblk_slow misfix by grow_dev_page fix · 676ce6d5
      Hugh Dickins 提交于
      Commit 91f68c89 ("block: fix infinite loop in __getblk_slow")
      is not good: a successful call to grow_buffers() cannot guarantee
      that the page won't be reclaimed before the immediate next call to
      __find_get_block(), which is why there was always a loop there.
      
      Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595:
      inode #19278: block 664: comm cc1: unable to read itable block" on console,
      which pointed to this commit.
      
      I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs
      sometimes fails from a missing header file, under memory pressure on
      ppc G5.  I've never seen this on x86, and I've never seen it on 3.5-rc7
      itself, despite that commit being in there: bisection pointed to an
      irrelevant pinctrl merge, but hard to tell when failure takes between
      18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2).
      
      (I've since found such __ext4_get_inode_loc errors in /var/log/messages
      from previous weeks: why the message never appeared on console until
      yesterday morning is a mystery for another day.)
      
      Revert 91f68c89, restoring __getblk_slow() to how it was (plus
      a checkpatch nitfix).  Simplify the interface between grow_buffers()
      and grow_dev_page(), and avoid the infinite loop beyond end of device
      by instead checking init_page_buffers()'s end_block there (I presume
      that's more efficient than a repeated call to blkdev_max_block()),
      returning -ENXIO to __getblk_slow() in that case.
      
      And remove akpm's ten-year-old "__getblk() cannot fail ... weird"
      comment, but that is worrying: are all users of __getblk() really
      now prepared for a NULL bh beyond end of device, or will some oops??
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org # 3.0 3.2 3.4 3.5
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      676ce6d5
  2. 22 8月, 2012 6 次提交
  3. 21 8月, 2012 7 次提交
  4. 17 8月, 2012 13 次提交
    • I
      autofs4 - fix expire check · d807ff83
      Ian Kent 提交于
      In some cases when an autofs indirect mount is contained in a file
      system that is marked as shared (such as when systemd does the
      equivalent of "mount --make-rshared /" early in the boot), mounts
      stop expiring.
      
      When this happens the first expiry check on a mountpoint dentry in
      autofs_expire_indirect() sees a mountpoint dentry with a higher
      than minimal reference count. Consequently the dentry is condidered
      busy and the actual expiry check is never done.
      
      This particular check was originally meant as an optimisation to
      detect a path walk in progress but with the addition of rcu-walk
      it can be ineffective anyway.
      
      Removing the test allows automounts to expire again since the
      actual expire check doesn't rely on the dentry reference count.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d807ff83
    • T
      ext4: fix kernel BUG on large-scale rm -rf commands · 89a4e48f
      Theodore Ts'o 提交于
      Commit 968dee77: "ext4: fix hole punch failure when depth is greater
      than 0" introduced a regression in v3.5.1/v3.6-rc1 which caused kernel
      crashes when users ran run "rm -rf" on large directory hierarchy on
      ext4 filesystems on RAID devices:
      
          BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
      
          Process rm (pid: 18229, threadinfo ffff8801276bc000, task ffff880123631710)
          Call Trace:
           [<ffffffff81236483>] ? __ext4_handle_dirty_metadata+0x83/0x110
           [<ffffffff812353d3>] ext4_ext_truncate+0x193/0x1d0
           [<ffffffff8120a8cf>] ? ext4_mark_inode_dirty+0x7f/0x1f0
           [<ffffffff81207e05>] ext4_truncate+0xf5/0x100
           [<ffffffff8120cd51>] ext4_evict_inode+0x461/0x490
           [<ffffffff811a1312>] evict+0xa2/0x1a0
           [<ffffffff811a1513>] iput+0x103/0x1f0
           [<ffffffff81196d84>] do_unlinkat+0x154/0x1c0
           [<ffffffff8118cc3a>] ? sys_newfstatat+0x2a/0x40
           [<ffffffff81197b0b>] sys_unlinkat+0x1b/0x50
           [<ffffffff816135e9>] system_call_fastpath+0x16/0x1b
          Code: 8b 4d 20 0f b7 41 02 48 8d 04 40 48 8d 04 81 49 89 45 18 0f b7 49 02 48 83 c1 01 49 89 4d 00 e9 ae f8 ff ff 0f 1f 00 49 8b 45 28 <48> 8b 40 28 49 89 45 20 e9 85 f8 ff ff 0f 1f 80 00 00 00
      
          RIP  [<ffffffff81233164>] ext4_ext_remove_space+0xa34/0xdf0
      
      This could be reproduced as follows:
      
      The problem in commit 968dee77 was that caused the variable 'i' to
      be left uninitialized if the truncate required more space than was
      available in the journal.  This resulted in the function
      ext4_ext_truncate_extend_restart() returning -EAGAIN, which caused
      ext4_ext_remove_space() to restart the truncate operation after
      starting a new jbd2 handle.
      Reported-by: NMaciej Żenczykowski <maze@google.com>
      Reported-by: NMarti Raudsepp <marti@juffo.org>
      Tested-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      89a4e48f
    • T
      ext4: fix long mount times on very big file systems · 0548bbb8
      Theodore Ts'o 提交于
      Commit 8aeb00ff85a: "ext4: fix overhead calculation used by
      ext4_statfs()" introduced a O(n**2) calculation which makes very large
      file systems take forever to mount.  Fix this with an optimization for
      non-bigalloc file systems.  (For bigalloc file systems the overhead
      needs to be set in the the superblock.)
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      0548bbb8
    • T
      ext4: don't call ext4_error while block group is locked · 7a4c5de2
      Theodore Ts'o 提交于
      While in ext4_validate_block_bitmap(), if an block allocation bitmap
      is found to be invalid, we call ext4_error() while the block group is
      still locked.  This causes ext4_commit_super() to call a function
      which might sleep while in an atomic context.
      
      There's no need to keep the block group locked at this point, so hoist
      the ext4_error() call up to ext4_validate_block_bitmap() and release
      the block group spinlock before calling ext4_error().
      
      The reported stack trace can be found at:
      
      	http://article.gmane.org/gmane.comp.file-systems.ext4/33731Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      7a4c5de2
    • B
      NFS: return -ENOKEY when the upcall fails to map the name · 12dfd080
      Bryan Schumaker 提交于
      This allows the normal error-paths to handle the error, rather than
      making a special call to complete_request_key() just for this instance.
      Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
      Tested-by: NWilliam Dauchy <wdauchy@gmail.com>
      Cc: stable@vger.kernel.org [>= 3.4]
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      12dfd080
    • B
      NFS: Clear key construction data if the idmap upcall fails · c5066945
      Bryan Schumaker 提交于
      idmap_pipe_downcall already clears this field if the upcall succeeds,
      but if it fails (rpc.idmapd isn't running) the field will still be set
      on the next call triggering a BUG_ON().  This patch tries to handle all
      possible ways that the upcall could fail and clear the idmap key data
      for each one.
      Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
      Tested-by: NWilliam Dauchy <wdauchy@gmail.com>
      Cc: stable@vger.kernel.org [>= 3.4]
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      c5066945
    • T
      NFSv4: Don't use private xdr_stream fields in decode_getacl · cff298c7
      Trond Myklebust 提交于
      Instead of using the private field xdr->p from struct xdr_stream,
      use the public xdr_stream_pos().
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      cff298c7
    • T
      NFSv4: Fix the acl cache size calculation · b291f1b1
      Trond Myklebust 提交于
      Currently, we do not take into account the size of the 16 byte
      struct nfs4_cached_acl header, when deciding whether or not we should
      cache the acl data.  Consequently, we will end up allocating an
      8k buffer in order to fit a maximum size 4k acl.
      
      This patch adjusts the calculation so that we limit the cache size
      to 4k for the acl header+data.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      b291f1b1
    • T
      NFSv4: Fix pointer arithmetic in decode_getacl · 519d3959
      Trond Myklebust 提交于
      Resetting the cursor xdr->p to a previous value is not a safe
      practice: if the xdr_stream has crossed out of the initial iovec,
      then a bunch of other fields would need to be reset too.
      
      Fix this issue by using xdr_enter_page() so that the buffer gets
      page aligned at the bitmap _before_ we decode it.
      
      Also fix the confusion of the ACL length with the page buffer length
      by not adding the base offset to the ACL length...
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org
      519d3959
    • B
      NFS: Alias the nfs module to nfs4 · 425e776d
      bjschuma@gmail.com 提交于
      This allows distros to remove the line from their modprobe
      configuration.
      Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      425e776d
    • B
      NFS: Fix a regression when loading the NFS v4 module · 1ae811ee
      bjschuma@gmail.com 提交于
      Some systems have a modprobe.d/nfs.conf file that sets an nfs4 alias
      pointing to nfs.ko, rather than nfs4.ko.  This can prevent the v4 module
      from loading on mount, since the kernel sees that something named "nfs4"
      has already been loaded.  To work around this, I've renamed the modules
      to "nfsv2.ko" "nfsv3.ko" and "nfsv4.ko".
      
      I also had to move the nfs4_fs_type back to nfs.ko to ensure that `mount
      -t nfs4` still works.
      Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      1ae811ee
    • I
      autofs4 - fix get_next_positive_subdir() · a45440f0
      Ian Kent 提交于
      Following a report of a crash during an automount expire I found that
      the locking in fs/autofs4/expire.c:get_next_positive_subdir() was wrong.
      Not only is the locking wrong but the function is more complex than it
      needs to be.
      
      The function is meant to calculate (and dget) the next entry in the list
      of directories contained in the root of an autofs mount point (an autofs
      indirect mount to be precise). The main problem was that the d_lock of
      the owner of the list was not being taken when walking the list, which
      lead to list corruption under load. The only other lock that needs to
      be taken is against the next dentry candidate so it can be checked for
      usability.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a45440f0
    • S
      vfs: fix propagation of atomic_open create error on negative dentry · 62b2ce96
      Sage Weil 提交于
      If ->atomic_open() returns -ENOENT, we take care to return the create
      error (e.g., EACCES), if any.  Do the same when ->atomic_open() returns 1
      and provides a negative dentry.
      
      This fixes a regression where an unprivileged open O_CREAT fails with
      ENOENT instead of EACCES, introduced with the new atomic_open code.  It
      is tested by the open/08.t test in the pjd posix test suite, and was
      observed on top of fuse (backed by ceph-fuse).
      Signed-off-by: NSage Weil <sage@inktank.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      62b2ce96
  5. 15 8月, 2012 10 次提交
  6. 09 8月, 2012 3 次提交