1. 12 7月, 2008 2 次提交
    • J
      ext4: New inode allocation for FLEX_BG meta-data groups. · 772cb7c8
      Jose R. Santos 提交于
      This patch mostly controls the way inode are allocated in order to
      make ialloc aware of flex_bg block group grouping.  It achieves this
      by bypassing the Orlov allocator when block group meta-data are packed
      toghether through mke2fs.  Since the impact on the block allocator is
      minimal, this patch should have little or no effect on other block
      allocation algorithms. By controlling the inode allocation, it can
      basically control where the initial search for new block begins and
      thus indirectly manipulate the block allocator.
      
      This allocator favors data and meta-data locality so the disk will
      gradually be filled from block group zero upward.  This helps improve
      performance by reducing seek time.  Since the group of inode tables
      within one flex_bg are treated as one giant inode table, uninitialized
      block groups would not need to partially initialize as many inode
      table as with Orlov which would help fsck time as the filesystem usage
      goes up.
      Signed-off-by: NJose R. Santos <jrs@us.ibm.com>
      Signed-off-by: NValerie Clement <valerie.clement@bull.net>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      772cb7c8
    • T
      jbd2: Add commit time into the commit block · 736603ab
      Theodore Ts'o 提交于
      Carlo Wood has demonstrated that it's possible to recover deleted
      files from the journal.  Something that will make this easier is if we
      can put the time of the commit into commit block.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      736603ab
  2. 14 7月, 2008 3 次提交
  3. 12 7月, 2008 16 次提交
  4. 13 7月, 2008 2 次提交
  5. 12 7月, 2008 1 次提交
    • D
      Fix reference counting race on log buffers · 49641f1a
      Dave Chinner 提交于
      When we release the iclog, we do an atomic_dec_and_lock to determine if
      we are the last reference and need to trigger update of log headers and
      writeout.  However, in xlog_state_get_iclog_space() we also need to
      check if we have the last reference count there.  If we do, we release
      the log buffer, otherwise we decrement the reference count.
      
      But the compare and decrement in xlog_state_get_iclog_space() is not
      atomic, so both places can see a reference count of 2 and neither will
      release the iclog.  That leads to a filesystem hang.
      
      Close the race by replacing the atomic_read() and atomic_dec() pair with
      atomic_add_unless() to ensure that they are executed atomically.
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NTim Shimmin <tes@sgi.com>
      Tested-by: NEric Sandeen <sandeen@sandeen.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      49641f1a
  6. 11 7月, 2008 2 次提交
    • H
      exec: fix stack excutability without PT_GNU_STACK · 96a8e13e
      Hugh Dickins 提交于
      Kernel Bugzilla #11063 points out that on some architectures (e.g. x86_32)
      exec'ing an ELF without a PT_GNU_STACK program header should default to an
      executable stack; but this got broken by the unlimited argv feature because
      stack vma is now created before the right personality has been established:
      so breaking old binaries using nested function trampolines.
      
      Therefore re-evaluate VM_STACK_FLAGS in setup_arg_pages, where stack
      vm_flags used to be set, before the mprotect_fixup.  Checking through
      our existing VM_flags, none would have changed since insert_vm_struct:
      so this seems safer than finding a way through the personality labyrinth.
      
      Reported-by: pageexec@freemail.hu
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      96a8e13e
    • M
      ocfs2: Fix flags in ocfs2_file_lock · e988cf1c
      Mark Fasheh 提交于
      The stack-glue merge changed the way we use flags in dlmglue in that we now
      use the fs/dlm equivalents. Unfortunately, a merge error left the new flock
      code only partially updated. This took a while to show up though, because
      the lock level constants are actually identical between o2dlm and fs/dlm.
      The *_CONVERT and *_NOQUEUE flags have different values though, which is
      eventually causing a crash in flags_to_o2dlm().
      Signed-off-by: NMark Fasheh <mfasheh@suse.com>
      e988cf1c
  7. 09 7月, 2008 2 次提交
    • J
      reiserfs: discard prealloc in reiserfs_delete_inode · eb35c218
      Jeff Mahoney 提交于
      With the removal of struct file from the xattr code,
      reiserfs_file_release() isn't used anymore, so the prealloc isn't
      discarded.  This causes hangs later down the line.
      
      This patch adds it to reiserfs_delete_inode.  In most cases it will be a
      no-op due to it already having been called, but will avoid hangs with
      xattrs.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eb35c218
    • T
      NFS: Fix readdir cache invalidation · 2aac05a9
      Trond Myklebust 提交于
      invalidate_inode_pages2_range() takes page offset arguments, not byte
      ranges.
      
      Another thought is that individual pages might perhaps get evicted by VM
      pressure, in which case we might perhaps want to re-read not only the
      evicted page, but all subsequent pages too (in case the server returns
      more/less data per page so that the alignment of the next entry
      changes). We should therefore remove the condition that we only do this on
      page->index==0.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      2aac05a9
  8. 08 7月, 2008 1 次提交
  9. 06 7月, 2008 2 次提交
  10. 05 7月, 2008 7 次提交
  11. 03 7月, 2008 1 次提交
    • E
      9p: fix O_APPEND in legacy mode · 2e4bef41
      Eric Van Hensbergen 提交于
      The legacy protocol's open operation doesn't handle an append operation
      (it is expected that the client take care of it).  We were incorrectly
      passing the extended protocol's flag through even in legacy mode.  This
      was reported in bugzilla report #10689.  This patch fixes the problem
      by disallowing extended protocol open modes from being passed in legacy
      mode and implemented append functionality on the client side by adding
      a seek after the open.
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      2e4bef41
  12. 01 7月, 2008 1 次提交
    • J
      Properly notify block layer of sync writes · 18ce3751
      Jens Axboe 提交于
      fsync_buffers_list() and sync_dirty_buffer() both issue async writes and
      then immediately wait on them. Conceptually, that makes them sync writes
      and we should treat them as such so that the IO schedulers can handle
      them appropriately.
      
      This patch fixes a write starvation issue that Lin Ming reported, where
      xx is stuck for more than 2 minutes because of a large number of
      synchronous IO in the system:
      
      INFO: task kjournald:20558 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
      message.
      kjournald     D ffff810010820978  6712 20558      2
      ffff81022ddb1d10 0000000000000046 ffff81022e7baa10 ffffffff803ba6f2
      ffff81022ecd0000 ffff8101e6dc9160 ffff81022ecd0348 000000008048b6cb
      0000000000000086 ffff81022c4e8d30 0000000000000000 ffffffff80247537
      Call Trace:
      [<ffffffff803ba6f2>] kobject_get+0x12/0x17
      [<ffffffff80247537>] getnstimeofday+0x2f/0x83
      [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f
      [<ffffffff8066d195>] io_schedule+0x5d/0x9f
      [<ffffffff8029c1e7>] sync_buffer+0x3b/0x3f
      [<ffffffff8066d3f0>] __wait_on_bit+0x40/0x6f
      [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f
      [<ffffffff8066d48b>] out_of_line_wait_on_bit+0x6c/0x78
      [<ffffffff80243909>] wake_bit_function+0x0/0x23
      [<ffffffff8029e3ad>] sync_dirty_buffer+0x98/0xcb
      [<ffffffff8030056b>] journal_commit_transaction+0x97d/0xcb6
      [<ffffffff8023a676>] lock_timer_base+0x26/0x4b
      [<ffffffff8030300a>] kjournald+0xc1/0x1fb
      [<ffffffff802438db>] autoremove_wake_function+0x0/0x2e
      [<ffffffff80302f49>] kjournald+0x0/0x1fb
      [<ffffffff802437bb>] kthread+0x47/0x74
      [<ffffffff8022de51>] schedule_tail+0x28/0x5d
      [<ffffffff8020cac8>] child_rip+0xa/0x12
      [<ffffffff80243774>] kthread+0x0/0x74
      [<ffffffff8020cabe>] child_rip+0x0/0x12
      
      Lin Ming confirms that this patch fixes the issue. I've run tests with
      it for the past week and no ill effects have been observed, so I'm
      proposing it for inclusion into 2.6.26.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      18ce3751