1. 14 6月, 2010 1 次提交
  2. 12 6月, 2010 1 次提交
    • T
      ext4: Clean up s_dirt handling · a0375156
      Theodore Ts'o 提交于
      We don't need to set s_dirt in most of the ext4 code when journaling
      is enabled.  In ext3/4 some of the summary statistics for # of free
      inodes, blocks, and directories are calculated from the per-block
      group statistics when the file system is mounted or unmounted.  As a
      result the superblock doesn't have to be updated, either via the
      journal or by setting s_dirt.  There are a few exceptions, most
      notably when resizing the file system, where the superblock needs to
      be modified --- and in that case it should be done as a journalled
      operation if possible, and s_dirt set only in no-journal mode.
      
      This patch will optimize out some unneeded disk writes when using ext4
      with a journal.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a0375156
  3. 05 6月, 2010 1 次提交
  4. 03 6月, 2010 1 次提交
  5. 28 5月, 2010 2 次提交
  6. 24 5月, 2010 5 次提交
  7. 22 5月, 2010 3 次提交
  8. 17 5月, 2010 20 次提交
  9. 16 5月, 2010 6 次提交
    • E
      ext4: don't use quota reservation for speculative metadata · 72b8ab9d
      Eric Sandeen 提交于
      Because we can badly over-reserve metadata when we
      calculate worst-case, it complicates things for quota, since
      we must reserve and then claim later, retry on EDQUOT, etc.
      Quota is also a generally smaller pool than fs free blocks,
      so this over-reservation hurts more, and more often.
      
      I'm of the opinion that it's not the worst thing to allow
      metadata to push a user slightly over quota.  This simplifies
      the code and avoids the false quota rejections that result
      from worst-case speculation.
      
      This patch stops the speculative quota-charging for
      worst-case metadata requirements, and just charges quota
      when the blocks are allocated at writeout.  It also is
      able to remove the try-again loop on EDQUOT.
      
      This patch has been tested indirectly by running the xfstests
      suite with a hack to mount & enable quota prior to the test.
      
      I also did a more specific test of fragmenting freespace
      and then doing a large delalloc write under quota; quota
      stopped me at the right amount of file IO, and then the
      writeout generated enough metadata (due to the fragmentation)
      that it put me slightly over quota, as expected.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      72b8ab9d
    • D
      ext4: init statistics after journal recovery · 84061e07
      Dmitry Monakhov 提交于
      Currently block/inode/dir counters initialized before journal was
      recovered. In fact after journal recovery this info will probably
      change. And freeblocks it critical for correct delalloc mode
      accounting.
      
      https://bugzilla.kernel.org/show_bug.cgi?id=15768Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      84061e07
    • D
      ext4: clean up inode bitmaps manipulation in ext4_free_inode · d17413c0
      Dmitry Monakhov 提交于
      - Reorganize locking scheme to batch two atomic operation in to one.
        This also allow us to state what healthy group must obey following rule
        ext4_free_inodes_count(sb, gdp) == ext4_count_free(inode_bitmap, NUM);
      - Fix possible undefined pointer dereference.
      - Even if group descriptor stats aren't accessible we have to update
        inode bitmaps.
      - Move non-group members update out of group_lock.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      d17413c0
    • D
      ext4: Do not zero out uninitialized extents beyond i_size · 21ca087a
      Dmitry Monakhov 提交于
      The extents code will sometimes zero out blocks and mark them as
      initialized instead of splitting an extent into several smaller ones.
      This optimization however, causes problems if the extent is beyond
      i_size because fsck will complain if there are uninitialized blocks
      after i_size as this can not be distinguished from an inode that has
      an incorrect i_size field.
      
      https://bugzilla.kernel.org/show_bug.cgi?id=15742Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      21ca087a
    • E
      ext4: don't scan/accumulate more pages than mballoc will allocate · c445e3e0
      Eric Sandeen 提交于
      There was a bug reported on RHEL5 that a 10G dd on a 12G box
      had a very, very slow sync after that.
      
      At issue was the loop in write_cache_pages scanning all the way
      to the end of the 10G file, even though the subsequent call
      to mpage_da_submit_io would only actually write a smallish amt; then
      we went back to the write_cache_pages loop ... wasting tons of time
      in calling __mpage_da_writepage for thousands of pages we would
      just revisit (many times) later.
      
      Upstream it's not such a big issue for sys_sync because we get
      to the loop with a much smaller nr_to_write, which limits the loop.
      
      However, talking with Aneesh he realized that fsync upstream still
      gets here with a very large nr_to_write and we face the same problem.
      
      This patch makes mpage_add_bh_to_extent stop the loop after we've
      accumulated 2048 pages, by setting mpd->io_done = 1; which ultimately
      causes the write_cache_pages loop to break.
      
      Repeating the test with a dirty_ratio of 80 (to leave something for
      fsync to do), I don't see huge IO performance gains, but the reduction
      in cpu usage is striking: 80% usage with stock, and 2% with the
      below patch.  Instrumenting the loop in write_cache_pages clearly
      shows that we are wasting time here.
      
      Eventually we need to change mpage_da_map_pages() also submit its I/O
      to the block layer, subsuming mpage_da_submit_io(), and then change it
      call ext4_get_blocks() multiple times.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c445e3e0
    • E
      ext4: stop issuing discards if not supported by device · a30eec2a
      Eric Sandeen 提交于
      Turn off issuance of discard requests if the device does
      not support it - similar to the action we take for barriers.
      This will save a little computation time if a non-discardable
      device is mounted with -o discard, and also makes it obvious
      that it's not doing what was asked at mount time ...
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a30eec2a