1. 01 11月, 2011 3 次提交
  2. 31 10月, 2011 3 次提交
    • T
      ext4: optimize locking for end_io extent conversion · b82e384c
      Theodore Ts'o 提交于
      Now that we are doing the locking correctly, we need to grab the
      i_completed_io_lock() twice per end_io.  We can clean this up by
      removing the structure from the i_complted_io_list, and use this as
      the locking mechanism to prevent ext4_flush_completed_IO() racing
      against ext4_end_io_work(), instead of clearing the
      EXT4_IO_END_UNWRITTEN in io->flag.
      
      In addition, if the ext4_convert_unwritten_extents() returns an error,
      we no longer keep the end_io structure on the linked list.  This
      doesn't help, because it tends to lock up the file system and wedges
      the system.  That's one way to call attention to the problem, but it
      doesn't help the overall robustness of the system.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      b82e384c
    • T
      ext4: remove unnecessary call to waitqueue_active() · 4e298021
      Theodore Ts'o 提交于
      The usage of waitqueue_active() is not necessary, and introduces (I
      believe) a hard-to-hit race.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      4e298021
    • T
      ext4: Use correct locking for ext4_end_io_nolock() · d73d5046
      Tao Ma 提交于
      We must hold i_completed_io_lock when manipulating anything on the
      i_completed_io_list linked list.  This includes io->lock, which we
      were checking in ext4_end_io_nolock().
      
      So move this check to ext4_end_io_work().  This also has the bonus of
      avoiding extra work if it is already done without needing to take the
      mutex.
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      d73d5046
  3. 29 10月, 2011 7 次提交
  4. 27 10月, 2011 4 次提交
    • E
      ext4: optimize memmmove lengths in extent/index insertions · 80e675f9
      Eric Gouriou 提交于
      ext4_ext_insert_extent() (respectively ext4_ext_insert_index())
      was using EXT_MAX_EXTENT() (resp. EXT_MAX_INDEX()) to determine
      how many entries needed to be moved beyond the insertion point.
      In practice this means that (320 - I) * 24 bytes were memmove()'d
      when I is the insertion point, rather than (#entries - I) * 24 bytes.
      
      This patch uses EXT_LAST_EXTENT() (resp. EXT_LAST_INDEX()) instead
      to only move existing entries. The code flow is also simplified
      slightly to highlight similarities and reduce code duplication in
      the insertion logic.
      
      This patch reduces system CPU consumption by over 25% on a 4kB
      synchronous append DIO write workload when used with the
      pre-2.6.39 x86_64 memmove() implementation. With the much faster
      2.6.39 memmove() implementation we still see a decrease in
      system CPU usage between 2% and 7%.
      
      Note that the ext_debug() output changes with this patch, splitting
      some log information between entries. Users of the ext_debug() output
      should note that the "move %d" units changed from reporting the number
      of bytes moved to reporting the number of entries moved.
      Signed-off-by: NEric Gouriou <egouriou@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      80e675f9
    • E
      ext4: optimize ext4_ext_convert_to_initialized() · 6f91bc5f
      Eric Gouriou 提交于
      This patch introduces a fast path in ext4_ext_convert_to_initialized()
      for the case when the conversion can be performed by transferring
      the newly initialized blocks from the uninitialized extent into
      an adjacent initialized extent. Doing so removes the expensive
      invocations of memmove() which occur during extent insertion and
      the subsequent merge.
      
      In practice this should be the common case for clients performing
      append writes into files pre-allocated via
      fallocate(FALLOC_FL_KEEP_SIZE). In such a workload performed via
      direct IO and when using a suboptimal implementation of memmove()
      (x86_64 prior to the 2.6.39 rewrite), this patch reduces kernel CPU
      consumption by 32%.
      
      Two new trace points are added to ext4_ext_convert_to_initialized()
      to offer visibility into its operations. No exit trace point has
      been added due to the multiplicity of return points. This can be
      revisited once the upstream cleanup is backported.
      Signed-off-by: NEric Gouriou <egouriou@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      6f91bc5f
    • T
      jdb/jbd2: factor out common functions from the jbd[2] header files · 44606672
      Thomas Gleixner 提交于
      The state bits and the lock functions of jbd and jbd2 are
      identical.  Share them.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      44606672
    • R
      jbd2: fix build when CONFIG_BUG is not enabled · 44705754
      Randy Dunlap 提交于
      Fix build error when CONFIG_BUG is not enabled:
      
      fs/jbd2/transaction.c:1175:3: error: implicit declaration of function '__WARN'
      
      by changing __WARN() to WARN_ON(), as suggested by
      Arnaud Lacombe <lacombar@gmail.com>.
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Arnaud Lacombe <lacombar@gmail.com>
      44705754
  5. 26 10月, 2011 10 次提交
  6. 25 10月, 2011 3 次提交
    • D
      ext4: prevent stack overrun in ext4_file_open · cf803903
      Darrick J. Wong 提交于
      In ext4_file_open, the filesystem records the mountpoint of the first
      file that is opened after mounting the filesystem.  It does this by
      allocating a 64-byte stack buffer, calling d_path() to grab the mount
      point through which this file was accessed, and then memcpy()ing 64
      bytes into the superblock's s_last_mounted field, starting from the
      return value of d_path(), which is stored as "cp".  However, if cp >
      buf (which it frequently is since path components are prepended
      starting at the end of buf) then we can end up copying stack data into
      the superblock.
      
      Writing stack variables into the superblock doesn't sound like a great
      idea, so use strlcpy instead.  Andi Kleen suggested using strlcpy
      instead of strncpy.
      Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      cf803903
    • D
      ext4: update EOFBLOCKS flag on fallocate properly · a4e5d88b
      Dmitry Monakhov 提交于
      EOFBLOCK_FL should be updated if called w/o FALLOCATE_FL_KEEP_SIZE
      Currently it happens only if new extent was allocated.
      
      TESTCASE:
      fallocate test_file -n -l4096
      fallocate test_file -l4096
      Last fallocate cmd has updated size, but keept EOFBLOCK_FL set. And
      fsck will complain about that.
      
      Also remove ping pong in ext4_fallocate() in case of new extents,
      where ext4_ext_map_blocks() clear EOFBLOCKS bit, and later
      ext4_falloc_update_inode() restore it again.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a4e5d88b
    • D
      ext4: remove messy logic from ext4_ext_rm_leaf · 750c9c47
      Dmitry Monakhov 提交于
      - Both callers(truncate and punch_hole) already aligned left end point
        so we no longer need split logic here.
      - Remove dead duplicated code.
      - Call ext4_ext_dirty only after we have updated eh_entries, otherwise
        we'll loose entries update. Regression caused by d583fb87
        266'th testcase in xfstests (http://patchwork.ozlabs.org/patch/120872)
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      750c9c47
  7. 22 10月, 2011 1 次提交
    • D
      ext4: cleanup ext4_ext_grow_indepth code · 1939dd84
      Dmitry Monakhov 提交于
      Currently code make an impression what grow procedure is very complicated
      and some mythical paths, blocks are involved. But in fact grow in depth
      it relatively simple procedure:
       1) Just create new meta block and copy root data to that block.
       2) Convert root from extent to index if old depth == 0
       3) Update root block pointer
      
      This patch does:
       - Reorganize code to make it more self explanatory
       - Do not pass path parameter to new_meta_block() in order to
         provoke allocation from inode's group because top-level block
         should site closer to it's inode, but not to leaf data block.
      
         [ This happens anyway, due to logic in mballoc; we should drop
           the path parameter from new_meta_block() entirely.  -- tytso ]
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      1939dd84
  8. 21 10月, 2011 3 次提交
  9. 18 10月, 2011 6 次提交