1. 17 12月, 2014 1 次提交
  2. 06 11月, 2014 1 次提交
    • D
      ext4: move_extent improve bh vanishing success factor · 88c6b61f
      Dmitry Monakhov 提交于
      Xiaoguang Wang has reported sporadic EBUSY failures of ext4/302
      Unfortunetly there is nothing we can do if some other task holds BH's
      refenrence.  So we must return EBUSY in this case.  But we can try
      kicking the journal to see if the other task releases the bh reference
      after the commit is complete.  Also decrease false positives by
      properly checking for ENOSPC and retrying the allocation after kicking
      the journal --- which is done by ext4_should_retry_alloc().
      
      [ Modified by tytso to properly check for ENOSPC. ]
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      88c6b61f
  3. 12 10月, 2014 1 次提交
  4. 02 9月, 2014 4 次提交
    • T
      ext4: rename ext4_ext_find_extent() to ext4_find_extent() · ed8a1a76
      Theodore Ts'o 提交于
      Make the function name less redundant.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ed8a1a76
    • T
      ext4: reuse path object in ext4_move_extents() · 3bdf14b4
      Theodore Ts'o 提交于
      Reuse the path object in ext4_move_extents() so we don't unnecessarily
      free and reallocate it.
      
      Also clean up the get_ext_path() wrapper so that it has the same
      semantics of freeing the path object on error as ext4_ext_find_extent().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      3bdf14b4
    • T
      ext4: allow a NULL argument to ext4_ext_drop_refs() · b7ea89ad
      Theodore Ts'o 提交于
      Teach ext4_ext_drop_refs() to accept a NULL argument, much like
      kfree().  This allows us to drop a lot of checks to make sure path is
      non-NULL before calling ext4_ext_drop_refs().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b7ea89ad
    • T
      ext4: teach ext4_ext_find_extent() to free path on error · 705912ca
      Theodore Ts'o 提交于
      Right now, there are a places where it is all to easy to leak memory
      on an error path, via a usage like this:
      
      	struct ext4_ext_path *path = NULL
      
      	while (...) {
      		...
      		path = ext4_ext_find_extent(inode, block, path, 0);
      		if (IS_ERR(path)) {
      			/* oops, if path was non-NULL before the call to
      			   ext4_ext_find_extent, we've leaked it!  :-(  */
      			...
      			return PTR_ERR(path);
      		}
      		...
      	}
      
      Unfortunately, there some code paths where we are doing the following
      instead:
      
      	path = ext4_ext_find_extent(inode, block, orig_path, 0);
      
      and where it's important that we _not_ free orig_path in the case
      where ext4_ext_find_extent() returns an error.
      
      So change the function signature of ext4_ext_find_extent() so that it
      takes a struct ext4_ext_path ** for its third argument, and by
      default, on an error, it will free the struct ext4_ext_path, and then
      zero out the struct ext4_ext_path * pointer.  In order to avoid
      causing problems, we add a flag EXT4_EX_NOFREE_ON_ERR which causes
      ext4_ext_find_extent() to use the original behavior of forcing the
      caller to deal with freeing the original path pointer on the error
      case.
      
      The goal is to get rid of EXT4_EX_NOFREE_ON_ERR entirely, but this
      allows for a gentle transition and makes the patches easier to verify.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      
      		
      705912ca
  5. 31 8月, 2014 2 次提交
  6. 28 7月, 2014 1 次提交
  7. 13 5月, 2014 1 次提交
  8. 21 4月, 2014 1 次提交
    • L
      ext4: rename uninitialized extents to unwritten · 556615dc
      Lukas Czerner 提交于
      Currently in ext4 there is quite a mess when it comes to naming
      unwritten extents. Sometimes we call it uninitialized and sometimes we
      refer to it as unwritten.
      
      The right name for the extent which has been allocated but does not
      contain any written data is _unwritten_. Other file systems are
      using this name consistently, even the buffer head state refers to it as
      unwritten. We need to fix this confusion in ext4.
      
      This commit changes every reference to an uninitialized extent (meaning
      allocated but unwritten) to unwritten extent. This includes comments,
      function names and variable names. It even covers abbreviation of the
      word uninitialized (such as uninit) and some misspellings.
      
      This commit does not change any of the code paths at all. This has been
      confirmed by comparing md5sums of the assembly code of each object file
      after all the function names were stripped from it.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      556615dc
  9. 24 2月, 2014 1 次提交
  10. 18 2月, 2014 1 次提交
  11. 09 11月, 2013 1 次提交
  12. 17 8月, 2013 1 次提交
  13. 17 6月, 2013 1 次提交
  14. 20 4月, 2013 1 次提交
  15. 12 4月, 2013 1 次提交
  16. 10 4月, 2013 1 次提交
  17. 09 4月, 2013 1 次提交
    • D
      ext4: implementation of a new ioctl called EXT4_IOC_SWAP_BOOT · 393d1d1d
      Dr. Tilmann Bubeck 提交于
      Add a new ioctl, EXT4_IOC_SWAP_BOOT which swaps i_blocks and
      associated attributes (like i_blocks, i_size, i_flags, ...) from the
      specified inode with inode EXT4_BOOT_LOADER_INO (#5). This is
      typically used to store a boot loader in a secure part of the
      filesystem, where it can't be changed by a normal user by accident.
      The data blocks of the previous boot loader will be associated with
      the given inode.
      
      This usercode program is a simple example of the usage:
      
      int main(int argc, char *argv[])
      {
        int fd;
        int err;
      
        if ( argc != 2 ) {
          printf("usage: ext4-swap-boot-inode FILE-TO-SWAP\n");
          exit(1);
        }
      
        fd = open(argv[1], O_WRONLY);
        if ( fd < 0 ) {
          perror("open");
          exit(1);
        }
      
        err = ioctl(fd, EXT4_IOC_SWAP_BOOT);
        if ( err < 0 ) {
          perror("ioctl");
          exit(1);
        }
      
        close(fd);
        exit(0);
      }
      
      [ Modified by Theodore Ts'o to fix a number of bugs in the original code.]
      Signed-off-by: NDr. Tilmann Bubeck <t.bubeck@reinform.de>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      393d1d1d
  18. 18 3月, 2013 1 次提交
  19. 04 3月, 2013 1 次提交
  20. 23 2月, 2013 1 次提交
  21. 18 2月, 2013 1 次提交
  22. 09 2月, 2013 1 次提交
    • T
      ext4: pass context information to jbd2__journal_start() · 9924a92a
      Theodore Ts'o 提交于
      So we can better understand what bits of ext4 are responsible for
      long-running jbd2 handles, use jbd2__journal_start() so we can pass
      context information for logging purposes.
      
      The recommended way for finding the longer-running handles is:
      
         T=/sys/kernel/debug/tracing
         EVENT=$T/events/jbd2/jbd2_handle_stats
         echo "interval > 5" > $EVENT/filter
         echo 1 > $EVENT/enable
      
         ./run-my-fs-benchmark
      
         cat $T/trace > /tmp/problem-handles
      
      This will list handles that were active for longer than 20ms.  Having
      longer-running handles is bad, because a commit started at the wrong
      time could stall for those 20+ milliseconds, which could delay an
      fsync() or an O_SYNC operation.  Here is an example line from the
      trace file describing a handle which lived on for 311 jiffies, or over
      1.2 seconds:
      
      postmark-2917  [000] ....   196.435786: jbd2_handle_stats: dev 254,32 
         tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1
         dirtied_blocks 0
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      9924a92a
  23. 02 2月, 2013 1 次提交
  24. 29 11月, 2012 1 次提交
    • T
      ext4: rationalize ext4_extents.h inclusion · 4a092d73
      Theodore Ts'o 提交于
      Previously, ext4_extents.h was being included at the end of ext4.h,
      which was bad for a number of reasons: (a) it was not being included
      in the expected place, and (b) it caused the header to be included
      multiple times.  There were #ifdef's to prevent this from causing any
      problems, but it still was unnecessary.
      
      By moving the function declarations that were in ext4_extents.h to
      ext4.h, which is standard practice for where the function declarations
      for the rest of ext4.h can be found, we can remove ext4_extents.h from
      being included in ext4.h at all, and then we can only include
      ext4_extents.h where it is needed in ext4's source files.
      
      It should be possible to move a few more things into ext4.h, and
      further reduce the number of source files that need to #include
      ext4_extents.h, but that's a cleanup for another day.
      Reported-by: NSachin Kamat <sachin.kamat@linaro.org>
      Reported-by: NWei Yongjun <weiyj.lk@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      4a092d73
  25. 29 9月, 2012 1 次提交
  26. 27 9月, 2012 6 次提交
    • W
      ext4: convert to use leXX_add_cpu() · ba39ebb6
      Wei Yongjun 提交于
      Convert cpu_to_leXX(leXX_to_cpu(E1) + E2) to use leXX_add_cpu().
      
      dpatch engine is used to auto generate this patch.
      (https://github.com/weiyj/dpatch)
      Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      ba39ebb6
    • W
      ext4: remove redundant offset check in mext_check_arguments() · cbb4ee83
      Wang Sheng-Hui 提交于
      In the check code above, if orig_start != donor_start, we would
      return -EINVAL. So here, orig_start should be equal with donor_start.
      Remove the redundant check here.
      Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      cbb4ee83
    • D
      ext4: reimplement uninit extent optimization for move_extent_per_page() · 8c854473
      Dmitry Monakhov 提交于
      Uninitialized extent may became initialized(parallel writeback task)
      at any moment after we drop i_data_sem, so we have to recheck extent's
      state after we hold page's lock and i_data_sem.
      
      If we about to change page's mapping we must hold page's lock in order to
      serialize other users.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      8c854473
    • D
      ext4: clean up online defrag bugs in move_extent_per_page() · bb557488
      Dmitry Monakhov 提交于
      Non-full list of bugs:
      1) uninitialized extent optimization does not hold page's lock,
         and simply replace brunches after that writeback code goes
         crazy because block mapping changed under it's feets
         kernel BUG at fs/ext4/inode.c:1434!  ( 288'th xfstress)
      
      2) uninitialized extent may became initialized right after we
         drop i_data_sem, so extent state must be rechecked
      
      3) Locked pages goes uptodate via following sequence:
         ->readpage(page); lock_page(page); use_that_page(page)
         But after readpage() one may invalidate it because it is
         uptodate and unlocked (reclaimer does that)
         As result kernel bug at include/linux/buffer_head.c:133!
      
      4) We call write_begin() with already opened stansaction which
         result in following deadlock:
      ->move_extent_per_page()
        ->ext4_journal_start()-> hold journal transaction
        ->write_begin()
          ->ext4_da_write_begin()
            ->ext4_nonda_switch()
              ->writeback_inodes_sb_if_idle()  --> will wait for journal_stop()
      
      5) try_to_release_page() may fail and it does fail if one of page's bh was
         pinned by journal
      
      6) If we about to change page's mapping we MUST hold it's lock during entire
         remapping procedure, this is true for both pages(original and donor one)
      
      Fixes:
      
      - Avoid (1) and (2) simply by temproraly drop uninitialized extent handling
        optimization, this will be reimplemented later.
      
      - Fix (3) by manually forcing page to uptodate state w/o dropping it's lock
      
      - Fix (4) by rearranging existing locking:
        from: journal_start(); ->write_begin
        to: write_begin(); journal_extend()
      - Fix (5) simply by checking retvalue
      - Fix (6) by locking both (original and donor one) pages during extent swap
        with help of mext_page_double_lock()
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bb557488
    • D
      ext4: online defrag is not supported for journaled files · f066055a
      Dmitry Monakhov 提交于
      Proper block swap for inodes with full journaling enabled is
      truly non obvious task. In order to be on a safe side let's
      explicitly disable it for now.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      f066055a
    • D
      ext4: move_extent code cleanup · 03bd8b9b
      Dmitry Monakhov 提交于
      - Remove usless checks, because it is too late to check that inode != NULL
        at the moment it was referenced several times.
      - Double lock routines looks very ugly and locking ordering relays on
        order of i_ino, but other kernel code rely on order of pointers.
        Let's make them simple and clean.
      - check that inodes belongs to the same SB as soon as possible.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      03bd8b9b
  27. 10 9月, 2011 1 次提交
  28. 06 6月, 2011 1 次提交
    • L
      ext4: Fix max file size and logical block counting of extent format file · f17722f9
      Lukas Czerner 提交于
      Kazuya Mio reported that he was able to hit BUG_ON(next == lblock)
      in ext4_ext_put_gap_in_cache() while creating a sparse file in extent
      format and fill the tail of file up to its end. We will hit the BUG_ON
      when we write the last block (2^32-1) into the sparse file.
      
      The root cause of the problem lies in the fact that we specifically set
      s_maxbytes so that block at s_maxbytes fit into on-disk extent format,
      which is 32 bit long. However, we are not storing start and end block
      number, but rather start block number and length in blocks. It means
      that in order to cover extent from 0 to EXT_MAX_BLOCK we need
      EXT_MAX_BLOCK+1 to fit into len (because we counting block 0 as well) -
      and it does not.
      
      The only way to fix it without changing the meaning of the struct
      ext4_extent members is, as Kazuya Mio suggested, to lower s_maxbytes
      by one fs block so we can cover the whole extent we can get by the
      on-disk extent format.
      
      Also in many places EXT_MAX_BLOCK is used as length instead of maximum
      logical block number as the name suggests, it is all a bit messy. So
      this commit renames it to EXT_MAX_BLOCKS and change its usage in some
      places to actually be maximum number of blocks in the extent.
      
      The bug which this commit fixes can be reproduced as follows:
      
       dd if=/dev/zero of=/mnt/mp1/file bs=<blocksize> count=1 seek=$((2**32-2))
       sync
       dd if=/dev/zero of=/mnt/mp1/file bs=<blocksize> count=1 seek=$((2**32-1))
      Reported-by: NKazuya Mio <k-mio@sx.jp.nec.com>
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f17722f9
  29. 19 5月, 2011 1 次提交
  30. 28 10月, 2010 1 次提交
  31. 27 7月, 2010 1 次提交