1. 05 9月, 2014 1 次提交
  2. 02 9月, 2014 18 次提交
    • Z
      ext4: track extent status tree shrinker delay statictics · eb68d0e2
      Zheng Liu 提交于
      This commit adds some statictics in extent status tree shrinker.  The
      purpose to add these is that we want to collect more details when we
      encounter a stall caused by extent status tree shrinker.  Here we count
      the following statictics:
        stats:
          the number of all objects on all extent status trees
          the number of reclaimable objects on lru list
          cache hits/misses
          the last sorted interval
          the number of inodes on lru list
        average:
          scan time for shrinking some objects
          the number of shrunk objects
        maximum:
          the inode that has max nr. of objects on lru list
          the maximum scan time for shrinking some objects
      
      The output looks like below:
        $ cat /proc/fs/ext4/sda1/es_shrinker_info
        stats:
          28228 objects
          6341 reclaimable objects
          5281/631 cache hits/misses
          586 ms last sorted interval
          250 inodes on lru list
        average:
          153 us scan time
          128 shrunk objects
        maximum:
          255 inode (255 objects, 198 reclaimable)
          125723 us max scan time
      
      If the lru list has never been sorted, the following line will not be
      printed:
          586ms last sorted interval
      If there is an empty lru list, the following lines also will not be
      printed:
          250 inodes on lru list
        ...
        maximum:
          255 inode (255 objects, 198 reclaimable)
          0 us max scan time
      
      Meanwhile in this commit a new trace point is defined to print some
      details in __ext4_es_shrink().
      
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Jan Kara <jack@suse.cz>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      eb68d0e2
    • Z
      ext4: improve extents status tree trace point · e963bb1d
      Zheng Liu 提交于
      This commit improves the trace point of extents status tree.  We rename
      trace_ext4_es_shrink_enter in ext4_es_count() because it is also used
      in ext4_es_scan() and we can not identify them from the result.
      
      Further this commit fixes a variable name in trace point in order to
      keep consistency with others.
      
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Jan Kara <jack@suse.cz>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e963bb1d
    • S
      ext4: fix comments about get_blocks · d91bd2c1
      Seunghun Lee 提交于
      get_blocks is renamed to get_block.
      Signed-off-by: NSeunghun Lee <waydi1@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      d91bd2c1
    • D
      ext4: enable block_validity by default · 45f1a9c3
      Darrick J. Wong 提交于
      Enable by default the block_validity feature, which checks for
      collisions between newly allocated blocks and critical system
      metadata.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      45f1a9c3
    • T
      jbd2: fold __wait_cp_io into jbd2_log_do_checkpoint() · 88fe1acb
      Theodore Ts'o 提交于
      __wait_cp_io() is only called by jbd2_log_do_checkpoint().  Fold it in
      to make it a bit easier to understand.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      88fe1acb
    • T
      jbd2: fold __process_buffer() into jbd2_log_do_checkpoint() · be1158cc
      Theodore Ts'o 提交于
      __process_buffer() is only called by jbd2_log_do_checkpoint(), and it
      had a very complex locking protocol where it would be called with the
      j_list_lock, and sometimes exit with the lock held (if the return code
      was 0), or release the lock.
      
      This was confusing both to humans and to smatch (which erronously
      complained that the lock was taken twice).
      
      Folding __process_buffer() to the caller allows us to simplify the
      control flow, making the resulting function easier to read and reason
      about, and dropping the compiled size of fs/jbd2/checkpoint.c by 150
      bytes (over 4% of the text size).
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      be1158cc
    • T
      ext4: rename ext4_ext_find_extent() to ext4_find_extent() · ed8a1a76
      Theodore Ts'o 提交于
      Make the function name less redundant.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ed8a1a76
    • T
      ext4: reuse path object in ext4_move_extents() · 3bdf14b4
      Theodore Ts'o 提交于
      Reuse the path object in ext4_move_extents() so we don't unnecessarily
      free and reallocate it.
      
      Also clean up the get_ext_path() wrapper so that it has the same
      semantics of freeing the path object on error as ext4_ext_find_extent().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      3bdf14b4
    • T
      ext4: reuse path object in ext4_ext_shift_extents() · ee4bd0d9
      Theodore Ts'o 提交于
      Now that the semantics of ext4_ext_find_extent() are much cleaner,
      it's safe and more efficient to reuse the path object across the
      multiple calls to ext4_ext_find_extent() in ext4_ext_shift_extents().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ee4bd0d9
    • T
      ext4: teach ext4_ext_find_extent() to realloc path if necessary · 10809df8
      Theodore Ts'o 提交于
      This adds additional safety in case for some reason we end reusing a
      path structure which isn't big enough for current depth of the inode.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      10809df8
    • T
      ext4: allow a NULL argument to ext4_ext_drop_refs() · b7ea89ad
      Theodore Ts'o 提交于
      Teach ext4_ext_drop_refs() to accept a NULL argument, much like
      kfree().  This allows us to drop a lot of checks to make sure path is
      non-NULL before calling ext4_ext_drop_refs().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b7ea89ad
    • T
      ext4: call ext4_ext_drop_refs() from ext4_ext_find_extent() · 523f431c
      Theodore Ts'o 提交于
      In nearly all of the calls to ext4_ext_find_extent() where the caller
      is trying to recycle the path object, ext4_ext_drop_refs() gets called
      to release the buffer heads before the path object gets overwritten.
      To simplify things for the callers, and to avoid the possibility of a
      memory leak, make ext4_ext_find_extent() responsible for dropping the
      buffers.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      523f431c
    • T
      ext4: drop EXT4_EX_NOFREE_ON_ERR from rest of extents handling code · dfe50809
      Theodore Ts'o 提交于
      Drop EXT4_EX_NOFREE_ON_ERR from ext4_ext_create_new_leaf(),
      ext4_split_extent(), ext4_convert_unwritten_extents_endio().
      
      This requires fixing all of their callers to potentially
      ext4_ext_find_extent() to free the struct ext4_ext_path object in case
      of an error, and there are interlocking dependencies all the way up to
      ext4_ext_map_blocks(), ext4_swap_extents(), and
      ext4_ext_remove_space().
      
      Once this is done, we can drop the EXT4_EX_NOFREE_ON_ERR flag since it
      is no longer necessary.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      dfe50809
    • T
      ext4: drop EXT4_EX_NOFREE_ON_ERR in convert_initialized_extent() · 4f224b8b
      Theodore Ts'o 提交于
      Transfer responsibility of freeing struct ext4_ext_path on error to
      ext4_ext_find_extent().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      4f224b8b
    • T
      ext4: collapse ext4_convert_initialized_extents() · e8b83d93
      Theodore Ts'o 提交于
      The function ext4_convert_initialized_extents() is only called by a
      single function --- ext4_ext_convert_initalized_extents().  Inline the
      code and get rid of the unnecessary bits in order to simplify the code.
      
      Rename ext4_ext_convert_initalized_extents() to
      convert_initalized_extents() since it's a static function that is
      actually only used in a single caller, ext4_ext_map_blocks().
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e8b83d93
    • T
      ext4: teach ext4_ext_find_extent() to free path on error · 705912ca
      Theodore Ts'o 提交于
      Right now, there are a places where it is all to easy to leak memory
      on an error path, via a usage like this:
      
      	struct ext4_ext_path *path = NULL
      
      	while (...) {
      		...
      		path = ext4_ext_find_extent(inode, block, path, 0);
      		if (IS_ERR(path)) {
      			/* oops, if path was non-NULL before the call to
      			   ext4_ext_find_extent, we've leaked it!  :-(  */
      			...
      			return PTR_ERR(path);
      		}
      		...
      	}
      
      Unfortunately, there some code paths where we are doing the following
      instead:
      
      	path = ext4_ext_find_extent(inode, block, orig_path, 0);
      
      and where it's important that we _not_ free orig_path in the case
      where ext4_ext_find_extent() returns an error.
      
      So change the function signature of ext4_ext_find_extent() so that it
      takes a struct ext4_ext_path ** for its third argument, and by
      default, on an error, it will free the struct ext4_ext_path, and then
      zero out the struct ext4_ext_path * pointer.  In order to avoid
      causing problems, we add a flag EXT4_EX_NOFREE_ON_ERR which causes
      ext4_ext_find_extent() to use the original behavior of forcing the
      caller to deal with freeing the original path pointer on the error
      case.
      
      The goal is to get rid of EXT4_EX_NOFREE_ON_ERR entirely, but this
      allows for a gentle transition and makes the patches easier to verify.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      
      		
      705912ca
    • T
      ext4: fix accidental flag aliasing in ext4_map_blocks flags · bd30d702
      Theodore Ts'o 提交于
      Commit b8a86845 introduced an accidental flag aliasing between
      EXT4_EX_NOCACHE and EXT4_GET_BLOCKS_CONVERT_UNWRITTEN.
      
      Fortunately, this didn't introduce any untorward side effects --- we
      got lucky.  Nevertheless, fix this and leave a warning to hopefully
      avoid this from happening in the future.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      bd30d702
    • T
      ext4: fix ZERO_RANGE bug hidden by flag aliasing · 713e8dde
      Theodore Ts'o 提交于
      We accidently aliased EXT4_EX_NOCACHE and EXT4_GET_CONVERT_UNWRITTEN
      falgs, which apparently was hiding a bug that was unmasked when this
      flag aliasing issue was addressed (see the subsequent commit).  The
      reproduction case was:
      
         fsx -N 10000 -l 500000 -r 4096 -t 4096 -w 4096 -Z -R -W /vdb/junk
      
      ... which would cause fsx to report corruption in the data file.
      
      The fix we have is a bit of an overkill, but I'd much rather be
      conservative for now, and we can optimize ZERO_RANGE_FL handling
      later.  The fact that we need to zap the extent_status cache for the
      inode is unfortunate, but correctness is far more important than
      performance.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      713e8dde
  3. 01 9月, 2014 1 次提交
  4. 31 8月, 2014 3 次提交
  5. 30 8月, 2014 11 次提交
  6. 29 8月, 2014 6 次提交
    • D
      ext4: fix same-dir rename when inline data directory overflows · d80d448c
      Darrick J. Wong 提交于
      When performing a same-directory rename, it's possible that adding or
      setting the new directory entry will cause the directory to overflow
      the inline data area, which causes the directory to be converted to an
      extent-based directory.  Under this circumstance it is necessary to
      re-read the directory when deleting the old dirent because the "old
      directory" context still points to i_block in the inode table, which
      is now an extent tree root!  The delete fails with an FS error, and
      the subsequent fsck complains about incorrect link counts and
      hardlinked directories.
      
      Test case (originally found with flat_dir_test in the metadata_csum
      test program):
      
      # mkfs.ext4 -O inline_data /dev/sda
      # mount /dev/sda /mnt
      # mkdir /mnt/x
      # touch /mnt/x/changelog.gz /mnt/x/copyright /mnt/x/README.Debian
      # sync
      # for i in /mnt/x/*; do mv $i $i.longer; done
      # ls -la /mnt/x/
      total 0
      -rw-r--r-- 1 root root 0 Aug 25 12:03 changelog.gz.longer
      -rw-r--r-- 1 root root 0 Aug 25 12:03 copyright
      -rw-r--r-- 1 root root 0 Aug 25 12:03 copyright.longer
      -rw-r--r-- 1 root root 0 Aug 25 12:03 README.Debian.longer
      
      (Hey!  Why are there four files now??)
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      d80d448c
    • D
      jbd2: fix descriptor block size handling errors with journal_csum · db9ee220
      Darrick J. Wong 提交于
      It turns out that there are some serious problems with the on-disk
      format of journal checksum v2.  The foremost is that the function to
      calculate descriptor tag size returns sizes that are too big.  This
      causes alignment issues on some architectures and is compounded by the
      fact that some parts of jbd2 use the structure size (incorrectly) to
      determine the presence of a 64bit journal instead of checking the
      feature flags.
      
      Therefore, introduce journal checksum v3, which enlarges the
      descriptor block tag format to allow for full 32-bit checksums of
      journal blocks, fix the journal tag function to return the correct
      sizes, and fix the jbd2 recovery code to use feature flags to
      determine 64bitness.
      
      Add a few function helpers so we don't have to open-code quite so
      many pieces.
      
      Switching to a 16-byte block size was found to increase journal size
      overhead by a maximum of 0.1%, to convert a 32-bit journal with no
      checksumming to a 32-bit journal with checksum v3 enabled.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reported-by: NTR Reardon <thomas_reardon@hotmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      db9ee220
    • D
      jbd2: fix infinite loop when recovering corrupt journal blocks · 022eaa75
      Darrick J. Wong 提交于
      When recovering the journal, don't fall into an infinite loop if we
      encounter a corrupt journal block.  Instead, just skip the block and
      return an error, which fails the mount and thus forces the user to run
      a full filesystem fsck.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      022eaa75
    • D
      ext4: update i_disksize coherently with block allocation on error path · 6603120e
      Dmitry Monakhov 提交于
      In case of delalloc block i_disksize may be less than i_size. So we
      have to update i_disksize each time we allocated and submitted some
      blocks beyond i_disksize.  We weren't doing this on the error paths,
      so fix this.
      
      testcase: xfstest generic/019
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      6603120e
    • M
      dm crypt: fix access beyond the end of allocated space · d49ec52f
      Mikulas Patocka 提交于
      The DM crypt target accesses memory beyond allocated space resulting in
      a crash on 32 bit x86 systems.
      
      This bug is very old (it dates back to 2.6.25 commit 3a7f6c99 "dm
      crypt: use async crypto").  However, this bug was masked by the fact
      that kmalloc rounds the size up to the next power of two.  This bug
      wasn't exposed until 3.17-rc1 commit 298a9fa0 ("dm crypt: use per-bio
      data").  By switching to using per-bio data there was no longer any
      padding beyond the end of a dm-crypt allocated memory block.
      
      To minimize allocation overhead dm-crypt puts several structures into one
      block allocated with kmalloc.  The block holds struct ablkcipher_request,
      cipher-specific scratch pad (crypto_ablkcipher_reqsize(any_tfm(cc))),
      struct dm_crypt_request and an initialization vector.
      
      The variable dmreq_start is set to offset of struct dm_crypt_request
      within this memory block.  dm-crypt allocates the block with this size:
      cc->dmreq_start + sizeof(struct dm_crypt_request) + cc->iv_size.
      
      When accessing the initialization vector, dm-crypt uses the function
      iv_of_dmreq, which performs this calculation: ALIGN((unsigned long)(dmreq
      + 1), crypto_ablkcipher_alignmask(any_tfm(cc)) + 1).
      
      dm-crypt allocated "cc->iv_size" bytes beyond the end of dm_crypt_request
      structure.  However, when dm-crypt accesses the initialization vector, it
      takes a pointer to the end of dm_crypt_request, aligns it, and then uses
      it as the initialization vector.  If the end of dm_crypt_request is not
      aligned on a crypto_ablkcipher_alignmask(any_tfm(cc)) boundary the
      alignment causes the initialization vector to point beyond the allocated
      space.
      
      Fix this bug by calculating the variable iv_size_padding and adding it
      to the allocated size.
      
      Also correct the alignment of dm_crypt_request.  struct dm_crypt_request
      is specific to dm-crypt (it isn't used by the crypto subsystem at all),
      so it is aligned on __alignof__(struct dm_crypt_request).
      
      Also align per_bio_data_size on ARCH_KMALLOC_MINALIGN, so that it is
      aligned as if the block was allocated with kmalloc.
      Reported-by: NKrzysztof Kolasa <kkolasa@winsoft.pl>
      Tested-by: NMilan Broz <gmazyland@gmail.com>
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      d49ec52f
    • L
      Merge tag 'backlight-fixes-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight · 59753a80
      Linus Torvalds 提交于
      Pull backlight fix from Lee Jones:
       "One simple fix to invalidate GPIO non-request"
      
      * tag 'backlight-fixes-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
        pwm-backlight: Fix bogus request for GPIO#0 when instantiated from DT
      59753a80