1. 19 5月, 2015 5 次提交
    • T
      ext4 crypto: use slab caches · 8ee03714
      Theodore Ts'o 提交于
      Use slab caches the ext4_crypto_ctx and ext4_crypt_info structures for
      slighly better memory efficiency and debuggability.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      8ee03714
    • T
      ext4: clean up superblock encryption mode fields · f5aed2c2
      Theodore Ts'o 提交于
      The superblock fields s_file_encryption_mode and s_dir_encryption_mode
      are vestigal, so remove them as a cleanup.  While we're at it, allow
      file systems with both encryption and inline_data enabled at the same
      time to work correctly.  We can't have encrypted inodes with inline
      data, but there's no reason to prohibit unencrypted inodes from using
      the inline data feature.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      f5aed2c2
    • T
      ext4 crypto: reorganize how we store keys in the inode · b7236e21
      Theodore Ts'o 提交于
      This is a pretty massive patch which does a number of different things:
      
      1) The per-inode encryption information is now stored in an allocated
         data structure, ext4_crypt_info, instead of directly in the node.
         This reduces the size usage of an in-memory inode when it is not
         using encryption.
      
      2) We drop the ext4_fname_crypto_ctx entirely, and use the per-inode
         encryption structure instead.  This remove an unnecessary memory
         allocation and free for the fname_crypto_ctx as well as allowing us
         to reuse the ctfm in a directory for multiple lookups and file
         creations.
      
      3) We also cache the inode's policy information in the ext4_crypt_info
         structure so we don't have to continually read it out of the
         extended attributes.
      
      4) We now keep the keyring key in the inode's encryption structure
         instead of releasing it after we are done using it to derive the
         per-inode key.  This allows us to test to see if the key has been
         revoked; if it has, we prevent the use of the derived key and free
         it.
      
      5) When an inode is released (or when the derived key is freed), we
         will use memset_explicit() to zero out the derived key, so it's not
         left hanging around in memory.  This implies that when a user logs
         out, it is important to first revoke the key, and then unlink it,
         and then finally, to use "echo 3 > /proc/sys/vm/drop_caches" to
         release any decrypted pages and dcache entries from the system
         caches.
      
      6) All this, and we also shrink the number of lines of code by around
         100.  :-)
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b7236e21
    • T
      ext4 crypto: separate kernel and userspace structure for the key · e2881b1b
      Theodore Ts'o 提交于
      Use struct ext4_encryption_key only for the master key passed via the
      kernel keyring.
      
      For internal kernel space users, we now use struct ext4_crypt_info.
      This will allow us to put information from the policy structure so we
      can cache it and avoid needing to constantly looking up the extended
      attribute.  We will do this in a spearate patch.  This patch is mostly
      mechnical to make it easier for patch review.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e2881b1b
    • T
      ext4 crypto: optimize filename encryption · 5b643f9c
      Theodore Ts'o 提交于
      Encrypt the filename as soon it is passed in by the user.  This avoids
      our needing to encrypt the filename 2 or 3 times while in the process
      of creating a filename.
      
      Similarly, when looking up a directory entry, encrypt the filename
      early, or if the encryption key is not available, base-64 decode the
      file syystem so that the hash value and the last 16 bytes of the
      encrypted filename is available in the new struct ext4_filename data
      structure.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      5b643f9c
  2. 15 5月, 2015 1 次提交
  3. 02 5月, 2015 3 次提交
  4. 16 4月, 2015 3 次提交
  5. 12 4月, 2015 6 次提交
  6. 11 4月, 2015 3 次提交
  7. 08 4月, 2015 1 次提交
  8. 17 2月, 2015 1 次提交
  9. 13 2月, 2015 1 次提交
  10. 20 1月, 2015 1 次提交
  11. 03 12月, 2014 1 次提交
  12. 26 11月, 2014 4 次提交
    • J
      ext4: limit number of scanned extents in status tree shrinker · dd475925
      Jan Kara 提交于
      Currently we scan extent status trees of inodes until we reclaim nr_to_scan
      extents. This can however require a lot of scanning when there are lots
      of delayed extents (as those cannot be reclaimed).
      
      Change shrinker to work as shrinkers are supposed to and *scan* only
      nr_to_scan extents regardless of how many extents did we actually
      reclaim. We however need to be careful and avoid scanning each status
      tree from the beginning - that could lead to a situation where we would
      not be able to reclaim anything at all when first nr_to_scan extents in
      the tree are always unreclaimable. We remember with each inode offset
      where we stopped scanning and continue from there when we next come
      across the inode.
      
      Note that we also need to update places calling __es_shrink() manually
      to pass reasonable nr_to_scan to have a chance of reclaiming anything and
      not just 1.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      dd475925
    • Z
      ext4: change LRU to round-robin in extent status tree shrinker · edaa53ca
      Zheng Liu 提交于
      In this commit we discard the lru algorithm for inodes with extent
      status tree because it takes significant effort to maintain a lru list
      in extent status tree shrinker and the shrinker can take a long time to
      scan this lru list in order to reclaim some objects.
      
      We replace the lru ordering with a simple round-robin.  After that we
      never need to keep a lru list.  That means that the list needn't be
      sorted if the shrinker can not reclaim any objects in the first round.
      
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      edaa53ca
    • Z
      ext4: cache extent hole in extent status tree for ext4_da_map_blocks() · 2f8e0a7c
      Zheng Liu 提交于
      Currently extent status tree doesn't cache extent hole when a write
      looks up in extent tree to make sure whether a block has been allocated
      or not.  In this case, we don't put extent hole in extent cache because
      later this extent might be removed and a new delayed extent might be
      added back.  But it will cause a defect when we do a lot of writes.  If
      we don't put extent hole in extent cache, the following writes also need
      to access extent tree to look at whether or not a block has been
      allocated.  It brings a cache miss.  This commit fixes this defect.
      Also if the inode doesn't have any extent, this extent hole will be
      cached as well.
      
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      2f8e0a7c
    • J
      ext4: fix block reservation for bigalloc filesystems · cbd7584e
      Jan Kara 提交于
      For bigalloc filesystems we have to check whether newly requested inode
      block isn't already part of a cluster for which we already have delayed
      allocation reservation. This check happens in ext4_ext_map_blocks() and
      that function sets EXT4_MAP_FROM_CLUSTER if that's the case. However if
      ext4_da_map_blocks() finds in extent cache information about the block,
      we don't call into ext4_ext_map_blocks() and thus we always end up
      getting new reservation even if the space for cluster is already
      reserved. This results in overreservation and premature ENOSPC reports.
      
      Fix the problem by checking for existing cluster reservation already in
      ext4_da_map_blocks(). That simplifies the logic and actually allows us
      to get rid of the EXT4_MAP_FROM_CLUSTER flag completely.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      cbd7584e
  13. 21 11月, 2014 1 次提交
  14. 10 11月, 2014 1 次提交
  15. 14 10月, 2014 1 次提交
    • D
      ext4: check s_chksum_driver when looking for bg csum presence · 813d32f9
      Darrick J. Wong 提交于
      Convert the ext4_has_group_desc_csum predicate to look for a checksum
      driver instead of the metadata_csum flag and change the bg checksum
      calculation function to look for GDT_CSUM before taking the crc16
      path.
      
      Without this patch, if we mount with ^uninit_bg,^metadata_csum and
      later metadata_csum gets turned on by accident, the block group
      checksum functions will incorrectly assume that checksumming is
      enabled (metadata_csum) but that crc16 should be used
      (!s_chksum_driver).  This is totally wrong, so fix the predicate
      and the checksum formula selection.
      
      (Granted, if the metadata_csum feature bit gets enabled on a live FS
      then something underhanded is going on, but we could at least avoid
      writing garbage into the on-disk fields.)
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Cc: stable@vger.kernel.org
      813d32f9
  16. 13 10月, 2014 1 次提交
  17. 06 10月, 2014 1 次提交
    • T
      ext4: add ext4_iget_normal() which is to be used for dir tree lookups · f4bb2981
      Theodore Ts'o 提交于
      If there is a corrupted file system which has directory entries that
      point at reserved, metadata inodes, prohibit them from being used by
      treating them the same way we treat Boot Loader inodes --- that is,
      mark them to be bad inodes.  This prohibits them from being opened,
      deleted, or modified via chmod, chown, utimes, etc.
      
      In particular, this prevents a corrupted file system which has a
      directory entry which points at the journal inode from being deleted
      and its blocks released, after which point Much Hilarity Ensues.
      Reported-by: NSami Liedes <sami.liedes@iki.fi>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      f4bb2981
  18. 11 9月, 2014 1 次提交
  19. 05 9月, 2014 2 次提交
  20. 02 9月, 2014 2 次提交
    • Z
      ext4: track extent status tree shrinker delay statictics · eb68d0e2
      Zheng Liu 提交于
      This commit adds some statictics in extent status tree shrinker.  The
      purpose to add these is that we want to collect more details when we
      encounter a stall caused by extent status tree shrinker.  Here we count
      the following statictics:
        stats:
          the number of all objects on all extent status trees
          the number of reclaimable objects on lru list
          cache hits/misses
          the last sorted interval
          the number of inodes on lru list
        average:
          scan time for shrinking some objects
          the number of shrunk objects
        maximum:
          the inode that has max nr. of objects on lru list
          the maximum scan time for shrinking some objects
      
      The output looks like below:
        $ cat /proc/fs/ext4/sda1/es_shrinker_info
        stats:
          28228 objects
          6341 reclaimable objects
          5281/631 cache hits/misses
          586 ms last sorted interval
          250 inodes on lru list
        average:
          153 us scan time
          128 shrunk objects
        maximum:
          255 inode (255 objects, 198 reclaimable)
          125723 us max scan time
      
      If the lru list has never been sorted, the following line will not be
      printed:
          586ms last sorted interval
      If there is an empty lru list, the following lines also will not be
      printed:
          250 inodes on lru list
        ...
        maximum:
          255 inode (255 objects, 198 reclaimable)
          0 us max scan time
      
      Meanwhile in this commit a new trace point is defined to print some
      details in __ext4_es_shrink().
      
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Jan Kara <jack@suse.cz>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      eb68d0e2
    • T
      ext4: rename ext4_ext_find_extent() to ext4_find_extent() · ed8a1a76
      Theodore Ts'o 提交于
      Make the function name less redundant.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      ed8a1a76