1. 28 5月, 2013 1 次提交
  2. 22 5月, 2013 3 次提交
    • L
      ext4: use ->invalidatepage() length argument · ca99fdd2
      Lukas Czerner 提交于
      ->invalidatepage() aop now accepts range to invalidate so we can make
      use of it in all ext4 invalidatepage routines.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      ca99fdd2
    • L
      jbd2: change jbd2_journal_invalidatepage to accept length · 259709b0
      Lukas Czerner 提交于
      invalidatepage now accepts range to invalidate and there are two file
      system using jbd2 also implementing punch hole feature which can benefit
      from this. We need to implement the same thing for jbd2 layer in order to
      allow those file system take benefit of this functionality.
      
      This commit adds length argument to the jbd2_journal_invalidatepage()
      and updates all instances in ext4 and ocfs2.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      259709b0
    • L
      mm: change invalidatepage prototype to accept length · d47992f8
      Lukas Czerner 提交于
      Currently there is no way to truncate partial page where the end
      truncate point is not at the end of the page. This is because it was not
      needed and the functionality was enough for file system truncate
      operation to work properly. However more file systems now support punch
      hole feature and it can benefit from mm supporting truncating page just
      up to the certain point.
      
      Specifically, with this functionality truncate_inode_pages_range() can
      be changed so it supports truncating partial page at the end of the
      range (currently it will BUG_ON() if 'end' is not at the end of the
      page).
      
      This commit changes the invalidatepage() address space operation
      prototype to accept range to be invalidated and update all the instances
      for it.
      
      We also change the block_invalidatepage() in the same way and actually
      make a use of the new length argument implementing range invalidation.
      
      Actual file system implementations will follow except the file systems
      where the changes are really simple and should not change the behaviour
      in any way .Implementation for truncate_page_range() which will be able
      to accept page unaligned ranges will follow as well.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      d47992f8
  3. 12 5月, 2013 1 次提交
  4. 08 5月, 2013 1 次提交
  5. 07 5月, 2013 1 次提交
  6. 06 5月, 2013 1 次提交
    • L
      ext4: limit group search loop for non-extent files · e6155736
      Lachlan McIlroy 提交于
      In the case where we are allocating for a non-extent file,
      we must limit the groups we allocate from to those below
      2^32 blocks, and ext4_mb_regular_allocator() attempts to
      do this initially by putting a cap on ngroups for the
      subsequent search loop.
      
      However, the initial target group comes in from the 
      allocation context (ac), and it may already be beyond
      the artificially limited ngroups.  In this case,
      the limit
      
      	if (group == ngroups)
      		group = 0;
      
      at the top of the loop is never true, and the loop will
      run away.
      
      Catch this case inside the loop and reset the search to
      start at group 0.
      
      [sandeen@redhat.com: add commit msg & comments]
      Signed-off-by: NLachlan McIlroy <lmcilroy@redhat.com>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      e6155736
  7. 03 5月, 2013 1 次提交
    • Y
      ext4: fix fio regression · e30b5dca
      Yan, Zheng 提交于
      We (Linux Kernel Performance project) found a regression introduced
      by commit:
      
        f7fec032 ext4: track all extent status in extent status tree
      
      The commit causes about 20% performance decrease in fio random write
      test. Profiler shows that rb_next() uses a lot of CPU time. The call
      stack is:
      
        rb_next
        ext4_es_find_delayed_extent
        ext4_map_blocks
        _ext4_get_block
        ext4_get_block_write
        __blockdev_direct_IO
        ext4_direct_IO
        generic_file_direct_write
        __generic_file_aio_write
        ext4_file_write
        aio_rw_vect_retry
        aio_run_iocb
        do_io_submit
        sys_io_submit
        system_call_fastpath
        io_submit
        td_io_getevents
        io_u_queued_complete
        thread_main
        main
        __libc_start_main
      
      The cause is that ext4_es_find_delayed_extent() doesn't have an
      upper bound, it keeps searching until a delayed extent is found.
      When there are a lots of non-delayed entries in the extent state
      tree, ext4_es_find_delayed_extent() may uses a lot of CPU time.
      Reported-by: NLKP project <lkp@linux.intel.com>
      Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      e30b5dca
  8. 23 4月, 2013 1 次提交
  9. 22 4月, 2013 4 次提交
  10. 21 4月, 2013 1 次提交
  11. 20 4月, 2013 4 次提交
    • T
      ext4: fix readdir error in case inline_data+^dir_index. · c4d8b023
      Tao Ma 提交于
      Zach reported a problem that if inline data is enabled, we don't
      tell the difference between the offset of '.' and '..'. And a
      getdents will fail if the user only want to get '.'. And what's
      worse, we may meet with duplicate dir entries as the offset
      for inline dir and non-inline one is quite different.
      
      This patch just try to resolve this problem if dir_index
      is disabled. In this case, f_pos is the real offset with
      the dir block, so for inline dir, we just pretend as if
      we are a dir block and returns the offset like a norml
      dir block does.
      Reported-by: NZach Brown <zab@redhat.com>
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c4d8b023
    • T
      ext4: fix readdir error in the case of inline_data+dir_index · 8af0f082
      Tao Ma 提交于
      Zach reported a problem that if inline data is enabled, we don't
      tell the difference between the offset of '.' and '..'. And a
      getdents will fail if the user only want to get '.' and what's worse,
      if there is a conversion happens when the user calls getdents
      many times, he/she may get the same entry twice.
      
      In theory, a dir block would also fail if it is converted to a
      hashed-index based dir since f_pos will become a hash value, not the
      real one, but it doesn't happen.  And a deep investigation shows that
      we uses a hash based solution even for a normal dir if the dir_index
      feature is enabled.
      
      So this patch just adds a new htree_inlinedir_to_tree for inline dir,
      and if we find that the hash index is supported, we will do like what
      we do for a dir block.
      Reported-by: NZach Brown <zab@redhat.com>
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      8af0f082
    • D
    • J
      ext4: move quota initialization out of inode allocation transaction · eb9cc7e1
      Jan Kara 提交于
      Inode allocation transaction is pretty heavy (246 credits with quotas
      and extents before previous patch, still around 200 after it).  This is
      mostly due to credits required for allocation of quota structures
      (credits there are heavily overestimated but it's difficult to make
      better estimates if we don't want to wire non-trivial assumptions about
      quota format into filesystem).
      
      So move quota initialization out of allocation transaction. That way
      transaction for quota structure allocation will be started only if we
      need to look up quota structure on disk (rare) and furthermore it will
      be started for each quota type separately, not for all of them at once.
      This reduces maximum transaction size to 34 is most cases and to 73 in
      the worst case.
      
      [ Modified by tytso to clean up the cleanup paths for error handling.
        Also use a separate call to ext4_std_error() for each failure so it
        is easier for someone who is debugging a problem in this function to
        determine which function call failed. ]
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      eb9cc7e1
  12. 19 4月, 2013 1 次提交
  13. 12 4月, 2013 5 次提交
  14. 11 4月, 2013 2 次提交
  15. 10 4月, 2013 8 次提交
  16. 09 4月, 2013 4 次提交
    • E
      ext4: fix free space estimate in ext4_nonda_switch() · 5c1ff336
      Eric Whitney 提交于
      Values stored in s_freeclusters_counter and s_dirtyclusters_counter
      are both in cluster units.  Remove the cluster to block conversion
      applied to s_freeclusters_counter causing an inflated estimate of
      free space because s_dirtyclusters_counter is not similarly
      converted.  Rename free_blocks and dirty_blocks to better reflect
      the units these variables contain to avoid future confusion.  This
      fix corrects ENOSPC failures for xfstests 127 and 231 on bigalloc
      file systems.
      Signed-off-by: NEric Whitney <enwlinux@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      5c1ff336
    • J
      ext4: fix deadlock with quota feature · bcb13850
      Jan Kara 提交于
      We didn't mark hidden quota files with S_NOQUOTA flag and thus quota was
      accounted even for quota files. Thus we could recurse back to quota code
      when adding new blocks to quota file which can easily deadlock. Mark
      hidden quota files properly.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bcb13850
    • D
      ext4: fix incorrect lock ordering for ext4_ind_migrate · e8238f9a
      Dmitry Monakhov 提交于
      existing locking ordering: journal-> i_data_sem, but
      ext4_ind_migrate() grab locks in opposite order which may result in
      deadlock.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      e8238f9a
    • D
      ext4: implementation of a new ioctl called EXT4_IOC_SWAP_BOOT · 393d1d1d
      Dr. Tilmann Bubeck 提交于
      Add a new ioctl, EXT4_IOC_SWAP_BOOT which swaps i_blocks and
      associated attributes (like i_blocks, i_size, i_flags, ...) from the
      specified inode with inode EXT4_BOOT_LOADER_INO (#5). This is
      typically used to store a boot loader in a secure part of the
      filesystem, where it can't be changed by a normal user by accident.
      The data blocks of the previous boot loader will be associated with
      the given inode.
      
      This usercode program is a simple example of the usage:
      
      int main(int argc, char *argv[])
      {
        int fd;
        int err;
      
        if ( argc != 2 ) {
          printf("usage: ext4-swap-boot-inode FILE-TO-SWAP\n");
          exit(1);
        }
      
        fd = open(argv[1], O_WRONLY);
        if ( fd < 0 ) {
          perror("open");
          exit(1);
        }
      
        err = ioctl(fd, EXT4_IOC_SWAP_BOOT);
        if ( err < 0 ) {
          perror("ioctl");
          exit(1);
        }
      
        close(fd);
        exit(0);
      }
      
      [ Modified by Theodore Ts'o to fix a number of bugs in the original code.]
      Signed-off-by: NDr. Tilmann Bubeck <t.bubeck@reinform.de>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      393d1d1d
  17. 04 4月, 2013 1 次提交