1. 30 4月, 2017 1 次提交
  2. 16 3月, 2017 1 次提交
    • E
      ext4: mark inode dirty after converting inline directory · b9cf625d
      Eric Biggers 提交于
      If ext4_convert_inline_data() was called on a directory with inline
      data, the filesystem was left in an inconsistent state (as considered by
      e2fsck) because the file size was not increased to cover the new block.
      This happened because the inode was not marked dirty after i_disksize
      was updated.  Fix this by marking the inode dirty at the end of
      ext4_finish_convert_inline_dir().
      
      This bug was probably not noticed before because most users mark the
      inode dirty afterwards for other reasons.  But if userspace executed
      FS_IOC_SET_ENCRYPTION_POLICY with invalid parameters, as exercised by
      'kvm-xfstests -c adv generic/396', then the inode was never marked dirty
      after updating i_disksize.
      
      Cc: stable@vger.kernel.org  # 3.10+
      Fixes: 3c47d541Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b9cf625d
  3. 05 2月, 2017 2 次提交
  4. 23 1月, 2017 1 次提交
  5. 12 1月, 2017 2 次提交
    • T
      ext4: avoid calling ext4_mark_inode_dirty() under unneeded semaphores · b907f2d5
      Theodore Ts'o 提交于
      There is no need to call ext4_mark_inode_dirty while holding xattr_sem
      or i_data_sem, so where it's easy to avoid it, move it out from the
      critical region.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b907f2d5
    • T
      ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea() · c755e251
      Theodore Ts'o 提交于
      The xattr_sem deadlock problems fixed in commit 2e81a4ee: "ext4:
      avoid deadlock when expanding inode size" didn't include the use of
      xattr_sem in fs/ext4/inline.c.  With the addition of project quota
      which added a new extra inode field, this exposed deadlocks in the
      inline_data code similar to the ones fixed by 2e81a4ee.
      
      The deadlock can be reproduced via:
      
         dmesg -n 7
         mke2fs -t ext4 -O inline_data -Fq -I 256 /dev/vdc 32768
         mount -t ext4 -o debug_want_extra_isize=24 /dev/vdc /vdc
         mkdir /vdc/a
         umount /vdc
         mount -t ext4 /dev/vdc /vdc
         echo foo > /vdc/a/foo
      
      and looks like this:
      
      [   11.158815] 
      [   11.160276] =============================================
      [   11.161960] [ INFO: possible recursive locking detected ]
      [   11.161960] 4.10.0-rc3-00015-g011b30a8a3cf #160 Tainted: G        W      
      [   11.161960] ---------------------------------------------
      [   11.161960] bash/2519 is trying to acquire lock:
      [   11.161960]  (&ei->xattr_sem){++++..}, at: [<c1225a4b>] ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960] 
      [   11.161960] but task is already holding lock:
      [   11.161960]  (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152
      [   11.161960] 
      [   11.161960] other info that might help us debug this:
      [   11.161960]  Possible unsafe locking scenario:
      [   11.161960] 
      [   11.161960]        CPU0
      [   11.161960]        ----
      [   11.161960]   lock(&ei->xattr_sem);
      [   11.161960]   lock(&ei->xattr_sem);
      [   11.161960] 
      [   11.161960]  *** DEADLOCK ***
      [   11.161960] 
      [   11.161960]  May be due to missing lock nesting notation
      [   11.161960] 
      [   11.161960] 4 locks held by bash/2519:
      [   11.161960]  #0:  (sb_writers#3){.+.+.+}, at: [<c11a2414>] mnt_want_write+0x1e/0x3e
      [   11.161960]  #1:  (&type->i_mutex_dir_key){++++++}, at: [<c119508b>] path_openat+0x338/0x67a
      [   11.161960]  #2:  (jbd2_handle){++++..}, at: [<c123314a>] start_this_handle+0x582/0x622
      [   11.161960]  #3:  (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152
      [   11.161960] 
      [   11.161960] stack backtrace:
      [   11.161960] CPU: 0 PID: 2519 Comm: bash Tainted: G        W       4.10.0-rc3-00015-g011b30a8a3cf #160
      [   11.161960] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1 04/01/2014
      [   11.161960] Call Trace:
      [   11.161960]  dump_stack+0x72/0xa3
      [   11.161960]  __lock_acquire+0xb7c/0xcb9
      [   11.161960]  ? kvm_clock_read+0x1f/0x29
      [   11.161960]  ? __lock_is_held+0x36/0x66
      [   11.161960]  ? __lock_is_held+0x36/0x66
      [   11.161960]  lock_acquire+0x106/0x18a
      [   11.161960]  ? ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  down_write+0x39/0x72
      [   11.161960]  ? ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  ? _raw_read_unlock+0x22/0x2c
      [   11.161960]  ? jbd2_journal_extend+0x1e2/0x262
      [   11.161960]  ? __ext4_journal_get_write_access+0x3d/0x60
      [   11.161960]  ext4_mark_inode_dirty+0x17d/0x26d
      [   11.161960]  ? ext4_add_dirent_to_inline.isra.12+0xa5/0xb2
      [   11.161960]  ext4_add_dirent_to_inline.isra.12+0xa5/0xb2
      [   11.161960]  ext4_try_add_inline_entry+0x69/0x152
      [   11.161960]  ext4_add_entry+0xa3/0x848
      [   11.161960]  ? __brelse+0x14/0x2f
      [   11.161960]  ? _raw_spin_unlock_irqrestore+0x44/0x4f
      [   11.161960]  ext4_add_nondir+0x17/0x5b
      [   11.161960]  ext4_create+0xcf/0x133
      [   11.161960]  ? ext4_mknod+0x12f/0x12f
      [   11.161960]  lookup_open+0x39e/0x3fb
      [   11.161960]  ? __wake_up+0x1a/0x40
      [   11.161960]  ? lock_acquire+0x11e/0x18a
      [   11.161960]  path_openat+0x35c/0x67a
      [   11.161960]  ? sched_clock_cpu+0xd7/0xf2
      [   11.161960]  do_filp_open+0x36/0x7c
      [   11.161960]  ? _raw_spin_unlock+0x22/0x2c
      [   11.161960]  ? __alloc_fd+0x169/0x173
      [   11.161960]  do_sys_open+0x59/0xcc
      [   11.161960]  SyS_open+0x1d/0x1f
      [   11.161960]  do_int80_syscall_32+0x4f/0x61
      [   11.161960]  entry_INT80_32+0x2f/0x2f
      [   11.161960] EIP: 0xb76ad469
      [   11.161960] EFLAGS: 00000286 CPU: 0
      [   11.161960] EAX: ffffffda EBX: 08168ac8 ECX: 00008241 EDX: 000001b6
      [   11.161960] ESI: b75e46bc EDI: b7755000 EBP: bfbdb108 ESP: bfbdafc0
      [   11.161960]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
      
      Cc: stable@vger.kernel.org # 3.10 (requires 2e81a4ee as a prereq)
      Reported-by: NGeorge Spelvin <linux@sciencehorizons.net>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c755e251
  6. 10 12月, 2016 1 次提交
  7. 21 11月, 2016 1 次提交
  8. 15 11月, 2016 1 次提交
    • D
      ext4: use current_time() for inode timestamps · eeca7ea1
      Deepa Dinamani 提交于
      CURRENT_TIME_SEC and CURRENT_TIME are not y2038 safe.
      current_time() will be transitioned to be y2038 safe
      along with vfs.
      
      current_time() returns timestamps according to the
      granularities set in the super_block.
      The granularity check in ext4_current_time() to call
      current_time() or CURRENT_TIME_SEC is not required.
      Use current_time() directly to obtain timestamps
      unconditionally, and remove ext4_current_time().
      
      Quota files are assumed to be on the same filesystem.
      Hence, use current_time() for these files as well.
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NArnd Bergmann <arnd@arndb.de>
      eeca7ea1
  9. 11 7月, 2016 1 次提交
  10. 27 4月, 2016 1 次提交
  11. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  12. 10 3月, 2016 1 次提交
  13. 09 3月, 2016 1 次提交
  14. 09 1月, 2016 1 次提交
  15. 18 10月, 2015 1 次提交
  16. 19 5月, 2015 1 次提交
    • T
      ext4 crypto: optimize filename encryption · 5b643f9c
      Theodore Ts'o 提交于
      Encrypt the filename as soon it is passed in by the user.  This avoids
      our needing to encrypt the filename 2 or 3 times while in the process
      of creating a filename.
      
      Similarly, when looking up a directory entry, encrypt the filename
      early, or if the encryption key is not available, base-64 decode the
      file syystem so that the hash value and the last 16 bytes of the
      encrypted filename is available in the new struct ext4_filename data
      structure.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      5b643f9c
  17. 16 4月, 2015 1 次提交
  18. 12 4月, 2015 2 次提交
  19. 03 4月, 2015 1 次提交
    • R
      ext4: fix transposition typo in format string · 80cfb71e
      Rasmus Villemoes 提交于
      According to C99, %*.s means the same as %*.0s, in other words, print as
      many spaces as the field width argument says and effectively ignore the
      string argument. That is certainly not what was meant here. The kernel's
      printf implementation, however, treats it as if the . was not there,
      i.e. as %*s. I don't know if de->name is nul-terminated or not, but in
      any case I'm guessing the intention was to use de->name_len as precision
      instead of field width.
      
      [ Note: this is debugging code which is commented out, so this is not
        security issue; a developer would have to explicitly enable
        INLINE_DIR_DEBUG before this would be an issue. ]
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      80cfb71e
  20. 06 12月, 2014 1 次提交
    • D
      ext4: ext4_da_convert_inline_data_to_extent drop locked page after error · 50db71ab
      Dmitry Monakhov 提交于
      Testcase:
      xfstests generic/270
      MKFS_OPTIONS="-q -I 256 -O inline_data,64bit"
      
      Call Trace:
       [<ffffffff81144c76>] lock_page+0x35/0x39 -------> DEADLOCK
       [<ffffffff81145260>] pagecache_get_page+0x65/0x15a
       [<ffffffff811507fc>] truncate_inode_pages_range+0x1db/0x45c
       [<ffffffff8120ea63>] ? ext4_da_get_block_prep+0x439/0x4b6
       [<ffffffff811b29b7>] ? __block_write_begin+0x284/0x29c
       [<ffffffff8120e62a>] ? ext4_change_inode_journal_flag+0x16b/0x16b
       [<ffffffff81150af0>] truncate_inode_pages+0x12/0x14
       [<ffffffff81247cb4>] ext4_truncate_failed_write+0x19/0x25
       [<ffffffff812488cf>] ext4_da_write_inline_data_begin+0x196/0x31c
       [<ffffffff81210dad>] ext4_da_write_begin+0x189/0x302
       [<ffffffff810c07ac>] ? trace_hardirqs_on+0xd/0xf
       [<ffffffff810ddd13>] ? read_seqcount_begin.clone.1+0x9f/0xcc
       [<ffffffff8114309d>] generic_perform_write+0xc7/0x1c6
       [<ffffffff810c040e>] ? mark_held_locks+0x59/0x77
       [<ffffffff811445d1>] __generic_file_write_iter+0x17f/0x1c5
       [<ffffffff8120726b>] ext4_file_write_iter+0x2a5/0x354
       [<ffffffff81185656>] ? file_start_write+0x2a/0x2c
       [<ffffffff8107bcdb>] ? bad_area_nosemaphore+0x13/0x15
       [<ffffffff811858ce>] new_sync_write+0x8a/0xb2
       [<ffffffff81186e7b>] vfs_write+0xb5/0x14d
       [<ffffffff81186ffb>] SyS_write+0x5c/0x8c
       [<ffffffff816f2529>] system_call_fastpath+0x12/0x17
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      50db71ab
  21. 03 12月, 2014 2 次提交
  22. 13 10月, 2014 1 次提交
  23. 11 9月, 2014 1 次提交
  24. 29 7月, 2014 1 次提交
  25. 15 7月, 2014 1 次提交
    • Z
      ext4: make ext4_has_inline_data() as a inline function · 83447ccb
      Zheng Liu 提交于
      Now ext4_has_inline_data() is used in wide spread codepaths.  So we need
      to make it as a inline function to avoid burning some CPU cycles.
      
      Change in text size:
      
               text     data      bss     dec     hex filename
      before: 326110    19258    5528  350896   55ab0 fs/ext4/ext4.o
      after:  326227    19258    5528  351013   55b25 fs/ext4/ext4.o
      
      I use the following script to measure the CPU usage.
      
        #!/bin/bash
      
        shm_base='/dev/shm'
        img=${shm_base}/ext4-img
        mnt=/mnt/loop
      
        e2fsprgs_base=$HOME/e2fsprogs
        mkfs=${e2fsprgs_base}/misc/mke2fs
        fsck=${e2fsprgs_base}/e2fsck/e2fsck
      
        sudo umount $mnt
        dd if=/dev/zero of=$img bs=4k count=3145728
        ${mkfs} -t ext4 -O inline_data -F $img
        sudo mount -t ext4 -o loop $img $mnt
      
        # start testing...
        testdir="${mnt}/testdir"
        mkdir $testdir
        cd $testdir
      
        echo "start testing..."
        for ((cnt=0;cnt<100;cnt++)); do
      
        for ((i=0;i<5;i++)); do
        	for ((j=0;j<5;j++)); do
        		for ((k=0;k<5;k++)); do
        			for ((l=0;l<5;l++)); do
        				mkdir -p $i/$j/$k/$l
        				echo "$i-$j-$k-$l" > $i/$j/$k/$l/testfile
        			done
        		done
        	done
        done
      
        ls -R $testdir > /dev/null
        rm -rf $testdir/*
      
        done
      
      The result of `perf top -G -U` is as below.
      
      vanilla:
       13.92%  [ext4]  [k] ext4_do_update_inode
        9.36%  [ext4]  [k] __ext4_get_inode_loc
        4.07%  [ext4]  [k] ftrace_define_fields_ext4_writepages
        3.83%  [ext4]  [k] __ext4_handle_dirty_metadata
        3.42%  [ext4]  [k] ext4_get_inode_flags
        2.71%  [ext4]  [k] ext4_mark_iloc_dirty
        2.46%  [ext4]  [k] ftrace_define_fields_ext4_direct_IO_enter
        2.26%  [ext4]  [k] ext4_get_inode_loc
        2.22%  [ext4]  [k] ext4_has_inline_data
        [...]
      
      After applied the patch, we don't see ext4_has_inline_data() because it
      has been inlined and perf couldn't sample it.  Although it doesn't mean
      that the CPU cycles can be saved but at least the overhead of function
      calls can be eliminated.  So IMHO we'd better inline this function.
      
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      83447ccb
  26. 13 5月, 2014 1 次提交
  27. 12 5月, 2014 1 次提交
  28. 12 1月, 2014 1 次提交
  29. 08 1月, 2014 1 次提交
  30. 07 1月, 2014 2 次提交
  31. 30 10月, 2013 2 次提交
  32. 01 7月, 2013 1 次提交
  33. 29 6月, 2013 1 次提交
  34. 01 6月, 2013 1 次提交