1. 30 4月, 2017 2 次提交
  2. 26 3月, 2017 1 次提交
  3. 05 2月, 2017 1 次提交
  4. 12 1月, 2017 1 次提交
    • T
      ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea() · c755e251
      Theodore Ts'o 提交于
      The xattr_sem deadlock problems fixed in commit 2e81a4ee: "ext4:
      avoid deadlock when expanding inode size" didn't include the use of
      xattr_sem in fs/ext4/inline.c.  With the addition of project quota
      which added a new extra inode field, this exposed deadlocks in the
      inline_data code similar to the ones fixed by 2e81a4ee.
      
      The deadlock can be reproduced via:
      
         dmesg -n 7
         mke2fs -t ext4 -O inline_data -Fq -I 256 /dev/vdc 32768
         mount -t ext4 -o debug_want_extra_isize=24 /dev/vdc /vdc
         mkdir /vdc/a
         umount /vdc
         mount -t ext4 /dev/vdc /vdc
         echo foo > /vdc/a/foo
      
      and looks like this:
      
      [   11.158815] 
      [   11.160276] =============================================
      [   11.161960] [ INFO: possible recursive locking detected ]
      [   11.161960] 4.10.0-rc3-00015-g011b30a8a3cf #160 Tainted: G        W      
      [   11.161960] ---------------------------------------------
      [   11.161960] bash/2519 is trying to acquire lock:
      [   11.161960]  (&ei->xattr_sem){++++..}, at: [<c1225a4b>] ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960] 
      [   11.161960] but task is already holding lock:
      [   11.161960]  (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152
      [   11.161960] 
      [   11.161960] other info that might help us debug this:
      [   11.161960]  Possible unsafe locking scenario:
      [   11.161960] 
      [   11.161960]        CPU0
      [   11.161960]        ----
      [   11.161960]   lock(&ei->xattr_sem);
      [   11.161960]   lock(&ei->xattr_sem);
      [   11.161960] 
      [   11.161960]  *** DEADLOCK ***
      [   11.161960] 
      [   11.161960]  May be due to missing lock nesting notation
      [   11.161960] 
      [   11.161960] 4 locks held by bash/2519:
      [   11.161960]  #0:  (sb_writers#3){.+.+.+}, at: [<c11a2414>] mnt_want_write+0x1e/0x3e
      [   11.161960]  #1:  (&type->i_mutex_dir_key){++++++}, at: [<c119508b>] path_openat+0x338/0x67a
      [   11.161960]  #2:  (jbd2_handle){++++..}, at: [<c123314a>] start_this_handle+0x582/0x622
      [   11.161960]  #3:  (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152
      [   11.161960] 
      [   11.161960] stack backtrace:
      [   11.161960] CPU: 0 PID: 2519 Comm: bash Tainted: G        W       4.10.0-rc3-00015-g011b30a8a3cf #160
      [   11.161960] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1 04/01/2014
      [   11.161960] Call Trace:
      [   11.161960]  dump_stack+0x72/0xa3
      [   11.161960]  __lock_acquire+0xb7c/0xcb9
      [   11.161960]  ? kvm_clock_read+0x1f/0x29
      [   11.161960]  ? __lock_is_held+0x36/0x66
      [   11.161960]  ? __lock_is_held+0x36/0x66
      [   11.161960]  lock_acquire+0x106/0x18a
      [   11.161960]  ? ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  down_write+0x39/0x72
      [   11.161960]  ? ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  ? _raw_read_unlock+0x22/0x2c
      [   11.161960]  ? jbd2_journal_extend+0x1e2/0x262
      [   11.161960]  ? __ext4_journal_get_write_access+0x3d/0x60
      [   11.161960]  ext4_mark_inode_dirty+0x17d/0x26d
      [   11.161960]  ? ext4_add_dirent_to_inline.isra.12+0xa5/0xb2
      [   11.161960]  ext4_add_dirent_to_inline.isra.12+0xa5/0xb2
      [   11.161960]  ext4_try_add_inline_entry+0x69/0x152
      [   11.161960]  ext4_add_entry+0xa3/0x848
      [   11.161960]  ? __brelse+0x14/0x2f
      [   11.161960]  ? _raw_spin_unlock_irqrestore+0x44/0x4f
      [   11.161960]  ext4_add_nondir+0x17/0x5b
      [   11.161960]  ext4_create+0xcf/0x133
      [   11.161960]  ? ext4_mknod+0x12f/0x12f
      [   11.161960]  lookup_open+0x39e/0x3fb
      [   11.161960]  ? __wake_up+0x1a/0x40
      [   11.161960]  ? lock_acquire+0x11e/0x18a
      [   11.161960]  path_openat+0x35c/0x67a
      [   11.161960]  ? sched_clock_cpu+0xd7/0xf2
      [   11.161960]  do_filp_open+0x36/0x7c
      [   11.161960]  ? _raw_spin_unlock+0x22/0x2c
      [   11.161960]  ? __alloc_fd+0x169/0x173
      [   11.161960]  do_sys_open+0x59/0xcc
      [   11.161960]  SyS_open+0x1d/0x1f
      [   11.161960]  do_int80_syscall_32+0x4f/0x61
      [   11.161960]  entry_INT80_32+0x2f/0x2f
      [   11.161960] EIP: 0xb76ad469
      [   11.161960] EFLAGS: 00000286 CPU: 0
      [   11.161960] EAX: ffffffda EBX: 08168ac8 ECX: 00008241 EDX: 000001b6
      [   11.161960] ESI: b75e46bc EDI: b7755000 EBP: bfbdb108 ESP: bfbdafc0
      [   11.161960]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
      
      Cc: stable@vger.kernel.org # 3.10 (requires 2e81a4ee as a prereq)
      Reported-by: NGeorge Spelvin <linux@sciencehorizons.net>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c755e251
  5. 02 12月, 2016 2 次提交
    • E
      ext4: correctly detect when an xattr value has an invalid size · d7614cc1
      Eric Biggers 提交于
      It was possible for an xattr value to have a very large size, which
      would then pass validation on 32-bit architectures due to a pointer
      wraparound.  Fix this by validating the size in a way which avoids
      pointer wraparound.
      
      It was also possible that a value's size would fit in the available
      space but its padded size would not.  This would cause an out-of-bounds
      memory write in ext4_xattr_set_entry when replacing the xattr value.
      For example, if an xattr value of unpadded size 253 bytes went until the
      very end of the inode or block, then using setxattr(2) to replace this
      xattr's value with 256 bytes would cause a write to the 3 bytes past the
      end of the inode or buffer, and the new xattr value would be incorrectly
      truncated.  Fix this by requiring that the padded size fit in the
      available space rather than the unpadded size.
      
      This patch shouldn't have any noticeable effect on
      non-corrupted/non-malicious filesystems.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      d7614cc1
    • E
      ext4: don't read out of bounds when checking for in-inode xattrs · 290ab230
      Eric Biggers 提交于
      With i_extra_isize equal to or close to the available space, it was
      possible for us to read past the end of the inode when trying to detect
      or validate in-inode xattrs.  Fix this by checking for the needed extra
      space first.
      
      This patch shouldn't have any noticeable effect on
      non-corrupted/non-malicious filesystems.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      290ab230
  6. 15 11月, 2016 2 次提交
  7. 15 10月, 2016 2 次提交
  8. 30 8月, 2016 7 次提交
  9. 12 8月, 2016 2 次提交
    • J
      ext4: avoid deadlock when expanding inode size · 2e81a4ee
      Jan Kara 提交于
      When we need to move xattrs into external xattr block, we call
      ext4_xattr_block_set() from ext4_expand_extra_isize_ea(). That may end
      up calling ext4_mark_inode_dirty() again which will recurse back into
      the inode expansion code leading to deadlocks.
      
      Protect from recursion using EXT4_STATE_NO_EXPAND inode flag and move
      its management into ext4_expand_extra_isize_ea() since its manipulation
      is safe there (due to xattr_sem) from possible races with
      ext4_xattr_set_handle() which plays with it as well.
      
      CC: stable@vger.kernel.org   # 4.4.x
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      2e81a4ee
    • J
      ext4: properly align shifted xattrs when expanding inodes · 443a8c41
      Jan Kara 提交于
      We did not count with the padding of xattr value when computing desired
      shift of xattrs in the inode when expanding i_extra_isize. As a result
      we could create unaligned start of inline xattrs. Account for alignment
      properly.
      
      CC: stable@vger.kernel.org  # 4.4.x-
      Signed-off-by: NJan Kara <jack@suse.cz>
      443a8c41
  10. 11 8月, 2016 2 次提交
    • J
      ext4: fix xattr shifting when expanding inodes part 2 · 418c12d0
      Jan Kara 提交于
      When multiple xattrs need to be moved out of inode, we did not properly
      recompute total size of xattr headers in the inode and the new header
      position. Thus when moving the second and further xattr we asked
      ext4_xattr_shift_entries() to move too much and from the wrong place,
      resulting in possible xattr value corruption or general memory
      corruption.
      
      CC: stable@vger.kernel.org  # 4.4.x
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      418c12d0
    • J
      ext4: fix xattr shifting when expanding inodes · d0141191
      Jan Kara 提交于
      The code in ext4_expand_extra_isize_ea() treated new_extra_isize
      argument sometimes as the desired target i_extra_isize and sometimes as
      the amount by which we need to grow current i_extra_isize. These happen
      to coincide when i_extra_isize is 0 which used to be the common case and
      so nobody noticed this until recently when we added i_projid to the
      inode and so i_extra_isize now needs to grow from 28 to 32 bytes.
      
      The result of these bugs was that we sometimes unnecessarily decided to
      move xattrs out of inode even if there was enough space and we often
      ended up corrupting in-inode xattrs because arguments to
      ext4_xattr_shift_entries() were just wrong. This could demonstrate
      itself as BUG_ON in ext4_xattr_shift_entries() triggering.
      
      Fix the problem by introducing new isize_diff variable and use it where
      appropriate.
      
      CC: stable@vger.kernel.org   # 4.4.x
      Reported-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      d0141191
  11. 04 7月, 2016 1 次提交
  12. 23 3月, 2016 1 次提交
  13. 23 2月, 2016 4 次提交
    • A
      mbcache: add reusable flag to cache entries · 6048c64b
      Andreas Gruenbacher 提交于
      To reduce amount of damage caused by single bad block, we limit number
      of inodes sharing an xattr block to 1024. Thus there can be more xattr
      blocks with the same contents when there are lots of files with the same
      extended attributes. These xattr blocks naturally result in hash
      collisions and can form long hash chains and we unnecessarily check each
      such block only to find out we cannot use it because it is already
      shared by too many inodes.
      
      Add a reusable flag to cache entries which is cleared when a cache entry
      has reached its maximum refcount.  Cache entries which are not marked
      reusable are skipped by mb_cache_entry_find_{first,next}. This
      significantly speeds up mbcache when there are many same xattr blocks.
      For example for xattr-bench with 5 values and each process handling
      20000 files, the run for 64 processes is 25x faster with this patch.
      Even for 8 processes the speedup is almost 3x. We have also verified
      that for situations where there is only one xattr block of each kind,
      the patch doesn't have a measurable cost.
      
      [JK: Remove handling of setting the same value since it is not needed
      anymore, check for races in e_reusable setting, improve changelog,
      add measurements]
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      6048c64b
    • J
      ext4: shortcut setting of xattr to the same value · 3fd16462
      Jan Kara 提交于
      When someone tried to set xattr to the same value (i.e., not changing
      anything) we did all the work of removing original xattr, possibly
      breaking references to shared xattr block, inserting new xattr, and
      merging xattr blocks again. Since this is not so rare operation and it
      is relatively cheap for us to detect this case, check for this and
      shortcut xattr setting in that case.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      3fd16462
    • J
      mbcache2: rename to mbcache · 7a2508e1
      Jan Kara 提交于
      Since old mbcache code is gone, let's rename new code to mbcache since
      number 2 is now meaningless. This is just a mechanical replacement.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      7a2508e1
    • J
      ext4: convert to mbcache2 · 82939d79
      Jan Kara 提交于
      The conversion is generally straightforward. The only tricky part is
      that xattr block corresponding to found mbcache entry can get freed
      before we get buffer lock for that block. So we have to check whether
      the entry is still valid after getting buffer lock.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      82939d79
  14. 07 1月, 2016 1 次提交
  15. 14 12月, 2015 1 次提交
  16. 14 11月, 2015 1 次提交
  17. 18 10月, 2015 2 次提交
  18. 16 4月, 2015 1 次提交
  19. 03 4月, 2015 2 次提交
  20. 13 10月, 2014 1 次提交
  21. 17 9月, 2014 1 次提交
    • D
      ext4: check EA value offset when loading · a0626e75
      Darrick J. Wong 提交于
      When loading extended attributes, check each entry's value offset to
      make sure it doesn't collide with the entries.
      
      Without this check it is easy to crash the kernel by mounting a
      malicious FS containing a file with an EA wherein e_value_offs = 0 and
      e_value_size > 0 and then deleting the EA, which corrupts the name
      list.
      
      (See the f_ea_value_crash test's FS image in e2fsprogs for an example.)
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      a0626e75
  22. 05 9月, 2014 1 次提交
    • T
      ext4: prepare to drop EXT4_STATE_DELALLOC_RESERVED · e3cf5d5d
      Theodore Ts'o 提交于
      The EXT4_STATE_DELALLOC_RESERVED flag was originally implemented
      because it was too hard to make sure the mballoc and get_block flags
      could be reliably passed down through all of the codepaths that end up
      calling ext4_mb_new_blocks().
      
      Since then, we have mb_flags passed down through most of the code
      paths, so getting rid of EXT4_STATE_DELALLOC_RESERVED isn't as tricky
      as it used to.
      
      This commit plumbs in the last of what is required, and then adds a
      WARN_ON check to make sure we haven't missed anything.  If this passes
      a full regression test run, we can then drop
      EXT4_STATE_DELALLOC_RESERVED.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      e3cf5d5d
  23. 13 5月, 2014 1 次提交