1. 22 6月, 2017 3 次提交
    • T
      ext4: lock inode before calling ext4_orphan_add() · 0de5983d
      Tahsin Erdogan 提交于
      ext4_orphan_add() requires caller to be holding the inode lock.
      Add missing lock statements.
      
       WARNING: CPU: 3 PID: 1806 at fs/ext4/namei.c:2731 ext4_orphan_add+0x4e/0x240
       CPU: 3 PID: 1806 Comm: python Not tainted 4.12.0-rc1+ #746
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       task: ffff880135d466c0 task.stack: ffffc900014b0000
       RIP: 0010:ext4_orphan_add+0x4e/0x240
       RSP: 0018:ffffc900014b3d50 EFLAGS: 00010246
       RAX: 0000000000000000 RBX: ffff8801348fe1f0 RCX: ffffc900014b3c64
       RDX: 0000000000000000 RSI: ffff8801348fe1f0 RDI: ffff8801348fe1f0
       RBP: ffffc900014b3da0 R08: 0000000000000000 R09: ffffffff80e82025
       R10: 0000000000004692 R11: 000000000000468d R12: ffff880137598000
       R13: ffff880137217000 R14: ffff880134ac58d0 R15: 0000000000000000
       FS:  00007fc50f09e740(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00000000008bc2e0 CR3: 00000001375ac000 CR4: 00000000000006e0
       Call Trace:
        ext4_xattr_inode_orphan_add.constprop.19+0x9d/0xf0
        ext4_xattr_delete_inode+0x1c4/0x2f0
        ext4_evict_inode+0x15a/0x7f0
        evict+0xc0/0x1a0
        iput+0x16a/0x270
        do_unlinkat+0x172/0x290
        SyS_unlink+0x11/0x20
        entry_SYSCALL_64_fastpath+0x18/0xad
      Signed-off-by: NTahsin Erdogan <tahsin@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      0de5983d
    • T
      ext4: fix lockdep warning about recursive inode locking · 33d201e0
      Tahsin Erdogan 提交于
      Setting a large xattr value may require writing the attribute contents
      to an external inode. In this case we may need to lock the xattr inode
      along with the parent inode. This doesn't pose a deadlock risk because
      xattr inodes are not directly visible to the user and their access is
      restricted.
      
      Assign a lockdep subclass to xattr inode's lock.
      
       ============================================
       WARNING: possible recursive locking detected
       4.12.0-rc1+ #740 Not tainted
       --------------------------------------------
       python/1822 is trying to acquire lock:
        (&sb->s_type->i_mutex_key#15){+.+...}, at: [<ffffffff804912ca>] ext4_xattr_set_entry+0x65a/0x7b0
      
       but task is already holding lock:
        (&sb->s_type->i_mutex_key#15){+.+...}, at: [<ffffffff803d6687>] vfs_setxattr+0x57/0xb0
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&sb->s_type->i_mutex_key#15);
         lock(&sb->s_type->i_mutex_key#15);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       4 locks held by python/1822:
        #0:  (sb_writers#10){.+.+.+}, at: [<ffffffff803d0eef>] mnt_want_write+0x1f/0x50
        #1:  (&sb->s_type->i_mutex_key#15){+.+...}, at: [<ffffffff803d6687>] vfs_setxattr+0x57/0xb0
        #2:  (jbd2_handle){.+.+..}, at: [<ffffffff80493f40>] start_this_handle+0xf0/0x420
        #3:  (&ei->xattr_sem){++++..}, at: [<ffffffff804920ba>] ext4_xattr_set_handle+0x9a/0x4f0
      
       stack backtrace:
       CPU: 0 PID: 1822 Comm: python Not tainted 4.12.0-rc1+ #740
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       Call Trace:
        dump_stack+0x67/0x9e
        __lock_acquire+0x5f3/0x1750
        lock_acquire+0xb5/0x1d0
        down_write+0x2c/0x60
        ext4_xattr_set_entry+0x65a/0x7b0
        ext4_xattr_block_set+0x1b2/0x9b0
        ext4_xattr_set_handle+0x322/0x4f0
        ext4_xattr_set+0x144/0x1a0
        ext4_xattr_user_set+0x34/0x40
        __vfs_setxattr+0x66/0x80
        __vfs_setxattr_noperm+0x69/0x1c0
        vfs_setxattr+0xa2/0xb0
        setxattr+0x12e/0x150
        path_setxattr+0x87/0xb0
        SyS_setxattr+0xf/0x20
        entry_SYSCALL_64_fastpath+0x18/0xad
      Signed-off-by: NTahsin Erdogan <tahsin@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      33d201e0
    • A
      ext4: xattr-in-inode support · e50e5129
      Andreas Dilger 提交于
      Large xattr support is implemented for EXT4_FEATURE_INCOMPAT_EA_INODE.
      
      If the size of an xattr value is larger than will fit in a single
      external block, then the xattr value will be saved into the body
      of an external xattr inode.
      
      The also helps support a larger number of xattr, since only the headers
      will be stored in the in-inode space or the single external block.
      
      The inode is referenced from the xattr header via "e_value_inum",
      which was formerly "e_value_block", but that field was never used.
      The e_value_size still contains the xattr size so that listing
      xattrs does not need to look up the inode if the data is not accessed.
      
      struct ext4_xattr_entry {
              __u8    e_name_len;     /* length of name */
              __u8    e_name_index;   /* attribute name index */
              __le16  e_value_offs;   /* offset in disk block of value */
              __le32  e_value_inum;   /* inode in which value is stored */
              __le32  e_value_size;   /* size of attribute value */
              __le32  e_hash;         /* hash value of name and value */
              char    e_name[0];      /* attribute name */
      };
      
      The xattr inode is marked with the EXT4_EA_INODE_FL flag and also
      holds a back-reference to the owning inode in its i_mtime field,
      allowing the ext4/e2fsck to verify the correct inode is accessed.
      
      [ Applied fix by Dan Carpenter to avoid freeing an ERR_PTR. ]
      
      Lustre-Jira: https://jira.hpdd.intel.com/browse/LU-80
      Lustre-bugzilla: https://bugzilla.lustre.org/show_bug.cgi?id=4424Signed-off-by: NKalpak Shah <kalpak.shah@sun.com>
      Signed-off-by: NJames Simmons <uja.ornl@gmail.com>
      Signed-off-by: NAndreas Dilger <andreas.dilger@intel.com>
      Signed-off-by: NTahsin Erdogan <tahsin@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      e50e5129
  2. 25 5月, 2017 1 次提交
  3. 30 4月, 2017 4 次提交
  4. 26 3月, 2017 1 次提交
  5. 05 2月, 2017 1 次提交
  6. 12 1月, 2017 1 次提交
    • T
      ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea() · c755e251
      Theodore Ts'o 提交于
      The xattr_sem deadlock problems fixed in commit 2e81a4ee: "ext4:
      avoid deadlock when expanding inode size" didn't include the use of
      xattr_sem in fs/ext4/inline.c.  With the addition of project quota
      which added a new extra inode field, this exposed deadlocks in the
      inline_data code similar to the ones fixed by 2e81a4ee.
      
      The deadlock can be reproduced via:
      
         dmesg -n 7
         mke2fs -t ext4 -O inline_data -Fq -I 256 /dev/vdc 32768
         mount -t ext4 -o debug_want_extra_isize=24 /dev/vdc /vdc
         mkdir /vdc/a
         umount /vdc
         mount -t ext4 /dev/vdc /vdc
         echo foo > /vdc/a/foo
      
      and looks like this:
      
      [   11.158815] 
      [   11.160276] =============================================
      [   11.161960] [ INFO: possible recursive locking detected ]
      [   11.161960] 4.10.0-rc3-00015-g011b30a8a3cf #160 Tainted: G        W      
      [   11.161960] ---------------------------------------------
      [   11.161960] bash/2519 is trying to acquire lock:
      [   11.161960]  (&ei->xattr_sem){++++..}, at: [<c1225a4b>] ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960] 
      [   11.161960] but task is already holding lock:
      [   11.161960]  (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152
      [   11.161960] 
      [   11.161960] other info that might help us debug this:
      [   11.161960]  Possible unsafe locking scenario:
      [   11.161960] 
      [   11.161960]        CPU0
      [   11.161960]        ----
      [   11.161960]   lock(&ei->xattr_sem);
      [   11.161960]   lock(&ei->xattr_sem);
      [   11.161960] 
      [   11.161960]  *** DEADLOCK ***
      [   11.161960] 
      [   11.161960]  May be due to missing lock nesting notation
      [   11.161960] 
      [   11.161960] 4 locks held by bash/2519:
      [   11.161960]  #0:  (sb_writers#3){.+.+.+}, at: [<c11a2414>] mnt_want_write+0x1e/0x3e
      [   11.161960]  #1:  (&type->i_mutex_dir_key){++++++}, at: [<c119508b>] path_openat+0x338/0x67a
      [   11.161960]  #2:  (jbd2_handle){++++..}, at: [<c123314a>] start_this_handle+0x582/0x622
      [   11.161960]  #3:  (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152
      [   11.161960] 
      [   11.161960] stack backtrace:
      [   11.161960] CPU: 0 PID: 2519 Comm: bash Tainted: G        W       4.10.0-rc3-00015-g011b30a8a3cf #160
      [   11.161960] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1 04/01/2014
      [   11.161960] Call Trace:
      [   11.161960]  dump_stack+0x72/0xa3
      [   11.161960]  __lock_acquire+0xb7c/0xcb9
      [   11.161960]  ? kvm_clock_read+0x1f/0x29
      [   11.161960]  ? __lock_is_held+0x36/0x66
      [   11.161960]  ? __lock_is_held+0x36/0x66
      [   11.161960]  lock_acquire+0x106/0x18a
      [   11.161960]  ? ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  down_write+0x39/0x72
      [   11.161960]  ? ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  ? _raw_read_unlock+0x22/0x2c
      [   11.161960]  ? jbd2_journal_extend+0x1e2/0x262
      [   11.161960]  ? __ext4_journal_get_write_access+0x3d/0x60
      [   11.161960]  ext4_mark_inode_dirty+0x17d/0x26d
      [   11.161960]  ? ext4_add_dirent_to_inline.isra.12+0xa5/0xb2
      [   11.161960]  ext4_add_dirent_to_inline.isra.12+0xa5/0xb2
      [   11.161960]  ext4_try_add_inline_entry+0x69/0x152
      [   11.161960]  ext4_add_entry+0xa3/0x848
      [   11.161960]  ? __brelse+0x14/0x2f
      [   11.161960]  ? _raw_spin_unlock_irqrestore+0x44/0x4f
      [   11.161960]  ext4_add_nondir+0x17/0x5b
      [   11.161960]  ext4_create+0xcf/0x133
      [   11.161960]  ? ext4_mknod+0x12f/0x12f
      [   11.161960]  lookup_open+0x39e/0x3fb
      [   11.161960]  ? __wake_up+0x1a/0x40
      [   11.161960]  ? lock_acquire+0x11e/0x18a
      [   11.161960]  path_openat+0x35c/0x67a
      [   11.161960]  ? sched_clock_cpu+0xd7/0xf2
      [   11.161960]  do_filp_open+0x36/0x7c
      [   11.161960]  ? _raw_spin_unlock+0x22/0x2c
      [   11.161960]  ? __alloc_fd+0x169/0x173
      [   11.161960]  do_sys_open+0x59/0xcc
      [   11.161960]  SyS_open+0x1d/0x1f
      [   11.161960]  do_int80_syscall_32+0x4f/0x61
      [   11.161960]  entry_INT80_32+0x2f/0x2f
      [   11.161960] EIP: 0xb76ad469
      [   11.161960] EFLAGS: 00000286 CPU: 0
      [   11.161960] EAX: ffffffda EBX: 08168ac8 ECX: 00008241 EDX: 000001b6
      [   11.161960] ESI: b75e46bc EDI: b7755000 EBP: bfbdb108 ESP: bfbdafc0
      [   11.161960]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
      
      Cc: stable@vger.kernel.org # 3.10 (requires 2e81a4ee as a prereq)
      Reported-by: NGeorge Spelvin <linux@sciencehorizons.net>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c755e251
  7. 02 12月, 2016 2 次提交
    • E
      ext4: correctly detect when an xattr value has an invalid size · d7614cc1
      Eric Biggers 提交于
      It was possible for an xattr value to have a very large size, which
      would then pass validation on 32-bit architectures due to a pointer
      wraparound.  Fix this by validating the size in a way which avoids
      pointer wraparound.
      
      It was also possible that a value's size would fit in the available
      space but its padded size would not.  This would cause an out-of-bounds
      memory write in ext4_xattr_set_entry when replacing the xattr value.
      For example, if an xattr value of unpadded size 253 bytes went until the
      very end of the inode or block, then using setxattr(2) to replace this
      xattr's value with 256 bytes would cause a write to the 3 bytes past the
      end of the inode or buffer, and the new xattr value would be incorrectly
      truncated.  Fix this by requiring that the padded size fit in the
      available space rather than the unpadded size.
      
      This patch shouldn't have any noticeable effect on
      non-corrupted/non-malicious filesystems.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      d7614cc1
    • E
      ext4: don't read out of bounds when checking for in-inode xattrs · 290ab230
      Eric Biggers 提交于
      With i_extra_isize equal to or close to the available space, it was
      possible for us to read past the end of the inode when trying to detect
      or validate in-inode xattrs.  Fix this by checking for the needed extra
      space first.
      
      This patch shouldn't have any noticeable effect on
      non-corrupted/non-malicious filesystems.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      290ab230
  8. 15 11月, 2016 2 次提交
  9. 15 10月, 2016 2 次提交
  10. 30 8月, 2016 7 次提交
  11. 12 8月, 2016 2 次提交
    • J
      ext4: avoid deadlock when expanding inode size · 2e81a4ee
      Jan Kara 提交于
      When we need to move xattrs into external xattr block, we call
      ext4_xattr_block_set() from ext4_expand_extra_isize_ea(). That may end
      up calling ext4_mark_inode_dirty() again which will recurse back into
      the inode expansion code leading to deadlocks.
      
      Protect from recursion using EXT4_STATE_NO_EXPAND inode flag and move
      its management into ext4_expand_extra_isize_ea() since its manipulation
      is safe there (due to xattr_sem) from possible races with
      ext4_xattr_set_handle() which plays with it as well.
      
      CC: stable@vger.kernel.org   # 4.4.x
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      2e81a4ee
    • J
      ext4: properly align shifted xattrs when expanding inodes · 443a8c41
      Jan Kara 提交于
      We did not count with the padding of xattr value when computing desired
      shift of xattrs in the inode when expanding i_extra_isize. As a result
      we could create unaligned start of inline xattrs. Account for alignment
      properly.
      
      CC: stable@vger.kernel.org  # 4.4.x-
      Signed-off-by: NJan Kara <jack@suse.cz>
      443a8c41
  12. 11 8月, 2016 2 次提交
    • J
      ext4: fix xattr shifting when expanding inodes part 2 · 418c12d0
      Jan Kara 提交于
      When multiple xattrs need to be moved out of inode, we did not properly
      recompute total size of xattr headers in the inode and the new header
      position. Thus when moving the second and further xattr we asked
      ext4_xattr_shift_entries() to move too much and from the wrong place,
      resulting in possible xattr value corruption or general memory
      corruption.
      
      CC: stable@vger.kernel.org  # 4.4.x
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      418c12d0
    • J
      ext4: fix xattr shifting when expanding inodes · d0141191
      Jan Kara 提交于
      The code in ext4_expand_extra_isize_ea() treated new_extra_isize
      argument sometimes as the desired target i_extra_isize and sometimes as
      the amount by which we need to grow current i_extra_isize. These happen
      to coincide when i_extra_isize is 0 which used to be the common case and
      so nobody noticed this until recently when we added i_projid to the
      inode and so i_extra_isize now needs to grow from 28 to 32 bytes.
      
      The result of these bugs was that we sometimes unnecessarily decided to
      move xattrs out of inode even if there was enough space and we often
      ended up corrupting in-inode xattrs because arguments to
      ext4_xattr_shift_entries() were just wrong. This could demonstrate
      itself as BUG_ON in ext4_xattr_shift_entries() triggering.
      
      Fix the problem by introducing new isize_diff variable and use it where
      appropriate.
      
      CC: stable@vger.kernel.org   # 4.4.x
      Reported-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      d0141191
  13. 04 7月, 2016 1 次提交
  14. 23 3月, 2016 1 次提交
  15. 23 2月, 2016 4 次提交
    • A
      mbcache: add reusable flag to cache entries · 6048c64b
      Andreas Gruenbacher 提交于
      To reduce amount of damage caused by single bad block, we limit number
      of inodes sharing an xattr block to 1024. Thus there can be more xattr
      blocks with the same contents when there are lots of files with the same
      extended attributes. These xattr blocks naturally result in hash
      collisions and can form long hash chains and we unnecessarily check each
      such block only to find out we cannot use it because it is already
      shared by too many inodes.
      
      Add a reusable flag to cache entries which is cleared when a cache entry
      has reached its maximum refcount.  Cache entries which are not marked
      reusable are skipped by mb_cache_entry_find_{first,next}. This
      significantly speeds up mbcache when there are many same xattr blocks.
      For example for xattr-bench with 5 values and each process handling
      20000 files, the run for 64 processes is 25x faster with this patch.
      Even for 8 processes the speedup is almost 3x. We have also verified
      that for situations where there is only one xattr block of each kind,
      the patch doesn't have a measurable cost.
      
      [JK: Remove handling of setting the same value since it is not needed
      anymore, check for races in e_reusable setting, improve changelog,
      add measurements]
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      6048c64b
    • J
      ext4: shortcut setting of xattr to the same value · 3fd16462
      Jan Kara 提交于
      When someone tried to set xattr to the same value (i.e., not changing
      anything) we did all the work of removing original xattr, possibly
      breaking references to shared xattr block, inserting new xattr, and
      merging xattr blocks again. Since this is not so rare operation and it
      is relatively cheap for us to detect this case, check for this and
      shortcut xattr setting in that case.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      3fd16462
    • J
      mbcache2: rename to mbcache · 7a2508e1
      Jan Kara 提交于
      Since old mbcache code is gone, let's rename new code to mbcache since
      number 2 is now meaningless. This is just a mechanical replacement.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      7a2508e1
    • J
      ext4: convert to mbcache2 · 82939d79
      Jan Kara 提交于
      The conversion is generally straightforward. The only tricky part is
      that xattr block corresponding to found mbcache entry can get freed
      before we get buffer lock for that block. So we have to check whether
      the entry is still valid after getting buffer lock.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      82939d79
  16. 07 1月, 2016 1 次提交
  17. 14 12月, 2015 1 次提交
  18. 14 11月, 2015 1 次提交
  19. 18 10月, 2015 2 次提交
  20. 16 4月, 2015 1 次提交