1. 02 7月, 2022 1 次提交
  2. 26 5月, 2022 1 次提交
    • Y
      ext4: fix bug_on in ext4_writepages · fb4e2c7c
      Ye Bin 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I58A6T
      CVE: NA
      
      ---------------------------
      
      we got issue as follows:
      EXT4-fs error (device loop0): ext4_mb_generate_buddy:1141: group 0, block bitmap and bg descriptor inconsistent: 25 vs 31513 free cls
      ------------[ cut here ]------------
      kernel BUG at fs/ext4/inode.c:2708!
      invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
      CPU: 2 PID: 2147 Comm: rep Not tainted 5.18.0-rc2-next-20220413+ #155
      RIP: 0010:ext4_writepages+0x1977/0x1c10
      RSP: 0018:ffff88811d3e7880 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88811c098000
      RDX: 0000000000000000 RSI: ffff88811c098000 RDI: 0000000000000002
      RBP: ffff888128140f50 R08: ffffffffb1ff6387 R09: 0000000000000000
      R10: 0000000000000007 R11: ffffed10250281ea R12: 0000000000000001
      R13: 00000000000000a4 R14: ffff88811d3e7bb8 R15: ffff888128141028
      FS:  00007f443aed9740(0000) GS:ffff8883aef00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020007200 CR3: 000000011c2a4000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       do_writepages+0x130/0x3a0
       filemap_fdatawrite_wbc+0x83/0xa0
       filemap_flush+0xab/0xe0
       ext4_alloc_da_blocks+0x51/0x120
       __ext4_ioctl+0x1534/0x3210
       __x64_sys_ioctl+0x12c/0x170
       do_syscall_64+0x3b/0x90
      
      It may happen as follows:
      1. write inline_data inode
      vfs_write
        new_sync_write
          ext4_file_write_iter
            ext4_buffered_write_iter
              generic_perform_write
                ext4_da_write_begin
                  ext4_da_write_inline_data_begin -> If inline data size too
                  small will allocate block to write, then mapping will has
                  dirty page
                      ext4_da_convert_inline_data_to_extent ->clear EXT4_STATE_MAY_INLINE_DATA
      2. fallocate
      do_vfs_ioctl
        ioctl_preallocate
          vfs_fallocate
            ext4_fallocate
              ext4_convert_inline_data
                ext4_convert_inline_data_nolock
                  ext4_map_blocks -> fail will goto restore data
                  ext4_restore_inline_data
                    ext4_create_inline_data
                    ext4_write_inline_data
                    ext4_set_inode_state -> set inode EXT4_STATE_MAY_INLINE_DATA
      3. writepages
      __ext4_ioctl
        ext4_alloc_da_blocks
          filemap_flush
            filemap_fdatawrite_wbc
              do_writepages
                ext4_writepages
                  if (ext4_has_inline_data(inode))
                    BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA))
      
      The root cause of this issue is we destroy inline data until call ext4_writepages
      under delay allocation mode. But there maybe already covert from inline to extent.
      To solved this issue, we call filemap_flush firstly.
      Signed-off-by: NYe Bin <yebin10@huawei.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NLi Nan <linan122@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      fb4e2c7c
  3. 19 4月, 2022 1 次提交
    • Y
      ext4: fix fs corruption when tring to remove a non-empty directory with IO error · 5ac69702
      Ye Bin 提交于
      mainline inclusion
      from mainline-v5.18-rc1
      commit 7aab5c84
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I52WOM
      CVE: NA
      
      --------------------------------
      
      We inject IO error when rmdir non empty direcory, then got issue as follows:
      step1: mkfs.ext4 -F /dev/sda
      step2: mount /dev/sda  test
      step3: cd test
      step4: mkdir -p 1/2
      step5: rmdir 1
      	[  110.920551] ext4_empty_dir: inject fault
      	[  110.921926] EXT4-fs warning (device sda): ext4_rmdir:3113: inode #12:
      	comm rmdir: empty directory '1' has too many links (3)
      step6: cd ..
      step7: umount test
      step8: fsck.ext4 -f /dev/sda
      	e2fsck 1.42.9 (28-Dec-2013)
      	Pass 1: Checking inodes, blocks, and sizes
      	Pass 2: Checking directory structure
      	Entry '..' in .../??? (13) has deleted/unused inode 12.  Clear<y>? yes
      	Pass 3: Checking directory connectivity
      	Unconnected directory inode 13 (...)
      	Connect to /lost+found<y>? yes
      	Pass 4: Checking reference counts
      	Inode 13 ref count is 3, should be 2.  Fix<y>? yes
      	Pass 5: Checking group summary information
      
      	/dev/sda: ***** FILE SYSTEM WAS MODIFIED *****
      	/dev/sda: 12/131072 files (0.0% non-contiguous), 26157/524288 blocks
      
      ext4_rmdir
      	if (!ext4_empty_dir(inode))
      		goto end_rmdir;
      ext4_empty_dir
      	bh = ext4_read_dirblock(inode, 0, DIRENT_HTREE);
      	if (IS_ERR(bh))
      		return true;
      Now if read directory block failed, 'ext4_empty_dir' will return true, assume
      directory is empty. Obviously, it will lead to above issue.
      To solve this issue, if read directory block failed 'ext4_empty_dir' just
      return false. To avoid making things worse when file system is already
      corrupted, 'ext4_empty_dir' also return false.
      Signed-off-by: NYe Bin <yebin10@huawei.com>
      Cc: stable@kernel.org
      Link: https://lore.kernel.org/r/20220228024815.3952506-1-yebin10@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      
      conflicts:
      fs/ext4/namei.c
      Signed-off-by: NYe Bin <yebin10@huawei.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      5ac69702
  4. 12 4月, 2022 1 次提交
  5. 22 10月, 2021 2 次提交
  6. 14 9月, 2021 1 次提交
  7. 22 3月, 2021 2 次提交
  8. 22 2月, 2021 1 次提交
  9. 27 12月, 2019 2 次提交
  10. 14 11月, 2018 1 次提交
  11. 27 8月, 2018 1 次提交
  12. 10 7月, 2018 1 次提交
  13. 17 6月, 2018 1 次提交
  14. 16 6月, 2018 1 次提交
    • T
      ext4: clear i_data in ext4_inode_info when removing inline data · 6e8ab72a
      Theodore Ts'o 提交于
      When converting from an inode from storing the data in-line to a data
      block, ext4_destroy_inline_data_nolock() was only clearing the on-disk
      copy of the i_blocks[] array.  It was not clearing copy of the
      i_blocks[] in ext4_inode_info, in i_data[], which is the copy actually
      used by ext4_map_blocks().
      
      This didn't matter much if we are using extents, since the extents
      header would be invalid and thus the extents could would re-initialize
      the extents tree.  But if we are using indirect blocks, the previous
      contents of the i_blocks array will be treated as block numbers, with
      potentially catastrophic results to the file system integrity and/or
      user data.
      
      This gets worse if the file system is using a 1k block size and
      s_first_data is zero, but even without this, the file system can get
      quite badly corrupted.
      
      This addresses CVE-2018-10881.
      
      https://bugzilla.kernel.org/show_bug.cgi?id=200015Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      6e8ab72a
  15. 02 6月, 2018 1 次提交
  16. 23 5月, 2018 1 次提交
  17. 01 2月, 2018 1 次提交
  18. 29 1月, 2018 1 次提交
  19. 18 12月, 2017 1 次提交
    • T
      ext4: fix up remaining files with SPDX cleanups · f5166768
      Theodore Ts'o 提交于
      A number of ext4 source files were skipped due because their copyright
      permission statements didn't match the expected text used by the
      automated conversion utilities.  I've added SPDX tags for the rest.
      
      While looking at some of these files, I've noticed that we have quite
      a bit of variation on the licenses that were used --- in particular
      some of the Red Hat licenses on the jbd2 files use a GPL2+ license,
      and we have some files that have a LGPL-2.1 license (which was quite
      surprising).
      
      I've not attempted to do any license changes.  Even if it is perfectly
      legal to relicense to GPL 2.0-only for consistency's sake, that should
      be done with ext4 developer community discussion.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      
      f5166768
  20. 12 10月, 2017 1 次提交
  21. 02 10月, 2017 1 次提交
  22. 22 6月, 2017 1 次提交
    • A
      ext4: xattr-in-inode support · e50e5129
      Andreas Dilger 提交于
      Large xattr support is implemented for EXT4_FEATURE_INCOMPAT_EA_INODE.
      
      If the size of an xattr value is larger than will fit in a single
      external block, then the xattr value will be saved into the body
      of an external xattr inode.
      
      The also helps support a larger number of xattr, since only the headers
      will be stored in the in-inode space or the single external block.
      
      The inode is referenced from the xattr header via "e_value_inum",
      which was formerly "e_value_block", but that field was never used.
      The e_value_size still contains the xattr size so that listing
      xattrs does not need to look up the inode if the data is not accessed.
      
      struct ext4_xattr_entry {
              __u8    e_name_len;     /* length of name */
              __u8    e_name_index;   /* attribute name index */
              __le16  e_value_offs;   /* offset in disk block of value */
              __le32  e_value_inum;   /* inode in which value is stored */
              __le32  e_value_size;   /* size of attribute value */
              __le32  e_hash;         /* hash value of name and value */
              char    e_name[0];      /* attribute name */
      };
      
      The xattr inode is marked with the EXT4_EA_INODE_FL flag and also
      holds a back-reference to the owning inode in its i_mtime field,
      allowing the ext4/e2fsck to verify the correct inode is accessed.
      
      [ Applied fix by Dan Carpenter to avoid freeing an ERR_PTR. ]
      
      Lustre-Jira: https://jira.hpdd.intel.com/browse/LU-80
      Lustre-bugzilla: https://bugzilla.lustre.org/show_bug.cgi?id=4424Signed-off-by: NKalpak Shah <kalpak.shah@sun.com>
      Signed-off-by: NJames Simmons <uja.ornl@gmail.com>
      Signed-off-by: NAndreas Dilger <andreas.dilger@intel.com>
      Signed-off-by: NTahsin Erdogan <tahsin@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      e50e5129
  23. 25 5月, 2017 1 次提交
  24. 30 4月, 2017 1 次提交
  25. 16 3月, 2017 1 次提交
    • E
      ext4: mark inode dirty after converting inline directory · b9cf625d
      Eric Biggers 提交于
      If ext4_convert_inline_data() was called on a directory with inline
      data, the filesystem was left in an inconsistent state (as considered by
      e2fsck) because the file size was not increased to cover the new block.
      This happened because the inode was not marked dirty after i_disksize
      was updated.  Fix this by marking the inode dirty at the end of
      ext4_finish_convert_inline_dir().
      
      This bug was probably not noticed before because most users mark the
      inode dirty afterwards for other reasons.  But if userspace executed
      FS_IOC_SET_ENCRYPTION_POLICY with invalid parameters, as exercised by
      'kvm-xfstests -c adv generic/396', then the inode was never marked dirty
      after updating i_disksize.
      
      Cc: stable@vger.kernel.org  # 3.10+
      Fixes: 3c47d541Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b9cf625d
  26. 05 2月, 2017 2 次提交
  27. 23 1月, 2017 1 次提交
  28. 12 1月, 2017 2 次提交
    • T
      ext4: avoid calling ext4_mark_inode_dirty() under unneeded semaphores · b907f2d5
      Theodore Ts'o 提交于
      There is no need to call ext4_mark_inode_dirty while holding xattr_sem
      or i_data_sem, so where it's easy to avoid it, move it out from the
      critical region.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b907f2d5
    • T
      ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea() · c755e251
      Theodore Ts'o 提交于
      The xattr_sem deadlock problems fixed in commit 2e81a4ee: "ext4:
      avoid deadlock when expanding inode size" didn't include the use of
      xattr_sem in fs/ext4/inline.c.  With the addition of project quota
      which added a new extra inode field, this exposed deadlocks in the
      inline_data code similar to the ones fixed by 2e81a4ee.
      
      The deadlock can be reproduced via:
      
         dmesg -n 7
         mke2fs -t ext4 -O inline_data -Fq -I 256 /dev/vdc 32768
         mount -t ext4 -o debug_want_extra_isize=24 /dev/vdc /vdc
         mkdir /vdc/a
         umount /vdc
         mount -t ext4 /dev/vdc /vdc
         echo foo > /vdc/a/foo
      
      and looks like this:
      
      [   11.158815] 
      [   11.160276] =============================================
      [   11.161960] [ INFO: possible recursive locking detected ]
      [   11.161960] 4.10.0-rc3-00015-g011b30a8a3cf #160 Tainted: G        W      
      [   11.161960] ---------------------------------------------
      [   11.161960] bash/2519 is trying to acquire lock:
      [   11.161960]  (&ei->xattr_sem){++++..}, at: [<c1225a4b>] ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960] 
      [   11.161960] but task is already holding lock:
      [   11.161960]  (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152
      [   11.161960] 
      [   11.161960] other info that might help us debug this:
      [   11.161960]  Possible unsafe locking scenario:
      [   11.161960] 
      [   11.161960]        CPU0
      [   11.161960]        ----
      [   11.161960]   lock(&ei->xattr_sem);
      [   11.161960]   lock(&ei->xattr_sem);
      [   11.161960] 
      [   11.161960]  *** DEADLOCK ***
      [   11.161960] 
      [   11.161960]  May be due to missing lock nesting notation
      [   11.161960] 
      [   11.161960] 4 locks held by bash/2519:
      [   11.161960]  #0:  (sb_writers#3){.+.+.+}, at: [<c11a2414>] mnt_want_write+0x1e/0x3e
      [   11.161960]  #1:  (&type->i_mutex_dir_key){++++++}, at: [<c119508b>] path_openat+0x338/0x67a
      [   11.161960]  #2:  (jbd2_handle){++++..}, at: [<c123314a>] start_this_handle+0x582/0x622
      [   11.161960]  #3:  (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152
      [   11.161960] 
      [   11.161960] stack backtrace:
      [   11.161960] CPU: 0 PID: 2519 Comm: bash Tainted: G        W       4.10.0-rc3-00015-g011b30a8a3cf #160
      [   11.161960] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1 04/01/2014
      [   11.161960] Call Trace:
      [   11.161960]  dump_stack+0x72/0xa3
      [   11.161960]  __lock_acquire+0xb7c/0xcb9
      [   11.161960]  ? kvm_clock_read+0x1f/0x29
      [   11.161960]  ? __lock_is_held+0x36/0x66
      [   11.161960]  ? __lock_is_held+0x36/0x66
      [   11.161960]  lock_acquire+0x106/0x18a
      [   11.161960]  ? ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  down_write+0x39/0x72
      [   11.161960]  ? ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  ext4_expand_extra_isize_ea+0x3d/0x4cd
      [   11.161960]  ? _raw_read_unlock+0x22/0x2c
      [   11.161960]  ? jbd2_journal_extend+0x1e2/0x262
      [   11.161960]  ? __ext4_journal_get_write_access+0x3d/0x60
      [   11.161960]  ext4_mark_inode_dirty+0x17d/0x26d
      [   11.161960]  ? ext4_add_dirent_to_inline.isra.12+0xa5/0xb2
      [   11.161960]  ext4_add_dirent_to_inline.isra.12+0xa5/0xb2
      [   11.161960]  ext4_try_add_inline_entry+0x69/0x152
      [   11.161960]  ext4_add_entry+0xa3/0x848
      [   11.161960]  ? __brelse+0x14/0x2f
      [   11.161960]  ? _raw_spin_unlock_irqrestore+0x44/0x4f
      [   11.161960]  ext4_add_nondir+0x17/0x5b
      [   11.161960]  ext4_create+0xcf/0x133
      [   11.161960]  ? ext4_mknod+0x12f/0x12f
      [   11.161960]  lookup_open+0x39e/0x3fb
      [   11.161960]  ? __wake_up+0x1a/0x40
      [   11.161960]  ? lock_acquire+0x11e/0x18a
      [   11.161960]  path_openat+0x35c/0x67a
      [   11.161960]  ? sched_clock_cpu+0xd7/0xf2
      [   11.161960]  do_filp_open+0x36/0x7c
      [   11.161960]  ? _raw_spin_unlock+0x22/0x2c
      [   11.161960]  ? __alloc_fd+0x169/0x173
      [   11.161960]  do_sys_open+0x59/0xcc
      [   11.161960]  SyS_open+0x1d/0x1f
      [   11.161960]  do_int80_syscall_32+0x4f/0x61
      [   11.161960]  entry_INT80_32+0x2f/0x2f
      [   11.161960] EIP: 0xb76ad469
      [   11.161960] EFLAGS: 00000286 CPU: 0
      [   11.161960] EAX: ffffffda EBX: 08168ac8 ECX: 00008241 EDX: 000001b6
      [   11.161960] ESI: b75e46bc EDI: b7755000 EBP: bfbdb108 ESP: bfbdafc0
      [   11.161960]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
      
      Cc: stable@vger.kernel.org # 3.10 (requires 2e81a4ee as a prereq)
      Reported-by: NGeorge Spelvin <linux@sciencehorizons.net>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c755e251
  29. 10 12月, 2016 1 次提交
  30. 21 11月, 2016 1 次提交
  31. 15 11月, 2016 1 次提交
    • D
      ext4: use current_time() for inode timestamps · eeca7ea1
      Deepa Dinamani 提交于
      CURRENT_TIME_SEC and CURRENT_TIME are not y2038 safe.
      current_time() will be transitioned to be y2038 safe
      along with vfs.
      
      current_time() returns timestamps according to the
      granularities set in the super_block.
      The granularity check in ext4_current_time() to call
      current_time() or CURRENT_TIME_SEC is not required.
      Use current_time() directly to obtain timestamps
      unconditionally, and remove ext4_current_time().
      
      Quota files are assumed to be on the same filesystem.
      Hence, use current_time() for these files as well.
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NArnd Bergmann <arnd@arndb.de>
      eeca7ea1
  32. 11 7月, 2016 1 次提交
  33. 27 4月, 2016 1 次提交
  34. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  35. 10 3月, 2016 1 次提交