1. 05 12月, 2019 1 次提交
  2. 05 10月, 2019 1 次提交
    • T
      ext4: fix punch hole for inline_data file systems · 091c754d
      Theodore Ts'o 提交于
      commit c1e8220bd316d8ae8e524df39534b8a412a45d5e upstream.
      
      If a program attempts to punch a hole on an inline data file, we need
      to convert it to a normal file first.
      
      This was detected using ext4/032 using the adv configuration.  Simple
      reproducer:
      
      mke2fs -Fq -t ext4 -O inline_data /dev/vdc
      mount /vdc
      echo "" > /vdc/testfile
      xfs_io -c 'truncate 33554432' /vdc/testfile
      xfs_io -c 'fpunch 0 1048576' /vdc/testfile
      umount /vdc
      e2fsck -fy /dev/vdc
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      091c754d
  3. 16 9月, 2019 1 次提交
  4. 28 7月, 2019 2 次提交
  5. 31 5月, 2019 2 次提交
  6. 22 5月, 2019 2 次提交
  7. 17 1月, 2019 2 次提交
    • T
      ext4: fix special inode number checks in __ext4_iget() · 5dc41af3
      Theodore Ts'o 提交于
      commit 191ce17876c9367819c4b0a25b503c0f6d9054d8 upstream.
      
      The check for special (reserved) inode number checks in __ext4_iget()
      was broken by commit 8a363970d1dc: ("ext4: avoid declaring fs
      inconsistent due to invalid file handles").  This was caused by a
      botched reversal of the sense of the flag now known as
      EXT4_IGET_SPECIAL (when it was previously named EXT4_IGET_NORMAL).
      Fix the logic appropriately.
      
      Fixes: 8a363970d1dc ("ext4: avoid declaring fs inconsistent...")
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5dc41af3
    • T
      ext4: make sure enough credits are reserved for dioread_nolock writes · 7c2ea25e
      Theodore Ts'o 提交于
      commit 812c0cab2c0dfad977605dbadf9148490ca5d93f upstream.
      
      There are enough credits reserved for most dioread_nolock writes;
      however, if the extent tree is sufficiently deep, and/or quota is
      enabled, the code was not allowing for all eventualities when
      reserving journal credits for the unwritten extent conversion.
      
      This problem can be seen using xfstests ext4/034:
      
         WARNING: CPU: 1 PID: 257 at fs/ext4/ext4_jbd2.c:271 __ext4_handle_dirty_metadata+0x10c/0x180
         Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
         RIP: 0010:__ext4_handle_dirty_metadata+0x10c/0x180
         	...
         EXT4-fs: ext4_free_blocks:4938: aborting transaction: error 28 in __ext4_handle_dirty_metadata
         EXT4: jbd2_journal_dirty_metadata failed: handle type 11 started at line 4921, credits 4/0, errcode -28
         EXT4-fs error (device dm-1) in ext4_free_blocks:4950: error 28
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c2ea25e
  8. 10 1月, 2019 2 次提交
    • T
      ext4: check for shutdown and r/o file system in ext4_write_inode() · 0cb4f655
      Theodore Ts'o 提交于
      commit 18f2c4fcebf2582f96cbd5f2238f4f354a0e4847 upstream.
      
      If the file system has been shut down or is read-only, then
      ext4_write_inode() needs to bail out early.
      
      Also use jbd2_complete_transaction() instead of ext4_force_commit() so
      we only force a commit if it is needed.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0cb4f655
    • T
      ext4: avoid declaring fs inconsistent due to invalid file handles · 26366388
      Theodore Ts'o 提交于
      commit 8a363970d1dc38c4ec4ad575c862f776f468d057 upstream.
      
      If we receive a file handle, either from NFS or open_by_handle_at(2),
      and it points at an inode which has not been initialized, and the file
      system has metadata checksums enabled, we shouldn't try to get the
      inode, discover the checksum is invalid, and then declare the file
      system as being inconsistent.
      
      This can be reproduced by creating a test file system via "mke2fs -t
      ext4 -O metadata_csum /tmp/foo.img 8M", mounting it, cd'ing into that
      directory, and then running the following program.
      
      #define _GNU_SOURCE
      #include <fcntl.h>
      
      struct handle {
      	struct file_handle fh;
      	unsigned char fid[MAX_HANDLE_SZ];
      };
      
      int main(int argc, char **argv)
      {
      	struct handle h = {{8, 1 }, { 12, }};
      
      	open_by_handle_at(AT_FDCWD, &h.fh, O_RDONLY);
      	return 0;
      }
      
      Google-Bug-Id: 120690101
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26366388
  9. 21 11月, 2018 1 次提交
  10. 16 9月, 2018 2 次提交
    • T
      ext4, dax: set ext4_dax_aops for dax files · cce6c9f7
      Toshi Kani 提交于
      Sync syscall to DAX file needs to flush processor cache, but it
      currently does not flush to existing DAX files.  This is because
      'ext4_da_aops' is set to address_space_operations of existing DAX
      files, instead of 'ext4_dax_aops', since S_DAX flag is set after
      ext4_set_aops() in the open path.
      
        New file
        --------
        lookup_open
          ext4_create
            __ext4_new_inode
              ext4_set_inode_flags   // Set S_DAX flag
            ext4_set_aops            // Set aops to ext4_dax_aops
      
        Existing file
        -------------
        lookup_open
          ext4_lookup
            ext4_iget
              ext4_set_aops          // Set aops to ext4_da_aops
              ext4_set_inode_flags   // Set S_DAX flag
      
      Change ext4_iget() to initialize i_flags before ext4_set_aops().
      
      Fixes: 5f0663bb ("ext4, dax: introduce ext4_dax_aops")
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Suggested-by: NJan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.org
      cce6c9f7
    • T
      ext4, dax: add ext4_bmap to ext4_dax_aops · 94dbb631
      Toshi Kani 提交于
      Ext4 mount path calls .bmap to the journal inode. This currently
      works for the DAX mount case because ext4_iget() always set
      'ext4_da_aops' to any regular files.
      
      In preparation to fix ext4_iget() to set 'ext4_dax_aops' for ext4
      DAX files, add ext4_bmap() to 'ext4_dax_aops', since bmap works for
      DAX inodes.
      
      Fixes: 5f0663bb ("ext4, dax: introduce ext4_dax_aops")
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Suggested-by: NJan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.org
      94dbb631
  11. 12 9月, 2018 1 次提交
  12. 02 9月, 2018 1 次提交
  13. 18 8月, 2018 1 次提交
  14. 02 8月, 2018 1 次提交
  15. 30 7月, 2018 2 次提交
  16. 10 7月, 2018 1 次提交
  17. 17 6月, 2018 1 次提交
  18. 16 6月, 2018 1 次提交
  19. 23 5月, 2018 1 次提交
  20. 14 5月, 2018 2 次提交
  21. 10 5月, 2018 1 次提交
    • E
      ext4: use raw i_version value for ea_inode · e254d1af
      Eryu Guan 提交于
      Currently, creating large xattr (e.g. 2k) in ea_inode would cause
      ea_inode refcount corruption, e.g.
      
        Pass 4: Checking reference counts
        Extended attribute inode 13 ref count is 0, should be 1. Fix? no
      
      This is because that we save the lower 32bit of refcount in
      inode->i_version and store it in raw_inode->i_disk_version on disk.
      But since commit ee73f9a5 ("ext4: convert to new i_version
      API"), we load/store modified i_disk_version from/to disk instead of
      raw value, which causes on-disk ea_inode refcount corruption.
      
      Fix it by loading/storing raw i_version/i_disk_version, because it's
      a self-managed value in this case.
      
      Fixes: ee73f9a5 ("ext4: convert to new i_version API")
      Cc: Tahsin Erdogan <tahsin@google.com>
      Signed-off-by: NEryu Guan <guaneryu@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      e254d1af
  22. 31 3月, 2018 1 次提交
    • D
      ext4, dax: introduce ext4_dax_aops · 5f0663bb
      Dan Williams 提交于
      In preparation for the dax implementation to start associating dax pages
      to inodes via page->mapping, we need to provide a 'struct
      address_space_operations' instance for dax. Otherwise, direct-I/O
      triggers incorrect page cache assumptions and warnings.
      
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: linux-ext4@vger.kernel.org
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5f0663bb
  23. 30 3月, 2018 1 次提交
  24. 28 3月, 2018 1 次提交
  25. 26 3月, 2018 1 次提交
  26. 22 3月, 2018 4 次提交
    • N
      ext4: remove EXT4_STATE_DIOREAD_LOCK flag · 1d39834f
      Nikolay Borisov 提交于
      Commit 16c54688 ("ext4: Allow parallel DIO reads") reworked the way
      locking happens around parallel dio reads. This resulted in obviating
      the need for EXT4_STATE_DIOREAD_LOCK flag and accompanying logic.
      Currently this amounts to dead code so let's remove it. No functional
      changes
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      1d39834f
    • J
      ext4: fix offset overflow on 32-bit archs in ext4_iomap_begin() · fe23cb65
      Jiri Slaby 提交于
      ext4_iomap_begin() has a bug where offset returned in the iomap
      structure will be truncated to unsigned long size. On 64-bit
      architectures this is fine but on 32-bit architectures obviously not.
      Not many places actually use the offset stored in the iomap structure
      but one of visible failures is in SEEK_HOLE / SEEK_DATA implementation.
      If we create a file like:
      
      dd if=/dev/urandom of=file bs=1k seek=8m count=1
      
      then
      
      lseek64("file", 0x100000000ULL, SEEK_DATA)
      
      wrongly returns 0x100000000 on unfixed kernel while it should return
      0x200000000. Avoid the overflow by proper type cast.
      
      Fixes: 545052e9 ("ext4: Switch to iomap for SEEK_HOLE / SEEK_DATA")
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org # v4.15
      fe23cb65
    • E
      ext4: update i_disksize if direct write past ondisk size · 45d8ec4d
      Eryu Guan 提交于
      Currently in ext4 direct write path, we update i_disksize only when
      new eof is greater than i_size, and don't update it even when new
      eof is greater than i_disksize but less than i_size. This doesn't
      work well with delalloc buffer write, which updates i_size and
      i_disksize only when delalloc blocks are resolved (at writeback
      time), the i_disksize from direct write can be lost if a previous
      buffer write succeeded at write time but failed at writeback time,
      then results in corrupted ondisk inode size.
      
      Consider this case, first buffer write 4k data to a new file at
      offset 16k with delayed allocation, then direct write 4k data to the
      same file at offset 4k before delalloc blocks are resolved, which
      doesn't update i_disksize because it writes within i_size(20k), but
      the extent tree metadata has been committed in journal. Then
      writeback of the delalloc blocks fails (due to device error etc.),
      and i_size/i_disksize from buffer write can't be written to disk
      (still zero). A subsequent umount/mount cycle recovers journal and
      writes extent tree metadata from direct write to disk, but with
      i_disksize being zero.
      
      Fix it by updating i_disksize too in direct write path when new eof
      is greater than i_disksize but less than i_size, so i_disksize is
      always consistent with direct write.
      
      This fixes occasional i_size corruption in fstests generic/475.
      Signed-off-by: NEryu Guan <guaneryu@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      45d8ec4d
    • E
      ext4: protect i_disksize update by i_data_sem in direct write path · 73fdad00
      Eryu Guan 提交于
      i_disksize update should be protected by i_data_sem, by either taking
      the lock explicitly or by using ext4_update_i_disksize() helper. But the
      i_disksize updates in ext4_direct_IO_write() are not protected at all,
      which may be racing with i_disksize updates in writeback path in
      delalloc buffer write path.
      
      This is found by code inspection, and I didn't hit any i_disksize
      corruption due to this bug. Thanks to Jan Kara for catching this bug and
      suggesting the fix!
      Reported-by: NJan Kara <jack@suse.cz>
      Suggested-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NEryu Guan <guaneryu@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      73fdad00
  27. 29 1月, 2018 2 次提交
  28. 10 1月, 2018 1 次提交
    • H
      ext4: fix a race in the ext4 shutdown path · abbc3f93
      Harshad Shirwadkar 提交于
      This patch fixes a race between the shutdown path and bio completion
      handling. In the ext4 direct io path with async io, after submitting a
      bio to the block layer, if journal starting fails,
      ext4_direct_IO_write() would bail out pretending that the IO
      failed. The caller would have had no way of knowing whether or not the
      IO was successfully submitted. So instead, we return -EIOCBQUEUED in
      this case. Now, the caller knows that the IO was submitted.  The bio
      completion handler takes care of the error.
      
      Tested: Ran the shutdown xfstest test 461 in loop for over 2 hours across
      4 machines resulting in over 400 runs. Verified that the race didn't
      occur. Usually the race was seen in about 20-30 iterations.
      Signed-off-by: NHarshad Shirwadkar <harshads@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      abbc3f93