1. 12 4月, 2021 1 次提交
  2. 24 1月, 2021 1 次提交
  3. 23 12月, 2020 3 次提交
  4. 07 11月, 2020 2 次提交
  5. 22 10月, 2020 1 次提交
    • H
      ext4: main fast-commit commit path · aa75f4d3
      Harshad Shirwadkar 提交于
      This patch adds main fast commit commit path handlers. The overall
      patch can be divided into two inter-related parts:
      
      (A) Metadata updates tracking
      
          This part consists of helper functions to track changes that need
          to be committed during a commit operation. These updates are
          maintained by Ext4 in different in-memory queues. Following are
          the APIs and their short description that are implemented in this
          patch:
      
          - ext4_fc_track_link/unlink/creat() - Track unlink. link and creat
            operations
          - ext4_fc_track_range() - Track changed logical block offsets
            inodes
          - ext4_fc_track_inode() - Track inodes
          - ext4_fc_mark_ineligible() - Mark file system fast commit
            ineligible()
          - ext4_fc_start_update() / ext4_fc_stop_update() /
            ext4_fc_start_ineligible() / ext4_fc_stop_ineligible() These
            functions are useful for co-ordinating inode updates with
            commits.
      
      (B) Main commit Path
      
          This part consists of functions to convert updates tracked in
          in-memory data structures into on-disk commits. Function
          ext4_fc_commit() is the main entry point to commit path.
      Reported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NHarshad Shirwadkar <harshadshirwadkar@gmail.com>
      Link: https://lore.kernel.org/r/20201015203802.3597742-6-harshadshirwadkar@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      aa75f4d3
  6. 18 10月, 2020 1 次提交
  7. 20 8月, 2020 1 次提交
    • B
      ext4: limit the length of per-inode prealloc list · 27bc446e
      brookxu 提交于
      In the scenario of writing sparse files, the per-inode prealloc list may
      be very long, resulting in high overhead for ext4_mb_use_preallocated().
      To circumvent this problem, we limit the maximum length of per-inode
      prealloc list to 512 and allow users to modify it.
      
      After patching, we observed that the sys ratio of cpu has dropped, and
      the system throughput has increased significantly. We created a process
      to write the sparse file, and the running time of the process on the
      fixed kernel was significantly reduced, as follows:
      
      Running time on unfixed kernel:
      [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
      real    0m2.051s
      user    0m0.008s
      sys     0m2.026s
      
      Running time on fixed kernel:
      [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
      real    0m0.471s
      user    0m0.004s
      sys     0m0.395s
      Signed-off-by: NChunguang Xu <brookxu@tencent.com>
      Link: https://lore.kernel.org/r/d7a98178-056b-6db5-6bce-4ead23f4a257@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      27bc446e
  8. 06 8月, 2020 3 次提交
  9. 04 6月, 2020 2 次提交
  10. 06 3月, 2020 1 次提交
  11. 27 12月, 2019 1 次提交
    • J
      ext4: Optimize ext4 DIO overwrites · 8cd115bd
      Jan Kara 提交于
      Currently we start transaction for mapping every extent for writing
      using direct IO. This is unnecessary when we know we are overwriting
      already allocated blocks and the overhead of starting a transaction can
      be significant especially for multithreaded workloads doing small writes.
      Use iomap operations that avoid starting a transaction for direct IO
      overwrites.
      
      This improves throughput of 4k random writes - fio jobfile:
      [global]
      rw=randrw
      norandommap=1
      invalidate=0
      bs=4k
      numjobs=16
      time_based=1
      ramp_time=30
      runtime=120
      group_reporting=1
      ioengine=psync
      direct=1
      size=16G
      filename=file1.0.0:file1.0.1:file1.0.2:file1.0.3:file1.0.4:file1.0.5:file1.0.6:file1.0.7:file1.0.8:file1.0.9:file1.0.10:file1.0.11:file1.0.12:file1.0.13:file1.0.14:file1.0.15:file1.0.16:file1.0.17:file1.0.18:file1.0.19:file1.0.20:file1.0.21:file1.0.22:file1.0.23:file1.0.24:file1.0.25:file1.0.26:file1.0.27:file1.0.28:file1.0.29:file1.0.30:file1.0.31
      file_service_type=random
      nrfiles=32
      
      from 3018MB/s to 4059MB/s in my test VM running test against simulated
      pmem device (note that before iomap conversion, this workload was able
      to achieve 3708MB/s because old direct IO path avoided transaction start
      for overwrites as well). For dax, the win is even larger improving
      throughput from 3042MB/s to 4311MB/s.
      Reported-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20191218174433.19380-1-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      8cd115bd
  12. 23 12月, 2019 3 次提交
  13. 06 11月, 2019 5 次提交
  14. 13 8月, 2019 1 次提交
    • E
      ext4: add basic fs-verity support · c93d8f88
      Eric Biggers 提交于
      Add most of fs-verity support to ext4.  fs-verity is a filesystem
      feature that enables transparent integrity protection and authentication
      of read-only files.  It uses a dm-verity like mechanism at the file
      level: a Merkle tree is used to verify any block in the file in
      log(filesize) time.  It is implemented mainly by helper functions in
      fs/verity/.  See Documentation/filesystems/fsverity.rst for the full
      documentation.
      
      This commit adds all of ext4 fs-verity support except for the actual
      data verification, including:
      
      - Adding a filesystem feature flag and an inode flag for fs-verity.
      
      - Implementing the fsverity_operations to support enabling verity on an
        inode and reading/writing the verity metadata.
      
      - Updating ->write_begin(), ->write_end(), and ->writepages() to support
        writing verity metadata pages.
      
      - Calling the fs-verity hooks for ->open(), ->setattr(), and ->ioctl().
      
      ext4 stores the verity metadata (Merkle tree and fsverity_descriptor)
      past the end of the file, starting at the first 64K boundary beyond
      i_size.  This approach works because (a) verity files are readonly, and
      (b) pages fully beyond i_size aren't visible to userspace but can be
      read/written internally by ext4 with only some relatively small changes
      to ext4.  This approach avoids having to depend on the EA_INODE feature
      and on rearchitecturing ext4's xattr support to support paging
      multi-gigabyte xattrs into memory, and to support encrypting xattrs.
      Note that the verity metadata *must* be encrypted when the file is,
      since it contains hashes of the plaintext data.
      
      This patch incorporates work by Theodore Ts'o and Chandan Rajendra.
      Reviewed-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      c93d8f88
  15. 12 8月, 2019 1 次提交
  16. 06 7月, 2019 1 次提交
  17. 10 6月, 2019 1 次提交
    • T
      ext4: enforce the immutable flag on open files · 02b016ca
      Theodore Ts'o 提交于
      According to the chattr man page, "a file with the 'i' attribute
      cannot be modified..."  Historically, this was only enforced when the
      file was opened, per the rest of the description, "... and the file
      can not be opened in write mode".
      
      There is general agreement that we should standardize all file systems
      to prevent modifications even for files that were opened at the time
      the immutable flag is set.  Eventually, a change to enforce this at
      the VFS layer should be landing in mainline.  Until then, enforce this
      at the ext4 level to prevent xfstests generic/553 from failing.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
      Cc: stable@kernel.org
      02b016ca
  18. 11 5月, 2019 1 次提交
    • L
      ext4: fix data corruption caused by overlapping unaligned and aligned IO · 57a0da28
      Lukas Czerner 提交于
      Unaligned AIO must be serialized because the zeroing of partial blocks
      of unaligned AIO can result in data corruption in case it's overlapping
      another in flight IO.
      
      Currently we wait for all unwritten extents before we submit unaligned
      AIO which protects data in case of unaligned AIO is following overlapping
      IO. However if a unaligned AIO is followed by overlapping aligned AIO we
      can still end up corrupting data.
      
      To fix this, we must make sure that the unaligned AIO is the only IO in
      flight by waiting for unwritten extents conversion not just before the
      IO submission, but right after it as well.
      
      This problem can be reproduced by xfstest generic/538
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      57a0da28
  19. 15 3月, 2019 1 次提交
    • L
      ext4: fix data corruption caused by unaligned direct AIO · 372a03e0
      Lukas Czerner 提交于
      Ext4 needs to serialize unaligned direct AIO because the zeroing of
      partial blocks of two competing unaligned AIOs can result in data
      corruption.
      
      However it decides not to serialize if the potentially unaligned aio is
      past i_size with the rationale that no pending writes are possible past
      i_size. Unfortunately if the i_size is not block aligned and the second
      unaligned write lands past i_size, but still into the same block, it has
      the potential of corrupting the previous unaligned write to the same
      block.
      
      This is (very simplified) reproducer from Frank
      
          // 41472 = (10 * 4096) + 512
          // 37376 = 41472 - 4096
      
          ftruncate(fd, 41472);
          io_prep_pwrite(iocbs[0], fd, buf[0], 4096, 37376);
          io_prep_pwrite(iocbs[1], fd, buf[1], 4096, 41472);
      
          io_submit(io_ctx, 1, &iocbs[1]);
          io_submit(io_ctx, 1, &iocbs[2]);
      
          io_getevents(io_ctx, 2, 2, events, NULL);
      
      Without this patch the 512B range from 40960 up to the start of the
      second unaligned write (41472) is going to be zeroed overwriting the data
      written by the first write. This is a data corruption.
      
      00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
      *
      00009200  30 30 30 30 30 30 30 30  30 30 30 30 30 30 30 30
      *
      0000a000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
      *
      0000a200  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31
      
      With this patch the data corruption is avoided because we will recognize
      the unaligned_aio and wait for the unwritten extent conversion.
      
      00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
      *
      00009200  30 30 30 30 30 30 30 30  30 30 30 30 30 30 30 30
      *
      0000a200  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31
      *
      0000b200
      Reported-by: NFrank Sorenson <fsorenso@redhat.com>
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Fixes: e9e3bcec ("ext4: serialize unaligned asynchronous DIO")
      Cc: stable@vger.kernel.org
      372a03e0
  20. 18 8月, 2018 1 次提交
  21. 14 5月, 2018 3 次提交
  22. 08 1月, 2018 2 次提交
  23. 03 11月, 2017 3 次提交