1. 15 9月, 2020 1 次提交
  2. 12 9月, 2020 11 次提交
    • D
      f2fs: change virtual mapping way for compression pages · 6fcaebac
      Daeho Jeong 提交于
      By profiling f2fs compression works, I've found vmap() callings have
      unexpected hikes in the execution time in our test environment and
      those are bottlenecks of f2fs decompression path. Changing these with
      vm_map_ram(), we can enhance f2fs decompression speed pretty much.
      
      [Verification]
      Android Pixel 3(ARM64, 6GB RAM, 128GB UFS)
      Turned on only 0-3 little cores(at 1.785GHz)
      
      dd if=/dev/zero of=dummy bs=1m count=1000
      echo 3 > /proc/sys/vm/drop_caches
      dd if=dummy of=/dev/zero bs=512k
      
      - w/o compression -
      1048576000 bytes (0.9 G) copied, 2.082554 s, 480 M/s
      1048576000 bytes (0.9 G) copied, 2.081634 s, 480 M/s
      1048576000 bytes (0.9 G) copied, 2.090861 s, 478 M/s
      
      - before patch -
      1048576000 bytes (0.9 G) copied, 7.407527 s, 135 M/s
      1048576000 bytes (0.9 G) copied, 7.283734 s, 137 M/s
      1048576000 bytes (0.9 G) copied, 7.291508 s, 137 M/s
      
      - after patch -
      1048576000 bytes (0.9 G) copied, 1.998959 s, 500 M/s
      1048576000 bytes (0.9 G) copied, 1.987554 s, 503 M/s
      1048576000 bytes (0.9 G) copied, 1.986380 s, 503 M/s
      Signed-off-by: NDaeho Jeong <daehojeong@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6fcaebac
    • D
      f2fs: change return value of f2fs_disable_compressed_file to bool · 78134d03
      Daeho Jeong 提交于
      The returned integer is not required anywhere. So we need to change
      the return value to bool type.
      Signed-off-by: NDaeho Jeong <daehojeong@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      78134d03
    • D
      f2fs: change i_compr_blocks of inode to atomic value · c2759eba
      Daeho Jeong 提交于
      writepages() can be concurrently invoked for the same file by different
      threads such as a thread fsyncing the file and a kworker kernel thread.
      So, changing i_compr_blocks without protection is racy and we need to
      protect it by changing it with atomic type value. Plus, we don't need
      a 64bit value for i_compr_blocks, so just we will use a atomic value,
      not atomic64.
      Signed-off-by: NDaeho Jeong <daehojeong@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c2759eba
    • C
      f2fs: ignore compress mount option on image w/o compression feature · 69c0dd29
      Chao Yu 提交于
      to keep consistent with behavior when passing compress mount option
      to kernel w/o compression feature, so that mount may not fail on
      such condition.
      Reported-by: NKyungmin Park <kyungmin.park@samsung.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      69c0dd29
    • C
      f2fs: allocate proper size memory for zstd decompress · 0e2b7385
      Chao Yu 提交于
      As 5kft <5kft@5kft.org> reported:
      
       kworker/u9:3: page allocation failure: order:9, mode:0x40c40(GFP_NOFS|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
       CPU: 3 PID: 8168 Comm: kworker/u9:3 Tainted: G         C        5.8.3-sunxi #trunk
       Hardware name: Allwinner sun8i Family
       Workqueue: f2fs_post_read_wq f2fs_post_read_work
       [<c010d6d5>] (unwind_backtrace) from [<c0109a55>] (show_stack+0x11/0x14)
       [<c0109a55>] (show_stack) from [<c056d489>] (dump_stack+0x75/0x84)
       [<c056d489>] (dump_stack) from [<c0243b53>] (warn_alloc+0xa3/0x104)
       [<c0243b53>] (warn_alloc) from [<c024473b>] (__alloc_pages_nodemask+0xb87/0xc40)
       [<c024473b>] (__alloc_pages_nodemask) from [<c02267c5>] (kmalloc_order+0x19/0x38)
       [<c02267c5>] (kmalloc_order) from [<c02267fd>] (kmalloc_order_trace+0x19/0x90)
       [<c02267fd>] (kmalloc_order_trace) from [<c047c665>] (zstd_init_decompress_ctx+0x21/0x88)
       [<c047c665>] (zstd_init_decompress_ctx) from [<c047e9cf>] (f2fs_decompress_pages+0x97/0x228)
       [<c047e9cf>] (f2fs_decompress_pages) from [<c045d0ab>] (__read_end_io+0xfb/0x130)
       [<c045d0ab>] (__read_end_io) from [<c045d141>] (f2fs_post_read_work+0x61/0x84)
       [<c045d141>] (f2fs_post_read_work) from [<c0130b2f>] (process_one_work+0x15f/0x3b0)
       [<c0130b2f>] (process_one_work) from [<c0130e7b>] (worker_thread+0xfb/0x3e0)
       [<c0130e7b>] (worker_thread) from [<c0135c3b>] (kthread+0xeb/0x10c)
       [<c0135c3b>] (kthread) from [<c0100159>]
      
      zstd may allocate large size memory for {,de}compression, it may cause
      file copy failure on low-end device which has very few memory.
      
      For decompression, let's just allocate proper size memory based on current
      file's cluster size instead of max cluster size.
      Reported-by: N5kft <5kft@5kft.org>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0e2b7385
    • D
      f2fs: change compr_blocks of superblock info to 64bit · ae999bb9
      Daeho Jeong 提交于
      Current compr_blocks of superblock info is not 64bit value. We are
      accumulating each i_compr_blocks count of inodes to this value and
      those are 64bit values. So, need to change this to 64bit value.
      Signed-off-by: NDaeho Jeong <daehojeong@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ae999bb9
    • D
      f2fs: add block address limit check to compressed file · 4eda1682
      Daeho Jeong 提交于
      Need to add block address range check to compressed file case and
      avoid calling get_data_block_bmap() for compressed file.
      Signed-off-by: NDaeho Jeong <daehojeong@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      4eda1682
    • D
      f2fs: check position in move range ioctl · aad1383c
      Dan Robertson 提交于
      When the move range ioctl is used, check the input and output position and
      ensure that it is a non-negative value. Without this check
      f2fs_get_dnode_of_data may hit a memmory bug.
      Signed-off-by: NDan Robertson <dan@dlrobertson.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      aad1383c
    • J
      f2fs: correct statistic of APP_DIRECT_IO/APP_DIRECT_READ_IO · 335cac8b
      Jack Qiu 提交于
      Miss to update APP_DIRECT_IO/APP_DIRECT_READ_IO when receiving async DIO.
      For example: fio -filename=/data/test.0 -bs=1m -ioengine=libaio -direct=1
      		-name=fill -size=10m -numjobs=1 -iodepth=32 -rw=write
      Signed-off-by: NJack Qiu <jack.qiu@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      335cac8b
    • M
      f2fs: Simplify SEEK_DATA implementation · 4cb03fec
      Matthew Wilcox (Oracle) 提交于
      Instead of finding the first dirty page and then seeing if it matches
      the index of a block that is NEW_ADDR, delay the lookup of the dirty
      bit until we've actually found a block that's NEW_ADDR.
      Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      4cb03fec
    • C
      f2fs: support age threshold based garbage collection · 093749e2
      Chao Yu 提交于
      There are several issues in current background GC algorithm:
      - valid blocks is one of key factors during cost overhead calculation,
      so if segment has less valid block, however even its age is young or
      it locates hot segment, CB algorithm will still choose the segment as
      victim, it's not appropriate.
      - GCed data/node will go to existing logs, no matter in-there datas'
      update frequency is the same or not, it may mix hot and cold data
      again.
      - GC alloctor mainly use LFS type segment, it will cost free segment
      more quickly.
      
      This patch introduces a new algorithm named age threshold based
      garbage collection to solve above issues, there are three steps
      mainly:
      
      1. select a source victim:
      - set an age threshold, and select candidates beased threshold:
      e.g.
       0 means youngest, 100 means oldest, if we set age threshold to 80
       then select dirty segments which has age in range of [80, 100] as
       candiddates;
      - set candidate_ratio threshold, and select candidates based the
      ratio, so that we can shrink candidates to those oldest segments;
      - select target segment with fewest valid blocks in order to
      migrate blocks with minimum cost;
      
      2. select a target victim:
      - select candidates beased age threshold;
      - set candidate_radius threshold, search candidates whose age is
      around source victims, searching radius should less than the
      radius threshold.
      - select target segment with most valid blocks in order to avoid
      migrating current target segment.
      
      3. merge valid blocks from source victim into target victim with
      SSR alloctor.
      
      Test steps:
      - create 160 dirty segments:
       * half of them have 128 valid blocks per segment
       * left of them have 384 valid blocks per segment
      - run background GC
      
      Benefit: GC count and block movement count both decrease obviously:
      
      - Before:
        - Valid: 86
        - Dirty: 1
        - Prefree: 11
        - Free: 6001 (6001)
      
      GC calls: 162 (BG: 220)
        - data segments : 160 (160)
        - node segments : 2 (2)
      Try to move 41454 blocks (BG: 41454)
        - data blocks : 40960 (40960)
        - node blocks : 494 (494)
      
      IPU: 0 blocks
      SSR: 0 blocks in 0 segments
      LFS: 41364 blocks in 81 segments
      
      - After:
      
        - Valid: 87
        - Dirty: 0
        - Prefree: 4
        - Free: 6008 (6008)
      
      GC calls: 75 (BG: 76)
        - data segments : 74 (74)
        - node segments : 1 (1)
      Try to move 12813 blocks (BG: 12813)
        - data blocks : 12544 (12544)
        - node blocks : 269 (269)
      
      IPU: 0 blocks
      SSR: 12032 blocks in 77 segments
      LFS: 855 blocks in 2 segments
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      093749e2
  3. 11 9月, 2020 12 次提交
    • D
      f2fs: Use generic casefolding support · eca4873e
      Daniel Rosenberg 提交于
      This switches f2fs over to the generic support provided in
      the previous patch.
      
      Since casefolded dentries behave the same in ext4 and f2fs, we decrease
      the maintenance burden by unifying them, and any optimizations will
      immediately apply to both.
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      Reviewed-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      eca4873e
    • D
      fs: Add standard casefolding support · c843843e
      Daniel Rosenberg 提交于
      This adds general supporting functions for filesystems that use
      utf8 casefolding. It provides standard dentry_operations and adds the
      necessary structures in struct super_block to allow this standardization.
      
      The new dentry operations are functionally equivalent to the existing
      operations in ext4 and f2fs, apart from the use of utf8_casefold_hash to
      avoid an allocation.
      
      By providing a common implementation, all users can benefit from any
      optimizations without needing to port over improvements.
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      Reviewed-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c843843e
    • D
      unicode: Add utf8_casefold_hash · 3d7bfea8
      Daniel Rosenberg 提交于
      This adds a case insensitive hash function to allow taking the hash
      without needing to allocate a casefolded copy of the string.
      
      The existing d_hash implementations for casefolding allocate memory
      within rcu-walk, by avoiding it we can be more efficient and avoid
      worrying about a failed allocation.
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      Reviewed-by: NGabriel Krisman Bertazi <krisman@collabora.com>
      Reviewed-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3d7bfea8
    • C
      f2fs: compress: use more readable atomic_t type for {cic,dic}.ref · e6c3948d
      Chao Yu 提交于
      refcount_t type variable should never be less than one, so it's a
      little bit hard to understand when we use it to indicate pending
      compressed page count, let's change to use atomic_t for better
      readability.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e6c3948d
    • C
      f2fs: fix compile warning · 17d7648d
      Chao Yu 提交于
      This patch fixes below compile warning reported by LKP
      (kernel test robot)
      
      cppcheck warnings: (new ones prefixed by >>)
      
      >> fs/f2fs/file.c:761:9: warning: Identical condition 'err', second condition is always false [identicalConditionAfterEarlyExit]
          return err;
                 ^
         fs/f2fs/file.c:753:6: note: first condition
          if (err)
              ^
         fs/f2fs/file.c:761:9: note: second condition
          return err;
      Reported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      17d7648d
    • C
      f2fs: support 64-bits key in f2fs rb-tree node entry · 2e9b2bb2
      Chao Yu 提交于
      then, we can add specified entry into rb-tree with 64-bits segment time
      as key.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2e9b2bb2
    • C
      f2fs: inherit mtime of original block during GC · c5d02785
      Chao Yu 提交于
      Don't let f2fs inner GC ruins original aging degree of segment.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c5d02785
    • C
      f2fs: record average update time of segment · 6f3a01ae
      Chao Yu 提交于
      Previously, once we update one block in segment, we will update mtime of
      segment to last time, making aged segment becoming freshest, result in
      that GC with cost benefit algorithm missing such segment, So this patch
      changes to record mtime as average block updating time instead of last
      updating time.
      
      It's not needed to reset mtime for prefree segment, as se->valid_blocks
      is zero, then old se->mtime won't take any weight with below calculation:
      
      	se->mtime = div_u64(se->mtime * se->valid_blocks + mtime,
      					se->valid_blocks + 1);
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6f3a01ae
    • C
      f2fs: introduce inmem curseg · d0b9e42a
      Chao Yu 提交于
      Previous implementation of aligned pinfile allocation will:
      - allocate new segment on cold data log no matter whether last used
      segment is partially used or not, it makes IOs more random;
      - force concurrent cold data/GCed IO going into warm data area, it
      can make a bad effect on hot/cold data separation;
      
      In this patch, we introduce a new type of log named 'inmem curseg',
      the differents from normal curseg is:
      - it reuses existed segment type (CURSEG_XXX_NODE/DATA);
      - it only exists in memory, its segno, blkofs, summary will not b
       persisted into checkpoint area;
      
      With this new feature, we can enhance scalability of log, special
      allocators can be created for purposes:
      - pure lfs allocator for aligned pinfile allocation or file
      defragmentation
      - pure ssr allocator for later feature
      
      So that, let's update aligned pinfile allocation to use this new
      inmem curseg fwk.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d0b9e42a
    • C
      f2fs: compress: remove unneeded code · 376207af
      Chao Yu 提交于
      - f2fs_write_multi_pages
       - f2fs_compress_pages
        - init_compress_ctx
        - compress_pages
        - destroy_compress_ctx  --- 1
       - f2fs_write_compressed_pages
       - destroy_compress_ctx  --- 2
      
      destroy_compress_ctx() in f2fs_write_multi_pages() is redundant, remove
      it.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      376207af
    • X
      f2fs: remove duplicated type casting · e90027d2
      Xiaojun Wang 提交于
      Since DUMMY_WRITTEN_PAGE and ATOMIC_WRITTEN_PAGE have already been
      converted as unsigned long type, we don't need do type casting again.
      Signed-off-by: NXiaojun Wang <wangxiaojun11@huawei.com>
      Reported-by: NJack Qiu <jack.qiu@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e90027d2
    • A
      f2fs: support zone capacity less than zone size · de881df9
      Aravind Ramesh 提交于
      NVMe Zoned Namespace devices can have zone-capacity less than zone-size.
      Zone-capacity indicates the maximum number of sectors that are usable in
      a zone beginning from the first sector of the zone. This makes the sectors
      sectors after the zone-capacity till zone-size to be unusable.
      This patch set tracks zone-size and zone-capacity in zoned devices and
      calculate the usable blocks per segment and usable segments per section.
      
      If zone-capacity is less than zone-size mark only those segments which
      start before zone-capacity as free segments. All segments at and beyond
      zone-capacity are treated as permanently used segments. In cases where
      zone-capacity does not align with segment size the last segment will start
      before zone-capacity and end beyond the zone-capacity of the zone. For
      such spanning segments only sectors within the zone-capacity are used.
      
      During writes and GC manage the usable segments in a section and usable
      blocks per segment. Segments which are beyond zone-capacity are never
      allocated, and do not need to be garbage collected, only the segments
      which are before zone-capacity needs to garbage collected.
      For spanning segments based on the number of usable blocks in that
      segment, write to blocks only up to zone-capacity.
      
      Zone-capacity is device specific and cannot be configured by the user.
      Since NVMe ZNS device zones are sequentially write only, a block device
      with conventional zones or any normal block device is needed along with
      the ZNS device for the metadata operations of F2fs.
      
      A typical nvme-cli output of a zoned device shows zone start and capacity
      and write pointer as below:
      
      SLBA: 0x0     WP: 0x0     Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
      SLBA: 0x20000 WP: 0x20000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
      SLBA: 0x40000 WP: 0x40000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
      
      Here zone size is 64MB, capacity is 49MB, WP is at zone start as the zones
      are in EMPTY state. For each zone, only zone start + 49MB is usable area,
      any lba/sector after 49MB cannot be read or written to, the drive will fail
      any attempts to read/write. So, the second zone starts at 64MB and is
      usable till 113MB (64 + 49) and the range between 113 and 128MB is
      again unusable. The next zone starts at 128MB, and so on.
      Signed-off-by: NAravind Ramesh <aravind.ramesh@wdc.com>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      de881df9
  4. 09 9月, 2020 3 次提交
    • G
      f2fs: Return EOF on unaligned end of file DIO read · 20d0a107
      Gabriel Krisman Bertazi 提交于
      Reading past end of file returns EOF for aligned reads but -EINVAL for
      unaligned reads on f2fs.  While documentation is not strict about this
      corner case, most filesystem returns EOF on this case, like iomap
      filesystems.  This patch consolidates the behavior for f2fs, by making
      it return EOF(0).
      
      it can be verified by a read loop on a file that does a partial read
      before EOF (A file that doesn't end at an aligned address).  The
      following code fails on an unaligned file on f2fs, but not on
      btrfs, ext4, and xfs.
      
        while (done < total) {
          ssize_t delta = pread(fd, buf + done, total - done, off + done);
          if (!delta)
            break;
          ...
        }
      
      It is arguable whether filesystems should actually return EOF or
      -EINVAL, but since iomap filesystems support it, and so does the
      original DIO code, it seems reasonable to consolidate on that.
      Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      20d0a107
    • S
      f2fs: fix indefinite loop scanning for free nid · e2cab031
      Sahitya Tummala 提交于
      If the sbi->ckpt->next_free_nid is not NAT block aligned and if there
      are free nids in that NAT block between the start of the block and
      next_free_nid, then those free nids will not be scanned in scan_nat_page().
      This results into mismatch between nm_i->available_nids and the sum of
      nm_i->free_nid_count of all NAT blocks scanned. And nm_i->available_nids
      will always be greater than the sum of free nids in all the blocks.
      Under this condition, if we use all the currently scanned free nids,
      then it will loop forever in f2fs_alloc_nid() as nm_i->available_nids
      is still not zero but nm_i->free_nid_count of that partially scanned
      NAT block is zero.
      
      Fix this to align the nm_i->next_scan_nid to the first nid of the
      corresponding NAT block.
      Signed-off-by: NSahitya Tummala <stummala@codeaurora.org>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e2cab031
    • S
      f2fs: Fix type of section block count variables · 123aaf77
      Shin'ichiro Kawasaki 提交于
      Commit da52f8ad ("f2fs: get the right gc victim section when section
      has several segments") added code to count blocks of each section using
      variables with type 'unsigned short', which has 2 bytes size in many
      systems. However, the counts can be larger than the 2 bytes range and
      type conversion results in wrong values. Especially when the f2fs
      sections have blocks as many as USHRT_MAX + 1, the count is handled as 0.
      This triggers eternal loop in init_dirty_segmap() at mount system call.
      Fix this by changing the type of the variables to block_t.
      
      Fixes: da52f8ad ("f2fs: get the right gc victim section when section has several segments")
      Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      123aaf77
  5. 06 9月, 2020 4 次提交
    • P
      io_uring: fix linked deferred ->files cancellation · c127a2a1
      Pavel Begunkov 提交于
      While looking for ->files in ->defer_list, consider that requests there
      may actually be links.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c127a2a1
    • P
      io_uring: fix cancel of deferred reqs with ->files · b7ddce3c
      Pavel Begunkov 提交于
      While trying to cancel requests with ->files, it also should look for
      requests in ->defer_list, otherwise it might end up hanging a thread.
      
      Cancel all requests in ->defer_list up to the last request there with
      matching ->files, that's needed to follow drain ordering semantics.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b7ddce3c
    • M
      xfs: don't update mtime on COW faults · b17164e2
      Mikulas Patocka 提交于
      When running in a dax mode, if the user maps a page with MAP_PRIVATE and
      PROT_WRITE, the xfs filesystem would incorrectly update ctime and mtime
      when the user hits a COW fault.
      
      This breaks building of the Linux kernel.  How to reproduce:
      
       1. extract the Linux kernel tree on dax-mounted xfs filesystem
       2. run make clean
       3. run make -j12
       4. run make -j12
      
      at step 4, make would incorrectly rebuild the whole kernel (although it
      was already built in step 3).
      
      The reason for the breakage is that almost all object files depend on
      objtool.  When we run objtool, it takes COW page fault on its .data
      section, and these faults will incorrectly update the timestamp of the
      objtool binary.  The updated timestamp causes make to rebuild the whole
      tree.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b17164e2
    • M
      ext2: don't update mtime on COW faults · 1ef6ea0e
      Mikulas Patocka 提交于
      When running in a dax mode, if the user maps a page with MAP_PRIVATE and
      PROT_WRITE, the ext2 filesystem would incorrectly update ctime and mtime
      when the user hits a COW fault.
      
      This breaks building of the Linux kernel.  How to reproduce:
      
       1. extract the Linux kernel tree on dax-mounted ext2 filesystem
       2. run make clean
       3. run make -j12
       4. run make -j12
      
      at step 4, make would incorrectly rebuild the whole kernel (although it
      was already built in step 3).
      
      The reason for the breakage is that almost all object files depend on
      objtool.  When we run objtool, it takes COW page fault on its .data
      section, and these faults will incorrectly update the timestamp of the
      objtool binary.  The updated timestamp causes make to rebuild the whole
      tree.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1ef6ea0e
  6. 05 9月, 2020 2 次提交
  7. 03 9月, 2020 2 次提交
  8. 02 9月, 2020 2 次提交
  9. 01 9月, 2020 1 次提交
  10. 31 8月, 2020 1 次提交
    • M
      affs: fix basic permission bits to actually work · d3a84a8d
      Max Staudt 提交于
      The basic permission bits (protection bits in AmigaOS) have been broken
      in Linux' AFFS - it would only set bits, but never delete them.
      Also, contrary to the documentation, the Archived bit was not handled.
      
      Let's fix this for good, and set the bits such that Linux and classic
      AmigaOS can coexist in the most peaceful manner.
      
      Also, update the documentation to represent the current state of things.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Cc: stable@vger.kernel.org
      Signed-off-by: NMax Staudt <max@enpas.org>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d3a84a8d
  11. 29 8月, 2020 1 次提交