1. 06 4月, 2019 1 次提交
    • D
      f2fs: Fix use of number of devices · 0916878d
      Damien Le Moal 提交于
      For a single device mount using a zoned block device, the zone
      information for the device is stored in the sbi->devs single entry
      array and sbi->s_ndevs is set to 1. This differs from a single device
      mount using a regular block device which does not allocate sbi->devs
      and sets sbi->s_ndevs to 0.
      
      However, sbi->s_devs == 0 condition is used throughout the code to
      differentiate a single device mount from a multi-device mount where
      sbi->s_ndevs is always larger than 1. This results in problems with
      single zoned block device volumes as these are treated as multi-device
      mounts but do not have the start_blk and end_blk information set. One
      of the problem observed is skipping of zone discard issuing resulting in
      write commands being issued to full zones or unaligned to a zone write
      pointer.
      
      Fix this problem by simply treating the cases sbi->s_ndevs == 0 (single
      regular block device mount) and sbi->s_ndevs == 1 (single zoned block
      device mount) in the same manner. This is done by introducing the
      helper function f2fs_is_multi_device() and using this helper in place
      of direct tests of sbi->s_ndevs value, improving code readability.
      
      Fixes: 7bb3a371 ("f2fs: Fix zoned block device support")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      0916878d
  2. 27 12月, 2018 2 次提交
    • C
      f2fs: check PageWriteback flag for ordered case · bae0ee7a
      Chao Yu 提交于
      For all ordered cases in f2fs_wait_on_page_writeback(), we need to
      check PageWriteback status, so let's clean up to relocate the check
      into f2fs_wait_on_page_writeback().
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      bae0ee7a
    • J
      f2fs: use kvmalloc, if kmalloc is failed · 5222595d
      Jaegeuk Kim 提交于
      One report says memalloc failure during mount.
      
       (unwind_backtrace) from [<c010cd4c>] (show_stack+0x10/0x14)
       (show_stack) from [<c049c6b8>] (dump_stack+0x8c/0xa0)
       (dump_stack) from [<c024fcf0>] (warn_alloc+0xc4/0x160)
       (warn_alloc) from [<c0250218>] (__alloc_pages_nodemask+0x3f4/0x10d0)
       (__alloc_pages_nodemask) from [<c0270450>] (kmalloc_order_trace+0x2c/0x120)
       (kmalloc_order_trace) from [<c03fa748>] (build_node_manager+0x35c/0x688)
       (build_node_manager) from [<c03de494>] (f2fs_fill_super+0xf0c/0x16cc)
       (f2fs_fill_super) from [<c02a5864>] (mount_bdev+0x15c/0x188)
       (mount_bdev) from [<c03da624>] (f2fs_mount+0x18/0x20)
       (f2fs_mount) from [<c02a68b8>] (mount_fs+0x158/0x19c)
       (mount_fs) from [<c02c3c9c>] (vfs_kern_mount+0x78/0x134)
       (vfs_kern_mount) from [<c02c76ac>] (do_mount+0x474/0xca4)
       (do_mount) from [<c02c8264>] (SyS_mount+0x94/0xbc)
       (SyS_mount) from [<c0108180>] (ret_fast_syscall+0x0/0x48)
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      5222595d
  3. 14 12月, 2018 1 次提交
  4. 27 11月, 2018 6 次提交
  5. 17 10月, 2018 2 次提交
    • C
      f2fs: submit cached bio to avoid endless PageWriteback · 48018b4c
      Chao Yu 提交于
      When migrating encrypted block from background GC thread, we only add
      them into f2fs inner bio cache, but forget to submit the cached bio, it
      may cause potential deadlock when we are waiting page writebacked, fix
      it.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      48018b4c
    • D
      f2fs: checkpoint disabling · 4354994f
      Daniel Rosenberg 提交于
      Note that, it requires "f2fs: return correct errno in f2fs_gc".
      
      This adds a lightweight non-persistent snapshotting scheme to f2fs.
      
      To use, mount with the option checkpoint=disable, and to return to
      normal operation, remount with checkpoint=enable. If the filesystem
      is shut down before remounting with checkpoint=enable, it will revert
      back to its apparent state when it was first mounted with
      checkpoint=disable. This is useful for situations where you wish to be
      able to roll back the state of the disk in case of some critical
      failure.
      Signed-off-by: NDaniel Rosenberg <drosen@google.com>
      [Jaegeuk Kim: use SB_RDONLY instead of MS_RDONLY]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      4354994f
  6. 01 10月, 2018 1 次提交
  7. 29 9月, 2018 2 次提交
  8. 20 9月, 2018 1 次提交
  9. 13 9月, 2018 1 次提交
  10. 21 8月, 2018 2 次提交
    • C
      f2fs: readahead encrypted block during GC · 6aa58d8a
      Chao Yu 提交于
      During GC, for each encrypted block, we will read block synchronously
      into meta page, and then submit it into current cold data log area.
      
      So this block read model with 4k granularity can make poor performance,
      like migrating non-encrypted block, let's readahead encrypted block
      as well to improve migration performance.
      
      To implement this, we choose meta page that its index is old block
      address of the encrypted block, and readahead ciphertext into this
      page, later, if readaheaded page is still updated, we will load its
      data into target meta page, and submit the write IO.
      
      Note that for OPU, truncation, deletion, we need to invalid meta
      page after we invalid old block address, to make sure we won't load
      invalid data from target meta page during encrypted block migration.
      
      for ((i = 0; i < 1000; i++))
      do {
              xfs_io -f /mnt/f2fs/dir/$i -c "pwrite 0 128k" -c "fsync";
      } done
      
      for ((i = 0; i < 1000; i+=2))
      do {
              rm /mnt/f2fs/dir/$i;
      } done
      
      ret = ioctl(fd, F2FS_IOC_GARBAGE_COLLECT, 0);
      
      Before:
                    gc-6549  [001] d..1 214682.212797: block_rq_insert: 8,32 RA 32768 () 786400 + 64 [gc]
                    gc-6549  [001] d..1 214682.212802: block_unplug: [gc] 1
                    gc-6549  [001] .... 214682.213892: block_bio_queue: 8,32 R 67494144 + 8 [gc]
                    gc-6549  [001] .... 214682.213899: block_getrq: 8,32 R 67494144 + 8 [gc]
                    gc-6549  [001] .... 214682.213902: block_plug: [gc]
                    gc-6549  [001] d..1 214682.213905: block_rq_insert: 8,32 R 4096 () 67494144 + 8 [gc]
                    gc-6549  [001] d..1 214682.213908: block_unplug: [gc] 1
                    gc-6549  [001] .... 214682.226405: block_bio_queue: 8,32 R 67494152 + 8 [gc]
                    gc-6549  [001] .... 214682.226412: block_getrq: 8,32 R 67494152 + 8 [gc]
                    gc-6549  [001] .... 214682.226414: block_plug: [gc]
                    gc-6549  [001] d..1 214682.226417: block_rq_insert: 8,32 R 4096 () 67494152 + 8 [gc]
                    gc-6549  [001] d..1 214682.226420: block_unplug: [gc] 1
                    gc-6549  [001] .... 214682.226904: block_bio_queue: 8,32 R 67494160 + 8 [gc]
                    gc-6549  [001] .... 214682.226910: block_getrq: 8,32 R 67494160 + 8 [gc]
                    gc-6549  [001] .... 214682.226911: block_plug: [gc]
                    gc-6549  [001] d..1 214682.226914: block_rq_insert: 8,32 R 4096 () 67494160 + 8 [gc]
                    gc-6549  [001] d..1 214682.226916: block_unplug: [gc] 1
      
      After:
                    gc-5678  [003] .... 214327.025906: block_bio_queue: 8,32 R 67493824 + 8 [gc]
                    gc-5678  [003] .... 214327.025908: block_bio_backmerge: 8,32 R 67493824 + 8 [gc]
                    gc-5678  [003] .... 214327.025915: block_bio_queue: 8,32 R 67493832 + 8 [gc]
                    gc-5678  [003] .... 214327.025917: block_bio_backmerge: 8,32 R 67493832 + 8 [gc]
                    gc-5678  [003] .... 214327.025923: block_bio_queue: 8,32 R 67493840 + 8 [gc]
                    gc-5678  [003] .... 214327.025925: block_bio_backmerge: 8,32 R 67493840 + 8 [gc]
                    gc-5678  [003] .... 214327.025932: block_bio_queue: 8,32 R 67493848 + 8 [gc]
                    gc-5678  [003] .... 214327.025934: block_bio_backmerge: 8,32 R 67493848 + 8 [gc]
                    gc-5678  [003] .... 214327.025941: block_bio_queue: 8,32 R 67493856 + 8 [gc]
                    gc-5678  [003] .... 214327.025943: block_bio_backmerge: 8,32 R 67493856 + 8 [gc]
                    gc-5678  [003] .... 214327.025953: block_bio_queue: 8,32 R 67493864 + 8 [gc]
                    gc-5678  [003] .... 214327.025955: block_bio_backmerge: 8,32 R 67493864 + 8 [gc]
                    gc-5678  [003] .... 214327.025962: block_bio_queue: 8,32 R 67493872 + 8 [gc]
                    gc-5678  [003] .... 214327.025964: block_bio_backmerge: 8,32 R 67493872 + 8 [gc]
                    gc-5678  [003] .... 214327.025970: block_bio_queue: 8,32 R 67493880 + 8 [gc]
                    gc-5678  [003] .... 214327.025972: block_bio_backmerge: 8,32 R 67493880 + 8 [gc]
                    gc-5678  [003] .... 214327.026000: block_bio_queue: 8,32 WS 34123776 + 2048 [gc]
                    gc-5678  [003] .... 214327.026019: block_getrq: 8,32 WS 34123776 + 2048 [gc]
                    gc-5678  [003] d..1 214327.026021: block_rq_insert: 8,32 R 131072 () 67493632 + 256 [gc]
                    gc-5678  [003] d..1 214327.026023: block_unplug: [gc] 1
                    gc-5678  [003] d..1 214327.026026: block_rq_issue: 8,32 R 131072 () 67493632 + 256 [gc]
                    gc-5678  [003] .... 214327.026046: block_plug: [gc]
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6aa58d8a
    • J
      f2fs: avoid fi->i_gc_rwsem[WRITE] lock in f2fs_gc · 6f8d4455
      Jaegeuk Kim 提交于
      The f2fs_gc() called by f2fs_balance_fs() requires to be called outside of
      fi->i_gc_rwsem[WRITE], since f2fs_gc() can try to grab it in a loop.
      
      If it hits the miximum retrials in GC, let's give a chance to release
      gc_mutex for a short time in order not to go into live lock in the worst
      case.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6f8d4455
  11. 15 8月, 2018 1 次提交
    • A
      f2fs: rework fault injection handling to avoid a warning · 7fa750a1
      Arnd Bergmann 提交于
      When CONFIG_F2FS_FAULT_INJECTION is disabled, we get a warning about an
      unused label:
      
      fs/f2fs/segment.c: In function '__submit_discard_cmd':
      fs/f2fs/segment.c:1059:1: error: label 'submit' defined but not used [-Werror=unused-label]
      
      This could be fixed by adding another #ifdef around it, but the more
      reliable way of doing this seems to be to remove the other #ifdefs
      where that is easily possible.
      
      By defining time_to_inject() as a trivial stub, most of the checks for
      CONFIG_F2FS_FAULT_INJECTION can go away. This also leads to nicer
      formatting of the code.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7fa750a1
  12. 02 8月, 2018 1 次提交
  13. 29 7月, 2018 1 次提交
    • C
      f2fs: fix to skip GC if type in SSA and SIT is inconsistent · 10d255c3
      Chao Yu 提交于
      If segment type in SSA and SIT is inconsistent, we will encounter below
      BUG_ON during GC, to avoid this panic, let's just skip doing GC on such
      segment.
      
      The bug is triggered with image reported in below link:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=200223
      
      [  388.060262] ------------[ cut here ]------------
      [  388.060268] kernel BUG at /home/y00370721/git/devf2fs/gc.c:989!
      [  388.061172] invalid opcode: 0000 [#1] SMP
      [  388.061773] Modules linked in: f2fs(O) bluetooth ecdh_generic xt_tcpudp iptable_filter ip_tables x_tables lp ttm drm_kms_helper drm intel_rapl sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel fb_sys_fops ppdev aes_x86_64 syscopyarea crypto_simd sysfillrect parport_pc joydev sysimgblt glue_helper parport cryptd i2c_piix4 serio_raw mac_hid btrfs hid_generic usbhid hid raid6_pq psmouse pata_acpi floppy
      [  388.064247] CPU: 7 PID: 4151 Comm: f2fs_gc-7:0 Tainted: G           O    4.13.0-rc1+ #26
      [  388.065306] Hardware name: Xen HVM domU, BIOS 4.1.2_115-900.260_ 11/06/2015
      [  388.066058] task: ffff880201583b80 task.stack: ffffc90004d7c000
      [  388.069948] RIP: 0010:do_garbage_collect+0xcc8/0xcd0 [f2fs]
      [  388.070766] RSP: 0018:ffffc90004d7fc68 EFLAGS: 00010202
      [  388.071783] RAX: ffff8801ed227000 RBX: 0000000000000001 RCX: ffffea0007b489c0
      [  388.072700] RDX: ffff880000000000 RSI: 0000000000000001 RDI: ffffea0007b489c0
      [  388.073607] RBP: ffffc90004d7fd58 R08: 0000000000000003 R09: ffffea0007b489dc
      [  388.074619] R10: 0000000000000000 R11: 0052782ab317138d R12: 0000000000000018
      [  388.075625] R13: 0000000000000018 R14: ffff880211ceb000 R15: ffff880211ceb000
      [  388.076687] FS:  0000000000000000(0000) GS:ffff880214fc0000(0000) knlGS:0000000000000000
      [  388.083277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  388.084536] CR2: 0000000000e18c60 CR3: 00000001ecf2e000 CR4: 00000000001406e0
      [  388.085748] Call Trace:
      [  388.086690]  ? find_next_bit+0xb/0x10
      [  388.088091]  f2fs_gc+0x1a8/0x9d0 [f2fs]
      [  388.088888]  ? lock_timer_base+0x7d/0xa0
      [  388.090213]  ? try_to_del_timer_sync+0x44/0x60
      [  388.091698]  gc_thread_func+0x342/0x4b0 [f2fs]
      [  388.092892]  ? wait_woken+0x80/0x80
      [  388.094098]  kthread+0x109/0x140
      [  388.095010]  ? f2fs_gc+0x9d0/0x9d0 [f2fs]
      [  388.096043]  ? kthread_park+0x60/0x60
      [  388.097281]  ret_from_fork+0x25/0x30
      [  388.098401] Code: ff ff 48 83 e8 01 48 89 44 24 58 e9 27 f8 ff ff 48 83 e8 01 e9 78 fc ff ff 48 8d 78 ff e9 17 fb ff ff 48 83 ef 01 e9 4d f4 ff ff <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55
      [  388.100864] RIP: do_garbage_collect+0xcc8/0xcd0 [f2fs] RSP: ffffc90004d7fc68
      [  388.101810] ---[ end trace 81c73d6e6b7da61d ]---
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      10d255c3
  14. 05 6月, 2018 1 次提交
    • C
      f2fs: let sync node IO interrupt async one · c29fd0c0
      Chao Yu 提交于
      Although mixed sync/async IOs can have continuous LBA, as they have
      different IO priority, block IO scheduler will add them into different
      queues and commit them separately, result in splited IOs which causes
      wrose performance.
      
      This patch gives high priority to synchronous IO of nodes, means that
      once synchronous flow starts, it can interrupt asynchronous writeback
      flow of system flusher, so more big IOs can be expected.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c29fd0c0
  15. 01 6月, 2018 8 次提交
    • C
      f2fs: clean up symbol namespace · 4d57b86d
      Chao Yu 提交于
      As Ted reported:
      
      "Hi, I was looking at f2fs's sources recently, and I noticed that there
      is a very large number of non-static symbols which don't have a f2fs
      prefix.  There's well over a hundred (see attached below).
      
      As one example, in fs/f2fs/dir.c there is:
      
      unsigned char get_de_type(struct f2fs_dir_entry *de)
      
      This function is clearly only useful for f2fs, but it has a generic
      name.  This means that if any other file system tries to have the same
      symbol name, there will be a symbol conflict and the kernel would not
      successfully build.  It also means that when someone is looking f2fs
      sources, it's not at all obvious whether a function such as
      read_data_page(), invalidate_blocks(), is a generic kernel function
      found in the fs, mm, or block layers, or a f2fs specific function.
      
      You might want to fix this at some point.  Hopefully Kent's bcachefs
      isn't similarly using genericly named functions, since that might
      cause conflicts with f2fs's functions --- but just as this would be a
      problem that we would rightly insist that Kent fix, this is something
      that we should have rightly insisted that f2fs should have fixed
      before it was integrated into the mainline kernel.
      
      acquire_orphan_inode
      add_ino_entry
      add_orphan_inode
      allocate_data_block
      allocate_new_segments
      alloc_nid
      alloc_nid_done
      alloc_nid_failed
      available_free_memory
      ...."
      
      This patch adds "f2fs_" prefix for all non-static symbols in order to:
      a) avoid conflict with other kernel generic symbols;
      b) to indicate the function is f2fs specific one instead of generic
      one;
      Reported-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      4d57b86d
    • C
      f2fs: fix to let caller retry allocating block address · fe16efe6
      Chao Yu 提交于
      Configure io_bits with 2 and enable LFS mode, generic/013 reports below dmesg:
      
      BUG: unable to handle kernel NULL pointer dereference at 00000104
      *pdpt = 0000000029b7b001 *pde = 0000000000000000
      Oops: 0002 [#1] PREEMPT SMP
      Modules linked in: crc32_generic zram f2fs(O) rfcomm bnep bluetooth ecdh_generic snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq pcbc joydev snd_seq_device aesni_intel snd_timer aes_i586 snd crypto_simd cryptd soundcore i2c_piix4 serio_raw mac_hid video parport_pc ppdev lp parport hid_generic psmouse usbhid hid e1000
      CPU: 0 PID: 11161 Comm: fsstress Tainted: G           O      4.17.0-rc2 #38
      Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      EIP: f2fs_submit_page_write+0x28d/0x550 [f2fs]
      EFLAGS: 00010206 CPU: 0
      EAX: e863dcd8 EBX: 00000000 ECX: 00000100 EDX: 00000200
      ESI: e863dcf4 EDI: f6f82768 EBP: e863dbb0 ESP: e863db74
       DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      CR0: 80050033 CR2: 00000104 CR3: 29a62020 CR4: 000406f0
      Call Trace:
       do_write_page+0x6f/0xc0 [f2fs]
       write_data_page+0x4a/0xd0 [f2fs]
       do_write_data_page+0x327/0x630 [f2fs]
       __write_data_page+0x34b/0x820 [f2fs]
       __f2fs_write_data_pages+0x42d/0x8c0 [f2fs]
       f2fs_write_data_pages+0x27/0x30 [f2fs]
       do_writepages+0x1a/0x70
       __filemap_fdatawrite_range+0x94/0xd0
       filemap_write_and_wait_range+0x3d/0xa0
       __generic_file_write_iter+0x11a/0x1f0
       f2fs_file_write_iter+0xdd/0x3b0 [f2fs]
       __vfs_write+0xd2/0x150
       vfs_write+0x9b/0x190
       ksys_write+0x45/0x90
       sys_write+0x16/0x20
       do_fast_syscall_32+0xaa/0x22c
       entry_SYSENTER_32+0x4c/0x7b
      EIP: 0xb7fc8c51
      EFLAGS: 00000246 CPU: 0
      EAX: ffffffda EBX: 00000003 ECX: 09cde000 EDX: 00001000
      ESI: 00000003 EDI: 00001000 EBP: 00000000 ESP: bfbded38
       DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
      Code: e8 f9 77 34 c9 8b 45 e0 8b 80 b8 00 00 00 39 45 d8 0f 84 bb 02 00 00 8b 45 e0 8b 80 b8 00 00 00 8d 50 d8 8b 08 89 55 f0 8b 50 04 <89> 51 04 89 0a c7 00 00 01 00 00 c7 40 04 00 02 00 00 8b 45 dc
      EIP: f2fs_submit_page_write+0x28d/0x550 [f2fs] SS:ESP: 0068:e863db74
      CR2: 0000000000000104
      ---[ end trace 4cac79c0d1305ee6 ]---
      
      allocate_data_block will submit all sequential pending IOs sorted by a
      FIFO list, If we failed to submit other user's IO due to unaligned write,
      we will retry to allocate new block address for current IO, then it will
      initialize fio.list again, if fio was in the list before, it can break
      FIFO list, result in above panic.
      
      Thread A			Thread B
      - do_write_page
       - allocate_data_block
        - list_add_tail
        : fioA cached in FIFO list.
      				- do_write_page
      				 - allocate_data_block
      				  - list_add_tail
      				  : fioB cached in FIFO list.
      				 - f2fs_submit_page_write
      				 : fail to submit IO
      				 - allocate_data_block
      				  - INIT_LIST_HEAD
       - f2fs_submit_page_write
        - list_del  <-- NULL pointer dereference
      
      This patch adds fio.retry parameter to indicate failure status for each
      IO, and avoid bailing out if there is still pending IO in FIFO list for
      fixing.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fe16efe6
    • C
      f2fs: fix error path of move_data_page · 14a28559
      Chao Yu 提交于
      This patch fixes error path of move_data_page:
      - clear cold data flag if it fails to write page.
      - redirty page for non-ENOMEM case.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      14a28559
    • C
      f2fs: avoid stucking GC due to atomic write · 2ef79ecb
      Chao Yu 提交于
      f2fs doesn't allow abuse on atomic write class interface, so except
      limiting in-mem pages' total memory usage capacity, we need to limit
      atomic-write usage as well when filesystem is seriously fragmented,
      otherwise we may run into infinite loop during foreground GC because
      target blocks in victim segment are belong to atomic opened file for
      long time.
      
      Now, we will detect failure due to atomic write in foreground GC, if
      the count exceeds threshold, we will drop all atomic written data in
      cache, by this, I expect it can keep our system running safely to
      prevent Dos attack.
      
      In addition, his patch adds to show GC skip information in debugfs,
      now it just shows count of skipped caused by atomic write.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2ef79ecb
    • J
      f2fs: introduce sbi->gc_mode to determine the policy · 5b0e9539
      Jaegeuk Kim 提交于
      This is to avoid sbi->gc_thread pointer access.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      5b0e9539
    • C
      f2fs: keep migration IO order in LFS mode · 107a805d
      Chao Yu 提交于
      For non-migration IO, we will keep order of data/node blocks' submitting
      as allocation sequence by sorting IOs in per log io_list list, but for
      migration IO, it could be out-of-order.
      
      In LFS mode, we should keep all IOs including migration IO be ordered,
      so that this patch fixes to add an additional lock to keep submitting
      order.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      107a805d
    • C
      Revert "f2fs: add ovp valid_blocks check for bg gc victim to fg_gc" · 299254d8
      Chao Yu 提交于
      For extreme case:
      10 section, op = 10%, no_fggc_threshold = 90%
      All section usage: 85% 85% 85% 85% 90% 90% 95% 95% 95% 95%
      
      During foreground GC, if we skip select dirty section whose usage
      is larger than no_fggc_threshold, we can only recycle 80% invalid
      space from four 85% usage sections and two 90% usage sections,
      result in encountering out-of-space issue.
      
      This reverts commit e93b9865 to
      fix this issue, besides, we keep the logic that we scan all dirty
      section when searching a victim, so that GC can select victim with
      least valid blocks.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      299254d8
    • C
      f2fs: rename dio_rwsem to i_gc_rwsem · b2532c69
      Chao Yu 提交于
      RW semphore dio_rwsem in struct f2fs_inode_info is introduced to avoid
      race between dio and data gc, but now, it is more wildly used to avoid
      foreground operation vs data gc. So rename it to i_gc_rwsem to improve
      its readability.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b2532c69
  16. 03 5月, 2018 2 次提交
    • J
      f2fs: clear PageError on writepage · 17c50035
      Jaegeuk Kim 提交于
      This patch clears PageError in some pages tagged by read path, but when we
      write the pages with valid contents, writepage should clear the bit likewise
      ext4.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      17c50035
    • E
      f2fs: refactor read path to allow multiple postprocessing steps · 6dbb1796
      Eric Biggers 提交于
      Currently f2fs's ->readpage() and ->readpages() assume that either the
      data undergoes no postprocessing, or decryption only.  But with
      fs-verity, there will be an additional authenticity verification step,
      and it may be needed either by itself, or combined with decryption.
      
      To support this, store a 'struct bio_post_read_ctx' in ->bi_private
      which contains a work struct, a bitmask of postprocessing steps that are
      enabled, and an indicator of the current step.  The bio completion
      routine, if there was no I/O error, enqueues the first postprocessing
      step.  When that completes, it continues to the next step.  Pages that
      fail any postprocessing step have PageError set.  Once all steps have
      completed, pages without PageError set are set Uptodate, and all pages
      are unlocked.
      
      Also replace f2fs_encrypted_file() with a new function
      f2fs_post_read_required() in places like direct I/O and garbage
      collection that really should be testing whether the file needs special
      I/O processing, not whether it is encrypted specifically.
      
      This may also be useful for other future f2fs features such as
      compression.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6dbb1796
  17. 12 4月, 2018 1 次提交
  18. 13 3月, 2018 3 次提交
  19. 23 1月, 2018 2 次提交
    • S
      f2fs: avoid hungtask when GC encrypted block if io_bits is set · a9d572c7
      Sheng Yong 提交于
      When io_bits is set, GCing encrypted block may hit the following hungtask.
      Since io_bits requires aligned block address, f2fs_submit_page_write may
      return -EAGAIN if new_blkaddr does not satisify io_bits alignment. As a
      result, the encrypted page will never be writtenback.
      
      This patch makes move_data_block aware the EAGAIN error and cancel the
      writeback.
      
      [  246.751371] INFO: task kworker/u4:4:797 blocked for more than 90 seconds.
      [  246.752423]       Not tainted 4.15.0-rc4+ #11
      [  246.754176] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  246.755336] kworker/u4:4    D25448   797      2 0x80000000
      [  246.755597] Workqueue: writeback wb_workfn (flush-7:0)
      [  246.755616] Call Trace:
      [  246.755695]  ? __schedule+0x322/0xa90
      [  246.755761]  ? blk_init_request_from_bio+0x120/0x120
      [  246.755773]  ? pci_mmcfg_check_reserved+0xb0/0xb0
      [  246.755801]  ? __radix_tree_create+0x19e/0x200
      [  246.755813]  ? delete_node+0x136/0x370
      [  246.755838]  schedule+0x43/0xc0
      [  246.755904]  io_schedule+0x17/0x40
      [  246.755939]  wait_on_page_bit_common+0x17b/0x240
      [  246.755950]  ? wake_page_function+0xa0/0xa0
      [  246.755961]  ? add_to_page_cache_lru+0x160/0x160
      [  246.755972]  ? page_cache_tree_insert+0x170/0x170
      [  246.755983]  ? __lru_cache_add+0x96/0xb0
      [  246.756086]  __filemap_fdatawait_range+0x14f/0x1c0
      [  246.756097]  ? wait_on_page_bit_common+0x240/0x240
      [  246.756120]  ? __wake_up_locked_key_bookmark+0x20/0x20
      [  246.756167]  ? wait_on_all_pages_writeback+0xc9/0x100
      [  246.756179]  ? __remove_ino_entry+0x120/0x120
      [  246.756192]  ? wait_woken+0x100/0x100
      [  246.756204]  filemap_fdatawait_range+0x9/0x20
      [  246.756216]  write_checkpoint+0x18a1/0x1f00
      [  246.756254]  ? blk_get_request+0x10/0x10
      [  246.756265]  ? cpumask_next_and+0x43/0x60
      [  246.756279]  ? f2fs_sync_inode_meta+0x160/0x160
      [  246.756289]  ? remove_element.isra.4+0xa0/0xa0
      [  246.756300]  ? __put_compound_page+0x40/0x40
      [  246.756310]  ? f2fs_sync_fs+0xec/0x1c0
      [  246.756320]  ? f2fs_sync_fs+0x120/0x1c0
      [  246.756329]  f2fs_sync_fs+0x120/0x1c0
      [  246.756357]  ? trace_event_raw_event_f2fs__page+0x260/0x260
      [  246.756393]  ? ata_build_rw_tf+0x173/0x410
      [  246.756397]  f2fs_balance_fs_bg+0x198/0x390
      [  246.756405]  ? drop_inmem_page+0x230/0x230
      [  246.756415]  ? ahci_qc_prep+0x1bb/0x2e0
      [  246.756418]  ? ahci_qc_issue+0x1df/0x290
      [  246.756422]  ? __accumulate_pelt_segments+0x42/0xd0
      [  246.756426]  ? f2fs_write_node_pages+0xd1/0x380
      [  246.756429]  f2fs_write_node_pages+0xd1/0x380
      [  246.756437]  ? sync_node_pages+0x8f0/0x8f0
      [  246.756440]  ? update_curr+0x53/0x220
      [  246.756444]  ? __accumulate_pelt_segments+0xa2/0xd0
      [  246.756448]  ? __update_load_avg_se.isra.39+0x349/0x360
      [  246.756452]  ? do_writepages+0x2a/0xa0
      [  246.756456]  do_writepages+0x2a/0xa0
      [  246.756460]  __writeback_single_inode+0x70/0x490
      [  246.756463]  ? check_preempt_wakeup+0x199/0x310
      [  246.756467]  writeback_sb_inodes+0x2a2/0x660
      [  246.756471]  ? is_empty_dir_inode+0x40/0x40
      [  246.756474]  ? __writeback_single_inode+0x490/0x490
      [  246.756477]  ? string+0xbf/0xf0
      [  246.756480]  ? down_read_trylock+0x35/0x60
      [  246.756484]  __writeback_inodes_wb+0x9f/0xf0
      [  246.756488]  wb_writeback+0x41d/0x4b0
      [  246.756492]  ? writeback_inodes_wb.constprop.55+0x150/0x150
      [  246.756498]  ? set_worker_desc+0xf7/0x130
      [  246.756502]  ? current_is_workqueue_rescuer+0x60/0x60
      [  246.756511]  ? _find_next_bit+0x2c/0xa0
      [  246.756514]  ? wb_workfn+0x400/0x5d0
      [  246.756518]  wb_workfn+0x400/0x5d0
      [  246.756521]  ? finish_task_switch+0xdf/0x2a0
      [  246.756525]  ? inode_wait_for_writeback+0x30/0x30
      [  246.756529]  process_one_work+0x3a7/0x6f0
      [  246.756533]  worker_thread+0x82/0x750
      [  246.756537]  kthread+0x16f/0x1c0
      [  246.756541]  ? trace_event_raw_event_workqueue_work+0x110/0x110
      [  246.756544]  ? kthread_create_worker_on_cpu+0xb0/0xb0
      [  246.756548]  ret_from_fork+0x1f/0x30
      Signed-off-by: NSheng Yong <shengyong1@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a9d572c7
    • J
      f2fs: add an ioctl to disable GC for specific file · 1ad71a27
      Jaegeuk Kim 提交于
      This patch gives a flag to disable GC on given file, which would be useful, when
      user wants to keep its block map. It also conducts in-place-update for dontmove
      file.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1ad71a27
  20. 28 11月, 2017 1 次提交
    • L
      Rename superblock flags (MS_xyz -> SB_xyz) · 1751e8a6
      Linus Torvalds 提交于
      This is a pure automated search-and-replace of the internal kernel
      superblock flags.
      
      The s_flags are now called SB_*, with the names and the values for the
      moment mirroring the MS_* flags that they're equivalent to.
      
      Note how the MS_xyz flags are the ones passed to the mount system call,
      while the SB_xyz flags are what we then use in sb->s_flags.
      
      The script to do this was:
      
          # places to look in; re security/*: it generally should *not* be
          # touched (that stuff parses mount(2) arguments directly), but
          # there are two places where we really deal with superblock flags.
          FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
                  include/linux/fs.h include/uapi/linux/bfs_fs.h \
                  security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
          # the list of MS_... constants
          SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
                DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
                POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
                I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
                ACTIVE NOUSER"
      
          SED_PROG=
          for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
      
          # we want files that contain at least one of MS_...,
          # with fs/namespace.c and fs/pnode.c excluded.
          L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
      
          for f in $L; do sed -i $f $SED_PROG; done
      Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1751e8a6