1. 06 9月, 2018 2 次提交
    • S
      f2fs: fix unnecessary periodic wakeup of discard thread when dev is busy · abde73c7
      Sahitya Tummala 提交于
      When dev is busy, discard thread wake up timeout can be aligned with the
      exact time that it needs to wait for dev to come out of busy. This helps
      to avoid unnecessary periodic wakeups and thus save some power.
      Signed-off-by: NSahitya Tummala <stummala@codeaurora.org>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      abde73c7
    • C
      f2fs: fix to avoid NULL pointer dereference on se->discard_map · 7d20c8ab
      Chao Yu 提交于
      https://bugzilla.kernel.org/show_bug.cgi?id=200951
      
      These is a NULL pointer dereference issue reported in bugzilla:
      
      Hi,
      in the setup there is a SATA SSD connected to a SATA-to-USB bridge.
      
      The disc is "Samsung SSD 850 PRO 256G" which supports TRIM.
      There are four partitions:
       sda1: FAT  /boot
       sda2: F2FS /
       sda3: F2FS /home
       sda4: F2FS
      
      The bridge is ASMT1153e which uses the "uas" driver.
      There is no TRIM pass-through, so, when mounting it reports:
       mounting with "discard" option, but the device does not support discard
      
      The USB host is USB3.0 and UASP capable. It is the one on RK3399.
      
      Given this everything works fine, except there is no TRIM support.
      
      In order to enable TRIM a new UDEV rule is added [1]:
       /etc/udev/rules.d/10-sata-bridge-trim.rules:
       ACTION=="add|change", ATTRS{idVendor}=="174c", ATTRS{idProduct}=="55aa", SUBSYSTEM=="scsi_disk", ATTR{provisioning_mode}="unmap"
      After reboot any F2FS write hangs forever and dmesg reports:
       Unable to handle kernel NULL pointer dereference
      
      Also tested on a x86_64 system: works fine even with TRIM enabled.
       same disc
       same bridge
       different usb host controller
       different cpu architecture
       not root filesystem
      
      Regards,
        Vicenç.
      
      [1] Post #5 in https://bbs.archlinux.org/viewtopic.php?id=236280
      
       Unable to handle kernel NULL pointer dereference at virtual address 000000000000003e
       Mem abort info:
         ESR = 0x96000004
         Exception class = DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
       Data abort info:
         ISV = 0, ISS = 0x00000004
         CM = 0, WnR = 0
       user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000626e3122
       [000000000000003e] pgd=0000000000000000
       Internal error: Oops: 96000004 [#1] SMP
       Modules linked in: overlay snd_soc_hdmi_codec rc_cec dw_hdmi_i2s_audio dw_hdmi_cec snd_soc_simple_card snd_soc_simple_card_utils snd_soc_rockchip_i2s rockchip_rga snd_soc_rockchip_pcm rockchipdrm videobuf2_dma_sg v4l2_mem2mem rtc_rk808 videobuf2_memops analogix_dp videobuf2_v4l2 videobuf2_common dw_hdmi dw_wdt cec rc_core videodev drm_kms_helper media drm rockchip_thermal rockchip_saradc realtek drm_panel_orientation_quirks syscopyarea sysfillrect sysimgblt fb_sys_fops dwmac_rk stmmac_platform stmmac pwm_bl squashfs loop crypto_user gpio_keys hid_kensington
       CPU: 5 PID: 957 Comm: nvim Not tainted 4.19.0-rc1-1-ARCH #1
       Hardware name: Sapphire-RK3399 Board (DT)
       pstate: 00000005 (nzcv daif -PAN -UAO)
       pc : update_sit_entry+0x304/0x4b0
       lr : update_sit_entry+0x108/0x4b0
       sp : ffff00000ca13bd0
       x29: ffff00000ca13bd0 x28: 000000000000003e
       x27: 0000000000000020 x26: 0000000000080000
       x25: 0000000000000048 x24: ffff8000ebb85cf8
       x23: 0000000000000253 x22: 00000000ffffffff
       x21: 00000000000535f2 x20: 00000000ffffffdf
       x19: ffff8000eb9e6800 x18: ffff8000eb9e6be8
       x17: 0000000007ce6926 x16: 000000001c83ffa8
       x15: 0000000000000000 x14: ffff8000f602df90
       x13: 0000000000000006 x12: 0000000000000040
       x11: 0000000000000228 x10: 0000000000000000
       x9 : 0000000000000000 x8 : 0000000000000000
       x7 : 00000000000535f2 x6 : ffff8000ebff3440
       x5 : ffff8000ebff3440 x4 : ffff8000ebe3a6c8
       x3 : 00000000ffffffff x2 : 0000000000000020
       x1 : 0000000000000000 x0 : ffff8000eb9e5800
       Process nvim (pid: 957, stack limit = 0x0000000063a78320)
       Call trace:
        update_sit_entry+0x304/0x4b0
        f2fs_invalidate_blocks+0x98/0x140
        truncate_node+0x90/0x400
        f2fs_remove_inode_page+0xe8/0x340
        f2fs_evict_inode+0x2b0/0x408
        evict+0xe0/0x1e0
        iput+0x160/0x260
        do_unlinkat+0x214/0x298
        __arm64_sys_unlinkat+0x3c/0x68
        el0_svc_handler+0x94/0x118
        el0_svc+0x8/0xc
       Code: f9400800 b9488400 36080140 f9400f01 (387c4820)
       ---[ end trace a0f21a307118c477 ]---
      
      The reason is it is possible to enable discard flag on block queue via
      UDEV, but during mount, f2fs will initialize se->discard_map only if
      this flag is set, once the flag is set after mount, f2fs may dereference
      NULL pointer on se->discard_map.
      
      So this patch does below changes to fix this issue:
      - initialize and update se->discard_map all the time.
      - don't clear DISCARD option if device has no QUEUE_FLAG_DISCARD flag
      during mount.
      - don't issue small discard on zoned block device.
      - introduce some functions to enhance the readability.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Tested-by: NVicente Bergas <vicencb@gmail.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7d20c8ab
  2. 21 8月, 2018 3 次提交
    • C
      f2fs: readahead encrypted block during GC · 6aa58d8a
      Chao Yu 提交于
      During GC, for each encrypted block, we will read block synchronously
      into meta page, and then submit it into current cold data log area.
      
      So this block read model with 4k granularity can make poor performance,
      like migrating non-encrypted block, let's readahead encrypted block
      as well to improve migration performance.
      
      To implement this, we choose meta page that its index is old block
      address of the encrypted block, and readahead ciphertext into this
      page, later, if readaheaded page is still updated, we will load its
      data into target meta page, and submit the write IO.
      
      Note that for OPU, truncation, deletion, we need to invalid meta
      page after we invalid old block address, to make sure we won't load
      invalid data from target meta page during encrypted block migration.
      
      for ((i = 0; i < 1000; i++))
      do {
              xfs_io -f /mnt/f2fs/dir/$i -c "pwrite 0 128k" -c "fsync";
      } done
      
      for ((i = 0; i < 1000; i+=2))
      do {
              rm /mnt/f2fs/dir/$i;
      } done
      
      ret = ioctl(fd, F2FS_IOC_GARBAGE_COLLECT, 0);
      
      Before:
                    gc-6549  [001] d..1 214682.212797: block_rq_insert: 8,32 RA 32768 () 786400 + 64 [gc]
                    gc-6549  [001] d..1 214682.212802: block_unplug: [gc] 1
                    gc-6549  [001] .... 214682.213892: block_bio_queue: 8,32 R 67494144 + 8 [gc]
                    gc-6549  [001] .... 214682.213899: block_getrq: 8,32 R 67494144 + 8 [gc]
                    gc-6549  [001] .... 214682.213902: block_plug: [gc]
                    gc-6549  [001] d..1 214682.213905: block_rq_insert: 8,32 R 4096 () 67494144 + 8 [gc]
                    gc-6549  [001] d..1 214682.213908: block_unplug: [gc] 1
                    gc-6549  [001] .... 214682.226405: block_bio_queue: 8,32 R 67494152 + 8 [gc]
                    gc-6549  [001] .... 214682.226412: block_getrq: 8,32 R 67494152 + 8 [gc]
                    gc-6549  [001] .... 214682.226414: block_plug: [gc]
                    gc-6549  [001] d..1 214682.226417: block_rq_insert: 8,32 R 4096 () 67494152 + 8 [gc]
                    gc-6549  [001] d..1 214682.226420: block_unplug: [gc] 1
                    gc-6549  [001] .... 214682.226904: block_bio_queue: 8,32 R 67494160 + 8 [gc]
                    gc-6549  [001] .... 214682.226910: block_getrq: 8,32 R 67494160 + 8 [gc]
                    gc-6549  [001] .... 214682.226911: block_plug: [gc]
                    gc-6549  [001] d..1 214682.226914: block_rq_insert: 8,32 R 4096 () 67494160 + 8 [gc]
                    gc-6549  [001] d..1 214682.226916: block_unplug: [gc] 1
      
      After:
                    gc-5678  [003] .... 214327.025906: block_bio_queue: 8,32 R 67493824 + 8 [gc]
                    gc-5678  [003] .... 214327.025908: block_bio_backmerge: 8,32 R 67493824 + 8 [gc]
                    gc-5678  [003] .... 214327.025915: block_bio_queue: 8,32 R 67493832 + 8 [gc]
                    gc-5678  [003] .... 214327.025917: block_bio_backmerge: 8,32 R 67493832 + 8 [gc]
                    gc-5678  [003] .... 214327.025923: block_bio_queue: 8,32 R 67493840 + 8 [gc]
                    gc-5678  [003] .... 214327.025925: block_bio_backmerge: 8,32 R 67493840 + 8 [gc]
                    gc-5678  [003] .... 214327.025932: block_bio_queue: 8,32 R 67493848 + 8 [gc]
                    gc-5678  [003] .... 214327.025934: block_bio_backmerge: 8,32 R 67493848 + 8 [gc]
                    gc-5678  [003] .... 214327.025941: block_bio_queue: 8,32 R 67493856 + 8 [gc]
                    gc-5678  [003] .... 214327.025943: block_bio_backmerge: 8,32 R 67493856 + 8 [gc]
                    gc-5678  [003] .... 214327.025953: block_bio_queue: 8,32 R 67493864 + 8 [gc]
                    gc-5678  [003] .... 214327.025955: block_bio_backmerge: 8,32 R 67493864 + 8 [gc]
                    gc-5678  [003] .... 214327.025962: block_bio_queue: 8,32 R 67493872 + 8 [gc]
                    gc-5678  [003] .... 214327.025964: block_bio_backmerge: 8,32 R 67493872 + 8 [gc]
                    gc-5678  [003] .... 214327.025970: block_bio_queue: 8,32 R 67493880 + 8 [gc]
                    gc-5678  [003] .... 214327.025972: block_bio_backmerge: 8,32 R 67493880 + 8 [gc]
                    gc-5678  [003] .... 214327.026000: block_bio_queue: 8,32 WS 34123776 + 2048 [gc]
                    gc-5678  [003] .... 214327.026019: block_getrq: 8,32 WS 34123776 + 2048 [gc]
                    gc-5678  [003] d..1 214327.026021: block_rq_insert: 8,32 R 131072 () 67493632 + 256 [gc]
                    gc-5678  [003] d..1 214327.026023: block_unplug: [gc] 1
                    gc-5678  [003] d..1 214327.026026: block_rq_issue: 8,32 R 131072 () 67493632 + 256 [gc]
                    gc-5678  [003] .... 214327.026046: block_plug: [gc]
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6aa58d8a
    • J
      f2fs: avoid fi->i_gc_rwsem[WRITE] lock in f2fs_gc · 6f8d4455
      Jaegeuk Kim 提交于
      The f2fs_gc() called by f2fs_balance_fs() requires to be called outside of
      fi->i_gc_rwsem[WRITE], since f2fs_gc() can try to grab it in a loop.
      
      If it hits the miximum retrials in GC, let's give a chance to release
      gc_mutex for a short time in order not to go into live lock in the worst
      case.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6f8d4455
    • J
      f2fs: fix performance issue observed with multi-thread sequential read · 853137ce
      Jaegeuk Kim 提交于
      This reverts the commit - "b93f7712 - f2fs: remove writepages lock"
      to fix the drop in sequential read throughput.
      
      Test: ./tiotest -t 32 -d /data/tio_tmp -f 32 -b 524288 -k 1 -k 3 -L
      device: UFS
      
      Before -
      read throughput: 185 MB/s
      total read requests: 85177 (of these ~80000 are 4KB size requests).
      total write requests: 2546 (of these ~2208 requests are written in 512KB).
      
      After -
      read throughput: 758 MB/s
      total read requests: 2417 (of these ~2042 are 512KB reads).
      total write requests: 2701 (of these ~2034 requests are written in 512KB).
      Signed-off-by: NSahitya Tummala <stummala@codeaurora.org>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      853137ce
  3. 15 8月, 2018 1 次提交
    • A
      f2fs: rework fault injection handling to avoid a warning · 7fa750a1
      Arnd Bergmann 提交于
      When CONFIG_F2FS_FAULT_INJECTION is disabled, we get a warning about an
      unused label:
      
      fs/f2fs/segment.c: In function '__submit_discard_cmd':
      fs/f2fs/segment.c:1059:1: error: label 'submit' defined but not used [-Werror=unused-label]
      
      This could be fixed by adding another #ifdef around it, but the more
      reliable way of doing this seems to be to remove the other #ifdefs
      where that is easily possible.
      
      By defining time_to_inject() as a trivial stub, most of the checks for
      CONFIG_F2FS_FAULT_INJECTION can go away. This also leads to nicer
      formatting of the code.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7fa750a1
  4. 14 8月, 2018 5 次提交
    • C
      f2fs: fix to return success when trimming meta area · 3f16ecd9
      Chao Yu 提交于
      generic/251
          --- tests/generic/251.out	2016-05-03 20:20:11.381899000 +0800
           QA output created by 251
           Running the test: done.
          +fstrim: /mnt/scratch_f2fs: FITRIM ioctl failed: Invalid argument
          +fstrim: /mnt/scratch_f2fs: FITRIM ioctl failed: Invalid argument
          +fstrim: /mnt/scratch_f2fs: FITRIM ioctl failed: Invalid argument
          +fstrim: /mnt/scratch_f2fs: FITRIM ioctl failed: Invalid argument
          +fstrim: /mnt/scratch_f2fs: FITRIM ioctl failed: Invalid argument
          ...
      Ran: generic/251
      Failures: generic/251
      
      The reason is coverage of fstrim locates in meta area, previously we
      just return -EINVAL for such case, making generic/251 failed, to fix
      this problem, let's relieve restriction to return success with no
      block discarded.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3f16ecd9
    • C
      f2fs: fix use-after-free of dicard command entry · 6b9cb124
      Chao Yu 提交于
      As Dan Carpenter reported:
      
      The patch 20ee4382: "f2fs: issue small discard by LBA order" from
      Jul 8, 2018, leads to the following Smatch warning:
      
      	fs/f2fs/segment.c:1277 __issue_discard_cmd_orderly()
      	warn: 'dc' was already freed.
      
      See also:
      fs/f2fs/segment.c:2550 __issue_discard_cmd_range() warn: 'dc' was already freed.
      
      In order to fix this issue, let's get error from __submit_discard_cmd(),
      and release current discard command after we referenced next one.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6b9cb124
    • C
      f2fs: support discard submission error injection · b83dcfe6
      Chao Yu 提交于
      This patch adds to support discard submission error injection for testing
      error handling of __submit_discard_cmd().
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b83dcfe6
    • C
      f2fs: split discard command in prior to block layer · 35ec7d57
      Chao Yu 提交于
      Some devices has small max_{hw,}discard_sectors, so that in
      __blkdev_issue_discard(), one big size discard bio can be split
      into multiple small size discard bios, result in heavy load in IO
      scheduler and device, which can hang other sync IO for long time.
      
      Now, f2fs is trying to control discard commands more elaboratively,
      in order to make less conflict in between discard IO and user IO
      to enhance application's performance, so in this patch, we will
      split discard bio in f2fs in prior to in block layer to reduce
      issuing multiple discard bios in a short time.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      35ec7d57
    • C
      f2fs: fix incorrect range->len in f2fs_trim_fs() · 6eae2694
      Chao Yu 提交于
      generic/260 reported below error:
      
           [+] Default length with start set (should succeed)
           [+] Length beyond the end of fs (should succeed)
           [+] Length beyond the end of fs with start set (should succeed)
          +./tests/generic/260: line 94: [: 18446744073709551615: integer expression expected
          +./tests/generic/260: line 104: [: 18446744073709551615: integer expression expected
           Test done
          ...
      
      In f2fs_trim_fs(), if there is no discard being trimmed, we need to correct
      range->len before return.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6eae2694
  5. 02 8月, 2018 7 次提交
    • C
      f2fs: let checkpoint flush dnode page of regular · fd8c8caf
      Chao Yu 提交于
      Fsyncer will wait on all dnode pages of regular writeback before flushing,
      if there are async dnode pages blocked by IO scheduler, it may decrease
      fsync's performance.
      
      In this patch, we choose to let f2fs_balance_fs_bg() to trigger checkpoint
      to flush these dnode pages of regular, so async IO of dnode page can be
      elimitnated, making fsyncer only need to wait for sync IO.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fd8c8caf
    • Y
      f2fs: issue discard align to section in LFS mode · ad6672bb
      Yunlong Song 提交于
      For the case when sbi->segs_per_sec > 1 with lfs mode, take
      section:segment = 5 for example, if the section prefree_map is
      ...previous section | current section (1 1 0 1 1) | next section...,
      then the start = x, end = x + 1, after start = start_segno +
      sbi->segs_per_sec, start = x + 5, then it will skip x + 3 and x + 4, but
      their bitmap is still set, which will cause duplicated
      f2fs_issue_discard of this same section in the next write_checkpoint:
      
      round 1: section bitmap : 1 1 1 1 1, all valid, prefree_map: 0 0 0 0 0
      then rm data block NO.2, block NO.2 becomes invalid, prefree_map: 0 0 1 0 0
      write_checkpoint: section bitmap: 1 1 0 1 1, prefree_map: 0 0 0 0 0,
      prefree of NO.2 is cleared, and no discard issued
      
      round 2: rm data block NO.0, NO.1, NO.3, NO.4
      all invalid, but prefree bit of NO.2 is set and cleared in round 1, then
      prefree_map: 1 1 0 1 1
      write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1, no
      valid blocks of this section, so discard issued, but this time prefree
      bit of NO.3 and NO.4 is skipped due to start = start_segno + sbi->segs_per_sec;
      
      round 3:
      write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1 ->
      0 0 0 0 0, no valid blocks of this section, so discard issued,
      this time prefree bit of NO.3 and NO.4 is cleared, but the discard of
      this section is sent again...
      
      To fix this problem, we can align the start and end value to section
      boundary for fstrim and real-time discard operation, and decide to issue
      discard only when the whole section is invalid, which can issue discard
      aligned to section size as much as possible and avoid redundant discard.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ad6672bb
    • C
      f2fs: clean up with f2fs_is_{atomic,volatile}_file() · 2079f115
      Chao Yu 提交于
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2079f115
    • C
      f2fs: fix to propagate error from __get_meta_page() · 7735730d
      Chao Yu 提交于
      If caller of __get_meta_page() can handle error, let's propagate error
      from __get_meta_page().
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7735730d
    • C
      f2fs: issue small discard by LBA order · 20ee4382
      Chao Yu 提交于
      For small granularity discard which size is smaller than 64KB, if we
      issue those kind of discards orderly by size, their IOs will be spread
      into entire logical address, so that in FTL, L2P table will be updated
      randomly, result bad wear rate in the table.
      
      In this patch, we choose to issue small discard by LBA order, by this
      way, we can expect that L2P table updates from adjacent discard IOs can
      be merged in the cache, so it can reduce lifetime wearing of flash.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      20ee4382
    • C
      f2fs: stop issuing discard immediately if there is queued IO · 522d1711
      Chao Yu 提交于
      For background discard policy, even if there is queued user IO, still
      we will check max_requests times for next discard entry, it is unneeded,
      let's just stop this round submission immediately.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      522d1711
    • C
      f2fs: detect bug_on in f2fs_wait_discard_bios · 2482c432
      Chao Yu 提交于
      Add bug_on to detect potential non-empty discard wait list.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2482c432
  6. 27 7月, 2018 4 次提交
  7. 13 6月, 2018 2 次提交
    • K
      treewide: Use array_size in f2fs_kvzalloc() · 9d2a789c
      Kees Cook 提交于
      The f2fs_kvzalloc() function has no 2-factor argument form, so
      multiplication factors need to be wrapped in array_size(). This patch
      replaces cases of:
      
              f2fs_kvzalloc(handle, a * b, gfp)
      
      with:
              f2fs_kvzalloc(handle, array_size(a, b), gfp)
      
      as well as handling cases of:
      
              f2fs_kvzalloc(handle, a * b * c, gfp)
      
      with:
      
              f2fs_kvzalloc(handle, array3_size(a, b, c), gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              f2fs_kvzalloc(handle, 4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      expression HANDLE;
      type TYPE;
      expression THING, E;
      @@
      
      (
        f2fs_kvzalloc(HANDLE,
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression HANDLE;
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        f2fs_kvzalloc(HANDLE,
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      expression HANDLE;
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE) * (COUNT_ID)
      +	array_size(COUNT_ID, sizeof(TYPE))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE) * COUNT_ID
      +	array_size(COUNT_ID, sizeof(TYPE))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE) * (COUNT_CONST)
      +	array_size(COUNT_CONST, sizeof(TYPE))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE) * COUNT_CONST
      +	array_size(COUNT_CONST, sizeof(TYPE))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(THING) * (COUNT_ID)
      +	array_size(COUNT_ID, sizeof(THING))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(THING) * COUNT_ID
      +	array_size(COUNT_ID, sizeof(THING))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(THING) * (COUNT_CONST)
      +	array_size(COUNT_CONST, sizeof(THING))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(THING) * COUNT_CONST
      +	array_size(COUNT_CONST, sizeof(THING))
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      expression HANDLE;
      identifier SIZE, COUNT;
      @@
      
        f2fs_kvzalloc(HANDLE,
      -	SIZE * COUNT
      +	array_size(COUNT, SIZE)
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression HANDLE;
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression HANDLE;
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      expression HANDLE;
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        f2fs_kvzalloc(HANDLE,
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products
      // when they're not all constants...
      @@
      expression HANDLE;
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        f2fs_kvzalloc(HANDLE, C1 * C2 * C3, ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants.
      @@
      expression HANDLE;
      expression E1, E2;
      constant C1, C2;
      @@
      
      (
        f2fs_kvzalloc(HANDLE, C1 * C2, ...)
      |
        f2fs_kvzalloc(HANDLE,
      -	E1 * E2
      +	array_size(E1, E2)
        , ...)
      )
      Signed-off-by: NKees Cook <keescook@chromium.org>
      9d2a789c
    • K
      treewide: Use array_size() in f2fs_kzalloc() · 026f0507
      Kees Cook 提交于
      The f2fs_kzalloc() function has no 2-factor argument form, so
      multiplication factors need to be wrapped in array_size(). This patch
      replaces cases of:
      
              f2fs_kzalloc(handle, a * b, gfp)
      
      with:
              f2fs_kzalloc(handle, array_size(a, b), gfp)
      
      as well as handling cases of:
      
              f2fs_kzalloc(handle, a * b * c, gfp)
      
      with:
      
              f2fs_kzalloc(handle, array3_size(a, b, c), gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              f2fs_kzalloc(handle, 4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      expression HANDLE;
      type TYPE;
      expression THING, E;
      @@
      
      (
        f2fs_kzalloc(HANDLE,
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression HANDLE;
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        f2fs_kzalloc(HANDLE,
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      expression HANDLE;
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE) * (COUNT_ID)
      +	array_size(COUNT_ID, sizeof(TYPE))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE) * COUNT_ID
      +	array_size(COUNT_ID, sizeof(TYPE))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE) * (COUNT_CONST)
      +	array_size(COUNT_CONST, sizeof(TYPE))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE) * COUNT_CONST
      +	array_size(COUNT_CONST, sizeof(TYPE))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(THING) * (COUNT_ID)
      +	array_size(COUNT_ID, sizeof(THING))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(THING) * COUNT_ID
      +	array_size(COUNT_ID, sizeof(THING))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(THING) * (COUNT_CONST)
      +	array_size(COUNT_CONST, sizeof(THING))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(THING) * COUNT_CONST
      +	array_size(COUNT_CONST, sizeof(THING))
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      expression HANDLE;
      identifier SIZE, COUNT;
      @@
      
        f2fs_kzalloc(HANDLE,
      -	SIZE * COUNT
      +	array_size(COUNT, SIZE)
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression HANDLE;
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression HANDLE;
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      expression HANDLE;
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        f2fs_kzalloc(HANDLE,
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        f2fs_kzalloc(HANDLE,
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products
      // when they're not all constants...
      @@
      expression HANDLE;
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        f2fs_kzalloc(HANDLE, C1 * C2 * C3, ...)
      |
        f2fs_kzalloc(HANDLE,
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants.
      @@
      expression HANDLE;
      expression E1, E2;
      constant C1, C2;
      @@
      
      (
        f2fs_kzalloc(HANDLE, C1 * C2, ...)
      |
        f2fs_kzalloc(HANDLE,
      -	E1 * E2
      +	array_size(E1, E2)
        , ...)
      )
      Signed-off-by: NKees Cook <keescook@chromium.org>
      026f0507
  8. 05 6月, 2018 1 次提交
  9. 01 6月, 2018 15 次提交
    • C
      f2fs: clean up symbol namespace · 4d57b86d
      Chao Yu 提交于
      As Ted reported:
      
      "Hi, I was looking at f2fs's sources recently, and I noticed that there
      is a very large number of non-static symbols which don't have a f2fs
      prefix.  There's well over a hundred (see attached below).
      
      As one example, in fs/f2fs/dir.c there is:
      
      unsigned char get_de_type(struct f2fs_dir_entry *de)
      
      This function is clearly only useful for f2fs, but it has a generic
      name.  This means that if any other file system tries to have the same
      symbol name, there will be a symbol conflict and the kernel would not
      successfully build.  It also means that when someone is looking f2fs
      sources, it's not at all obvious whether a function such as
      read_data_page(), invalidate_blocks(), is a generic kernel function
      found in the fs, mm, or block layers, or a f2fs specific function.
      
      You might want to fix this at some point.  Hopefully Kent's bcachefs
      isn't similarly using genericly named functions, since that might
      cause conflicts with f2fs's functions --- but just as this would be a
      problem that we would rightly insist that Kent fix, this is something
      that we should have rightly insisted that f2fs should have fixed
      before it was integrated into the mainline kernel.
      
      acquire_orphan_inode
      add_ino_entry
      add_orphan_inode
      allocate_data_block
      allocate_new_segments
      alloc_nid
      alloc_nid_done
      alloc_nid_failed
      available_free_memory
      ...."
      
      This patch adds "f2fs_" prefix for all non-static symbols in order to:
      a) avoid conflict with other kernel generic symbols;
      b) to indicate the function is f2fs specific one instead of generic
      one;
      Reported-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      4d57b86d
    • C
      f2fs: fix to let caller retry allocating block address · fe16efe6
      Chao Yu 提交于
      Configure io_bits with 2 and enable LFS mode, generic/013 reports below dmesg:
      
      BUG: unable to handle kernel NULL pointer dereference at 00000104
      *pdpt = 0000000029b7b001 *pde = 0000000000000000
      Oops: 0002 [#1] PREEMPT SMP
      Modules linked in: crc32_generic zram f2fs(O) rfcomm bnep bluetooth ecdh_generic snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq pcbc joydev snd_seq_device aesni_intel snd_timer aes_i586 snd crypto_simd cryptd soundcore i2c_piix4 serio_raw mac_hid video parport_pc ppdev lp parport hid_generic psmouse usbhid hid e1000
      CPU: 0 PID: 11161 Comm: fsstress Tainted: G           O      4.17.0-rc2 #38
      Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      EIP: f2fs_submit_page_write+0x28d/0x550 [f2fs]
      EFLAGS: 00010206 CPU: 0
      EAX: e863dcd8 EBX: 00000000 ECX: 00000100 EDX: 00000200
      ESI: e863dcf4 EDI: f6f82768 EBP: e863dbb0 ESP: e863db74
       DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      CR0: 80050033 CR2: 00000104 CR3: 29a62020 CR4: 000406f0
      Call Trace:
       do_write_page+0x6f/0xc0 [f2fs]
       write_data_page+0x4a/0xd0 [f2fs]
       do_write_data_page+0x327/0x630 [f2fs]
       __write_data_page+0x34b/0x820 [f2fs]
       __f2fs_write_data_pages+0x42d/0x8c0 [f2fs]
       f2fs_write_data_pages+0x27/0x30 [f2fs]
       do_writepages+0x1a/0x70
       __filemap_fdatawrite_range+0x94/0xd0
       filemap_write_and_wait_range+0x3d/0xa0
       __generic_file_write_iter+0x11a/0x1f0
       f2fs_file_write_iter+0xdd/0x3b0 [f2fs]
       __vfs_write+0xd2/0x150
       vfs_write+0x9b/0x190
       ksys_write+0x45/0x90
       sys_write+0x16/0x20
       do_fast_syscall_32+0xaa/0x22c
       entry_SYSENTER_32+0x4c/0x7b
      EIP: 0xb7fc8c51
      EFLAGS: 00000246 CPU: 0
      EAX: ffffffda EBX: 00000003 ECX: 09cde000 EDX: 00001000
      ESI: 00000003 EDI: 00001000 EBP: 00000000 ESP: bfbded38
       DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
      Code: e8 f9 77 34 c9 8b 45 e0 8b 80 b8 00 00 00 39 45 d8 0f 84 bb 02 00 00 8b 45 e0 8b 80 b8 00 00 00 8d 50 d8 8b 08 89 55 f0 8b 50 04 <89> 51 04 89 0a c7 00 00 01 00 00 c7 40 04 00 02 00 00 8b 45 dc
      EIP: f2fs_submit_page_write+0x28d/0x550 [f2fs] SS:ESP: 0068:e863db74
      CR2: 0000000000000104
      ---[ end trace 4cac79c0d1305ee6 ]---
      
      allocate_data_block will submit all sequential pending IOs sorted by a
      FIFO list, If we failed to submit other user's IO due to unaligned write,
      we will retry to allocate new block address for current IO, then it will
      initialize fio.list again, if fio was in the list before, it can break
      FIFO list, result in above panic.
      
      Thread A			Thread B
      - do_write_page
       - allocate_data_block
        - list_add_tail
        : fioA cached in FIFO list.
      				- do_write_page
      				 - allocate_data_block
      				  - list_add_tail
      				  : fioB cached in FIFO list.
      				 - f2fs_submit_page_write
      				 : fail to submit IO
      				 - allocate_data_block
      				  - INIT_LIST_HEAD
       - f2fs_submit_page_write
        - list_del  <-- NULL pointer dereference
      
      This patch adds fio.retry parameter to indicate failure status for each
      IO, and avoid bailing out if there is still pending IO in FIFO list for
      fixing.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fe16efe6
    • C
      f2fs: fix to don't trigger writeback during recovery · 64c74a7a
      Chao Yu 提交于
      - f2fs_fill_super
       - recover_fsync_data
        - recover_data
         - del_fsync_inode
          - iput
           - iput_final
            - write_inode_now
             - f2fs_write_inode
              - f2fs_balance_fs
               - f2fs_balance_fs_bg
                - sync_dirty_inodes
      
      With data_flush mount option, during recovery, in order to avoid entering
      above writeback flow, let's detect recovery status and do skip in
      f2fs_balance_fs_bg.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      64c74a7a
    • S
      f2fs: clear discard_wake earlier · 35a9a766
      Sheng Yong 提交于
      If SBI_NEED_FSCK is set, discard_wake will never be cleared. As a
      result, the condition of wait_event_interruptible_timeout() is always
      true, which gets discard thread run too frequently.
      Signed-off-by: NSheng Yong <shengyong1@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      35a9a766
    • Y
      f2fs: let discard thread wait a little longer if dev is busy · f9d1dced
      Yunlei He 提交于
      This patch modify discard thread wait policy as below:
      	issued       io_interrupted     wait time(ms)
      1.        8                 0               50
      2.      (0,8)               1               50
      3.        0                 1              500 (dev is busy)
      4.        0                 0            60000 (no candidates)
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f9d1dced
    • C
      f2fs: avoid stucking GC due to atomic write · 2ef79ecb
      Chao Yu 提交于
      f2fs doesn't allow abuse on atomic write class interface, so except
      limiting in-mem pages' total memory usage capacity, we need to limit
      atomic-write usage as well when filesystem is seriously fragmented,
      otherwise we may run into infinite loop during foreground GC because
      target blocks in victim segment are belong to atomic opened file for
      long time.
      
      Now, we will detect failure due to atomic write in foreground GC, if
      the count exceeds threshold, we will drop all atomic written data in
      cache, by this, I expect it can keep our system running safely to
      prevent Dos attack.
      
      In addition, his patch adds to show GC skip information in debugfs,
      now it just shows count of skipped caused by atomic write.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2ef79ecb
    • J
      f2fs: introduce sbi->gc_mode to determine the policy · 5b0e9539
      Jaegeuk Kim 提交于
      This is to avoid sbi->gc_thread pointer access.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      5b0e9539
    • C
      f2fs: keep migration IO order in LFS mode · 107a805d
      Chao Yu 提交于
      For non-migration IO, we will keep order of data/node blocks' submitting
      as allocation sequence by sorting IOs in per log io_list list, but for
      migration IO, it could be out-of-order.
      
      In LFS mode, we should keep all IOs including migration IO be ordered,
      so that this patch fixes to add an additional lock to keep submitting
      order.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      107a805d
    • C
      f2fs: fix to wait page writeback during revoking atomic write · e5e5732d
      Chao Yu 提交于
      After revoking atomic write, related LBA can be reused by others, so we
      need to wait page writeback before reusing the LBA, in order to avoid
      interference between old atomic written in-flight IO and new IO.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e5e5732d
    • C
      f2fs: clean up with is_valid_blkaddr() · 7b525dd0
      Chao Yu 提交于
      - rename is_valid_blkaddr() to is_valid_meta_blkaddr() for readability.
      - introduce is_valid_blkaddr() for cleanup.
      
      No logic change in this patch.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7b525dd0
    • C
      f2fs: fix to initialize min_mtime with ULLONG_MAX · 5ad25442
      Chao Yu 提交于
      Since sit_i.min_mtime's type is unsigned long long, so we should
      initialize it with max value of the type ULLONG_MAX instead of
      LLONG_MAX.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      5ad25442
    • C
      f2fs: treat volatile file's data as hot one · b4c3ca8b
      Chao Yu 提交于
      Volatile file's data will be updated oftenly, so it'd better to place
      its data into hot data segment.
      
      In addition, for atomic file, we change to check FI_ATOMIC_FILE instead
      of FI_HOT_DATA to make code readability better.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b4c3ca8b
    • C
      f2fs: introduce release_discard_addr() for cleanup · af8ff65b
      Chao Yu 提交于
      Introduce release_discard_addr() to include common codes for cleanup.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      [Fengguang Wu: declare static function, reported by kbuild test robot]
      Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      af8ff65b
    • C
      f2fs: fix potential overflow · a9af3fdc
      Chao Yu 提交于
      In build_sit_entries(), if valid_blocks in SIT block is smaller than
      valid_blocks in journal, for below calculation:
      
      sbi->discard_blks += old_valid_blocks - se->valid_blocks;
      
      There will be two times potential overflow:
      - old_valid_blocks - se->valid_blocks will overflow, and be a very
      large number.
      - sbi->discard_blks += result will overflow again, comes out a correct
      result accidently.
      
      Anyway, it should be fixed.
      
      Fixes: d600af23 ("f2fs: avoid unneeded loop in build_sit_entries")
      Fixes: 1f43e2ad ("f2fs: introduce CP_TRIMMED_FLAG to avoid unneeded discard")
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a9af3fdc
    • J
      f2fs: sanity check for total valid node blocks · 8a29c126
      Jaegeuk Kim 提交于
      This patch enhances sanity check for SIT entries.
      
      syzbot hit the following crash on upstream commit
      83beed7b (Fri Apr 20 17:56:32 2018 +0000)
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal
      syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=bf9253040425feb155ad
      
      syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=5692130282438656
      Raw console output: https://syzkaller.appspot.com/x/log.txt?id=5095924598571008
      Kernel config: https://syzkaller.appspot.com/x/.config?id=1808800213120130118
      compiler: gcc (GCC) 8.0.1 20180413 (experimental)
      
      IMPORTANT: if you fix the bug, please add the following tag to the commit:
      Reported-by: syzbot+bf9253040425feb155ad@syzkaller.appspotmail.com
      It will help syzbot understand when the bug is fixed. See footer for details.
      If you forward the report, please keep this part and the footer.
      
      F2FS-fs (loop0): invalid crc value
      F2FS-fs (loop0): Try to recover 1th superblock, ret: 0
      F2FS-fs (loop0): Mounted with checkpoint version = d
      F2FS-fs (loop0): Bitmap was wrongly cleared, blk:9740
      ------------[ cut here ]------------
      kernel BUG at fs/f2fs/segment.c:1884!
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 1 PID: 4508 Comm: syz-executor0 Not tainted 4.17.0-rc1+ #10
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:update_sit_entry+0x1215/0x1590 fs/f2fs/segment.c:1882
      RSP: 0018:ffff8801af526708 EFLAGS: 00010282
      RAX: ffffed0035ea4cc0 RBX: ffff8801ad454f90 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffffffff82eeb87e RDI: ffffed0035ea4cb6
      RBP: ffff8801af526760 R08: ffff8801ad4a2480 R09: ffffed003b5e4f90
      R10: ffffed003b5e4f90 R11: ffff8801daf27c87 R12: ffff8801adb8d380
      R13: 0000000000000001 R14: 0000000000000008 R15: 00000000ffffffff
      FS:  00000000014af940(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f06bc223000 CR3: 00000001adb02000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       allocate_data_block+0x66f/0x2050 fs/f2fs/segment.c:2663
       do_write_page+0x105/0x1b0 fs/f2fs/segment.c:2727
       write_node_page+0x129/0x350 fs/f2fs/segment.c:2770
       __write_node_page+0x7da/0x1370 fs/f2fs/node.c:1398
       sync_node_pages+0x18cf/0x1eb0 fs/f2fs/node.c:1652
       block_operations+0x429/0xa60 fs/f2fs/checkpoint.c:1088
       write_checkpoint+0x3ba/0x5380 fs/f2fs/checkpoint.c:1405
       f2fs_sync_fs+0x2fb/0x6a0 fs/f2fs/super.c:1077
       __sync_filesystem fs/sync.c:39 [inline]
       sync_filesystem+0x265/0x310 fs/sync.c:67
       generic_shutdown_super+0xd7/0x520 fs/super.c:429
       kill_block_super+0xa4/0x100 fs/super.c:1191
       kill_f2fs_super+0x9f/0xd0 fs/f2fs/super.c:3030
       deactivate_locked_super+0x97/0x100 fs/super.c:316
       deactivate_super+0x188/0x1b0 fs/super.c:347
       cleanup_mnt+0xbf/0x160 fs/namespace.c:1174
       __cleanup_mnt+0x16/0x20 fs/namespace.c:1181
       task_work_run+0x1e4/0x290 kernel/task_work.c:113
       tracehook_notify_resume include/linux/tracehook.h:191 [inline]
       exit_to_usermode_loop+0x2bd/0x310 arch/x86/entry/common.c:166
       prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
       syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
       do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457d97
      RSP: 002b:00007ffd46f9c8e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000457d97
      RDX: 00000000014b09a3 RSI: 0000000000000002 RDI: 00007ffd46f9da50
      RBP: 00007ffd46f9da50 R08: 0000000000000000 R09: 0000000000000009
      R10: 0000000000000005 R11: 0000000000000246 R12: 00000000014b0940
      R13: 0000000000000000 R14: 0000000000000002 R15: 000000000000658e
      RIP: update_sit_entry+0x1215/0x1590 fs/f2fs/segment.c:1882 RSP: ffff8801af526708
      ---[ end trace f498328bb02610a2 ]---
      
      Reported-and-tested-by: syzbot+bf9253040425feb155ad@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+7d6d31d3bc702f566ce3@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+0a725420475916460f12@syzkaller.appspotmail.com
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      8a29c126