1. 01 6月, 2018 5 次提交
  2. 03 5月, 2018 2 次提交
    • J
      f2fs: clear PageError on writepage · 17c50035
      Jaegeuk Kim 提交于
      This patch clears PageError in some pages tagged by read path, but when we
      write the pages with valid contents, writepage should clear the bit likewise
      ext4.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      17c50035
    • E
      f2fs: refactor read path to allow multiple postprocessing steps · 6dbb1796
      Eric Biggers 提交于
      Currently f2fs's ->readpage() and ->readpages() assume that either the
      data undergoes no postprocessing, or decryption only.  But with
      fs-verity, there will be an additional authenticity verification step,
      and it may be needed either by itself, or combined with decryption.
      
      To support this, store a 'struct bio_post_read_ctx' in ->bi_private
      which contains a work struct, a bitmask of postprocessing steps that are
      enabled, and an indicator of the current step.  The bio completion
      routine, if there was no I/O error, enqueues the first postprocessing
      step.  When that completes, it continues to the next step.  Pages that
      fail any postprocessing step have PageError set.  Once all steps have
      completed, pages without PageError set are set Uptodate, and all pages
      are unlocked.
      
      Also replace f2fs_encrypted_file() with a new function
      f2fs_post_read_required() in places like direct I/O and garbage
      collection that really should be testing whether the file needs special
      I/O processing, not whether it is encrypted specifically.
      
      This may also be useful for other future f2fs features such as
      compression.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6dbb1796
  3. 12 4月, 2018 1 次提交
  4. 13 3月, 2018 3 次提交
  5. 23 1月, 2018 2 次提交
    • S
      f2fs: avoid hungtask when GC encrypted block if io_bits is set · a9d572c7
      Sheng Yong 提交于
      When io_bits is set, GCing encrypted block may hit the following hungtask.
      Since io_bits requires aligned block address, f2fs_submit_page_write may
      return -EAGAIN if new_blkaddr does not satisify io_bits alignment. As a
      result, the encrypted page will never be writtenback.
      
      This patch makes move_data_block aware the EAGAIN error and cancel the
      writeback.
      
      [  246.751371] INFO: task kworker/u4:4:797 blocked for more than 90 seconds.
      [  246.752423]       Not tainted 4.15.0-rc4+ #11
      [  246.754176] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  246.755336] kworker/u4:4    D25448   797      2 0x80000000
      [  246.755597] Workqueue: writeback wb_workfn (flush-7:0)
      [  246.755616] Call Trace:
      [  246.755695]  ? __schedule+0x322/0xa90
      [  246.755761]  ? blk_init_request_from_bio+0x120/0x120
      [  246.755773]  ? pci_mmcfg_check_reserved+0xb0/0xb0
      [  246.755801]  ? __radix_tree_create+0x19e/0x200
      [  246.755813]  ? delete_node+0x136/0x370
      [  246.755838]  schedule+0x43/0xc0
      [  246.755904]  io_schedule+0x17/0x40
      [  246.755939]  wait_on_page_bit_common+0x17b/0x240
      [  246.755950]  ? wake_page_function+0xa0/0xa0
      [  246.755961]  ? add_to_page_cache_lru+0x160/0x160
      [  246.755972]  ? page_cache_tree_insert+0x170/0x170
      [  246.755983]  ? __lru_cache_add+0x96/0xb0
      [  246.756086]  __filemap_fdatawait_range+0x14f/0x1c0
      [  246.756097]  ? wait_on_page_bit_common+0x240/0x240
      [  246.756120]  ? __wake_up_locked_key_bookmark+0x20/0x20
      [  246.756167]  ? wait_on_all_pages_writeback+0xc9/0x100
      [  246.756179]  ? __remove_ino_entry+0x120/0x120
      [  246.756192]  ? wait_woken+0x100/0x100
      [  246.756204]  filemap_fdatawait_range+0x9/0x20
      [  246.756216]  write_checkpoint+0x18a1/0x1f00
      [  246.756254]  ? blk_get_request+0x10/0x10
      [  246.756265]  ? cpumask_next_and+0x43/0x60
      [  246.756279]  ? f2fs_sync_inode_meta+0x160/0x160
      [  246.756289]  ? remove_element.isra.4+0xa0/0xa0
      [  246.756300]  ? __put_compound_page+0x40/0x40
      [  246.756310]  ? f2fs_sync_fs+0xec/0x1c0
      [  246.756320]  ? f2fs_sync_fs+0x120/0x1c0
      [  246.756329]  f2fs_sync_fs+0x120/0x1c0
      [  246.756357]  ? trace_event_raw_event_f2fs__page+0x260/0x260
      [  246.756393]  ? ata_build_rw_tf+0x173/0x410
      [  246.756397]  f2fs_balance_fs_bg+0x198/0x390
      [  246.756405]  ? drop_inmem_page+0x230/0x230
      [  246.756415]  ? ahci_qc_prep+0x1bb/0x2e0
      [  246.756418]  ? ahci_qc_issue+0x1df/0x290
      [  246.756422]  ? __accumulate_pelt_segments+0x42/0xd0
      [  246.756426]  ? f2fs_write_node_pages+0xd1/0x380
      [  246.756429]  f2fs_write_node_pages+0xd1/0x380
      [  246.756437]  ? sync_node_pages+0x8f0/0x8f0
      [  246.756440]  ? update_curr+0x53/0x220
      [  246.756444]  ? __accumulate_pelt_segments+0xa2/0xd0
      [  246.756448]  ? __update_load_avg_se.isra.39+0x349/0x360
      [  246.756452]  ? do_writepages+0x2a/0xa0
      [  246.756456]  do_writepages+0x2a/0xa0
      [  246.756460]  __writeback_single_inode+0x70/0x490
      [  246.756463]  ? check_preempt_wakeup+0x199/0x310
      [  246.756467]  writeback_sb_inodes+0x2a2/0x660
      [  246.756471]  ? is_empty_dir_inode+0x40/0x40
      [  246.756474]  ? __writeback_single_inode+0x490/0x490
      [  246.756477]  ? string+0xbf/0xf0
      [  246.756480]  ? down_read_trylock+0x35/0x60
      [  246.756484]  __writeback_inodes_wb+0x9f/0xf0
      [  246.756488]  wb_writeback+0x41d/0x4b0
      [  246.756492]  ? writeback_inodes_wb.constprop.55+0x150/0x150
      [  246.756498]  ? set_worker_desc+0xf7/0x130
      [  246.756502]  ? current_is_workqueue_rescuer+0x60/0x60
      [  246.756511]  ? _find_next_bit+0x2c/0xa0
      [  246.756514]  ? wb_workfn+0x400/0x5d0
      [  246.756518]  wb_workfn+0x400/0x5d0
      [  246.756521]  ? finish_task_switch+0xdf/0x2a0
      [  246.756525]  ? inode_wait_for_writeback+0x30/0x30
      [  246.756529]  process_one_work+0x3a7/0x6f0
      [  246.756533]  worker_thread+0x82/0x750
      [  246.756537]  kthread+0x16f/0x1c0
      [  246.756541]  ? trace_event_raw_event_workqueue_work+0x110/0x110
      [  246.756544]  ? kthread_create_worker_on_cpu+0xb0/0xb0
      [  246.756548]  ret_from_fork+0x1f/0x30
      Signed-off-by: NSheng Yong <shengyong1@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a9d572c7
    • J
      f2fs: add an ioctl to disable GC for specific file · 1ad71a27
      Jaegeuk Kim 提交于
      This patch gives a flag to disable GC on given file, which would be useful, when
      user wants to keep its block map. It also conducts in-place-update for dontmove
      file.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1ad71a27
  6. 28 11月, 2017 1 次提交
    • L
      Rename superblock flags (MS_xyz -> SB_xyz) · 1751e8a6
      Linus Torvalds 提交于
      This is a pure automated search-and-replace of the internal kernel
      superblock flags.
      
      The s_flags are now called SB_*, with the names and the values for the
      moment mirroring the MS_* flags that they're equivalent to.
      
      Note how the MS_xyz flags are the ones passed to the mount system call,
      while the SB_xyz flags are what we then use in sb->s_flags.
      
      The script to do this was:
      
          # places to look in; re security/*: it generally should *not* be
          # touched (that stuff parses mount(2) arguments directly), but
          # there are two places where we really deal with superblock flags.
          FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
                  include/linux/fs.h include/uapi/linux/bfs_fs.h \
                  security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
          # the list of MS_... constants
          SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
                DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
                POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
                I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
                ACTIVE NOUSER"
      
          SED_PROG=
          for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
      
          # we want files that contain at least one of MS_...,
          # with fs/namespace.c and fs/pnode.c excluded.
          L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
      
          for f in $L; do sed -i $f $SED_PROG; done
      Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1751e8a6
  7. 06 11月, 2017 3 次提交
  8. 11 10月, 2017 2 次提交
    • C
      f2fs: enhance multiple device flush · 39d787be
      Chao Yu 提交于
      When multiple device feature is enabled, during ->fsync we will issue
      flush in all devices to make sure node/data of the file being persisted
      into storage. But some flushes of device could be unneeded as file's
      data may be not writebacked into those devices. So this patch adds and
      manage bitmap per inode in global cache to indicate which device is
      dirty and it needs to issue flush during ->fsync, hence, we could improve
      performance of fsync in scenario of multiple device.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      39d787be
    • Y
      Revert "f2fs: node segment is prior to data segment selected victim" · 91f4382b
      Yunlong Song 提交于
      This reverts commit b9cd2061.
      
      That patch causes much fewer node segments (which can be used for SSR)
      than before, and in the corner case (e.g. create and delete *.txt files in
      one same directory, there will be very few node segments but many data
      segments), if the reserved free segments are all used up during gc, then
      the write_checkpoint can still flush dentry pages to data ssr segments,
      but will probably fail to flush node pages to node ssr segments, since
      there are not enough node ssr segments left (the left ones are all
      full).
      
      So revert this patch to give a fair chance to let node segments remain
      for SSR, which provides more robustness for corner cases.
      
      Conflicts:
      	fs/f2fs/gc.c
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      91f4382b
  9. 06 9月, 2017 3 次提交
  10. 30 8月, 2017 1 次提交
  11. 22 8月, 2017 1 次提交
  12. 16 8月, 2017 2 次提交
  13. 10 8月, 2017 1 次提交
  14. 01 8月, 2017 2 次提交
    • C
      f2fs: enhance on-disk inode structure scalability · 7a2af766
      Chao Yu 提交于
      This patch add new flag F2FS_EXTRA_ATTR storing in inode.i_inline
      to indicate that on-disk structure of current inode is extended.
      
      In order to extend, we changed the inode structure a bit:
      
      Original one:
      
      struct f2fs_inode {
      	...
      	struct f2fs_extent i_ext;
      	__le32 i_addr[DEF_ADDRS_PER_INODE];
      	__le32 i_nid[DEF_NIDS_PER_INODE];
      }
      
      Extended one:
      
      struct f2fs_inode {
              ...
              struct f2fs_extent i_ext;
      	union {
      		struct {
      			__le16 i_extra_isize;
      			__le16 i_padding;
      			__le32 i_extra_end[0];
      		};
      		__le32 i_addr[DEF_ADDRS_PER_INODE];
      	};
              __le32 i_nid[DEF_NIDS_PER_INODE];
      }
      
      Once F2FS_EXTRA_ATTR is set, we will steal four bytes in the head of
      i_addr field for storing i_extra_isize and i_padding. with i_extra_isize,
      we can calculate actual size of reserved space in i_addr, available
      attribute fields included in total extra attribute fields for current
      inode can be described as below:
      
        +--------------------+
        | .i_mode            |
        | ...                |
        | .i_ext             |
        +--------------------+
        | .i_extra_isize     |-----+
        | .i_padding         |     |
        | .i_prjid           |     |
        | .i_atime_extra     |     |
        | .i_ctime_extra     |     |
        | .i_mtime_extra     |<----+
        | .i_inode_cs        |<----- store blkaddr/inline from here
        | .i_xattr_cs        |
        | ...                |
        +--------------------+
        |                    |
        |    block address   |
        |                    |
        +--------------------+
        | .i_nid             |
        +--------------------+
        |   node_footer      |
        | (nid, ino, offset) |
        +--------------------+
      
      Hence, with this patch, we would enhance scalability of f2fs inode for
      storing more newly added attribute.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7a2af766
    • C
      f2fs: make background threads of f2fs being aware of freezing · dc6febb6
      Chao Yu 提交于
      When ->freeze_fs is called from lvm for doing snapshot, it needs to
      make sure there will be no more changes in filesystem's data, however,
      previously, background threads like GC thread wasn't aware of freezing,
      so in environment with active background threads, data of snapshot
      becomes unstable.
      
      This patch fixes this issue by adding sb_{start,end}_intwrite in
      below background threads:
      - GC thread
      - flush thread
      - discard thread
      
      Note that, don't use sb_start_intwrite() in gc_thread_func() due to:
      
      generic/241 reports below bug:
      
       ======================================================
       WARNING: possible circular locking dependency detected
       4.13.0-rc1+ #32 Tainted: G           O
       ------------------------------------------------------
       f2fs_gc-250:0/22186 is trying to acquire lock:
        (&sbi->gc_mutex){+.+...}, at: [<f8fa7f0b>] f2fs_sync_fs+0x7b/0x1b0 [f2fs]
      
       but task is already holding lock:
        (sb_internal#2){++++.-}, at: [<f8fb5609>] gc_thread_func+0x159/0x4a0 [f2fs]
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #2 (sb_internal#2){++++.-}:
              __lock_acquire+0x405/0x7b0
              lock_acquire+0xae/0x220
              __sb_start_write+0x11d/0x1f0
              f2fs_evict_inode+0x2d6/0x4e0 [f2fs]
              evict+0xa8/0x170
              iput+0x1fb/0x2c0
              f2fs_sync_inode_meta+0x3f/0xf0 [f2fs]
              write_checkpoint+0x1b1/0x750 [f2fs]
              f2fs_sync_fs+0x85/0x1b0 [f2fs]
              f2fs_do_sync_file.isra.24+0x137/0xa30 [f2fs]
              f2fs_sync_file+0x34/0x40 [f2fs]
              vfs_fsync_range+0x4a/0xa0
              do_fsync+0x3c/0x60
              SyS_fdatasync+0x15/0x20
              do_fast_syscall_32+0xa1/0x1b0
              entry_SYSENTER_32+0x4c/0x7b
      
       -> #1 (&sbi->cp_mutex){+.+...}:
              __lock_acquire+0x405/0x7b0
              lock_acquire+0xae/0x220
              __mutex_lock+0x4f/0x830
              mutex_lock_nested+0x25/0x30
              write_checkpoint+0x2f/0x750 [f2fs]
              f2fs_sync_fs+0x85/0x1b0 [f2fs]
              sync_filesystem+0x67/0x80
              generic_shutdown_super+0x27/0x100
              kill_block_super+0x22/0x50
              kill_f2fs_super+0x3a/0x40 [f2fs]
              deactivate_locked_super+0x3d/0x70
              deactivate_super+0x40/0x60
              cleanup_mnt+0x39/0x70
              __cleanup_mnt+0x10/0x20
              task_work_run+0x69/0x80
              exit_to_usermode_loop+0x57/0x92
              do_fast_syscall_32+0x18c/0x1b0
              entry_SYSENTER_32+0x4c/0x7b
      
       -> #0 (&sbi->gc_mutex){+.+...}:
              validate_chain.isra.36+0xc50/0xdb0
              __lock_acquire+0x405/0x7b0
              lock_acquire+0xae/0x220
              __mutex_lock+0x4f/0x830
              mutex_lock_nested+0x25/0x30
              f2fs_sync_fs+0x7b/0x1b0 [f2fs]
              f2fs_balance_fs_bg+0xb9/0x200 [f2fs]
              gc_thread_func+0x302/0x4a0 [f2fs]
              kthread+0xe9/0x120
              ret_from_fork+0x19/0x24
      
       other info that might help us debug this:
      
       Chain exists of:
         &sbi->gc_mutex --> &sbi->cp_mutex --> sb_internal#2
      
        Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(sb_internal#2);
                                      lock(&sbi->cp_mutex);
                                      lock(sb_internal#2);
         lock(&sbi->gc_mutex);
      
        *** DEADLOCK ***
      
       1 lock held by f2fs_gc-250:0/22186:
        #0:  (sb_internal#2){++++.-}, at: [<f8fb5609>] gc_thread_func+0x159/0x4a0 [f2fs]
      
       stack backtrace:
       CPU: 2 PID: 22186 Comm: f2fs_gc-250:0 Tainted: G           O    4.13.0-rc1+ #32
       Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
       Call Trace:
        dump_stack+0x5f/0x92
        print_circular_bug+0x1b3/0x1bd
        validate_chain.isra.36+0xc50/0xdb0
        ? __this_cpu_preempt_check+0xf/0x20
        __lock_acquire+0x405/0x7b0
        lock_acquire+0xae/0x220
        ? f2fs_sync_fs+0x7b/0x1b0 [f2fs]
        __mutex_lock+0x4f/0x830
        ? f2fs_sync_fs+0x7b/0x1b0 [f2fs]
        mutex_lock_nested+0x25/0x30
        ? f2fs_sync_fs+0x7b/0x1b0 [f2fs]
        f2fs_sync_fs+0x7b/0x1b0 [f2fs]
        f2fs_balance_fs_bg+0xb9/0x200 [f2fs]
        gc_thread_func+0x302/0x4a0 [f2fs]
        ? preempt_schedule_common+0x2f/0x4d
        ? f2fs_gc+0x540/0x540 [f2fs]
        kthread+0xe9/0x120
        ? f2fs_gc+0x540/0x540 [f2fs]
        ? kthread_create_on_node+0x30/0x30
        ret_from_fork+0x19/0x24
      
      The deadlock occurs in below condition:
      GC Thread			Thread B
      - sb_start_intwrite
      				- f2fs_sync_file
      				 - f2fs_sync_fs
      				  - mutex_lock(&sbi->gc_mutex)
      				   - write_checkpoint
      				    - block_operations
      				     - f2fs_sync_inode_meta
      				      - iput
      				       - sb_start_intwrite
       - mutex_lock(&sbi->gc_mutex)
      
      Fix this by altering sb_start_intwrite to sb_start_write_trylock.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      dc6febb6
  15. 24 5月, 2017 7 次提交
  16. 04 5月, 2017 1 次提交
  17. 03 5月, 2017 1 次提交
  18. 25 4月, 2017 2 次提交