1. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  2. 23 2月, 2016 3 次提交
    • C
      f2fs: support revoking atomic written pages · 28bc106b
      Chao Yu 提交于
      f2fs support atomic write with following semantics:
      1. open db file
      2. ioctl start atomic write
      3. (write db file) * n
      4. ioctl commit atomic write
      5. close db file
      
      With this flow we can avoid file becoming corrupted when abnormal power
      cut, because we hold data of transaction in referenced pages linked in
      inmem_pages list of inode, but without setting them dirty, so these data
      won't be persisted unless we commit them in step 4.
      
      But we should still hold journal db file in memory by using volatile
      write, because our semantics of 'atomic write support' is incomplete, in
      step 4, we could fail to submit all dirty data of transaction, once
      partial dirty data was committed in storage, then after a checkpoint &
      abnormal power-cut, db file will be corrupted forever.
      
      So this patch tries to improve atomic write flow by adding a revoking flow,
      once inner error occurs in committing, this gives another chance to try to
      revoke these partial submitted data of current transaction, it makes
      committing operation more like aotmical one.
      
      If we're not lucky, once revoking operation was failed, EAGAIN will be
      reported to user for suggesting doing the recovery with held journal file,
      or retrying current transaction again.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      28bc106b
    • C
      f2fs: remove unneeded pointer conversion · 81ca7350
      Chao Yu 提交于
      There are redundant pointer conversion in following call stack:
       - at position a, inode was been converted to f2fs_file_info.
       - at position b, f2fs_file_info was been converted to inode again.
      
       - truncate_blocks(inode,..)
        - fi = F2FS_I(inode)		---a
        - ADDRS_PER_PAGE(node_page, fi)
         - addrs_per_inode(fi)
          - inode = &fi->vfs_inode	---b
          - f2fs_has_inline_xattr(inode)
           - fi = F2FS_I(inode)
           - is_inode_flag_set(fi,..)
      
      In order to avoid unneeded conversion, alter ADDRS_PER_PAGE and
      addrs_per_inode to acept parameter with type of inode pointer.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      81ca7350
    • J
      f2fs: use wait_for_stable_page to avoid contention · fec1d657
      Jaegeuk Kim 提交于
      In write_begin, if storage supports stable_page, we don't need to wait for
      writeback to update its contents.
      This patch introduces to use wait_for_stable_page instead of
      wait_on_page_writeback.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      fec1d657
  3. 31 12月, 2015 1 次提交
  4. 05 12月, 2015 2 次提交
  5. 13 10月, 2015 2 次提交
    • C
      f2fs: support lower priority asynchronous readahead in ra_meta_pages · 26879fb1
      Chao Yu 提交于
      Now, we use ra_meta_pages to reads continuous physical blocks as much as
      possible to improve performance of following reads. However, ra_meta_pages
      uses a synchronous readahead approach by submitting bio with READ, as READ
      is with high priority, it can not be used in the case of preloading blocks,
      and it's not sure when these RAed pages will be used.
      
      This patch supports asynchronous readahead in ra_meta_pages by tagging bio
      with READA flag in order to allow preloading.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      26879fb1
    • C
      f2fs: don't tag REQ_META for temporary non-meta pages · 2b947003
      Chao Yu 提交于
      In recovery or checkpoint flow, we grab pages temperarily in meta inode's
      mapping for caching temperary data, actually, datas in these pages were
      not meta data of f2fs, but still we tag them with REQ_META flag. However,
      lower device like eMMC may do some optimization for data of such type.
      So in order to avoid wrong optimization, we'd better remove such flag
      for temperary non-meta pages.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      2b947003
  6. 10 10月, 2015 2 次提交
  7. 20 8月, 2015 1 次提交
  8. 06 8月, 2015 1 次提交
    • C
      f2fs: recover invalid/reserved block address for fsynced file · 12a8343e
      Chao Yu 提交于
      When testing with generic/101 in xfstests, error message outputed as below:
      
          --- tests/generic/101.out
          +++ results//generic/101.out.bad
          @@ -10,10 +10,14 @@
           File foo content after log replay:
           0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
           *
          -0200000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          +0200000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
           *
           0372000
          ...
          (Run 'diff -u tests/generic/101.out results/generic/101.out.bad'  to see the entire diff)
      
      The test flow is like below:
      1. pwrite foo -S 0xaa 0 64K
      2. pwrite foo -S 0xbb 64K 61K
      3. sync
      4. truncate foo 64K
      5. truncate foo 125K
      6. fsync foo
      7. flakey drop writes
      8. umount
      
      After this test, we expect the data of recovered file will have the first
      64k of data filling with value 0xaa and the next 61k of data filling with
      value 0x00 because we have fsynced it before dropping writes in dm.
      
      In f2fs, during recovering, we will only recover the valid block address
      in direct node page if it is marked as a fsynced dnode, but block address
      which means invalid/reserved (with value NULL_ADDR/NEW_ADDR) will not be
      recovered. So, the file recovered shows its incorrect data 0xbb in range of
      [61k, 125k].
      
      In this patch, we fix to recover invalid/reserved block during recover flow.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      12a8343e
  9. 05 8月, 2015 1 次提交
    • C
      f2fs: invalidate temporary meta page · e90c2d28
      Chao Yu 提交于
      To avoid meeting garbage data in next free node block at the end of warm
      node chain when doing recovery, we will try to zero out that invalid block.
      
      If the device is not support discard, our way for zeroing out block is:
      grabbing a temporary zeroed page in meta inode, then, issue write request
      with this page.
      
      But, we forget to release that temporary page, so our memory usage will
      increase without gaining any hit ratio benefit, so it's better to free it
      for saving memory.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e90c2d28
  10. 03 6月, 2015 1 次提交
  11. 29 5月, 2015 2 次提交
  12. 08 5月, 2015 1 次提交
  13. 17 4月, 2015 1 次提交
  14. 11 4月, 2015 5 次提交
  15. 04 3月, 2015 3 次提交
  16. 12 2月, 2015 2 次提交
    • C
      f2fs: merge flags in struct f2fs_sb_info · caf0047e
      Chao Yu 提交于
      Currently, there are several variables with Boolean type as below:
      
      struct f2fs_sb_info {
      ...
      	int s_dirty;
      	bool need_fsck;
      	bool s_closing;
      ...
      	bool por_doing;
      ...
      }
      
      For this there are some issues:
      1. there are some space of f2fs_sb_info is wasted due to aligning after Boolean
         type variables by compiler.
      2. if we continuously add new flag into f2fs_sb_info, structure will be messed
         up.
      
      So in this patch, we try to:
      1. switch s_dirty to Boolean type variable since it has two status 0/1.
      2. merge s_dirty/need_fsck/s_closing/por_doing variables into s_flag.
      3. introduce an enum type which can indicate different states of sbi.
      4. use new introduced universal interfaces is_sbi_flag_set/{set,clear}_sbi_flag
         to operate flags for sbi.
      
      After that, above issues will be fixed.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      caf0047e
    • J
      f2fs: leave comment for code readability · bc4a1f87
      Jaegeuk Kim 提交于
      During the recovery, any xattr blocks should not be found, since they are
      written into cold log, not the warm node chain.
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      bc4a1f87
  17. 10 1月, 2015 1 次提交
  18. 09 12月, 2014 1 次提交
  19. 24 11月, 2014 1 次提交
  20. 04 11月, 2014 2 次提交
  21. 01 10月, 2014 2 次提交
    • J
      f2fs: check the use of macros on block counts and addresses · 7cd8558b
      Jaegeuk Kim 提交于
      This patch cleans up the existing and new macros for readability.
      
      Rule is like this.
      
               ,-----------------------------------------> MAX_BLKADDR -,
               |  ,------------- TOTAL_BLKS ----------------------------,
               |  |                                                     |
               |  ,- seg0_blkaddr   ,----- sit/nat/ssa/main blkaddress  |
      block    |  | (SEG0_BLKADDR)  | | | |   (e.g., MAIN_BLKADDR)      |
      address  0..x................ a b c d .............................
                  |                                                     |
      global seg# 0...................... m .............................
                  |                       |                             |
                  |                       `------- MAIN_SEGS -----------'
                  `-------------- TOTAL_SEGS ---------------------------'
                                          |                             |
       seg#                               0..........xx..................
      
      = Note =
       o GET_SEGNO_FROM_SEG0 : blk address -> global segno
       o GET_SEGNO           : blk address -> segno
       o START_BLOCK         : segno -> starting block address
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7cd8558b
    • J
      f2fs: introduce cp_control structure · 75ab4cb8
      Jaegeuk Kim 提交于
      This patch add a new data structure to control checkpoint parameters.
      Currently, it presents the reason of checkpoint such as is_umount and normal
      sync.
      Reviewed-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      75ab4cb8
  22. 24 9月, 2014 3 次提交
  23. 16 9月, 2014 1 次提交
    • J
      f2fs: fix double lock for inode page during roll-foward recovery · 60979115
      Jaegeuk Kim 提交于
      If the inode is same and its data index are needed to truncate, we can fall into
      double lock for its inode page via get_dnode_of_data.
      
      Error case is like this.
      
      1. write data 1, 2, 3, 4, 5 in inode #4.
      2. write data 100, 102, 103, 104, 105 in dnode #6 of inode #4.
      3. sync
      4. update data 100->106 in dnode #6.
      5. fsync inode #4.
      6. power-cut
      
      -> Then,
      1. go back to #3's checkpoint
      2. in do_recover_data, get_dnode_of_data() gets inode #4.
      3. detect 100->106 in dnode #6.
      4. check_index_in_prev_nodes tries to truncate 100 in dnode #6.
      5. to trigger truncate_hole, get_dnode_of_data should grab inode #4.
      6. detect *kernel hang*
      
      This patch should resolve that bug.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      60979115