1. 09 5月, 2017 1 次提交
    • M
      mm: introduce kv[mz]alloc helpers · a7c3e901
      Michal Hocko 提交于
      Patch series "kvmalloc", v5.
      
      There are many open coded kmalloc with vmalloc fallback instances in the
      tree.  Most of them are not careful enough or simply do not care about
      the underlying semantic of the kmalloc/page allocator which means that
      a) some vmalloc fallbacks are basically unreachable because the kmalloc
      part will keep retrying until it succeeds b) the page allocator can
      invoke a really disruptive steps like the OOM killer to move forward
      which doesn't sound appropriate when we consider that the vmalloc
      fallback is available.
      
      As it can be seen implementing kvmalloc requires quite an intimate
      knowledge if the page allocator and the memory reclaim internals which
      strongly suggests that a helper should be implemented in the memory
      subsystem proper.
      
      Most callers, I could find, have been converted to use the helper
      instead.  This is patch 6.  There are some more relying on __GFP_REPEAT
      in the networking stack which I have converted as well and Eric Dumazet
      was not opposed [2] to convert them as well.
      
      [1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org
      [2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com
      
      This patch (of 9):
      
      Using kmalloc with the vmalloc fallback for larger allocations is a
      common pattern in the kernel code.  Yet we do not have any common helper
      for that and so users have invented their own helpers.  Some of them are
      really creative when doing so.  Let's just add kv[mz]alloc and make sure
      it is implemented properly.  This implementation makes sure to not make
      a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also
      to not warn about allocation failures.  This also rules out the OOM
      killer as the vmalloc is a more approapriate fallback than a disruptive
      user visible action.
      
      This patch also changes some existing users and removes helpers which
      are specific for them.  In some cases this is not possible (e.g.
      ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and
      require GFP_NO{FS,IO} context which is not vmalloc compatible in general
      (note that the page table allocation is GFP_KERNEL).  Those need to be
      fixed separately.
      
      While we are at it, document that __vmalloc{_node} about unsupported gfp
      mask because there seems to be a lot of confusion out there.
      kvmalloc_node will warn about GFP_KERNEL incompatible (which are not
      superset) flags to catch new abusers.  Existing ones would have to die
      slowly.
      
      [sfr@canb.auug.org.au: f2fs fixup]
        Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Reviewed-by: Andreas Dilger <adilger@dilger.ca>	[ext4 part]
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7c3e901
  2. 20 3月, 2017 4 次提交
    • C
      f2fs: combine nat_bits and free_nid_bitmap cache · 7041d5d2
      Chao Yu 提交于
      Both nat_bits cache and free_nid_bitmap cache provide same functionality
      as a intermediate cache between free nid cache and disk, but with
      different granularity of indicating free nid range, and different
      persistence policy. nat_bits cache provides better persistence ability,
      and free_nid_bitmap provides better granularity.
      
      In this patch we combine advantage of both caches, so finally policy of
      the intermediate cache would be:
      - init: load free nid status from nat_bits into free_nid_bitmap
      - lookup: scan free_nid_bitmap before load NAT blocks
      - update: update free_nid_bitmap in real-time
      - persistence: udpate and persist nat_bits in checkpoint
      
      This patch also resolves performance regression reported by lkp-robot.
      
      commit:
        4ac91242 ("f2fs: introduce free nid bitmap")
        d00030cf9cd0bb96fdccc41e33d3c91dcbb672ba ("f2fs: use __set{__clear}_bit_le")
        1382c0f3f9d3f936c8bc42ed1591cf7a593ef9f7 ("f2fs: combine nat_bits and free_nid_bitmap cache")
      
      4ac91242 d00030cf9cd0bb96fdccc41e33 1382c0f3f9d3f936c8bc42ed15
      ---------------- -------------------------- --------------------------
               %stddev     %change         %stddev     %change         %stddev
                   \          |                \          |                \
           77863 ±  0%      +2.1%      79485 ±  1%     +50.8%     117404 ±  0%  aim7.jobs-per-min
          231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  aim7.time.elapsed_time
          231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  aim7.time.elapsed_time.max
          896604 ±  0%      -0.8%     889221 ±  3%     -20.2%     715260 ±  1%  aim7.time.involuntary_context_switches
            2394 ±  1%      +4.6%       2503 ±  1%      +3.7%       2481 ±  2%  aim7.time.maximum_resident_set_size
            6240 ±  0%      -1.5%       6145 ±  1%     -14.1%       5360 ±  1%  aim7.time.system_time
         1111357 ±  3%      +1.9%    1132509 ±  2%      -6.2%    1041932 ±  2%  aim7.time.voluntary_context_switches
      ...
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Tested-by: NXiaolong Ye <xiaolong.ye@intel.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7041d5d2
    • C
      f2fs: skip scanning free nid bitmap of full NAT blocks · 586d1492
      Chao Yu 提交于
      This patch adds to account free nids for each NAT blocks, and while
      scanning all free nid bitmap, do check count and skip lookuping in
      full NAT block.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      586d1492
    • J
      f2fs: use __set{__clear}_bit_le · 23380b85
      Jaegeuk Kim 提交于
      This patch uses __set{__clear}_bit_le for highter speed.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      23380b85
    • J
      f2fs: declare static functions · 9f7e4a2c
      Jaegeuk Kim 提交于
      This is to avoid build warning reported by kbuild test robot.
      Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9f7e4a2c
  3. 01 3月, 2017 1 次提交
  4. 28 2月, 2017 5 次提交
  5. 24 2月, 2017 6 次提交
  6. 23 2月, 2017 1 次提交
  7. 29 1月, 2017 1 次提交
  8. 26 11月, 2016 2 次提交
    • C
      f2fs: fix to account total free nid correctly · 04d47e67
      Chao Yu 提交于
      Thread A		Thread B		Thread C
      - f2fs_create
       - f2fs_new_inode
        - f2fs_lock_op
         - alloc_nid
          alloc last nid
        - f2fs_unlock_op
      			- f2fs_create
      			 - f2fs_new_inode
      			  - f2fs_lock_op
      			   - alloc_nid
      			    as node count still not
      			    be increased, we will
      			    loop in alloc_nid
      						- f2fs_write_node_pages
      						 - f2fs_balance_fs_bg
      						  - f2fs_sync_fs
      						   - write_checkpoint
      						    - block_operations
      						     - f2fs_lock_all
       - f2fs_lock_op
      
      While creating new inode, we do not allocate and account nid atomically,
      so that when there is almost no free nids left, we may encounter deadloop
      like above stack.
      
      In order to avoid that, reuse nm_i::available_nids for accounting free nids
      and make nid allocation and counting being atomical during node creation.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      04d47e67
    • Y
      f2fs: fix an infinite loop when flush nodes in cp · d40a43af
      Yunlei He 提交于
      Thread A			Thread B
      
      - write_checkpoint
       - block_operations
         -blk_start_plug
          -sync_node_pages		- f2fs_do_sync_file
      				 - fsync_node_pages
      				  - f2fs_wait_on_page_writeback
      
      Thread A wait for global F2FS_DIRTY_NODES decreased to zero,
      it start a plug list, some requests have been added to this list.
      Thread B lock one dirty node page, and wait this page write back.
      But this page has been in plug list of thread A with PG_writeback flag.
      Thread A keep on running and its plug list has no chance to finish,
      so it seems a deadlock between cp and fsync path.
      
      This patch add a wait on page write back before set node page dirty
      to avoid this problem.
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NPengyang Hou <houpengyang@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d40a43af
  9. 24 11月, 2016 9 次提交
  10. 03 11月, 2016 1 次提交
  11. 01 11月, 2016 1 次提交
  12. 01 10月, 2016 3 次提交
    • C
      f2fs: fix to commit bio cache after flushing node pages · 3f5f4959
      Chao Yu 提交于
      In sync_node_pages, we won't check and commit last merged pages in private
      bio cache of f2fs, as these pages were taged as writeback, someone who is
      waiting for writebacking of the page will be blocked until the cache was
      committed by someone else.
      
      We need to commit node type bio cache to avoid potential deadlock or long
      delay of waiting writeback.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3f5f4959
    • C
      f2fs: support configuring fault injection per superblock · 1ecc0c5c
      Chao Yu 提交于
      Previously, we only support global fault injection configuration, so that
      when we configure type/rate of fault injection through sysfs, mount
      option, it will influence all f2fs partition which is being used.
      
      It is not make sence, since it will be not convenient if developer want
      to test separated partitions with different fault injection rate/type
      simultaneously, also it's not possible to enable fault injection in one
      partition and disable fault injection in other one.
      
      >From now on, we move global configuration of fault injection in module
      into per-superblock, hence injection testing can be more flexible.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      1ecc0c5c
    • W
      f2fs: add customized migrate_page callback · 5b7a487c
      Weichao Guo 提交于
      This patch improves the migration of dirty pages and allows migrating atomic
      written pages that F2FS uses in Page Cache. Instead of the fallback releasing
      page path, it provides better performance for memory compaction, CMA and other
      users of memory page migrating. For dirty pages, there is no need to write back
      first when migrating. For an atomic written page before committing, we can
      migrate the page and update the related 'inmem_pages' list at the same time.
      Signed-off-by: NWeichao Guo <guoweichao@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: fix some coding style]
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      5b7a487c
  13. 16 9月, 2016 1 次提交
  14. 14 9月, 2016 1 次提交
  15. 30 8月, 2016 1 次提交
  16. 19 8月, 2016 1 次提交
  17. 21 7月, 2016 1 次提交