提交 · fec1d6576cdf2ce13f84fcdf7b20d02a05f76fc6 · openeuler / Kernel

23 2月, 2016 14 次提交

f2fs: use wait_for_stable_page to avoid contention · fec1d657

由 Jaegeuk Kim 提交于 1月 20, 2016

In write_begin, if storage supports stable_page, we don't need to wait for
writeback to update its contents.
This patch introduces to use wait_for_stable_page instead of
wait_on_page_writeback.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fec1d657

f2fs: enhance foreground GC · 718e53fa

由 Chao Yu 提交于 1月 23, 2016

If we configure section consist of multiple segments, foreground GC will
do the garbage collection with following approach:

	for each segment in victim section
		blk_start_plug
		for each valid block in segment
			write out by OPU method
		submit bio cache   <---
		blk_finish_plug   <---

There are two issue:
1) for most of the time, 'submit bio cache' will break the merging in
current bio buffer from writes of next segments, making a smaller bio
submitting.
2) block plug only cover IO submitting in one segment, which reduce
opportunity of merging IOs in plug with multiple segments.

So refactor the code as below structure to strive for biggest
opportunity of merging IOs:

	blk_start_plug
	for each segment in victim section
		for each valid block in segment
			write out by OPU method
	submit bio cache
	blk_finish_plug

Test method:
1. mkfs.f2fs -s 8 /dev/sdX
2. touch 32 files
3. write 2M data into each file
4. punch 1.5M data from offset 0 for each file
5. trigger foreground gc through ioctl

Before patch, there are totoally 40 bios submitted.
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 65536, size = 122880
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 65776, size = 122880
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 66016, size = 122880
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 66256, size = 122880
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 66496, size = 32768
----repeat for 8 times

After patch, there are totally 35 bios submitted.
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 65536, size = 122880
----repeat 34 times
f2fs_submit_write_bio: dev = (8,32), WRITE_SYNC, DATA, sector = 73696, size = 16384
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

718e53fa

f2fs: don't need to call set_page_dirty for io error · e3ef1876

由 Jaegeuk Kim 提交于 1月 25, 2016

If end_io gets an error, we don't need to set the page as dirty, since we
already set f2fs_stop_checkpoint which will not flush any data.

This will resolve the following warning.

======================================================
[ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
4.4.0+ #9 Tainted: G           O
------------------------------------------------------
xfs_io/26773 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
 (&(&sbi->inode_lock[i])->rlock){+.+...}, at: [<ffffffffc025483f>] update_dirty_page+0x6f/0xd0 [f2fs]

and this task is already holding:
 (&(&q->__queue_lock)->rlock){-.-.-.}, at: [<ffffffff81396ea2>] blk_queue_bio+0x422/0x490
which would create a new lock dependency:
 (&(&q->__queue_lock)->rlock){-.-.-.} -> (&(&sbi->inode_lock[i])->rlock){+.+...}
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e3ef1876

f2fs: avoid needless sync_inode_page when reading inline_data · ae96e7bd

由 Jaegeuk Kim 提交于 1月 25, 2016

In write_begin, if there is an inline_data, f2fs loads it into 0'th data page.
Since it's the read path, we don't need to sync its inode page.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

ae96e7bd

f2fs: don't need to sync node page at every time · 52f80337

由 Jaegeuk Kim 提交于 1月 25, 2016

In write_end, we don't need to sync inode page at every time.
Instead, we can expect f2fs_write_inode will update later.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

52f80337

f2fs: avoid multiple node page writes due to inline_data · 2049d4fc

由 Jaegeuk Kim 提交于 1月 25, 2016

The sceanrio is:
1. create fully node blocks
2. flush node blocks
3. write inline_data for all the node blocks again
4. flush node blocks redundantly

So, this patch tries to flush inline_data when flushing node blocks.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2049d4fc

f2fs: do f2fs_balance_fs when block is allocated · 3c082b7b

由 Jaegeuk Kim 提交于 1月 23, 2016

We should consider data block allocation to trigger f2fs_balance_fs.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3c082b7b

f2fs: fix to overcome inline_data floods · 6e17bfbc

由 Jaegeuk Kim 提交于 1月 23, 2016

The scenario is:
1. create lots of node blocks
2. sync
3. write lots of inline_data
-> got panic due to no free space

In that case, we should flush node blocks when writing inline_data in #3,
and trigger gc as well.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6e17bfbc

f2fs: use writepages->lock for WB_SYNC_ALL · 25c13551

由 Jaegeuk Kim 提交于 1月 20, 2016

If there are many writepages calls by multiple threads in background, we don't
need to serialize to merge all the bios, since it's background.
In such the case, it'd better to run writepages concurrently.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

25c13551

f2fs: remove needless condition check · b483fadf

由 Jaegeuk Kim 提交于 1月 20, 2016

This patch removes needless condition variable.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b483fadf

f2fs: correct search area in get_new_segment · 0ab14356

由 Chao Yu 提交于 1月 22, 2016

get_new_segment starts from current segment position, tries to search a
free segment among its right neighbors locate in same section.

But previously our search area was set as [current segment, max segment],
which means we have to search to more bits in free_segmap bitmap for some
worse cases. So here we correct the search area to [current segment, last
segment in section] to avoid unnecessary searching.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0ab14356

f2fs: export dirty_nats_ratio in sysfs · 2304cb0c

由 Chao Yu 提交于 1月 18, 2016

This patch exports a new sysfs entry 'dirty_nat_ratio' to control threshold
of dirty nat entries, if current ratio exceeds configured threshold,
checkpoint will be triggered in f2fs_balance_fs_bg for flushing dirty nats.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2304cb0c

f2fs: flush dirty nat entries when exceeding threshold · 7d768d2c

由 Chao Yu 提交于 1月 18, 2016

When testing f2fs with xfstest, generic/251 is stuck for long time,
the case uses below serials to obtain fresh released space in device,
in order to prepare for following fstrim test.

1. rm -rf /mnt/dir
2. mkdir /mnt/dir/
3. cp -axT `pwd`/ /mnt/dir/
4. goto 1

During preparing step, all nat entries will be cached in nat cache,
most of them are dirty entries with invalid blkaddr, which means
nodes related to these entries have been truncated, and they could
be reused after the dirty entries been checkpointed.

However, there was no checkpoint been triggered, so nid allocators
(e.g. mkdir, creat) will run into long journey of iterating all NAT
pages, looking for free nids in alloc_nid->build_free_nids.

Here, in f2fs_balance_fs_bg we give another chance to do checkpoint
to flush nat entries for reusing them in free nid cache when dirty
entry count exceeds 10% of max count.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7d768d2c

f2fs: relocate is_merged_page · 0fd785eb

由 Chao Yu 提交于 1月 18, 2016

Operations in is_merged_page is related to inner bio cache, move it to
data.c.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0fd785eb

23 1月, 2016 1 次提交

wrappers for ->i_mutex access · 5955102c

由 Al Viro 提交于 1月 22, 2016

parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5955102c

15 1月, 2016 1 次提交

kmemcg: account certain kmem allocations to memcg · 5d097056

由 Vladimir Davydov 提交于 1月 14, 2016

Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg.  For the list, see below:

 - threadinfo
 - task_struct
 - task_delay_info
 - pid
 - cred
 - mm_struct
 - vm_area_struct and vm_region (nommu)
 - anon_vma and anon_vma_chain
 - signal_struct
 - sighand_struct
 - fs_struct
 - files_struct
 - fdtable and fdtable->full_fds_bits
 - dentry and external_name
 - inode for all filesystems. This is the most tedious part, because
   most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds.  Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5d097056

12 1月, 2016 5 次提交

f2fs: should unset atomic flag after successful commit · 447135a8

由 Jaegeuk Kim 提交于 1月 09, 2016

If there is an error during commit, we should keep the flag in order to
abort it.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

447135a8

f2fs: fix wrong memory condition check · 1663cae4

由 Jaegeuk Kim 提交于 1月 09, 2016

This patch fixes wrong decision for avaliable_free_memory.
The return valus is already set as false, so we should consider true condition
below only.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1663cae4

f2fs: monitor the number of background checkpoint · 42190d2a

由 Jaegeuk Kim 提交于 1月 09, 2016

This patch adds to show the number of background checkpoint.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

42190d2a

f2fs: detect idle time depending on user behavior · d0239e1b

由 Jaegeuk Kim 提交于 1月 08, 2016

This patch adds last time that user requested filesystem operations.
This information is used to detect whether system is idle or not later.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d0239e1b

f2fs: introduce time and interval facility · 6beceb54

由 Jaegeuk Kim 提交于 1月 08, 2016

This patch adds time and interval arrays to store some timing variables.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6beceb54

09 1月, 2016 9 次提交

f2fs: skip releasing nodes in chindless extent tree · 9b72a388

由 Chao Yu 提交于 1月 08, 2016

If there are no nodes in extent tree, let's skip releasing step to avoid
any overhead of grabbing/releasing extent tree lock.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9b72a388

f2fs: use atomic type for node count in extent tree · 68e35385

由 Chao Yu 提交于 1月 08, 2016

1. rename field in struct extent_tree from count to node_cnt for
   readability.
2. alter to use atomic type for node_cnt.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

68e35385

f2fs: recognize encrypted data in f2fs_fiemap · da5af127

由 Chao Yu 提交于 1月 08, 2016

This patch fixes to teach f2fs_fiemap to recognize encrypted data.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

da5af127

f2fs: clean up f2fs_balance_fs · 2c4db1a6

由 Jaegeuk Kim 提交于 1月 07, 2016

This patch adds one parameter to clean up all the callers of f2fs_balance_fs.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2c4db1a6

f2fs: remove redundant calls · 2a4b8e9f

由 Jaegeuk Kim 提交于 1月 07, 2016

This patch removes redundant calls.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2a4b8e9f

f2fs: avoid unnecessary f2fs_balance_fs calls · 12719ae1

由 Jaegeuk Kim 提交于 1月 07, 2016

Only when node page is newly dirtied, it needs to check whether we need to do
f2fs_gc.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

12719ae1

f2fs: check the page status filled from disk · 7612118a

由 Jaegeuk Kim 提交于 1月 01, 2016

After reading a page, we need to check whether there is any error.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7612118a

f2fs: introduce __get_node_page to reuse common code · 0e022ea8

由 Chao Yu 提交于 1月 05, 2016

There are duplicated code in between get_node_page and get_node_page_ra,
introduce __get_node_page to includes common parts of these two, and
export get_node_page and get_node_page_ra by reusing __get_node_page.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0e022ea8

f2fs: check node id earily when readaheading node page · e8458725

由 Chao Yu 提交于 1月 08, 2016

Add node id check in ra_node_page and get_node_page_ra like get_node_page.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e8458725

07 1月, 2016 4 次提交

f2fs: read isize while holding i_mutex in fiemap · de1475cc

由 Fan Li 提交于 1月 04, 2016

make sure the isize we read doesn't change during the process.
Signed-off-by: NFan li <fanofcode.li@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

de1475cc

J
Revert "f2fs: check the node block address of newly allocated nid" · 957efb0c
由 Jaegeuk Kim 提交于 1月 02, 2016
```
Original issue is fixed by:

  f2fs: cover more area with nat_tree_lock

This reverts commit 24928634.
```
957efb0c

f2fs: cover more area with nat_tree_lock · a5131193

由 Jaegeuk Kim 提交于 1月 02, 2016

There was a subtle bug on nat cache management which incurs wrong nid allocation
or wrong block addresses when try_to_free_nats is triggered heavily.
This patch enlarges the previous coverage of nat_tree_lock to avoid data race.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a5131193

fs: use block_device name vsprintf helper · a1c6f057

由 Dmitry Monakhov 提交于 4月 13, 2015

Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a1c6f057

04 1月, 2016 1 次提交

f2fs: introduce max_file_blocks in sbi · e0afc4d6

由 Chao Yu 提交于 12月 31, 2015

Introduce max_file_blocks in sbi to store max block index of file in f2fs,
it could be used to avoid unneeded calculation of max block index in
runtime.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: fix overflow of sbi->max_file_blocks]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e0afc4d6

01 1月, 2016 5 次提交

f2fs crypto: check CONFIG_F2FS_FS_XATTR for encrypted symlink · 3a9e6433

由 Chao Yu 提交于 12月 31, 2015

Add missed CONFIG_F2FS_FS_XATTR for encrypted symlink inode in order
to avoid unneeded registry of ->{get,set,remove}xattr.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3a9e6433

f2fs: introduce zombie list for fast shrinking extent trees · 137d09f0

由 Jaegeuk Kim 提交于 12月 31, 2015

This patch removes refcount, and instead, adds zombie_list to shrink directly
without radix tree traverse.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

137d09f0

f2fs: monitor zombie_tree count · c00ba554

由 Jaegeuk Kim 提交于 12月 31, 2015

This patch adds an entry to show the number of zombie extent_tree.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c00ba554

f2fs: use IPU for fdatasync · c46a155b

由 Jaegeuk Kim 提交于 12月 31, 2015

This patch fixes missing IPU condition when fdatasync is called.
With this patch, fdatasync is able to avoid additional node writes for recovery.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c46a155b

f2fs: write pending bios when cp_error is set · 8d4ea29b

由 Jaegeuk Kim 提交于 12月 31, 2015

When testing ioc_shutdown, put_super is able to be hanged by waiting for
writebacking pages as follows.

INFO: task umount:2723 blocked for more than 120 seconds.
      Tainted: G           O    4.4.0-rc3+ #8
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount          D ffff88000859f9d8     0  2723   2110 0x00000000
 ffff88000859f9d8 0000000000000000 0000000000000000 ffffffff81e11540
 ffff880078c225c0 ffff8800085a0000 ffff88007fc17440 7fffffffffffffff
 ffffffff818239f0 ffff88000859fb48 ffff88000859f9f0 ffffffff8182310c
Call Trace:
 [<ffffffff818239f0>] ? bit_wait+0x50/0x50
 [<ffffffff8182310c>] schedule+0x3c/0x90
 [<ffffffff81827fb9>] schedule_timeout+0x2d9/0x430
 [<ffffffff810e0f8f>] ? mark_held_locks+0x6f/0xa0
 [<ffffffff8111614d>] ? ktime_get+0x7d/0x140
 [<ffffffff818239f0>] ? bit_wait+0x50/0x50
 [<ffffffff8106a655>] ? kvm_clock_get_cycles+0x25/0x30
 [<ffffffff8111617c>] ? ktime_get+0xac/0x140
 [<ffffffff818239f0>] ? bit_wait+0x50/0x50
 [<ffffffff81822564>] io_schedule_timeout+0xa4/0x110
 [<ffffffff81823a25>] bit_wait_io+0x35/0x50
 [<ffffffff818235bd>] __wait_on_bit+0x5d/0x90
 [<ffffffff811b9e8b>] wait_on_page_bit+0xcb/0xf0
 [<ffffffff810d5f90>] ? autoremove_wake_function+0x40/0x40
 [<ffffffff811cf84c>] truncate_inode_pages_range+0x4bc/0x840
 [<ffffffff811cfc3d>] truncate_inode_pages_final+0x4d/0x60
 [<ffffffffc023ced5>] f2fs_evict_inode+0x75/0x400 [f2fs]
 [<ffffffff812639bc>] evict+0xbc/0x190
 [<ffffffff81263d19>] iput+0x229/0x2c0
 [<ffffffffc0241885>] f2fs_put_super+0x105/0x1a0 [f2fs]
 [<ffffffff8124756a>] generic_shutdown_super+0x6a/0xf0
 [<ffffffff812478f7>] kill_block_super+0x27/0x70
 [<ffffffffc0241290>] kill_f2fs_super+0x20/0x30 [f2fs]
 [<ffffffff81247b03>] deactivate_locked_super+0x43/0x70
 [<ffffffff81247f4c>] deactivate_super+0x5c/0x60
 [<ffffffff81268d2f>] cleanup_mnt+0x3f/0x90
 [<ffffffff81268dc2>] __cleanup_mnt+0x12/0x20
 [<ffffffff810ac463>] task_work_run+0x73/0xa0
 [<ffffffff810032ac>] exit_to_usermode_loop+0xcc/0xd0
 [<ffffffff81003e7c>] syscall_return_slowpath+0xcc/0xe0
 [<ffffffff81829ea2>] int_ret_from_sys_call+0x25/0x9f
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8d4ea29b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功