提交 · 21cb1d99bcc77252e6426010bcc6433f75b581bb · openanolis / cloud-kernel

11 4月, 2015 26 次提交

f2fs: fix to cover sentry_lock for block allocation · 21cb1d99

由 Jaegeuk Kim 提交于 3月 11, 2015

In the following call stack, f2fs changes the bitmap for dirty segments and # of
dirty sentries without grabbing sit_i->sentry_lock.
This can result in mismatch on bitmap and # of dirty sentries, since if there
are some direct_io operations.

In allocate_data_block,
 - __allocate_new_segments
  - mutex_lock(&curseg->curseg_mutex);
  - s_ops->allocate_segment
   - new_curseg/change_curseg
    - reset_curseg
     - __set_sit_entry_type
      - __mark_sit_entry_dirty
       - set_bit(dirty_sentries_bitmap)
       - dirty_sentries++;
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

21cb1d99

f2fs: fix to check current blkaddr in __allocate_data_blocks · d6d4f1cb

由 Chao Yu 提交于 3月 12, 2015

In __allocate_data_blocks, we should check current blkaddr which is located at
ofs_in_node of dnode page instead of checking first blkaddr all the time.
Otherwise we can only allocate one blkaddr in each dnode page. Fix it.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d6d4f1cb

f2fs: fix to truncate inline data past EOF · 0bfcfcca

由 Chao Yu 提交于 3月 10, 2015

Previously if inode is with inline data, we will try to invalid partial inline
data in page #0 when we truncate size of inode in truncate_partial_data_page().
And then we set page #0 to dirty, after this we can synchronize inode page with
page #0 at ->writepage().

But sometimes we will fail to operate page #0 in truncate_partial_data_page()
due to below reason:
a) if offset is zero, we will skip setting page #0 to dirty.
b) if page #0 is not uptodate, we will fail to update it as it has no mapping
data.

So with following operations, we will meet recent data which should be
truncated.

1.write inline data to file
2.sync first data page to inode page
3.truncate file size to 0
4.truncate file size to max_inline_size
5.echo 1 > /proc/sys/vm/drop_caches
6.read file --> meet original inline data which is remained in inode page.

This patch renames truncate_inline_data() to truncate_inline_inode() for code
readability, then use truncate_inline_inode() to truncate inline data in inode
page in truncate_blocks() and truncate page #0 in truncate_partial_data_page()
for fixing.

v2:
 o truncate partially #0 page in truncate_partial_data_page to avoid keeping
   old data in #0 page.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0bfcfcca

f2fs: fix reference leaks in f2fs_acl_create · 83dfe53c

由 Chao Yu 提交于 3月 09, 2015

Our f2fs_acl_create is copied and modified from posix_acl_create to avoid
deadlock bug when inline_dentry feature is enabled.

Now, we got reference leaks in posix_acl_create, and this has been fixed in
commit fed0b588 ("posix_acl: fix reference leaks in posix_acl_create")
by Omar Sandoval.
https://lkml.org/lkml/2015/2/9/5

Let's fix this issue in f2fs_acl_create too.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Reviewed-by: NChangman Lee <cm224.lee@ssamsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

83dfe53c

f2fs: fix to calculate max length of contiguous free slots correctly · bda19076

由 Chao Yu 提交于 3月 09, 2015

When lookuping for creating, we will try to record the level of current dentry
hash table if current dentry has enough contiguous slots for storing name of new
file which will be created later, this can save our lookup time when add a link
into parent dir.

But currently in find_target_dentry, our current length of contiguous free slots
is not calculated correctly. This make us leaving some holes in dentry block
occasionally, it wastes our space of dentry block.

Let's refactor the lookup flow for max slots as following to fix this issue:
a) increase max_len if current slot is free;
b) update max_slots with max_len if max_len is larger than max_slots;
c) reset max_len to zero if current slot is not free.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

bda19076

f2fs: fix unlocked nat set cache operation · 57ed1e95

由 Wanpeng Li 提交于 3月 09, 2015

nm_i->nat_tree_lock is used to sync both the operations of nat entry
cache tree and nat set cache tree, however, it isn't held when flush
nat entries during checkpoint which lead to potential race, this patch
fix it by holding the lock when gang lookup nat set cache and delete
item from nat set cache.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

57ed1e95

f2fs: cleanup statement about max orphan inodes calc · e0150392

由 Changman Lee 提交于 3月 09, 2015

Through each macro, we can read the meaning easily.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e0150392

f2fs: remove unnecessary condition judgment · d9f46bb1

由 Yuan Zhong 提交于 3月 09, 2015

Remove the unnecessary condition judgment, because
'max_slots' has been initialized to '0' at the beginging
of the function, as following:
if (max_slots)
       *max_slots = 0;
Signed-off-by: NYuan Zhong <yuan.mark.zhong@samsung.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d9f46bb1

f2fs: set the correct place of initializing *res_page · b1f73b79

由 Yuan Zhong 提交于 3月 07, 2015

The function 'find_in_inline_dir()' contain 'res_page'
as an argument. So, we should initiaize 'res_page' before
this function.
Signed-off-by: NYuan Zhong <yuan.mark.zhong@samsung.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b1f73b79

f2fs: reduce searching region of segmap when set free section · 7fd97019

由 Wanpeng Li 提交于 3月 06, 2015

In __set_free we will check whether all segment are free in one section
when free one segment, in order to set section to free status. But the
searching region of segmap is from start segno to last segno of main
area, it's not necessary. So let's just only check all segment bitmap
of target section.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7fd97019

f2fs: fix extent cache memory leak · fdf6c8be

由 Wanpeng Li 提交于 3月 06, 2015

extent tree/node slab cache is created during f2fs insmod,
how, it isn't destroyed during f2fs rmmod, this patch fix
it by destroy extent tree/node slab cache once rmmod f2fs.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fdf6c8be

f2fs: relocate Kconfig from misc filesystems · d7196c5a

由 Jaegeuk Kim 提交于 3月 03, 2015

The f2fs has been shipped on many smartphone devices during a couple of years.
So, it is worth to relocate Kconfig into main page from misc filesystems for
developers to choose it more easily.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d7196c5a

f2fs: report -ENOENT for unreached data indices · 76629165

由 Jaegeuk Kim 提交于 3月 02, 2015

If inode has inline_data, it should report -ENOENT when accessing out-of-bound
region.
This is used by f2fs_fiemap which treats -ENOENT with no error.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

76629165

f2fs: clear append/update flags once fsync is done · cff28521

由 Jaegeuk Kim 提交于 3月 02, 2015

When fsync is done through checkpoint, previous f2fs missed to clear append
and update flag. This patch fixes to clear them.

This was originally catched by Changman Lee before.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

cff28521

f2fs: avoid to trigger writepage during POR · d5669f7b

由 Jaegeuk Kim 提交于 2月 27, 2015

This patch doesn't make any effect on previous behavior, since
f2fs_write_data_page bypasses writing the page during POR.

But, the difference is that this patch avoids holding writepages mutex.
This is to avoid the following false warning, since this can happen only
when mount and shutdown are triggered at the same time.

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 4.0.0-rc1+ #3 Tainted: G           O
 -------------------------------------------------------
 kworker/u8:0/2270 is trying to acquire lock:
  (&sbi->gc_mutex){+.+.+.}, at: [<ffffffffa02bdd33>] f2fs_balance_fs+0x73/0x90 [f2fs]

 but task is already holding lock:
  (&sbi->writepages){+.+...}, at: [<ffffffffa02b261b>] f2fs_write_data_pages+0xcb/0x3a0 [f2fs]

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #2 (&sbi->writepages){+.+...}:
        [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0
        [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530
        [<ffffffffa02b261b>] f2fs_write_data_pages+0xcb/0x3a0 [f2fs]
        [<ffffffff811c38c1>] do_writepages+0x21/0x50
        [<ffffffff8126c5a6>] __writeback_single_inode+0x76/0xbf0
        [<ffffffff8126e23a>] writeback_single_inode+0xea/0x1c0
        [<ffffffff8126e425>] write_inode_now+0x95/0xa0
        [<ffffffff81259dab>] iput+0x20b/0x3f0
        [<ffffffffa02c1c8b>] recover_data.constprop.14+0x26b/0xa80 [f2fs]
        [<ffffffffa02c2776>] recover_fsync_data+0x2b6/0x5e0 [f2fs]
        [<ffffffffa02a9744>] f2fs_fill_super+0xb24/0xb90 [f2fs]
        [<ffffffff8123d7f4>] mount_bdev+0x1a4/0x1e0
        [<ffffffffa02a3c85>] f2fs_mount+0x15/0x20 [f2fs]
        [<ffffffff8123e159>] mount_fs+0x39/0x180
        [<ffffffff8125e51b>] vfs_kern_mount+0x6b/0x160
        [<ffffffff81261554>] do_mount+0x204/0xbe0
        [<ffffffff8126223b>] SyS_mount+0x8b/0xe0
        [<ffffffff81863e6d>] system_call_fastpath+0x16/0x1b

 -> #1 (&sbi->cp_mutex){+.+...}:
        [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0
        [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530
        [<ffffffffa02acbf2>] write_checkpoint+0x42/0x1230 [f2fs]
        [<ffffffffa02a847d>] f2fs_sync_fs+0x9d/0x2a0 [f2fs]
        [<ffffffff81272f82>] sync_filesystem+0x82/0xb0
        [<ffffffff8123c214>] generic_shutdown_super+0x34/0x100
        [<ffffffff8123c5f7>] kill_block_super+0x27/0x70
        [<ffffffffa02a3c60>] kill_f2fs_super+0x20/0x30 [f2fs]
        [<ffffffff8123ca49>] deactivate_locked_super+0x49/0x80
        [<ffffffff8123d05e>] deactivate_super+0x4e/0x70
        [<ffffffff8125df63>] cleanup_mnt+0x43/0x90
        [<ffffffff8125e002>] __cleanup_mnt+0x12/0x20
        [<ffffffff810a82e4>] task_work_run+0xc4/0xf0
        [<ffffffff8101f0bd>] do_notify_resume+0x8d/0xa0
        [<ffffffff81864141>] int_signal+0x12/0x17

 -> #0 (&sbi->gc_mutex){+.+.+.}:
        [<ffffffff810e2866>] __lock_acquire+0x1ac6/0x1c90
        [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0
        [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530
        [<ffffffffa02bdd33>] f2fs_balance_fs+0x73/0x90 [f2fs]
        [<ffffffffa02b5938>] f2fs_write_data_page+0x348/0x5b0 [f2fs]
        [<ffffffffa02af9da>] __f2fs_writepage+0x1a/0x50 [f2fs]
        [<ffffffff811c1b54>] write_cache_pages+0x274/0x6f0
        [<ffffffffa02b2630>] f2fs_write_data_pages+0xe0/0x3a0 [f2fs]
        [<ffffffff811c38c1>] do_writepages+0x21/0x50
        [<ffffffff8126c5a6>] __writeback_single_inode+0x76/0xbf0
        [<ffffffff8126d44a>] writeback_sb_inodes+0x32a/0x710
        [<ffffffff8126d8cf>] __writeback_inodes_wb+0x9f/0xd0
        [<ffffffff8126dcdb>] wb_writeback+0x3db/0x850
        [<ffffffff8126e848>] bdi_writeback_workfn+0x148/0x980
        [<ffffffff810a3782>] process_one_work+0x1e2/0x840
        [<ffffffff810a3f01>] worker_thread+0x121/0x460
        [<ffffffff810a9dc8>] kthread+0xf8/0x110
        [<ffffffff81863dbc>] ret_from_fork+0x7c/0xb0
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d5669f7b

f2fs: add stat info for moved blocks by background gc · e1235983

由 Changman Lee 提交于 12月 23, 2014

This patch is for looking into gc performance of f2fs in detail.
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
[Jaegeuk Kim: fix build errors]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e1235983

f2fs: fix to issue small discard in real-time mode discard · b28c3f94

由 Chao Yu 提交于 2月 28, 2015

Now in f2fs, we share functions and structures for batch mode and real-time mode
discard. For real-time mode discard, in shared function add_discard_addrs, we
will use uninitialized trim_minlen in struct cp_control to compare with length
of contiguous free blocks to decide whether skipping discard fragmented freespace
or not, this makes us ignore small discard sometimes. Fix it.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Reviewed-by : Changman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b28c3f94

f2fs: add cond_resched() to sync_dirty_dir_inodes() · 7ecebe5e

由 Sebastian Andrzej Siewior 提交于 2月 27, 2015

In a preempt-off enviroment a alot of FS activity (write/delete) I run
into a CPU stall:

| NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u2:2:59]
| Modules linked in:
| CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G        W      3.19.0-00010-g10c11c51ffed #153
| Workqueue: writeback bdi_writeback_workfn (flush-179:0)
| task: df230000 ti: df23e000 task.ti: df23e000
| PC is at __submit_merged_bio+0x6c/0x110
| LR is at f2fs_submit_merged_bio+0x74/0x80
…
| [<c00085c4>] (gic_handle_irq) from [<c0012e84>] (__irq_svc+0x44/0x5c)
| Exception stack(0xdf23fb48 to 0xdf23fb90)
| fb40:                   deef3484 ffff0001 ffff0001 00000027 deef3484 00000000
| fb60: deef3440 00000000 de426000 deef34ec deefc440 df23fbb4 df23fbb8 df23fb90
| fb80: c02191f0 c0218fa0 60000013 ffffffff
| [<c0012e84>] (__irq_svc) from [<c0218fa0>] (__submit_merged_bio+0x6c/0x110)
| [<c0218fa0>] (__submit_merged_bio) from [<c02191f0>] (f2fs_submit_merged_bio+0x74/0x80)
| [<c02191f0>] (f2fs_submit_merged_bio) from [<c021624c>] (sync_dirty_dir_inodes+0x70/0x78)
| [<c021624c>] (sync_dirty_dir_inodes) from [<c0216358>] (write_checkpoint+0x104/0xc10)
| [<c0216358>] (write_checkpoint) from [<c021231c>] (f2fs_sync_fs+0x80/0xbc)
| [<c021231c>] (f2fs_sync_fs) from [<c0221eb8>] (f2fs_balance_fs_bg+0x4c/0x68)
| [<c0221eb8>] (f2fs_balance_fs_bg) from [<c021e9b8>] (f2fs_write_node_pages+0x40/0x110)
| [<c021e9b8>] (f2fs_write_node_pages) from [<c00de620>] (do_writepages+0x34/0x48)
| [<c00de620>] (do_writepages) from [<c0145714>] (__writeback_single_inode+0x50/0x228)
| [<c0145714>] (__writeback_single_inode) from [<c0146184>] (writeback_sb_inodes+0x1a8/0x378)
| [<c0146184>] (writeback_sb_inodes) from [<c01463e4>] (__writeback_inodes_wb+0x90/0xc8)
| [<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] (wb_writeback+0x1dc/0x28c)
| [<c01465f8>] (wb_writeback) from [<c0146dd8>] (bdi_writeback_workfn+0x2ac/0x460)
| [<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] (process_one_work+0x11c/0x3a4)
| [<c003c3fc>] (process_one_work) from [<c003c844>] (worker_thread+0x17c/0x490)
| [<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100)
| [<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24)

As it turns out, the code loops in sync_dirty_dir_inodes() and waits for
others to make progress but since it never leaves the CPU there is no
progress made. At the time of this stall, there is also a rm process
blocked:
| rm              R running      0  1989   1774 0x00000000
| [<c047c55c>] (__schedule) from [<c00486dc>] (__cond_resched+0x30/0x4c)
| [<c00486dc>] (__cond_resched) from [<c047c8c8>] (_cond_resched+0x4c/0x54)
| [<c047c8c8>] (_cond_resched) from [<c00e1aec>] (truncate_inode_pages_range+0x1f0/0x5e8)
| [<c00e1aec>] (truncate_inode_pages_range) from [<c00e1fd8>] (truncate_inode_pages+0x28/0x30)
| [<c00e1fd8>] (truncate_inode_pages) from [<c00e2148>] (truncate_inode_pages_final+0x60/0x64)
| [<c00e2148>] (truncate_inode_pages_final) from [<c020c92c>] (f2fs_evict_inode+0x4c/0x268)
| [<c020c92c>] (f2fs_evict_inode) from [<c0137214>] (evict+0x94/0x140)
| [<c0137214>] (evict) from [<c01377e8>] (iput+0xc8/0x134)
| [<c01377e8>] (iput) from [<c01333e4>] (d_delete+0x154/0x180)
| [<c01333e4>] (d_delete) from [<c0129870>] (vfs_rmdir+0x114/0x12c)
| [<c0129870>] (vfs_rmdir) from [<c012d644>] (do_rmdir+0x158/0x168)
| [<c012d644>] (do_rmdir) from [<c012dd90>] (SyS_unlinkat+0x30/0x3c)
| [<c012dd90>] (SyS_unlinkat) from [<c000ec40>] (ret_fast_syscall+0x0/0x4c)

As explained by Jaegeuk Kim:
|This inode is the directory (c.f., do_rmdir) causing a infinite loop on
|sync_dirty_dir_inodes.
|The sync_dirty_dir_inodes tries to flush dirty dentry pages, but if the
|inode is under eviction, it submits bios and do it again until eviction
|is finished.

This patch adds a cond_resched() (as suggested by Jaegeuk) after a BIO
is submitted so other thread can make progress.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
[Jaegeuk Kim: change fs/f2fs to f2fs in subject as naming convention]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7ecebe5e

f2fs: fix max orphan inodes calculation · 14b42817

由 Wanpeng Li 提交于 2月 27, 2015

cp_payload is introduced for sit bitmap to support large volume, and it is
just after the block of f2fs_checkpoint + nat bitmap, so the first segment
should include F2FS_CP_PACKS + NR_CURSEG_TYPE + cp_payload + orphan blocks.
However, current max orphan inodes calculation don't consider cp_payload,
this patch fix it by reducing the number of cp_payload from total blocks of
the first segment when calculate max orphan inodes.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

14b42817

f2fs: don't need to collect dirty sit entries and flush journal when there's no dirty sit entries · 2b11a74b

由 Wanpeng Li 提交于 2月 27, 2015

 Don't need to collect dirty sit entries and flush sit journal to sit
 entries when there's no dirty sit entries. This patch check dirty_sentries
 earlier just like flush_nat_entries.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2b11a74b

f2fs: fix block_ops trace point · 2bda542d

由 Wanpeng Li 提交于 2月 27, 2015

block operations is used to flush all dirty node and dentry blocks in
the page cache and suspend ordinary writing activities, however, there
are some facts such like cp error or mount read-only etc which lead to
block operations can't be invoked. Current trace point print block_ops
start premature even if block_ops doesn't have opportunity to execute.
This patch fix it by move block_ops trace point just before block_ops.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2bda542d

f2fs: check its block allocation to avoid producing wrong dirty pages · b7f204cc

由 Jaegeuk Kim 提交于 2月 25, 2015

If a page is cached but its block was deallocated, we don't need to make
the page dirty again by gc and truncate_partial_data_page.

In that case, it needs to check its block allocation all the time instead
of giving up-to-date page.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b7f204cc

f2fs: clear page's up-to-date if block was deallocated · 2bca1e23

由 Jaegeuk Kim 提交于 2月 25, 2015

If page's on-disk block was deallocated, let's remove up-to-date flag to avoid
further access with wrong contents.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2bca1e23

f2fs: fix the number of orphan inode blocks · 3c642985

由 Wanpeng Li 提交于 2月 26, 2015

cp_pack_start_sum is calculated in do_checkpoint and is equal to
cpu_to_le32(1 + cp_payload_blks + orphan_blocks). The number of
orphan inode blocks is take advantage of by recover_orphan_inodes
to readahead meta pages and recovery inodes. However, current codes
forget to reduce the number of cp payload blocks when calculate
the number of orphan inode blocks. This patch fix it.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

3c642985

f2fs: introduce macro __cp_payload · 55141486

由 Wanpeng Li 提交于 2月 26, 2015

This patch introduce macro __cp_payload.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

55141486

f2fs: support fs shutdown · 1abff93d

由 Jaegeuk Kim 提交于 1月 08, 2015

This patch introduces a generic ioctl for fs shutdown, which was used by xfs.

If this shutdown is triggered, filesystem stops any further IOs according to the
following options.

1. FS_GOING_DOWN_FULLSYNC
 : this will flush all the data and dentry blocks, and do checkpoint before
   shutdown.

2. FS_GOING_DOWN_METASYNC
 : this will do checkpoint before shutdown.

3. FS_GOING_DOWN_NOSYNC
 : this will trigger shutdown as is.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1abff93d

04 3月, 2015 14 次提交

f2fs: avoid wrong error during recovery · 8fbc418f

由 Jaegeuk Kim 提交于 2月 24, 2015

During the roll-forward recovery, -ENOENT for f2fs_iget can be skipped.
So, this error value should not be propagated.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

8fbc418f

f2fs: remove obsolete code · 1614091d

由 Jaegeuk Kim 提交于 2月 23, 2015

This patch removes obsolete code in which summary variable is not needed.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1614091d

f2fs: use extent cache for dir · cb3bc9ee

由 Chao Yu 提交于 2月 05, 2015

We update extent cache for all user inode of f2fs including dir inode, so this
patch gives another chance to try to get physical address of page from extent
cache for dir inode.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

cb3bc9ee

f2fs: switch to check FI_NO_EXTENT in f2fs_{lookup,update}_extent_cache · 91c5d9bc

由 Chao Yu 提交于 2月 05, 2015

This patch switch to check FI_NO_EXTENT in f2fs_{lookup,update}_extent_cache
instead of f2fs_{lookup,update}_extent_tree or {lookup,update}_extent_info.

No functionality modification in this patch.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

91c5d9bc

f2fs: support fast lookup in extent cache · 62c8af65

由 Chao Yu 提交于 2月 05, 2015

This patch adds a fast lookup path for rb-tree extent cache.

In this patch we add a recently accessed extent node pointer 'cached_en' in
extent tree. In lookup path of extent cache, we will firstly lookup the last
accessed extent node which cached_en points, if we do not hit in this node,
we will try to lookup extent node in rb-tree.

By this way we can avoid unnecessary slow lookup in rb-tree sometimes.

Note that, side-effect of this patch is that we will increase memory cost,
because we will store a pointer variable in each struct extent tree
additionally.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

62c8af65

f2fs: add trace for rb-tree extent cache ops · 1ec4610c

由 Chao Yu 提交于 2月 05, 2015

This patch adds trace for lookup/update/shrink/destroy ops in rb-tree extent cache.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1ec4610c

f2fs: show extent tree, node stat info in debugfs · 4bf6fd9f

由 Chao Yu 提交于 2月 05, 2015

This patch add and show stat info of total memory footprint for extent tree,node
in debugfs.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4bf6fd9f

f2fs: enable rb-tree extent cache · 1dcc336b

由 Chao Yu 提交于 2月 05, 2015

This patch enables rb-tree based extent cache in f2fs.

When we mount with "-o extent_cache", f2fs will try to add recently accessed
page-block mappings into rb-tree based extent cache as much as possible, instead
of original one extent info cache.

By this way, f2fs can support more effective cache between dnode page cache and
disk. It will supply high hit ratio in the cache with fewer memory when dnode
page cache are reclaimed in environment of low memory.

Storage: Sandisk sd card 64g
1.append write file (offset: 0, size: 128M);
2.override write file (offset: 2M, size: 1M);
3.override write file (offset: 4M, size: 1M);
...
4.override write file (offset: 48M, size: 1M);
...
5.override write file (offset: 112M, size: 1M);
6.sync
7.echo 3 > /proc/sys/vm/drop_caches
8.read file (size:128M, unit: 4k, count: 32768)
(time dd if=/mnt/f2fs/128m bs=4k count=32768)

Extent Hit Ratio:
		before		patched
Hit Ratio	121 / 1071	1071 / 1071

Performance:
		before		patched
real    	0m37.051s	0m35.556s
user    	0m0.040s	0m0.026s
sys     	0m2.990s	0m2.251s

Memory Cost:
		before		patched
Tree Count:	0		1 (size: 24 bytes)
Node Count:	0		45 (size: 1440 bytes)

v3:
 o retest and given more details of test result.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1dcc336b

f2fs: add a mount option for rb-tree extent cache · 89672159

由 Chao Yu 提交于 2月 05, 2015

This patch adds a mount option 'extent_cache' in f2fs.

It is try to use a rb-tree based extent cache to cache more mapping information
with less memory if this option is set, otherwise we will use the original one
extent info cache.
Suggested-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

89672159

f2fs: add core functions for rb-tree extent cache · 429511cd

由 Chao Yu 提交于 2月 05, 2015

This patch adds core functions including slab cache init function and
init/lookup/update/shrink/destroy function for rb-tree based extent cache.

Thank Jaegeuk Kim and Changman Lee as they gave much suggestion about detail
design and implementation of extent cache.

Todo:
 * register rb-based extent cache shrink with mm shrink interface.

v2:
 o move set_extent_info and __is_{extent,back,front}_mergeable into f2fs.h.
 o introduce __{attach,detach}_extent_node for code readability.
 o add cond_resched() when fail to invoke kmem_cache_alloc/radix_tree_insert.
 o fix some coding style and typo issues.

v3:
 o fix oops due to using an unassigned pointer.
 o use list_del to remove extent node in shrink list.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
[Jaegeuk Kim: add static for some funcitons and declare in f2fs.h]
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

429511cd

f2fs: introduce infra macro and data structure of rb-tree extent cache · 13054c54

由 Chao Yu 提交于 2月 05, 2015

Introduce infra macro and data structure for rb-tree based extent cache:

Macros:
 * EXT_TREE_VEC_SIZE: indicate vector size for gang lookup in extent tree.
 * F2FS_MIN_EXTENT_LEN: indicate minimum length of extent managed in cache.
 * EXTENT_CACHE_SHRINK_NUMBER: indicate number of extent in cache will be shrunk.

Basic data structures for extent cache:
 * struct extent_tree: extent tree entry per inode.
 * struct extent_node: extent info node linked in extent tree.

Besides, adding new extent cache related fields in f2fs_sb_info.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

13054c54

f2fs: introduce universal lookup/update interface for extent cache · 7e4dde79

由 Chao Yu 提交于 2月 05, 2015

In this patch, we do these jobs:
1. rename {check,update}_extent_cache to {lookup,update}_extent_info;
2. introduce universal lookup/update interface of extent cache:
f2fs_{lookup,update}_extent_cache including above two real functions, then
export them to function callers.

So after above cleanup, we can add new rb-tree based extent cache into exported
interfaces.

v2:
 o remove "f2fs_" for inner function {lookup,update}_extent_info suggested by
   Jaegeuk Kim.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7e4dde79

f2fs: introduce f2fs_map_bh to clean codes of check_extent_cache · a2e7d1bf

由 Chao Yu 提交于 2月 05, 2015

This patch introduces f2fs_map_bh to clean codes of check_extent_cache.

v2:
 o cleanup f2fs_map_bh pointed out by Jaegeuk Kim.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

a2e7d1bf

f2fs: simplfy a field name in struct f2fs_extent,extent_info · 4d0b0bd4

由 Chao Yu 提交于 2月 05, 2015

Rename a filed name from 'blk_addr' to 'blk' in struct {f2fs_extent,extent_info}
as annotation of this field descripts its meaning well to us.

By this way, we can avoid long statement in code of following patches.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

4d0b0bd4

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功