提交 · b7ad7512b84b26f1c0ec823647a387627c138d32 · openeuler / raspberrypi-kernel

23 2月, 2016 11 次提交

f2fs: split journal cache from curseg cache · b7ad7512

由 Chao Yu 提交于 2月 19, 2016

In curseg cache, f2fs caches two different parts:
 - datas of current summay block, i.e. summary entries, footer info.
 - journal info, i.e. sparse nat/sit entries or io stat info.

With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.

So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem

When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

b7ad7512

f2fs: enhance IO path with block plug · e9f5b8b8

由 Chao Yu 提交于 2月 14, 2016

Try to use block plug in more place as below to let process cache bios
as much as possbile, in order to reduce lock overhead of queue in IO
scheduler.
1) sync_meta_pages
2) ra_meta_pages
3) f2fs_balance_fs_bg
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e9f5b8b8

f2fs: introduce f2fs_journal struct to wrap journal info · dfc08a12

由 Chao Yu 提交于 2月 14, 2016

Introduce a new structure f2fs_journal to wrap journal info in struct
f2fs_summary_block for readability.

struct f2fs_journal {
	union {
		__le16 n_nats;
		__le16 n_sits;
	};
	union {
		struct nat_journal nat_j;
		struct sit_journal sit_j;
		struct f2fs_extra_info info;
	};
} __packed;

struct f2fs_summary_block {
	struct f2fs_summary entries[ENTRIES_IN_SUM];
	struct f2fs_journal journal;
	struct summary_footer footer;
} __packed;
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

dfc08a12

f2fs: support revoking atomic written pages · 28bc106b

由 Chao Yu 提交于 2月 06, 2016

f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file

With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.

But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.

So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.

If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

28bc106b

f2fs: split drop_inmem_pages from commit_inmem_pages · 29b96b54

由 Chao Yu 提交于 2月 06, 2016

Split drop_inmem_pages from commit_inmem_pages for code readability,
and prepare for the following modification.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

29b96b54

f2fs: use correct errno · 60b286c4

由 Jaegeuk Kim 提交于 2月 09, 2016

This patch is to fix misused error number.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

60b286c4

f2fs: introduce f2fs_submit_merged_bio_cond · 0c3a5797

由 Chao Yu 提交于 1月 18, 2016

f2fs use single bio buffer per type data (META/NODE/DATA) for caching
writes locating in continuous block address as many as possible, after
submitting, these writes may be still cached in bio buffer, so we have
to flush cached writes in bio buffer by calling f2fs_submit_merged_bio.

Unfortunately, in the scenario of high concurrency, bio buffer could be
flushed by someone else before we submit it as below reasons:
a) there is no space in bio buffer.
b) add a request of different type (SYNC, ASYNC).
c) add a discontinuous block address.

For this condition, f2fs_submit_merged_bio will be devastating, because
it could break the following merging of writes in bio buffer, split one
big bio into two smaller one.

This patch introduces f2fs_submit_merged_bio_cond which can do a
conditional submitting with bio buffer, before submitting it will judge
whether:
 - page in DATA type bio buffer is matching with specified page;
 - page in DATA type bio buffer is belong to specified inode;
 - page in NODE type bio buffer is belong to specified inode;
If there is no eligible page in bio buffer, we will skip submitting step,
result in gaining more chance to merge consecutive block IOs in bio cache.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0c3a5797

f2fs: use wait_for_stable_page to avoid contention · fec1d657

由 Jaegeuk Kim 提交于 1月 20, 2016

In write_begin, if storage supports stable_page, we don't need to wait for
writeback to update its contents.
This patch introduces to use wait_for_stable_page instead of
wait_on_page_writeback.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

fec1d657

f2fs: correct search area in get_new_segment · 0ab14356

由 Chao Yu 提交于 1月 22, 2016

get_new_segment starts from current segment position, tries to search a
free segment among its right neighbors locate in same section.

But previously our search area was set as [current segment, max segment],
which means we have to search to more bits in free_segmap bitmap for some
worse cases. So here we correct the search area to [current segment, last
segment in section] to avoid unnecessary searching.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0ab14356

f2fs: flush dirty nat entries when exceeding threshold · 7d768d2c

由 Chao Yu 提交于 1月 18, 2016

When testing f2fs with xfstest, generic/251 is stuck for long time,
the case uses below serials to obtain fresh released space in device,
in order to prepare for following fstrim test.

1. rm -rf /mnt/dir
2. mkdir /mnt/dir/
3. cp -axT `pwd`/ /mnt/dir/
4. goto 1

During preparing step, all nat entries will be cached in nat cache,
most of them are dirty entries with invalid blkaddr, which means
nodes related to these entries have been truncated, and they could
be reused after the dirty entries been checkpointed.

However, there was no checkpoint been triggered, so nid allocators
(e.g. mkdir, creat) will run into long journey of iterating all NAT
pages, looking for free nids in alloc_nid->build_free_nids.

Here, in f2fs_balance_fs_bg we give another chance to do checkpoint
to flush nat entries for reusing them in free nid cache when dirty
entry count exceeds 10% of max count.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7d768d2c

f2fs: relocate is_merged_page · 0fd785eb

由 Chao Yu 提交于 1月 18, 2016

Operations in is_merged_page is related to inner bio cache, move it to
data.c.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

0fd785eb

12 1月, 2016 3 次提交

f2fs: monitor the number of background checkpoint · 42190d2a

由 Jaegeuk Kim 提交于 1月 09, 2016

This patch adds to show the number of background checkpoint.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

42190d2a

f2fs: detect idle time depending on user behavior · d0239e1b

由 Jaegeuk Kim 提交于 1月 08, 2016

This patch adds last time that user requested filesystem operations.
This information is used to detect whether system is idle or not later.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d0239e1b

f2fs: introduce time and interval facility · 6beceb54

由 Jaegeuk Kim 提交于 1月 08, 2016

This patch adds time and interval arrays to store some timing variables.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6beceb54

09 1月, 2016 1 次提交

f2fs: clean up f2fs_balance_fs · 2c4db1a6

由 Jaegeuk Kim 提交于 1月 07, 2016

This patch adds one parameter to clean up all the callers of f2fs_balance_fs.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2c4db1a6

31 12月, 2015 1 次提交

f2fs: report error of do_checkpoint · c34f42e2

由 Chao Yu 提交于 12月 23, 2015

do_checkpoint and write_checkpoint can fail due to reasons like triggering
in a readonly fs or encountering IO error of storage device.

So it's better to report such error info to user, let user be aware of
failure of doing checkpoint.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

c34f42e2

18 12月, 2015 1 次提交

f2fs: support data flush in background · 36b35a0d

由 Chao Yu 提交于 12月 17, 2015

Previously, when finishing a checkpoint, we have persisted all fs meta
info including meta inode, node inode, dentry page of directory inode, so,
after a sudden power cut, f2fs can recover from last checkpoint with full
directory structure.

But during checkpoint, we didn't flush dirty pages of regular and symlink
inode, so such dirty datas still in memory will be lost in that moment of
power off.

In order to reduce the chance of lost data, this patch enables
f2fs_balance_fs_bg with the ability of data flushing. It will try to flush
user data before starting a checkpoint. So user's data written after last
checkpoint which may not be fsynced could be saved.

When we mount with data_flush option, after every period of cp_interval
(could be configured in sysfs: /sys/fs/f2fs/device/cp_interval) seconds
user data could be flushed into device once f2fs_balance_fs_bg was called
in kworker thread or gc thread.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

36b35a0d

10 12月, 2015 1 次提交

f2fs: enhance the bit operation for SSR · 80609448

由 Jaegeuk Kim 提交于 12月 04, 2015

This patch enhances the existing bit operation when f2fs allocates SSR
blocks.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

80609448

05 12月, 2015 3 次提交

f2fs: clean up code with __has_cursum_space · 855639de

由 Chao Yu 提交于 12月 01, 2015

Clean up codes in lookup_journal_in_cursum() with __has_cursum_space().
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

855639de

f2fs: clear page uptodate when dropping cache for atomic write · f478f43f

由 Chao Yu 提交于 11月 13, 2015

We should clear uptodate flag for all pages atomic written when we drop
them, otherwise before these cached pages were reclaimed or invalidated
eventually, we will see invalid data when hitting them again.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f478f43f

f2fs: optimize __find_rev_next_bit · 692223d1

由 Fan Li 提交于 11月 12, 2015

1. Skip __reverse_ulong if the bitmap is empty.
2. Reduce branches and codes.
According to my test, the performance of this new version is 5% higher on
an empty bitmap of 64bytes, and remains about the same in the worst scenario.
Signed-off-by: NFan li <fanofcode.li@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

692223d1

23 10月, 2015 1 次提交

f2fs: fix to clear GCed flag for atomic written page · 7fee7406

由 Chao Yu 提交于 10月 22, 2015

Atomic write page can be GCed, after committing this kind of page, we should
clear the GCed flag for it.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7fee7406

22 10月, 2015 2 次提交

f2fs: don't need to submit bio on error case · 2b246fb0

由 Jaegeuk Kim 提交于 10月 21, 2015

If commit_atomic_write is failed, we don't need to submit any bio.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2b246fb0

f2fs: refactor __find_rev_next_{zero}_bit · f96999c3

由 Jaegeuk Kim 提交于 10月 20, 2015

This patch refactors __find_rev_next_{zero}_bit which was disabled previously
due to bugs.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

f96999c3

14 10月, 2015 1 次提交

f2fs crypto: fix racing of accessing encrypted page among · 08b39fbd

由 Chao Yu 提交于 10月 08, 2015

 different competitors

Since we use different page cache (normally inode's page cache for R/W
and meta inode's page cache for GC) to cache the same physical block
which is belong to an encrypted inode. Writeback of these two page
cache should be exclusive, but now we didn't handle writeback state
well, so there may be potential racing problem:

a)
kworker:				f2fs_gc:
 - f2fs_write_data_pages
  - f2fs_write_data_page
   - do_write_data_page
    - write_data_page
     - f2fs_submit_page_mbio
(page#1 in inode's page cache was queued
in f2fs bio cache, and be ready to write
to new blkaddr)
					 - gc_data_segment
					  - move_encrypted_block
					   - pagecache_get_page
					(page#2 in meta inode's page cache
					was cached with the invalid datas
					of physical block located in new
					blkaddr)
					   - f2fs_submit_page_mbio
					(page#1 was submitted, later, page#2
					with invalid data will be submitted)

b)
f2fs_gc:
 - gc_data_segment
  - move_encrypted_block
   - f2fs_submit_page_mbio
(page#1 in meta inode's page cache was
queued in f2fs bio cache, and be ready
to write to new blkaddr)
					user thread:
					 - f2fs_write_begin
					  - f2fs_submit_page_bio
					(we submit the request to block layer
					to update page#2 in inode's page cache
					with physical block located in new
					blkaddr, so here we may read gabbage
					data from new blkaddr since GC hasn't
					writebacked the page#1 yet)

This patch fixes above potential racing problem for encrypted inode.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

08b39fbd

13 10月, 2015 3 次提交

f2fs: support lower priority asynchronous readahead in ra_meta_pages · 26879fb1

由 Chao Yu 提交于 10月 12, 2015

Now, we use ra_meta_pages to reads continuous physical blocks as much as
possible to improve performance of following reads. However, ra_meta_pages
uses a synchronous readahead approach by submitting bio with READ, as READ
is with high priority, it can not be used in the case of preloading blocks,
and it's not sure when these RAed pages will be used.

This patch supports asynchronous readahead in ra_meta_pages by tagging bio
with READA flag in order to allow preloading.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

26879fb1

f2fs: don't tag REQ_META for temporary non-meta pages · 2b947003

由 Chao Yu 提交于 10月 12, 2015

In recovery or checkpoint flow, we grab pages temperarily in meta inode's
mapping for caching temperary data, actually, datas in these pages were
not meta data of f2fs, but still we tag them with REQ_META flag. However,
lower device like eMMC may do some optimization for data of such type.
So in order to avoid wrong optimization, we'd better remove such flag
for temperary non-meta pages.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

2b947003

f2fs: fix SSA updates resulting in corruption · 6e2c64ad

由 Jaegeuk Kim 提交于 10月 07, 2015

The f2fs_collapse_range and f2fs_insert_range changes the block addresses
directly. But that can cause uncovered SSA updates.
In that case, we need to give up to change the block addresses and do buffered
writes to keep filesystem consistency.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

6e2c64ad

10 10月, 2015 3 次提交

f2fs: introduce a periodic checkpoint flow · 60b99b48

由 Jaegeuk Kim 提交于 10月 05, 2015

This patch introduces a periodic checkpoint feature.
Note that, this is not enforcing to conduct checkpoints very strictly in terms
of trigger timing, instead just hope to help user experiences.
The default value is 60 seconds.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

60b99b48

f2fs: support synchronous gc in ioctl · d530d4d8

由 Chao Yu 提交于 10月 05, 2015

This patch drops in batches gc triggered through ioctl, since user
can easily control the gc by designing the loop around the ->ioctl.

We support synchronous gc by forcing using FG_GC in f2fs_gc, so with
it, user can make sure that in this round all blocks gced were
persistent in the device until ioctl returned.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d530d4d8

f2fs: use vmalloc to handle -ENOMEM error · 39307a8e

由 Jaegeuk Kim 提交于 9月 22, 2015

This patch introduces f2fs_kvmalloc to avoid -ENOMEM during mount.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

39307a8e

25 8月, 2015 1 次提交

f2fs: use __GFP_NOFAIL to avoid infinite loop · 80c54505

由 Jaegeuk Kim 提交于 8月 20, 2015

__GFP_NOFAIL can avoid retrying the whole path of kmem_cache_alloc and
bio_alloc.
And, it also fixes the use cases of GFP_ATOMIC correctly.
Suggested-by: NChao Yu <chao2.yu@samsung.com>
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

80c54505

21 8月, 2015 2 次提交

f2fs: handle failed bio allocation · 740432f8

由 Jaegeuk Kim 提交于 8月 14, 2015

As the below comment of bio_alloc_bioset, f2fs can allocate multiple bios at the
same time. So, we can't guarantee that bio is allocated all the time.

"
 *   When @bs is not NULL, if %__GFP_WAIT is set then bio_alloc will always be
 *   able to allocate a bio. This is due to the mempool guarantees. To make this
 *   work, callers must never allocate more than 1 bio at a time from this pool.
 *   Callers that need to allocate more than 1 bio must always submit the
 *   previously allocated bio for IO before attempting to allocate a new one.
 *   Failure to do so can cause deadlocks under memory pressure.
"
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

740432f8

f2fs: shrink free_nids entries · 31696580

由 Chao Yu 提交于 7月 28, 2015

This patch introduces __count_free_nids/try_to_free_nids and registers
them in slab shrinker for shrinking under memory pressure.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

31696580

15 8月, 2015 1 次提交

f2fs: do not assign a new segment for dio under space shortage · 47e70ca4

由 Jaegeuk Kim 提交于 8月 11, 2015

If there is not enough free segment, we should not assign a new segment
explicitly. Otherwise, we can run out of free segment.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

47e70ca4

12 8月, 2015 1 次提交

f2fs: remove inmem radix tree · decd36b6

由 Chao Yu 提交于 8月 07, 2015

Previously, we use radix tree to index all registered page entries for
atomic file, but now we only use radix tree to see whether current page
is indexed or not, since the other user of radix tree is gone in commit
042b7816 ("f2fs: remove unnecessary call to invalidate inmemory pages").

So in this patch, we try to use one more efficient way:
Introducing a macro ATOMIC_WRITTEN_PAGE, and setting it as page private
value to indicate page indexing status. By using this way, we can save
memory and lookup time.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

decd36b6

05 8月, 2015 4 次提交

f2fs: invalidate temporary meta page · e90c2d28

由 Chao Yu 提交于 7月 28, 2015

To avoid meeting garbage data in next free node block at the end of warm
node chain when doing recovery, we will try to zero out that invalid block.

If the device is not support discard, our way for zeroing out block is:
grabbing a temporary zeroed page in meta inode, then, issue write request
with this page.

But, we forget to release that temporary page, so our memory usage will
increase without gaining any hit ratio benefit, so it's better to free it
for saving memory.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

e90c2d28

f2fs: handle error cases in commit_inmem_pages · edb27dee

由 Jaegeuk Kim 提交于 7月 25, 2015

This patch adds to handle error cases in commit_inmem_pages.
If an error occurs, it stops to write the pages and return the error right
away.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

edb27dee

f2fs: shrink extent_cache entries · 554df79e

由 Jaegeuk Kim 提交于 6月 19, 2015

This patch registers shrinking extent_caches.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

554df79e

f2fs: shrink nat_cache entries · 1b38dc8e

由 Jaegeuk Kim 提交于 6月 19, 2015

This patch registers shrinking nat_cache entries.
Reviewed-by: NChao Yu <chao2.yu@samsung.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

1b38dc8e