- 06 9月, 2017 2 次提交
-
-
由 Jaegeuk Kim 提交于
If another thread already made the page dirtied or writebacked, we must avoid to verify checksum. If we got an error, we need to remove its uptodate as well. Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This patch fixes "f2fs: support inode checksum". The recovered inode page will be rewritten with valid checksum. Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 30 8月, 2017 1 次提交
-
-
由 Jaegeuk Kim 提交于
If file offset is insane, we have to return error instead of kernel panic. Reported-by: NEric Zhang <followme999@163.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 10 8月, 2017 1 次提交
-
-
由 Chao Yu 提交于
This patch enables inner app/fs io stats and introduces below virtual fs nodes for exposing stats info: /sys/fs/f2fs/<dev>/iostat_enable /proc/fs/f2fs/<dev>/iostat_info Signed-off-by: NChao Yu <yuchao0@huawei.com> [Jaegeuk Kim: fix wrong stat assignment] Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 04 8月, 2017 2 次提交
-
-
由 Chao Yu 提交于
This patch adds to support inode checksum in f2fs. Signed-off-by: NChao Yu <yuchao0@huawei.com> [Jaegeuk Kim: fix verification flow] Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Yunlong Song 提交于
Let node writeback also do f2fs_balance_fs to ensure there are always enough free segments. Signed-off-by: NYunlong Song <yunlong.song@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 01 8月, 2017 2 次提交
-
-
由 Chao Yu 提交于
This patch adds to support plain project quota. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
This patch add new flag F2FS_EXTRA_ATTR storing in inode.i_inline to indicate that on-disk structure of current inode is extended. In order to extend, we changed the inode structure a bit: Original one: struct f2fs_inode { ... struct f2fs_extent i_ext; __le32 i_addr[DEF_ADDRS_PER_INODE]; __le32 i_nid[DEF_NIDS_PER_INODE]; } Extended one: struct f2fs_inode { ... struct f2fs_extent i_ext; union { struct { __le16 i_extra_isize; __le16 i_padding; __le32 i_extra_end[0]; }; __le32 i_addr[DEF_ADDRS_PER_INODE]; }; __le32 i_nid[DEF_NIDS_PER_INODE]; } Once F2FS_EXTRA_ATTR is set, we will steal four bytes in the head of i_addr field for storing i_extra_isize and i_padding. with i_extra_isize, we can calculate actual size of reserved space in i_addr, available attribute fields included in total extra attribute fields for current inode can be described as below: +--------------------+ | .i_mode | | ... | | .i_ext | +--------------------+ | .i_extra_isize |-----+ | .i_padding | | | .i_prjid | | | .i_atime_extra | | | .i_ctime_extra | | | .i_mtime_extra |<----+ | .i_inode_cs |<----- store blkaddr/inline from here | .i_xattr_cs | | ... | +--------------------+ | | | block address | | | +--------------------+ | .i_nid | +--------------------+ | node_footer | | (nid, ino, offset) | +--------------------+ Hence, with this patch, we would enhance scalability of f2fs inode for storing more newly added attribute. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 27 7月, 2017 2 次提交
-
-
由 Yunlei He 提交于
recovery file A: recovery file B: -get_dnode_of_data -alloc_nid -recover_xattr_data -set_node_addr(sbi, &ni, NEW_ADDR, false); --->bug_on for nid has been used by file A In recovery process, new allocated node blocks may "reuse" xattr block nids, this patch alloc new nids for xattr blocks in recovery process to avoid this problem. Signed-off-by: NYunlei He <heyunlei@huawei.com> Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Yunlei He 提交于
This patch remove unused input parameter in function new_node_page. Signed-off-by: NYunlei He <heyunlei@huawei.com> Signed-off-by: NYong Sheng <shengyong1@huawei.com> Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 09 7月, 2017 1 次提交
-
-
由 Chao Yu 提交于
This patch adds to support plain user/group quota. Change Note by Jaegeuk Kim. - Use f2fs page cache for quota files in order to consider garbage collection. so, quota files are not tolerable for sudden power-cuts, so user needs to do quotacheck. - setattr() calls dquot_transfer which will transfer inode->i_blocks. We can't reclaim that during f2fs_evict_inode(). So, we need to count node blocks as well in order to match i_blocks with dquot's space. Note that, Chao wrote a patch to count inode->i_blocks without inode block. (f2fs: don't count inode block in in-memory inode.i_blocks) - in f2fs_remount, we need to make RW in prior to dquot_resume. - handle fault_injection case during f2fs_quota_off_umount - TODO: Project quota Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 08 7月, 2017 2 次提交
-
-
由 Chao Yu 提交于
Previously, we count all inode consumed blocks including inode block, xattr block, index block, data block into i_blocks, for other generic filesystems, they won't count inode block into i_blocks, so for userspace applications or quota system, they may detect incorrect block count according to i_blocks value in inode. This patch changes to count all blocks into inode.i_blocks excluding inode block, for on-disk i_blocks, we keep counting inode block for backward compatibility. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
Skip ->writepages in prior to ->writepage for {meta,node}_inode during recovery, hence unneeded loop in ->writepages can be avoided. Moreover, check SBI_POR_DOING earlier while writebacking pages. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 04 7月, 2017 3 次提交
-
-
由 Chao Yu 提交于
Both in memory or on disk, generic filesystems record i_blocks with 512bytes sized sector count, also VFS sub module such as disk quota follows this rule, but f2fs records it with 4096bytes sized block count, this difference leads to that once we use dquota's function which inc/dec iblocks, it will make i_blocks of f2fs being inconsistent between in memory and on disk. In order to resolve this issue, this patch changes to make in-memory i_blocks of f2fs recording sector count instead of block count, meanwhile leaving on-disk i_blocks recording block count. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
With fault_injection option, generic/361 of fstests will complain us with below message: Call Trace: get_node_page+0x12/0x20 [f2fs] f2fs_iget+0x92/0x7d0 [f2fs] f2fs_fill_super+0x10fb/0x15e0 [f2fs] mount_bdev+0x184/0x1c0 f2fs_mount+0x15/0x20 [f2fs] mount_fs+0x39/0x150 vfs_kern_mount+0x67/0x110 do_mount+0x1bb/0xc70 SyS_mount+0x83/0xd0 do_syscall_64+0x6e/0x160 entry_SYSCALL64_slow_path+0x25/0x25 Since mkfs loop device in f2fs partition can be failed silently due to checkpoint error injection, so root inode page can be corrupted, in order to avoid needless panic, in get_node_page, it's better to leave message and return error to caller, and let fsck repaire it later. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
We will never persist newly allocated nat entries during checkpoint(), so we don't need to track such nat entries in nat dirty list in order to avoid: - more latency during traversing dirty list; - sorting nat sets incorrectly due to recording wrong entry_cnt in nat entry set. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 24 5月, 2017 2 次提交
-
-
由 Hou Pengyang 提交于
Signed-off-by: NHou Pengyang <houpengyang@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
Merged IO flow doesn't need to care about read IOs. f2fs_submit_merged_bio -> f2fs_submit_merged_write f2fs_submit_merged_bios -> f2fs_submit_merged_writes f2fs_submit_merged_bio_cond -> f2fs_submit_merged_write_cond Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 09 5月, 2017 1 次提交
-
-
由 Michal Hocko 提交于
Patch series "kvmalloc", v5. There are many open coded kmalloc with vmalloc fallback instances in the tree. Most of them are not careful enough or simply do not care about the underlying semantic of the kmalloc/page allocator which means that a) some vmalloc fallbacks are basically unreachable because the kmalloc part will keep retrying until it succeeds b) the page allocator can invoke a really disruptive steps like the OOM killer to move forward which doesn't sound appropriate when we consider that the vmalloc fallback is available. As it can be seen implementing kvmalloc requires quite an intimate knowledge if the page allocator and the memory reclaim internals which strongly suggests that a helper should be implemented in the memory subsystem proper. Most callers, I could find, have been converted to use the helper instead. This is patch 6. There are some more relying on __GFP_REPEAT in the networking stack which I have converted as well and Eric Dumazet was not opposed [2] to convert them as well. [1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org [2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com This patch (of 9): Using kmalloc with the vmalloc fallback for larger allocations is a common pattern in the kernel code. Yet we do not have any common helper for that and so users have invented their own helpers. Some of them are really creative when doing so. Let's just add kv[mz]alloc and make sure it is implemented properly. This implementation makes sure to not make a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also to not warn about allocation failures. This also rules out the OOM killer as the vmalloc is a more approapriate fallback than a disruptive user visible action. This patch also changes some existing users and removes helpers which are specific for them. In some cases this is not possible (e.g. ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and require GFP_NO{FS,IO} context which is not vmalloc compatible in general (note that the page table allocation is GFP_KERNEL). Those need to be fixed separately. While we are at it, document that __vmalloc{_node} about unsupported gfp mask because there seems to be a lot of confusion out there. kvmalloc_node will warn about GFP_KERNEL incompatible (which are not superset) flags to catch new abusers. Existing ones would have to die slowly. [sfr@canb.auug.org.au: f2fs fixup] Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Andreas Dilger <adilger@dilger.ca> [ext4 part] Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: John Hubbard <jhubbard@nvidia.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 5月, 2017 1 次提交
-
-
由 Yunlei He 提交于
-write_checkpoint -do_checkpoint -next_free_nid <--- something wrong with next free nid -f2fs_fill_super -build_node_manager -build_free_nids -get_current_nat_page -__get_meta_page <--- attempt to access beyond end of device Signed-off-by: NYunlei He <heyunlei@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 26 4月, 2017 1 次提交
-
-
由 Yunlei He 提交于
This patch seperate nat page read io from nat_tree_lock. -lock_page -get_node_info() -current_nat_addr ...... -> write_checkpoint -get_meta_page Because we lock node page, we can make sure no other threads modify this nid concurrently. So we just obtain current_nat_addr under nat_tree_lock, node info is always same in both nat pack. Signed-off-by: NYunlei He <heyunlei@huawei.com> Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 13 4月, 2017 1 次提交
-
-
由 Jaegeuk Kim 提交于
Otherwise, we can see stale fsync/dentry mark given by previous calls, resulting in giving up roll-forward recovery due to wrong dentry mark. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 11 4月, 2017 1 次提交
-
-
由 Tomohiro Kusumi 提交于
Add braces around variables used within macros for those make sense to do it. Many of the macros in f2fs already do this. What this commit doesn't do is anything that changes line# as a result of adding braces, which usually affects the binary via __LINE__. Confirmed no diff in fs/f2fs/f2fs.ko before/after this commit on x86_64, to make sure this has no functional change as well as there's been no unexpected side effect due to callers' arithmetics within the existing code. Signed-off-by: NTomohiro Kusumi <tkusumi@tuxera.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 25 3月, 2017 2 次提交
-
-
由 Yunlei He 提交于
This patch allow write data to normal file when writting new checkpoint. We relax three limitations for write_begin path: 1. data allocation 2. node allocation 3. variables in checkpoint Signed-off-by: NYunlei He <heyunlei@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
In below concurrent case, allocated nid can be loaded into free nid cache and be allocated again. Thread A Thread B - f2fs_create - f2fs_new_inode - alloc_nid - __insert_nid_to_list(ALLOC_NID_LIST) - f2fs_balance_fs_bg - build_free_nids - __build_free_nids - scan_nat_page - add_free_nid - __lookup_nat_cache - f2fs_add_link - init_inode_metadata - new_inode_page - new_node_page - set_node_addr - alloc_nid_done - __remove_nid_from_list(ALLOC_NID_LIST) - __insert_nid_to_list(FREE_NID_LIST) This patch makes nat cache lookup and free nid list operation being atomical to avoid this race condition. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 22 3月, 2017 3 次提交
-
-
由 Kinglong Mee 提交于
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
free_nid_bitmap and free_nid_count in update_free_nid_bitmap should be updated atomically, use nid_list_lock cover them to avoid race in concurrent scenario. Signed-off-by: NChao Yu <yuchao0@huawei.com> Reviewed-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Kinglong Mee 提交于
The nat entry is listed from the set list for freeing, it's duplicate to do radix tree lookup again. Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> [Jaegeuk Kim: remove unnecessary f2fs_bug_on] Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 20 3月, 2017 4 次提交
-
-
由 Chao Yu 提交于
Both nat_bits cache and free_nid_bitmap cache provide same functionality as a intermediate cache between free nid cache and disk, but with different granularity of indicating free nid range, and different persistence policy. nat_bits cache provides better persistence ability, and free_nid_bitmap provides better granularity. In this patch we combine advantage of both caches, so finally policy of the intermediate cache would be: - init: load free nid status from nat_bits into free_nid_bitmap - lookup: scan free_nid_bitmap before load NAT blocks - update: update free_nid_bitmap in real-time - persistence: udpate and persist nat_bits in checkpoint This patch also resolves performance regression reported by lkp-robot. commit: 4ac91242 ("f2fs: introduce free nid bitmap") d00030cf9cd0bb96fdccc41e33d3c91dcbb672ba ("f2fs: use __set{__clear}_bit_le") 1382c0f3f9d3f936c8bc42ed1591cf7a593ef9f7 ("f2fs: combine nat_bits and free_nid_bitmap cache") 4ac91242 d00030cf9cd0bb96fdccc41e33 1382c0f3f9d3f936c8bc42ed15 ---------------- -------------------------- -------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 77863 ± 0% +2.1% 79485 ± 1% +50.8% 117404 ± 0% aim7.jobs-per-min 231.63 ± 0% -2.0% 227.01 ± 1% -33.6% 153.80 ± 0% aim7.time.elapsed_time 231.63 ± 0% -2.0% 227.01 ± 1% -33.6% 153.80 ± 0% aim7.time.elapsed_time.max 896604 ± 0% -0.8% 889221 ± 3% -20.2% 715260 ± 1% aim7.time.involuntary_context_switches 2394 ± 1% +4.6% 2503 ± 1% +3.7% 2481 ± 2% aim7.time.maximum_resident_set_size 6240 ± 0% -1.5% 6145 ± 1% -14.1% 5360 ± 1% aim7.time.system_time 1111357 ± 3% +1.9% 1132509 ± 2% -6.2% 1041932 ± 2% aim7.time.voluntary_context_switches ... Signed-off-by: NChao Yu <yuchao0@huawei.com> Tested-by: NXiaolong Ye <xiaolong.ye@intel.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
This patch adds to account free nids for each NAT blocks, and while scanning all free nid bitmap, do check count and skip lookuping in full NAT block. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This patch uses __set{__clear}_bit_le for highter speed. Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This is to avoid build warning reported by kbuild test robot. Signed-off-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 01 3月, 2017 1 次提交
-
-
由 Jaegeuk Kim 提交于
This patch adds a missing condition which flushes nat journal entries unnecessarily introduced by: f2fs: add bitmaps for empty or full NAT blocks Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 28 2月, 2017 5 次提交
-
-
由 Kinglong Mee 提交于
F2FS has define MAX_FREE_NIDS for maximum of cached free nids target. Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
In scenario of intensively node allocation, free nids will be ran out soon, then it needs to stop to load free nids by traversing NAT blocks, in worse case, if NAT blocks does not be cached in memory, it generates IOs which slows down our foreground operations. In order to speed up node allocation, in this patch we introduce a new free_nid_bitmap array, so there is an bitmap table for each NAT block, Once the NAT block is loaded, related bitmap cache will be switched on, and bitmap will be set during traversing nat entries in NAT block, later we can query and update nid usage status in memory completely. With such implementation, I expect performance of node allocation can be improved in the long-term after filesystem image is mounted. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Kinglong Mee 提交于
There are four places that getting the crc value in f2fs_checkpoint, just add a new helper cur_cp_crc for them. Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
Previously kernel message can show that in which function we do the injection, but unfortunately, most of the caller are the same, for tracking more information of injection path, it needs to show upper caller's name. This patch supports that ability. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This patches adds bitmaps to represent empty or full NAT blocks containing free nid entries. If we can find valid crc|cp_ver in the last block of checkpoint pack, we'll use these bitmaps when building free nids. In order to avoid checkpointing burden, up-to-date bitmaps will be flushed only during umount time. So, normally we can get this gain, but when power-cut happens, we rely on fsck.f2fs which recovers this bitmap again. After this patch, we build free nids from nid #0 at mount time to make more full NAT blocks, but in runtime, we check empty NAT blocks to load free nids without loading any NAT pages from disk. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 24 2月, 2017 2 次提交
-
-
由 Jaegeuk Kim 提交于
We've not seen this buggy case for a long time, so it's time to avoid this unnecessary get_node_info() call which reading NAT page to cache nat entry. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
Currently, if we call fsync after updating the xattr date belongs to the file, f2fs needs to trigger checkpoint to keep xattr data consistent. But, this policy cause low performance as checkpoint will block most foreground operations and cause unneeded and unrelated IOs around checkpoint. This patch will reuse regular file recovery policy for xattr node block, so, we change to write xattr node block tagged with fsync flag to warm area instead of cold area, and during recovery, we search warm node chain for fsynced xattr block, and do the recovery. So, for below application IO pattern, performance can be improved obviously: - touch file - create/update/delete xattr entry in file - fsync file Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-