- 24 11月, 2016 31 次提交
-
-
由 Damien Le Moal 提交于
The LFS mode is mandatory for host-managed zoned block devices as update in place optimizations are not possible for segments in sequential zones. Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Damien Le Moal 提交于
Zone write pointer reset acts as discard for zoned block devices. So if the zoned block device feature is enabled, always declare that discard is enabled, even if the device does not actually support the command. For the same reason, prevent the use the "nodicard" mount option. Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Damien Le Moal 提交于
For zoned block devices, discard is replaced by zone reset. So do not warn if the device does not supports discard. Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Damien Le Moal 提交于
The F2FS_FEATURE_BLKZONED feature indicates that the drive was formatted with zone alignment optimization. This is optional for host-aware devices, but mandatory for host-managed zoned block devices. So check that the feature is set in this latter case. Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Damien Le Moal 提交于
SMR stands for "Shingled Magnetic Recording" which makes sense only for hard disk drives (spinning rust). The ZBC/ZAC standards enable management of SMR disks, but solid state drives may also support those standards. So rename the HMSMR feature to BLKZONED to avoid a HDD centric terminology. For the same reason, rename f2fs_sb_mounted_hmsmr to f2fs_sb_mounted_blkzoned. Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Damien Le Moal 提交于
Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This patch should fix an infinite loop case below. F2FS-fs : inject IO error in f2fs_read_end_io+0xf3/0x120 [f2fs] F2FS-fs (nvme0n1p1): recover_orphan_inode: orphan failed (ino=39ac1a), run fsck to fix. ... [<ffffffffc0b11ede>] sync_meta_pages+0xae/0x270 [f2fs] [<ffffffffc0b288dd>] ? flush_sit_entries+0x8d/0x960 [f2fs] [<ffffffffc0b13801>] write_checkpoint+0x361/0xf20 [f2fs] [<ffffffffb40e979d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffc0b0a199>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc0b0a1a5>] f2fs_sync_fs+0x85/0x190 [f2fs] [<ffffffffc0b2560e>] f2fs_balance_fs_bg+0x7e/0x1c0 [f2fs] [<ffffffffc0b216c4>] f2fs_write_node_pages+0x34/0x320 [f2fs] [<ffffffffb41dff21>] do_writepages+0x21/0x30 [<ffffffffb429edb1>] __writeback_single_inode+0x61/0x760 [<ffffffffb490a937>] ? _raw_spin_unlock+0x27/0x40 [<ffffffffb42a0805>] writeback_single_inode+0xd5/0x190 [<ffffffffb42a0959>] write_inode_now+0x99/0xc0 [<ffffffffb4289a16>] iput+0x1f6/0x2c0 [<ffffffffc0b0e3be>] f2fs_fill_super+0xe0e/0x1300 [f2fs] [<ffffffffb426c394>] ? sget_userns+0x4f4/0x530 [<ffffffffb426c692>] mount_bdev+0x182/0x1b0 [<ffffffffc0b0d5b0>] ? f2fs_commit_super+0x100/0x100 [f2fs] [<ffffffffc0b0a375>] f2fs_mount+0x15/0x20 [f2fs] [<ffffffffb426d038>] mount_fs+0x38/0x170 [<ffffffffb428ec9b>] vfs_kern_mount+0x6b/0x160 [<ffffffffb4291d9e>] do_mount+0x1be/0xd60 [<ffffffffb4291a57>] ? copy_mount_options+0xb7/0x220 [<ffffffffb4292c54>] SyS_mount+0x94/0xd0 [<ffffffffb490b345>] entry_SYSCALL_64_fastpath+0x23/0xc6 Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
Report error of f2fs_fill_dentries to ->iterate_shared, otherwise when error ocurrs, user may just list part of dirents in target directory without any hints. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Arnd Bergmann 提交于
gcc is unsure about the use of last_ofs_in_node, which might happen without a prior initialization: fs/f2fs//git/arm-soc/fs/f2fs/data.c: In function ‘f2fs_map_blocks’: fs/f2fs/data.c:799:54: warning: ‘last_ofs_in_node’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (prealloc && dn.ofs_in_node != last_ofs_in_node + 1) { As pointed out by Chao Yu, the code is actually correct as 'prealloc' is only set if the last_ofs_in_node has been set, the two always get updated together. This initializes last_ofs_in_node to dn.ofs_in_node for each new dnode at the start of the 'next_block' loop, which at that point is a correct initialization as well. I assume that compilers that correctly track the contents of the variables and do not warn about the condition also figure out that they can eliminate the extra assignment here. Fixes: 46008c6d ("f2fs: support in batch multi blocks preallocation") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This patch removes percpu_count usage due to performance regression in iozone. Fixes: 523be8a6 ("f2fs: use percpu_counter for page counters") Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This patch tries to make more clean inodes when flushing dirty inodes in checkpoint. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This is to avoid no free segment bug during checkpoint caused by a number of dirty inodes. The case was reported by Chao like this. 1. mount with lazytime option 2. fill 4k file until disk is full 3. sync filesystem 4. read all files in the image 5. umount In this case, we actually don't need to flush dirty inode to inode page during checkpoint. Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
We don't need to allocate bio partially in order to maximize sequential writes. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This patch avoids build warning. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
If inode becomes dirty, we need to check the # of dirty inodes whether or not further checkpoint would be required. Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
If there are a lot of dirty inodes, we need to flush all of them when doing checkpoint. So, we need to count this for enough free space. Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
This patch makes sure it returns a positive value instead of a probable casted negative value as shrink count. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
Let build_free_nids support sync/async methods, in allocation flow of nids, we use synchronuous method, so that we can avoid looping in alloc_nid when free memory is low; in unblock_operations and f2fs_balance_fs_bg we use asynchronuous method in where low memory condition can interrupt us. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This patch cleans up to use consistent free nid list ops. Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
During free nid allocation, in order to do preallocation, we will tag free nid entry as allocated one and still leave it in free nid list, for other allocators who want to grab free nids, it needs to traverse the free nid list for lookup. It becomes overhead in scenario of allocating free nid intensively by multithreads. This patch splits free nid list to two list: {free,alloc}_nid_list, to keep free nids and preallocated free nids separately, after that, traverse latency will be gone, besides split nid_cnt for separate statistic. Additionally, introduce __insert_nid_to_list and __remove_nid_from_list for cleanup. Signed-off-by: NChao Yu <yuchao0@huawei.com> [Jaegeuk Kim: modify f2fs_bug_on to avoid needless branches] Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
We don't need to keep incomplete created inode in cache, so if we fail to add link into directory during new inode creation, it's better to set nlink of inode to zero, then we can evict inode immediately. Otherwise release of nid belong to inode will be delayed until inode cache is being shrunk, it may cause a seemingly endless loop while allocating free nids in time of testing generic/269 case of fstest suit. Signed-off-by: NChao Yu <yuchao0@huawei.com> [Jaegeuk Kim: add update_inode_page to fix kernel panic] Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Eric Biggers 提交于
f2fs contained a number of endianness conversion bugs. Also, one function should have been 'static'. Found with sparse by running 'make C=2 CF=-D__CHECK_ENDIAN__ fs/f2fs/' Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
In fsync_node_pages, if f2fs was taged with CP_ERROR_FLAG, make sure bio cache was flushed before return. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
In order to avoid racing problem, make largest extent cache being updated under lock. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
f2fs can support fallocating blocks beyond file size without changing the size, but ->fiemap of f2fs was restricted and can't detect these extents fallocated past EOF, now relieve the restriction. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
In f2fs_map_blocks, let f2fs_balance_fs detects node page modification with dn.node_changed to avoid miss some corner cases. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
f2fs_balance_fs should be called in between node page updating, otherwise node page count will exceeded far beyond watermark of triggering foreground garbage collection, result in facing high risk of hitting LFS allocation failure. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
If there is no dirty pages in inode, we should give a chance to detach the inode from global dirty list, otherwise it needs to call another unnecessary .writepages for detaching. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
In f2fs_fill_super, if there is any IO error occurs during recovery, cached discard entries will be leaked, in order to avoid this, make write_checkpoint() handle memory release by itself, besides, move clear_prefree_segments to write_checkpoint for readability. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
During nid allocation, it needs to exclude building and allocating flow of free nids, this is because while building free nid cache, there are two steps: a) load free nids from unused nat entries in NAT pages, b) update free nid cache by checking nat journal. The two steps should be atomical, otherwise an used nid can be allocated as free one after a) and before b). This patch adds missing lock which covers build_free_nids in unlock_operation and f2fs_balance_fs_bg to avoid that. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
In the last ilen case, i was already increased, resulting in accessing out- of-boundary entry of do_replace and blkaddr. Fix to check ilen first to exit the loop. Fixes: 2aa8fbb9693020 ("f2fs: refactor __exchange_data_block for speed up") Cc: stable@vger.kernel.org # 4.8+ Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 20 11月, 2016 3 次提交
-
-
由 Theodore Ts'o 提交于
If the block size or cluster size is insane, reject the mount. This is important for security reasons (although we shouldn't be just depending on this check). Ref: http://www.securityfocus.com/archive/1/539661 Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1332506Reported-by: NBorislav Petkov <bp@alien8.de> Reported-by: NNikolay Borisov <kernel@kyup.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Eric Biggers 提交于
With the new (in 4.9) option to use a virtually-mapped stack (CONFIG_VMAP_STACK), stack buffers cannot be used as input/output for the scatterlist crypto API because they may not be directly mappable to struct page. get_crypt_info() was using a stack buffer to hold the output from the encryption operation used to derive the per-file key. Fix it by using a heap buffer. This bug could most easily be observed in a CONFIG_DEBUG_SG kernel because this allowed the BUG in sg_set_buf() to be triggered. Cc: stable@vger.kernel.org Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Biggers 提交于
With the new (in 4.9) option to use a virtually-mapped stack (CONFIG_VMAP_STACK), stack buffers cannot be used as input/output for the scatterlist crypto API because they may not be directly mappable to struct page. For short filenames, fname_encrypt() was encrypting a stack buffer holding the padded filename. Fix it by encrypting the filename in-place in the output buffer, thereby making the temporary buffer unnecessary. This bug could most easily be observed in a CONFIG_DEBUG_SG kernel because this allowed the BUG in sg_set_buf() to be triggered. Cc: stable@vger.kernel.org Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 17 11月, 2016 2 次提交
-
-
由 Andreas Gruenbacher 提交于
The IOP_XATTR flag is set on sockfs because sockfs supports getting the "system.sockprotoname" xattr. Since commit 6c6ef9f2, this flag is checked for setxattr support as well. This is wrong on sockfs because security xattr support there is supposed to be provided by security_inode_setsecurity. The smack security module relies on socket labels (xattrs). Fix this by adding a security xattr handler on sockfs that returns -EAGAIN, and by checking for -EAGAIN in setxattr. We cannot simply check for -EOPNOTSUPP in setxattr because there are filesystems that neither have direct security xattr support nor support via security_inode_setsecurity. A more proper fix might be to move the call to security_inode_setsecurity into sockfs, but it's not clear to me if that is safe: we would end up calling security_inode_post_setxattr after that as well. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Mike Marshall 提交于
Without ".owner = THIS_MODULE" it is possible to crash the kernel by unloading the Orangefs module while someone is reading debugfs files. Signed-off-by: NMike Marshall <hubcap@omnibond.com>
-
- 15 11月, 2016 1 次提交
-
-
由 Miklos Szeredi 提交于
If pos is at the beginning of a page and copied is zero then page is not zeroed but is marked uptodate. Fix by skipping everything except unlock/put of page if zero bytes were copied. Reported-by: NAl Viro <viro@zeniv.linux.org.uk> Fixes: 6b12c1b3 ("fuse: Implement write_begin/write_end callbacks") Cc: <stable@vger.kernel.org> # v3.15+ Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 12 11月, 2016 3 次提交
-
-
由 Arnd Bergmann 提交于
A bugfix introduced a harmless gcc warning in nfs4_slot_seqid_in_use if we enable -Wmaybe-uninitialized again: fs/nfs/nfs4session.c:203:54: error: 'cur_seq' may be used uninitialized in this function [-Werror=maybe-uninitialized] gcc is not smart enough to conclude that the IS_ERR/PTR_ERR pair results in a nonzero return value here. Using PTR_ERR_OR_ZERO() instead makes this clear to the compiler. Fixes: e09c978a ("NFSv4.1: Fix Oopsable condition in server callback races") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrey Ryabinin 提交于
It could be not possible to freeze coredumping task when it waits for 'core_state->startup' completion, because threads are frozen in get_signal() before they got a chance to complete 'core_state->startup'. Inability to freeze a task during suspend will cause suspend to fail. Also CRIU uses cgroup freezer during dump operation. So with an unfreezable task the CRIU dump will fail because it waits for a transition from 'FREEZING' to 'FROZEN' state which will never happen. Use freezer_do_not_count() to tell freezer to ignore coredumping task while it waits for core_state->startup completion. Link: http://lkml.kernel.org/r/1475225434-3753-1-git-send-email-aryabinin@virtuozzo.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: NPavel Machek <pavel@ucw.cz> Acked-by: NOleg Nesterov <oleg@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Tejun Heo <tj@kernel.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Junxiao Bi 提交于
The following panic was caught when run ocfs2 disconfig single test (block size 512 and cluster size 8192). ocfs2_journal_dirty() return -ENOSPC, that means credits were used up. The total credit should include 3 times of "num_dx_leaves" from ocfs2_dx_dir_rebalance(), because 2 times will be consumed in ocfs2_dx_dir_transfer_leaf() and 1 time will be consumed in ocfs2_dx_dir_new_cluster() -> __ocfs2_dx_dir_new_cluster() -> ocfs2_dx_dir_format_cluster(). But only two times is included in ocfs2_dx_dir_rebalance_credits(), fix it. This can cause read-only fs(v4.1+) or panic for mainline linux depending on mount option. ------------[ cut here ]------------ kernel BUG at fs/ocfs2/journal.c:775! invalid opcode: 0000 [#1] SMP Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea parport_pc parport acpi_cpufreq i2c_piix4 i2c_core pcspkr ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod CPU: 2 PID: 10601 Comm: dd Not tainted 4.1.12-71.el6uek.bug24939243.x86_64 #2 Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016 task: ffff8800b6de6200 ti: ffff8800a7d48000 task.ti: ffff8800a7d48000 RIP: ocfs2_journal_dirty+0xa7/0xb0 [ocfs2] RSP: 0018:ffff8800a7d4b6d8 EFLAGS: 00010286 RAX: 00000000ffffffe4 RBX: 00000000814d0a9c RCX: 00000000000004f9 RDX: ffffffffa008e990 RSI: ffffffffa008f1ee RDI: ffff8800622b6460 RBP: ffff8800a7d4b6f8 R08: ffffffffa008f288 R09: ffff8800622b6460 R10: 0000000000000000 R11: 0000000000000282 R12: 0000000002c8421e R13: ffff88006d0cad00 R14: ffff880092beef60 R15: 0000000000000070 FS: 00007f9b83e92700(0000) GS:ffff8800be880000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb2c0d1a000 CR3: 0000000008f80000 CR4: 00000000000406e0 Call Trace: ocfs2_dx_dir_transfer_leaf+0x159/0x1a0 [ocfs2] ocfs2_dx_dir_rebalance+0xd9b/0xea0 [ocfs2] ocfs2_find_dir_space_dx+0xd3/0x300 [ocfs2] ocfs2_prepare_dx_dir_for_insert+0x219/0x450 [ocfs2] ocfs2_prepare_dir_for_insert+0x1d6/0x580 [ocfs2] ocfs2_mknod+0x5a2/0x1400 [ocfs2] ocfs2_create+0x73/0x180 [ocfs2] vfs_create+0xd8/0x100 lookup_open+0x185/0x1c0 do_last+0x36d/0x780 path_openat+0x92/0x470 do_filp_open+0x4a/0xa0 do_sys_open+0x11a/0x230 SyS_open+0x1e/0x20 system_call_fastpath+0x12/0x71 Code: 1d 3f 29 09 00 48 85 db 74 1f 48 8b 03 0f 1f 80 00 00 00 00 48 8b 7b 08 48 83 c3 10 4c 89 e6 ff d0 48 8b 03 48 85 c0 75 eb eb 90 <0f> 0b eb fe 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 RIP ocfs2_journal_dirty+0xa7/0xb0 [ocfs2] ---[ end trace 91ac5312a6ee1288 ]--- Kernel panic - not syncing: Fatal exception Kernel Offset: disabled Link: http://lkml.kernel.org/r/1478248135-31963-1-git-send-email-junxiao.bi@oracle.comSigned-off-by: NJunxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-