- 14 2月, 2020 4 次提交
-
-
由 Jan Kara 提交于
DIR_INDEX has been introduced as a compat ext4 feature. That means that even kernels / tools that don't understand the feature may modify the filesystem. This works because for kernels not understanding indexed dir format, internal htree nodes appear just as empty directory entries. Index dir aware kernels then check the htree structure is still consistent before using the data. This all worked reasonably well until metadata checksums were introduced. The problem is that these effectively made DIR_INDEX only ro-compatible because internal htree nodes store checksums in a different place than normal directory blocks. Thus any modification ignorant to DIR_INDEX (or just clearing EXT4_INDEX_FL from the inode) will effectively cause checksum mismatch and trigger kernel errors. So we have to be more careful when dealing with indexed directories on filesystems with checksumming enabled. 1) We just disallow loading any directory inodes with EXT4_INDEX_FL when DIR_INDEX is not enabled. This is harsh but it should be very rare (it means someone disabled DIR_INDEX on existing filesystem and didn't run e2fsck), e2fsck can fix the problem, and we don't want to answer the difficult question: "Should we rather corrupt the directory more or should we ignore that DIR_INDEX feature is not set?" 2) When we find out htree structure is corrupted (but the filesystem and the directory should in support htrees), we continue just ignoring htree information for reading but we refuse to add new entries to the directory to avoid corrupting it more. Link: https://lore.kernel.org/r/20200210144316.22081-1-jack@suse.cz Fixes: dbe89444 ("ext4: Calculate and verify checksums for htree nodes") Reviewed-by: NAndreas Dilger <adilger@dilger.ca> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
-
由 Theodore Ts'o 提交于
A recent commit, 9803387c ("ext4: validate the debug_want_extra_isize mount option at parse time"), moved mount-time checks around. One of those changes moved the inode size check before the blocksize variable was set to the blocksize of the file system. After 9803387c was set to the minimum allowable blocksize, which in practice on most systems would be 1024 bytes. This cuased file systems with inode sizes larger than 1024 bytes to be rejected with a message: EXT4-fs (sdXX): unsupported inode size: 4096 Fixes: 9803387c ("ext4: validate the debug_want_extra_isize mount option at parse time") Link: https://lore.kernel.org/r/20200206225252.GA3673@mit.eduReported-by: NHerbert Poetzl <herbert@13thfloor.at> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
-
由 Jan Kara 提交于
Coverity reports that conditions checking quota limits in ext4_statfs() contain dead code. Indeed it is right and current conditions can be simplified. Link: https://lore.kernel.org/r/20200130111148.10766-1-jack@suse.czReported-by: NCoverity <scan-admin@coverity.com> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
-
由 Andreas Dilger 提交于
Don't assume that the mmp_nodename and mmp_bdevname strings are NUL terminated, since they are filled in by snprintf(), which is not guaranteed to do so. Link: https://lore.kernel.org/r/1580076215-1048-1-git-send-email-adilger@dilger.caSigned-off-by: NAndreas Dilger <adilger@dilger.ca> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
-
- 25 1月, 2020 5 次提交
-
-
由 Shijie Luo 提交于
Fix comment and remove unneccessary blank. Signed-off-by: NShijie Luo <luoshijie1@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200123064325.36358-1-luoshijie1@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Dmitry Monakhov 提交于
Show pblock only if it has meaningful value. # before ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [1/4294967294) 576460752303423487 H ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [2/4294967293) 576460752303423487 HR # after ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [1/4294967294) 0 H ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [2/4294967293) 0 HR Signed-off-by: NDmitry Monakhov <dmonakhov@gmail.com> Link: https://lore.kernel.org/r/20191114200147.1073-2-dmonakhov@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Chengguang Xu 提交于
Setting softlimit larger than hardlimit seems meaningless for disk quota but currently it is allowed. In this case, there may be a bit of comfusion for users when they run df comamnd to directory which has project quota. For example, we set 20M softlimit and 10M hardlimit of block usage limit for project quota of test_dir(project id 123). [root@hades mnt_ext4]# repquota -P -a *** Report for project quotas on device /dev/loop0 Block grace time: 7days; Inode grace time: 7days Block limits File limits Project used soft hard grace used soft hard grace ---------------------------------------------------------------------- 0 -- 13 0 0 2 0 0 123 -- 10237 20480 10240 5 200 100 The result of df command as below: [root@hades mnt_ext4]# df -h test_dir Filesystem Size Used Avail Use% Mounted on /dev/loop0 20M 10M 10M 50% /home/cgxu/test/mnt_ext4 Even though it looks like there is another 10M free space to use, if we write new data to diretory test_dir(inherit project id), the write will fail with errno(-EDQUOT). After this patch, the df result looks like below. [root@hades mnt_ext4]# df -h test_dir Filesystem Size Used Avail Use% Mounted on /dev/loop0 10M 10M 3.0K 100% /home/cgxu/test/mnt_ext4 Signed-off-by: NChengguang Xu <cgxu519@mykernel.net> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20191016022501.760-1-cgxu519@mykernel.netSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Biggers 提交于
Since ->d_compare() and ->d_hash() can be called in RCU-walk mode, ->d_parent and ->d_inode can be concurrently modified, and in particular, ->d_inode may be changed to NULL. For ext4_d_hash() this resulted in a reproducible NULL dereference if a lookup is done in a directory being deleted, e.g. with: int main() { if (fork()) { for (;;) { mkdir("subdir", 0700); rmdir("subdir"); } } else { for (;;) access("subdir/file", 0); } } ... or by running the 't_encrypted_d_revalidate' program from xfstests. Both repros work in any directory on a filesystem with the encoding feature, even if the directory doesn't actually have the casefold flag. I couldn't reproduce a crash in ext4_d_compare(), but it appears that a similar crash is possible there. Fix these bugs by reading ->d_parent and ->d_inode using READ_ONCE() and falling back to the case sensitive behavior if the inode is NULL. Reported-by: NAl Viro <viro@zeniv.linux.org.uk> Fixes: b886ee3e ("ext4: Support case-insensitive file name lookups") Cc: <stable@vger.kernel.org> # v5.2+ Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20200124041234.159740-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
This fixes the direct I/O versus writeback race which can reveal stale data, and it improves the tail latency of commits on slow devices. Link: https://lore.kernel.org/r/20200125022254.1101588-1-tytso@mit.eduSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 24 1月, 2020 1 次提交
-
-
由 Dmitry Monakhov 提交于
Extents are cached in read_extent_tree_block(); as a result, extents are not cached for inodes with depth == 0 when we try to find the extent using ext4_find_extent(). The result of the lookup is cached in ext4_map_blocks() but is only a subset of the extent on disk. As a result, the contents of extents status cache can get very badly fragmented for certain workloads, such as a random 4k read workload. File size of /mnt/test is 33554432 (8192 blocks of 4096 bytes) ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 8191: 40960.. 49151: 8192: last,eof $ perf record -e 'ext4:ext4_es_*' /root/bin/fio --name=t --direct=0 --rw=randread --bs=4k --filesize=32M --size=32M --filename=/mnt/test $ perf script | grep ext4_es_insert_extent | head -n 10 fio 131 [000] 13.975421: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [494/1) mapped 41454 status W fio 131 [000] 13.975939: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [6064/1) mapped 47024 status W fio 131 [000] 13.976467: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [6907/1) mapped 47867 status W fio 131 [000] 13.976937: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [3850/1) mapped 44810 status W fio 131 [000] 13.977440: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [3292/1) mapped 44252 status W fio 131 [000] 13.977931: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [6882/1) mapped 47842 status W fio 131 [000] 13.978376: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [3117/1) mapped 44077 status W fio 131 [000] 13.978957: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [2896/1) mapped 43856 status W fio 131 [000] 13.979474: ext4:ext4_es_insert_extent: dev 253,0 ino 12 es [7479/1) mapped 48439 status W Fix this by caching the extents for inodes with depth == 0 in ext4_find_extent(). [ Renamed ext4_es_cache_extents() to ext4_cache_extents() since this newly added function is not in extents_cache.c, and to avoid potential visual confusion with ext4_es_cache_extent(). -TYT ] Signed-off-by: NDmitry Monakhov <dmonakhov@gmail.com> Link: https://lore.kernel.org/r/20191106122502.19986-1-dmonakhov@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 18 1月, 2020 25 次提交
-
-
由 Theodore Ts'o 提交于
As Jan pointed out[1], as of commit 81378da6 ("jbd2: mark the transaction context with the scope GFP_NOFS context") we use memalloc_nofs_{save,restore}() while a jbd2 handle is active. So ext4_kvmalloc() so we can call allocate using GFP_NOFS is no longer necessary. [1] https://lore.kernel.org/r/20200109100007.GC27035@quack2.suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20200116155031.266620-1-tytso@mit.eduReviewed-by: NJan Kara <jack@suse.cz>
-
由 Martijn Coenen 提交于
These are backed by 'struct fsxattr' which has the same size on all architectures. Signed-off-by: NMartijn Coenen <maco@android.com> Link: https://lore.kernel.org/r/20191227134639.35869-1-maco@android.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Ritesh Harjani 提交于
Remove unused macro MPAGE_DA_EXTENT_TAIL which is no more used after below commit 4e7ea81d ("ext4: restructure writeback path") Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200101095137.25656-1-riteshh@linux.ibm.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Biggers 提交于
For clarity, add braces to the loop in ext4_ext_drop_refs(). Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191231180444.46586-9-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Eric Biggers 提交于
Clean up some code that was using 2-character indents. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191231180444.46586-8-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Eric Biggers 提交于
Support for unwritten extents was added to ext4 a long time ago, so remove a misleading comment that says they're a future feature. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191231180444.46586-7-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Eric Biggers 提交于
Don't mention the nonexistent return value, and mention both types of merges that are attempted. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191231180444.46586-6-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Eric Biggers 提交于
Make the following functions static since they're only used in extents.c: __ext4_ext_dirty() ext4_can_extents_be_merged() ext4_collapse_range() ext4_insert_range() Also remove the prototype for ext4_ext_writepage_trans_blocks(), as this function is not defined anywhere. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191231180444.46586-5-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Eric Biggers 提交于
ext4_fallocate() is only used in the file_operations for regular files. Also, the VFS only allows fallocate() on regular files and block devices, but block devices always use blkdev_fallocate(). For both of these reasons, S_ISREG() is always true in ext4_fallocate(). Therefore the S_ISREG() checks in ext4_zero_range(), ext4_collapse_range(), ext4_insert_range(), and ext4_punch_hole() are redundant. Remove them. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191231180444.46586-4-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Eric Biggers 提交于
- Fix some comments. - Consistently access i_size directly rather than using i_size_read(), since in all relevant cases we're under inode_lock(). - Simplify the alignment checks by using the IS_ALIGNED() macro. - In ext4_insert_range(), do the check against s_maxbytes in a way that is safe against signed overflow. (This doesn't currently matter for ext4 due to ext4's limited max file size, but this is something other filesystems have gotten wrong. We might as well do it safely.) Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191231180444.46586-3-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Eric Biggers 提交于
Remove the ext4_ind_calc_metadata_amount() and ext4_ext_calc_metadata_amount() functions, which have been unused since commit 71d4f7d0 ("ext4: remove metadata reservation checks"). Also remove the i_da_metadata_calc_last_lblock and i_da_metadata_calc_len fields from struct ext4_inode_info, as these were only used by these removed functions. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191231180444.46586-2-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Eric Biggers 提交于
Since allocating an object from a mempool never fails when __GFP_DIRECT_RECLAIM (which is included in GFP_NOFS) is set, the check for failure to allocate a bio_post_read_ctx is unnecessary. Remove it. Also remove the redundant assignment to ->bi_private. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191231181256.47770-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Biggers 提交于
Without any form of coordination, any case where multiple allocations from the same mempool are needed at a time to make forward progress can deadlock under memory pressure. This is the case for struct bio_post_read_ctx, as one can be allocated to decrypt a Merkle tree page during fsverity_verify_bio(), which itself is running from a post-read callback for a data bio which has its own struct bio_post_read_ctx. Fix this by freeing the first bio_post_read_ctx before calling fsverity_verify_bio(). This works because verity (if enabled) is always the last post-read step. This deadlock can be reproduced by trying to read from an encrypted verity file after reducing NUM_PREALLOC_POST_READ_CTXS to 1 and patching mempool_alloc() to pretend that pool->alloc() always fails. Note that since NUM_PREALLOC_POST_READ_CTXS is actually 128, to actually hit this bug in practice would require reading from lots of encrypted verity files at the same time. But it's theoretically possible, as N available objects isn't enough to guarantee forward progress when > N/2 threads each need 2 objects at a time. Fixes: 22cfe4b4 ("ext4: add fs-verity read support") Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191231181222.47684-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Biggers 提交于
ext4_writepages() on an encrypted file has to encrypt the data, but it can't modify the pagecache pages in-place, so it encrypts the data into bounce pages and writes those instead. All bounce pages are allocated from a mempool using GFP_NOFS. This is not correct use of a mempool, and it can deadlock. This is because GFP_NOFS includes __GFP_DIRECT_RECLAIM, which enables the "never fail" mode for mempool_alloc() where a failed allocation will fall back to waiting for one of the preallocated elements in the pool. But since this mode is used for all a bio's pages and not just the first, it can deadlock waiting for pages already in the bio to be freed. This deadlock can be reproduced by patching mempool_alloc() to pretend that pool->alloc() always fails (so that it always falls back to the preallocations), and then creating an encrypted file of size > 128 KiB. Fix it by only using GFP_NOFS for the first page in the bio. For subsequent pages just use GFP_NOWAIT, and if any of those fail, just submit the bio and start a new one. This will need to be fixed in f2fs too, but that's less straightforward. Fixes: c9af28fd ("ext4 crypto: don't let data integrity writebacks fail with ENOMEM") Cc: stable@vger.kernel.org Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191231181149.47619-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Naoto Kobayashi 提交于
Since we're not using ext4_kvzalloc(), delete this function. Signed-off-by: NNaoto Kobayashi <naoto.kobayashi4c@gmail.com> Link: https://lore.kernel.org/r/20191227080523.31808-2-naoto.kobayashi4c@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Biggers 提交于
For encrypted files, commit 36086d43 ("ext4 crypto: fix bugs in ext4_encrypted_zeroout()") disabled the optimization where when a write occurs to the middle of an unwritten extent, the head and/or tail of the extent (when they aren't too large) are zeroed out, turned into an initialized extent, and merged with the part being written to. This optimization helps prevent fragmentation of the extent tree. However, disabling this optimization also made fscrypt_zeroout_range() nearly impossible to test, as now it's only reachable via the very rare case in ext4_split_extent_at() where allocating a new extent tree block fails due to ENOSPC. 'gce-xfstests -c ext4/encrypt -g auto' doesn't even hit this at all. It's preferable to avoid really rare cases that are hard to test. That commit also cited data corruption in xfstest generic/127 as a reason to disable the extent zeroout optimization, but that's no longer reproducible anymore. It also cited fscrypt_zeroout_range() having poor performance, but I've written a patch to fix that. Therefore, re-enable the extent zeroout optimization on encrypted files. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191226161114.53606-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Biggers 提交于
fscrypt_zeroout_range() is only for encrypted regular files, not for encrypted directories or symlinks. Fortunately, currently it seems it's never called on non-regular files. But to be safe ext4 should explicitly check S_ISREG() before calling it. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191226161022.53490-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Biggers 提交于
When ext4 encryption support was first added, ZERO_RANGE was disallowed, supposedly because test failures (e.g. ext4/001) were seen when enabling it, and at the time there wasn't enough time/interest to debug it. However, there's actually no reason why ZERO_RANGE can't work on encrypted files. And it fact it *does* work now. Whole blocks in the zeroed range are converted to unwritten extents, as usual; encryption makes no difference for that part. Partial blocks are zeroed in the pagecache and then ->writepages() encrypts those blocks as usual. ext4_block_zero_page_range() handles reading and decrypting the block if needed before actually doing the pagecache write. Also, f2fs has always supported ZERO_RANGE on encrypted files. As far as I can tell, the reason that ext4/001 was failing in v4.1 was actually because of one of the bugs fixed by commit 36086d43 ("ext4 crypto: fix bugs in ext4_encrypted_zeroout()"). The bug made ext4_encrypted_zeroout() always return a positive value, which caused unwritten extents in encrypted files to sometimes not be marked as initialized after being written to. This bug was not actually in ZERO_RANGE; it just happened to trigger during the extents manipulation done in ext4/001 (and probably other tests too). So, let's enable ZERO_RANGE on encrypted files on ext4. Tested with: gce-xfstests -c ext4/encrypt -g auto gce-xfstests -c ext4/encrypt_1k -g auto Got the same set of test failures both with and without this patch. But with this patch 6 fewer tests are skipped: ext4/001, generic/008, generic/009, generic/033, generic/096, and generic/511. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191226154216.4808-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Biggers 提交于
fscrypt_decrypt_pagecache_blocks() can fail, because it uses skcipher_request_alloc(), which uses kmalloc(), which can fail; and also because it calls crypto_skcipher_decrypt(), which can fail depending on the driver that actually implements the crypto. Therefore it's not appropriate to WARN on decryption error in __ext4_block_zero_page_range(). Remove the WARN and just handle the error instead. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191226154105.4704-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Biggers 提交于
Since EXT3_FS already selects EXT4_FS, there's no reason for it to redundantly select all the selections of EXT4_FS -- notwithstanding the comments that claim otherwise. Remove these redundant selections to avoid confusion. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191226153920.4466-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 zhengbin 提交于
Fixes coccicheck warning: fs/ext4/extents.c:5271:6-12: WARNING: Assignment of 0/1 to bool variable fs/ext4/extents.c:5287:4-10: WARNING: Assignment of 0/1 to bool variable Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: Nzhengbin <zhengbin13@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/1577241959-138695-1-git-send-email-zhengbin13@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Biggers 提交于
Determining an inode's journaling mode has gotten more complicated over time. Move ext4_inode_journal_mode() from an inline function into ext4_jbd2.c to reduce the compiled code size. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191209233602.117778-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Eric Biggers 提交于
The ifdefs for CONFIG_FS_ENCRYPTION in htree_dirblock_to_tree() are unnecessary, as the called functions are already stubbed out when !CONFIG_FS_ENCRYPTION. Remove them. Signed-off-by: NEric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191209213225.18477-1-ebiggers@kernel.orgSigned-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Chengguang Xu 提交于
We have allocated memory using kzalloc() so don't have to set 0 again in last byte. Signed-off-by: NChengguang Xu <cgxu519@mykernel.net> Link: https://lore.kernel.org/r/20191206054317.3107-1-cgxu519@mykernel.netSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Linus observed that an allmodconfig build which does a lot of stat(2) calls that ext4_getattr() was a noticeable (1%) amount of CPU time, due to the cache line for i_extra_isize getting pulled in. Since the normal stat system call doesn't return btime, it's a complete waste. So only calculate btime when it is explicitly requested. [ Fixed to check against request_mask instead of query_flags. ] Link: https://lore.kernel.org/r/CAHk-=wivmk_j6KbTX+Er64mLrG8abXZo0M10PNdAnHc8fWXfsQ@mail.gmail.comReported-by: NLinus Torvalds <torvalds@linux-foundation.org> Reviewed-by: NAndreas Dilger <adilger@dilger.ca> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 15 1月, 2020 1 次提交
-
-
由 Eric Biggers 提交于
When fs-verity verifies data pages, currently it reads each Merkle tree page synchronously using read_mapping_page(). Therefore, when the Merkle tree pages aren't already cached, fs-verity causes an extra 4 KiB I/O request for every 512 KiB of data (assuming that the Merkle tree uses SHA-256 and 4 KiB blocks). This results in more I/O requests and performance loss than is strictly necessary. Therefore, implement readahead of the Merkle tree pages. For simplicity, we take advantage of the fact that the kernel already does readahead of the file's *data*, just like it does for any other file. Due to this, we don't really need a separate readahead state (struct file_ra_state) just for the Merkle tree, but rather we just need to piggy-back on the existing data readahead requests. We also only really need to bother with the first level of the Merkle tree, since the usual fan-out factor is 128, so normally over 99% of Merkle tree I/O requests are for the first level. Therefore, make fsverity_verify_bio() enable readahead of the first Merkle tree level, for up to 1/4 the number of pages in the bio, when it sees that the REQ_RAHEAD flag is set on the bio. The readahead size is then passed down to ->read_merkle_tree_page() for the filesystem to (optionally) implement if it sees that the requested page is uncached. While we're at it, also make build_merkle_tree_level() set the Merkle tree readahead size, since it's easy to do there. However, for now don't set the readahead size in fsverity_verify_page(), since currently it's only used to verify holes on ext4 and f2fs, and it would need parameters added to know how much to read ahead. This patch significantly improves fs-verity sequential read performance. Some quick benchmarks with 'cat'-ing a 250MB file after dropping caches: On an ARM64 phone (using sha256-ce): Before: 217 MB/s After: 263 MB/s (compare to sha256sum of non-verity file: 357 MB/s) In an x86_64 VM (using sha256-avx2): Before: 173 MB/s After: 215 MB/s (compare to sha256sum of non-verity file: 223 MB/s) Link: https://lore.kernel.org/r/20200106205533.137005-1-ebiggers@kernel.orgReviewed-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NEric Biggers <ebiggers@google.com>
-
- 10 1月, 2020 1 次提交
-
-
由 Alan Maguire 提交于
As tests are added to kunit, it will become less feasible to execute all built tests together. By supporting modular tests we provide a simple way to do selective execution on a running system; specifying CONFIG_KUNIT=y CONFIG_KUNIT_EXAMPLE_TEST=m ...means we can simply "insmod example-test.ko" to run the tests. To achieve this we need to do the following: o export the required symbols in kunit o string-stream tests utilize non-exported symbols so for now we skip building them when CONFIG_KUNIT_TEST=m. o drivers/base/power/qos-test.c contains a few unexported interface references, namely freq_qos_read_value() and freq_constraints_init(). Both of these could be potentially defined as static inline functions in include/linux/pm_qos.h, but for now we simply avoid supporting module build for that test suite. o support a new way of declaring test suites. Because a module cannot do multiple late_initcall()s, we provide a kunit_test_suites() macro to declare multiple suites within the same module at once. o some test module names would have been too general ("test-test" and "example-test" for kunit tests, "inode-test" for ext4 tests); rename these as appropriate ("kunit-test", "kunit-example-test" and "ext4-inode-test" respectively). Also define kunit_test_suite() via kunit_test_suites() as callers in other trees may need the old definition. Co-developed-by: NKnut Omang <knut.omang@oracle.com> Signed-off-by: NKnut Omang <knut.omang@oracle.com> Signed-off-by: NAlan Maguire <alan.maguire@oracle.com> Reviewed-by: NBrendan Higgins <brendanhiggins@google.com> Acked-by: Theodore Ts'o <tytso@mit.edu> # for ext4 bits Acked-by: David Gow <davidgow@google.com> # For list-test Reported-by: Nkbuild test robot <lkp@intel.com> Signed-off-by: NShuah Khan <skhan@linuxfoundation.org>
-
- 01 1月, 2020 2 次提交
-
-
由 Herbert Xu 提交于
The commit 643fa961 ("fscrypt: remove filesystem specific build config option") removed modular support for fs/crypto. This causes the Crypto API to be built-in whenever fscrypt is enabled. This makes it very difficult for me to test modular builds of the Crypto API without disabling fscrypt which is a pain. As fscrypt is still evolving and it's developing new ties with the fs layer, it's hard to build it as a module for now. However, the actual algorithms are not required until a filesystem is mounted. Therefore we can allow them to be built as modules. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/20191227024700.7vrzuux32uyfdgum@gondor.apana.org.auSigned-off-by: NEric Biggers <ebiggers@google.com>
-
由 Eric Biggers 提交于
fscrypt_get_encryption_info() returns 0 if the encryption key is unavailable; it never returns ENOKEY. So remove checks for ENOKEY. Link: https://lore.kernel.org/r/20191209212348.243331-1-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com>
-
- 27 12月, 2019 1 次提交
-
-
由 Jan Kara 提交于
Currently we start transaction for mapping every extent for writing using direct IO. This is unnecessary when we know we are overwriting already allocated blocks and the overhead of starting a transaction can be significant especially for multithreaded workloads doing small writes. Use iomap operations that avoid starting a transaction for direct IO overwrites. This improves throughput of 4k random writes - fio jobfile: [global] rw=randrw norandommap=1 invalidate=0 bs=4k numjobs=16 time_based=1 ramp_time=30 runtime=120 group_reporting=1 ioengine=psync direct=1 size=16G filename=file1.0.0:file1.0.1:file1.0.2:file1.0.3:file1.0.4:file1.0.5:file1.0.6:file1.0.7:file1.0.8:file1.0.9:file1.0.10:file1.0.11:file1.0.12:file1.0.13:file1.0.14:file1.0.15:file1.0.16:file1.0.17:file1.0.18:file1.0.19:file1.0.20:file1.0.21:file1.0.22:file1.0.23:file1.0.24:file1.0.25:file1.0.26:file1.0.27:file1.0.28:file1.0.29:file1.0.30:file1.0.31 file_service_type=random nrfiles=32 from 3018MB/s to 4059MB/s in my test VM running test against simulated pmem device (note that before iomap conversion, this workload was able to achieve 3708MB/s because old direct IO path avoided transaction start for overwrites as well). For dax, the win is even larger improving throughput from 3042MB/s to 4311MB/s. Reported-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20191218174433.19380-1-jack@suse.czSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-