- 15 10月, 2021 2 次提交
-
-
由 Darrick J. Wong 提交于
When log recovery tries to recover a transaction that had log intent items attached to it, it has to save certain parts of the transaction state (reservation, dfops chain, inodes with no automatic unlock) so that it can finish single-stepping the recovered transactions before finishing the chains. This is done with the xfs_defer_ops_capture and xfs_defer_ops_continue functions. Right now they open-code this functionality, so let's port this to the formalized resource capture structure that we introduced in the previous patch. This enables us to hold up to two inodes and two buffers during log recovery, the same way we do for regular runtime. With this patch applied, we'll be ready to support atomic extent swap which holds two inodes; and logged xattrs which holds one inode and one xattr leaf buffer. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
-
由 Darrick J. Wong 提交于
Transaction users are allowed to flag up to two buffers and two inodes for ownership preservation across a deferred transaction roll. Hoist the variables and code responsible for this out of xfs_defer_trans_roll so that we can use it for the defer capture mechanism. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
-
- 12 10月, 2021 2 次提交
-
-
由 Rustam Kovhaev 提交于
For kmalloc() allocations SLOB prepends the blocks with a 4-byte header, and it puts the size of the allocated blocks in that header. Blocks allocated with kmem_cache_alloc() allocations do not have that header. SLOB explodes when you allocate memory with kmem_cache_alloc() and then try to free it with kfree() instead of kmem_cache_free(). SLOB will assume that there is a header when there is none, read some garbage to size variable and corrupt the adjacent objects, which eventually leads to hang or panic. Let's make XFS work with SLOB by using proper free function. Fixes: 9749fee8 ("xfs: enable the xfs_defer mechanism to process extents to free") Signed-off-by: NRustam Kovhaev <rkovhaev@gmail.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Gustavo A. R. Silva 提交于
Use 2-factor argument multiplication form kvcalloc() instead of kvzalloc(). Link: https://github.com/KSPP/linux/issues/162Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
- 04 10月, 2021 1 次提交
-
-
由 Chen Jingwen 提交于
In commit b212921b ("elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings") we still leave MAP_FIXED_NOREPLACE in place for load_elf_interp. Unfortunately, this will cause kernel to fail to start with: 1 (init): Uhuuh, elf segment at 00003ffff7ffd000 requested but the memory is mapped already Failed to execute /init (error -17) The reason is that the elf interpreter (ld.so) has overlapping segments. readelf -l ld-2.31.so Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x000000000002c94c 0x000000000002c94c R E 0x10000 LOAD 0x000000000002dae0 0x000000000003dae0 0x000000000003dae0 0x00000000000021e8 0x0000000000002320 RW 0x10000 LOAD 0x000000000002fe00 0x000000000003fe00 0x000000000003fe00 0x00000000000011ac 0x0000000000001328 RW 0x10000 The reason for this problem is the same as described in commit ad55eac7 ("elf: enforce MAP_FIXED on overlaying elf segments"). Not only executable binaries, elf interpreters (e.g. ld.so) can have overlapping elf segments, so we better drop MAP_FIXED_NOREPLACE and go back to MAP_FIXED in load_elf_interp. Fixes: 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map") Cc: <stable@vger.kernel.org> # v4.19 Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: NChen Jingwen <chenjingwen6@huawei.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 10月, 2021 1 次提交
-
-
由 Pavel Begunkov 提交于
We have never supported fasync properly, it would only fire when there is something polling io_uring making it useless. The original support came in through the initial io_uring merge for 5.1. Since it's broken and nobody has reported it, get rid of the fasync bits. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2f7ca3d344d406d34fa6713824198915c41cea86.1633080236.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 01 10月, 2021 7 次提交
-
-
由 Zhang Yi 提交于
Commit 8e33fadf ("ext4: remove an unnecessary if statement in __ext4_get_inode_loc()") forget to recheck buffer's uptodate bit again under buffer lock, which may overwrite the buffer if someone else have already brought it uptodate and changed it. Fixes: 8e33fadf ("ext4: remove an unnecessary if statement in __ext4_get_inode_loc()") Cc: stable@kernel.org Signed-off-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210910080316.70421-1-yi.zhang@huawei.com
-
由 yangerkun 提交于
When ext4_htree_fill_tree() fails, ext4_dx_readdir() can run into an infinite loop since if info->last_pos != ctx->pos this will reset the directory scan and reread the failing entry. For example: 1. a dx_dir which has 3 block, block 0 as dx_root block, block 1/2 as leaf block which own the ext4_dir_entry_2 2. block 1 read ok and call_filldir which will fill the dirent and update the ctx->pos 3. block 2 read fail, but we has already fill some dirent, so we will return back to userspace will a positive return val(see ksys_getdents64) 4. the second ext4_dx_readdir will reset the world since info->last_pos != ctx->pos, and will also init the curr_hash which pos to block 1 5. So we will read block1 too, and once block2 still read fail, we can only fill one dirent because the hash of the entry in block1(besides the last one) won't greater than curr_hash 6. this time, we forget update last_pos too since the read for block2 will fail, and since we has got the one entry, ksys_getdents64 can return success 7. Latter we will trapped in a loop with step 4~6 Cc: stable@kernel.org Signed-off-by: Nyangerkun <yangerkun@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210914111415.3921954-1-yangerkun@huawei.com
-
由 yangerkun 提交于
The error path in ext4_fill_super forget to flush s_error_work before journal destroy, and it may trigger the follow bug since flush_stashed_error_work can run concurrently with journal destroy without any protection for sbi->s_journal. [32031.740193] EXT4-fs (loop66): get root inode failed [32031.740484] EXT4-fs (loop66): mount failed [32031.759805] ------------[ cut here ]------------ [32031.759807] kernel BUG at fs/jbd2/transaction.c:373! [32031.760075] invalid opcode: 0000 [#1] SMP PTI [32031.760336] CPU: 5 PID: 1029268 Comm: kworker/5:1 Kdump: loaded 4.18.0 [32031.765112] Call Trace: [32031.765375] ? __switch_to_asm+0x35/0x70 [32031.765635] ? __switch_to_asm+0x41/0x70 [32031.765893] ? __switch_to_asm+0x35/0x70 [32031.766148] ? __switch_to_asm+0x41/0x70 [32031.766405] ? _cond_resched+0x15/0x40 [32031.766665] jbd2__journal_start+0xf1/0x1f0 [jbd2] [32031.766934] jbd2_journal_start+0x19/0x20 [jbd2] [32031.767218] flush_stashed_error_work+0x30/0x90 [ext4] [32031.767487] process_one_work+0x195/0x390 [32031.767747] worker_thread+0x30/0x390 [32031.768007] ? process_one_work+0x390/0x390 [32031.768265] kthread+0x10d/0x130 [32031.768521] ? kthread_flush_work_fn+0x10/0x10 [32031.768778] ret_from_fork+0x35/0x40 static int start_this_handle(...) BUG_ON(journal->j_flags & JBD2_UNMOUNT); <---- Trigger this Besides, after we enable fast commit, ext4_fc_replay can add work to s_error_work but return success, so the latter journal destroy in ext4_load_journal can trigger this problem too. Fix this problem with two steps: 1. Call ext4_commit_super directly in ext4_handle_error for the case that called from ext4_fc_replay 2. Since it's hard to pair the init and flush for s_error_work, we'd better add a extras flush_work before journal destroy in ext4_fill_super Besides, this patch will call ext4_commit_super in ext4_handle_error for any nojournal case too. But it seems safe since the reason we call schedule_work was that we should save error info to sb through journal if available. Conversely, for the nojournal case, it seems useless delay commit superblock to s_error_work. Fixes: c92dc856 ("ext4: defer saving error info from atomic context") Fixes: 2d01ddc8 ("ext4: save error info to sb through journal if available") Cc: stable@kernel.org Signed-off-by: Nyangerkun <yangerkun@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210924093917.1953239-1-yangerkun@huawei.com
-
由 Ritesh Harjani 提交于
We should use unsigned long long rather than loff_t to avoid overflow in ext4_max_bitmap_size() for comparison before returning. w/o this patch sbi->s_bitmap_maxbytes was becoming a negative value due to overflow of upper_limit (with has_huge_files as true) Below is a quick test to trigger it on a 64KB pagesize system. sudo mkfs.ext4 -b 65536 -O ^has_extents,^64bit /dev/loop2 sudo mount /dev/loop2 /mnt sudo echo "hello" > /mnt/hello -> This will error out with "echo: write error: File too large" Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Link: https://lore.kernel.org/r/594f409e2c543e90fd836b78188dfa5c575065ba.1622867594.git.riteshh@linux.ibm.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Jeffle Xu 提交于
When ext4_insert_delayed block receives and recovers from an error from ext4_es_insert_delayed_block(), e.g., ENOMEM, it does not release the space it has reserved for that block insertion as it should. One effect of this bug is that s_dirtyclusters_counter is not decremented and remains incorrectly elevated until the file system has been unmounted. This can result in premature ENOSPC returns and apparent loss of free space. Another effect of this bug is that /sys/fs/ext4/<dev>/delayed_allocation_blocks can remain non-zero even after syncfs has been executed on the filesystem. Besides, add check for s_dirtyclusters_counter when inode is going to be evicted and freed. s_dirtyclusters_counter can still keep non-zero until inode is written back in .evict_inode(), and thus the check is delayed to .destroy_inode(). Fixes: 51865fda ("ext4: let ext4 maintain extent status tree") Cc: stable@kernel.org Suggested-by: NGao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: NEric Whitney <enwlinux@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210823061358.84473-1-jefflexu@linux.alibaba.com
-
由 Hou Tao 提交于
Now EXT4_FC_TAG_ADD_RANGE uses ext4_extent to track the newly-added blocks, but the limit on the max value of ee_len field is ignored, and it can lead to BUG_ON as shown below when running command "fallocate -l 128M file" on a fast_commit-enabled fs: kernel BUG at fs/ext4/ext4_extents.h:199! invalid opcode: 0000 [#1] SMP PTI CPU: 3 PID: 624 Comm: fallocate Not tainted 5.14.0-rc6+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:ext4_fc_write_inode_data+0x1f3/0x200 Call Trace: ? ext4_fc_write_inode+0xf2/0x150 ext4_fc_commit+0x93b/0xa00 ? ext4_fallocate+0x1ad/0x10d0 ext4_sync_file+0x157/0x340 ? ext4_sync_file+0x157/0x340 vfs_fsync_range+0x49/0x80 do_fsync+0x3d/0x70 __x64_sys_fsync+0x14/0x20 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Simply fixing it by limiting the number of blocks in one EXT4_FC_TAG_ADD_RANGE TLV. Fixes: aa75f4d3 ("ext4: main fast-commit commit path") Cc: stable@kernel.org Signed-off-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210820044505.474318-1-houtao1@huawei.com
-
由 Dan Carpenter 提交于
The kmalloc() does not have a NULL check. This code can be re-written slightly cleaner to just use the kstrdup(). Fixes: 265fd199 ("ksmbd: use LOOKUP_BENEATH to prevent the out of share access") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NNamjae Jeon <linkinjeon@kernel.org> Acked-by: NHyunchul Lee <hyc.lee@gmail.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
- 30 9月, 2021 6 次提交
-
-
由 Namjae Jeon 提交于
Validate that the transform and smb request headers are present before checking OriginalMessageSize and SessionId fields. Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: NTom Talpey <tom@talpey.com> Acked-by: NHyunchul Lee <hyc.lee@gmail.com> Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Hyunchul Lee 提交于
Add buffer validation for SMB2_CREATE_CONTEXT. Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Reviewed-by: NRalph Boehme <slow@samba.org> Signed-off-by: NHyunchul Lee <hyc.lee@gmail.com> Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Namjae Jeon 提交于
This patch add validation to check request buffer check in smb2 negotiate and fix null pointer deferencing oops in smb3_preauth_hash_rsp() that found from manual test. Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Cc: Hyunchul Lee <hyc.lee@gmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: NRalph Boehme <slow@samba.org> Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Namjae Jeon 提交于
Add buffer validation in smb2_set_info, and remove unused variable in set_file_basic_info. and smb2_set_info infolevel functions take structure pointer argument. Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: NHyunchul Lee <hyc.lee@gmail.com> Reviewed-by: NRalph Boehme <slow@samba.org> Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Namjae Jeon 提交于
Use correct basic info level in set/get_file_basic_info(). Reviewed-by: NRalph Boehme <slow@samba.org> Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Namjae Jeon 提交于
Remove insecure NTLMv1 authentication. Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Reviewed-by: NTom Talpey <tom@talpey.com> Acked-by: NSteve French <smfrench@gmail.com> Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
- 29 9月, 2021 2 次提交
-
-
由 Enzo Matsumiya 提交于
ksmbd_kthread_fn() and create_socket() returns 0 or error code, and not task_struct/ERR_PTR. Signed-off-by: NEnzo Matsumiya <ematsumiya@suse.de> Acked-by: NNamjae Jeon <linkinjeon@kernel.org> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Hou Tao 提交于
A KMSAN warning is reported by Alexander Potapenko: BUG: KMSAN: uninit-value in kernfs_dop_revalidate+0x61f/0x840 fs/kernfs/dir.c:1053 kernfs_dop_revalidate+0x61f/0x840 fs/kernfs/dir.c:1053 d_revalidate fs/namei.c:854 lookup_dcache fs/namei.c:1522 __lookup_hash+0x3a6/0x590 fs/namei.c:1543 filename_create+0x312/0x7c0 fs/namei.c:3657 do_mkdirat+0x103/0x930 fs/namei.c:3900 __do_sys_mkdir fs/namei.c:3931 __se_sys_mkdir fs/namei.c:3929 __x64_sys_mkdir+0xda/0x120 fs/namei.c:3929 do_syscall_x64 arch/x86/entry/common.c:51 It seems a positive dentry in kernfs becomes a negative dentry directly through d_delete() in vfs_rmdir(). dentry->d_time is uninitialized when accessing it in kernfs_dop_revalidate(), because it is only initialized when created as negative dentry in kernfs_iop_lookup(). The problem can be reproduced by the following command: cd /sys/fs/cgroup/pids && mkdir hi && stat hi && rmdir hi && stat hi A simple fixes seems to be initializing d->d_time for positive dentry in kernfs_iop_lookup() as well. The downside is the negative dentry will be revalidated again after it becomes negative in d_delete(), because the revison of its parent must have been increased due to its removal. Alternative solution is implement .d_iput for kernfs, and assign d_time for the newly-generated negative dentry in it. But we may need to take kernfs_rwsem to protect again the concurrent kernfs_link_sibling() on the parent directory, it is a little over-killing. Now the simple fix is chosen. Link: https://marc.info/?l=linux-fsdevel&m=163249838610499 Fixes: c7e7c042 ("kernfs: use VFS negative dentry caching") Reported-by: NAlexander Potapenko <glider@google.com> Signed-off-by: NHou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20210928140750.1274441-1-houtao1@huawei.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 28 9月, 2021 2 次提交
-
-
由 Linus Torvalds 提交于
Commit 9d682ea6 ("vboxsf: Fix the check for the old binary mount-arguments struct") was meant to fix a build error due to sign mismatch in 'char' and the use of character constants, but it just moved the error elsewhere, in that on some architectures characters and signed and on others they are unsigned, and that's just how the C standard works. The proper fix is a simple "don't do that then". The code was just being silly and odd, and it should never have cared about signed vs unsigned characters in the first place, since what it is testing is not four "characters", but four bytes. And the way to compare four bytes is by using "memcmp()". Which compilers will know to just turn into a single 32-bit compare with a constant, as long as you don't have crazy debug options enabled. Link: https://lore.kernel.org/lkml/20210927094123.576521-1-arnd@kernel.org/ Cc: Arnd Bergmann <arnd@kernel.org> Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jens Axboe 提交于
io-wq threads block all signals, except SIGKILL and SIGSTOP. We should not need any extra checking of signal_pending or fatal_signal_pending, rely exclusively on whether or not get_signal() tells us to exit. The original debugging of this issue led to the false positive that we were exiting on non-fatal signals, but that is not the case. The issue was around races with nr_workers accounting. Fixes: 87c16966 ("io-wq: ensure we exit if thread group is exiting") Fixes: 15e20db2 ("io-wq: only exit on fatal signals") Reported-by: NEric W. Biederman <ebiederm@xmission.com> Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 27 9月, 2021 2 次提交
-
-
由 Namjae Jeon 提交于
Ronnie reported invalid request buffer access in chained command when inserting garbage value to NextCommand of compound request. This patch add validation check to avoid this issue. Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Tested-by: NSteve French <smfrench@gmail.com> Reviewed-by: NSteve French <smfrench@gmail.com> Acked-by: NHyunchul Lee <hyc.lee@gmail.com> Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Ronnie Sahlberg 提交于
In smb_common.c you have this function : ksmbd_smb_request() which is called from connection.c once you have read the initial 4 bytes for the next length+smb2 blob. It checks the first byte of this 4 byte preamble for valid values, i.e. a NETBIOSoverTCP SESSION_MESSAGE or a SESSION_KEEP_ALIVE. We don't need to check this for ksmbd since it only implements SMB2 over TCP port 445. The netbios stuff was only used in very old servers when SMB ran over TCP port 139. Now that we run over TCP port 445, this is actually not a NB header anymore and you can just treat it as a 4 byte length field that must be less than 16Mbyte. and remove the references to the RFC1002 constants that no longer applies. Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Cc: Steve French <smfrench@gmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: NHyunchul Lee <hyc.lee@gmail.com> Signed-off-by: NRonnie Sahlberg <lsahlber@redhat.com> Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
- 25 9月, 2021 12 次提交
-
-
由 Hyunchul Lee 提交于
instead of removing '..' in a given path, call kern_path with LOOKUP_BENEATH flag to prevent the out of share access. ran various test on this: smb2-cat-async smb://127.0.0.1/homes/../out_of_share smb2-cat-async smb://127.0.0.1/homes/foo/../../out_of_share smbclient //127.0.0.1/homes -c "mkdir ../foo2" smbclient //127.0.0.1/homes -c "rename bar ../bar" Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Boehme <slow@samba.org> Tested-by: NSteve French <smfrench@gmail.com> Tested-by: NNamjae Jeon <linkinjeon@kernel.org> Acked-by: NNamjae Jeon <linkinjeon@kernel.org> Signed-off-by: NHyunchul Lee <hyc.lee@gmail.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Minchan Kim 提交于
The kernel test robot reported the regression of fio.write_iops[1] with commit 8cc621d2 ("mm: fs: invalidate BH LRU during page migration"). Since lru_add_drain is called frequently, invalidate bh_lrus there could increase bh_lrus cache miss ratio, which needs more IO in the end. This patch moves the bh_lrus invalidation from the hot path( e.g., zap_page_range, pagevec_release) to cold path(i.e., lru_add_drain_all, lru_cache_disable). Zhengjun Xing confirmed "I test the patch, the regression reduced to -2.9%" [1] https://lore.kernel.org/lkml/20210520083144.GD14190@xsang-OptiPlex-9020/ [2] 8cc621d2, mm: fs: invalidate BH LRU during page migration Link: https://lkml.kernel.org/r/20210907212347.1977686-1-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org> Reported-by: Nkernel test robot <oliver.sang@intel.com> Reviewed-by: NChris Goldsworthy <cgoldswo@codeaurora.org> Tested-by: N"Xing, Zhengjun" <zhengjun.xing@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wengang Wang 提交于
ocfs2_data_convert_worker() is currently dropping any cached acl info for FILE before down-converting meta lock. It should also drop for DIRECTORY. Otherwise the second acl lookup returns the cached one (from VFS layer) which could be already stale. The problem we are seeing is that the acl changes on one node doesn't get refreshed on other nodes in the following case: Node 1 Node 2 -------------- ---------------- getfacl dir1 getfacl dir1 <-- this is OK setfacl -m u:user1:rwX dir1 getfacl dir1 <-- see the change for user1 getfacl dir1 <-- can't see change for user1 Link: https://lkml.kernel.org/r/20210903012631.6099-1-wen.gang.wang@oracle.comSigned-off-by: NWengang Wang <wen.gang.wang@oracle.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pavel Begunkov 提交于
From recently open/accept are now able to manipulate fixed file table, but it's inconsistent that close can't. Close the gap, keep API same as with open/accept, i.e. via sqe->file_slot. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
We don't retry short writes and so we would never get to async setup in io_write() in that case. Thus ret2 > 0 is always false and iov_iter_advance() is never used. Apparently, the same is found by Coverity, which complains on the code. Fixes: cd658695 ("io_uring: use iov_iter state save/restore helpers") Reported-by: NDave Jones <davej@codemonkey.org.uk> Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5b33e61034748ef1022766efc0fb8854cfcf749c.1632500058.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
There's no reason to punt it unconditionally, we just need to ensure that the submit lock grabbing is conditional. Fixes: 05f3fb3c ("io_uring: avoid ring quiesce for fixed file set unregister and update") Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
For each provided buffer, we allocate a struct io_buffer to hold the data associated with it. As a large number of buffers can be provided, account that data with memcg. Fixes: ddf0322d ("io_uring: add IORING_OP_PROVIDE_BUFFERS") Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If we have a lot of threads and rings, the tctx list can get quite big. This is especially true if we keep creating new threads and rings. Likewise for the provided buffers list. Be nice and insert a conditional reschedule point while iterating the nodes for deletion. Link: https://lore.kernel.org/io-uring/00000000000064b6b405ccb41113@google.com/ Reported-by: syzbot+111d2a03f51f5ae73775@syzkaller.appspotmail.com Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Hao Xu 提交于
For multishot mode, there may be cases like: iowq original context io_poll_add _arm_poll() mask = vfs_poll() is not 0 if mask (2) io_poll_complete() compl_unlock (interruption happens tw queued to original context) io_poll_task_func() compl_lock (3) done = io_poll_complete() is true compl_unlock put req ref (1) if (poll->flags & EPOLLONESHOT) put req ref EPOLLONESHOT flag in (1) may be from (2) or (3), so there are multiple combinations that can cause ref underfow. Let's address it by: - check the return value in (2) as done - change (1) to if (done) in this way, we only do ref put in (1) if 'oneshot flag' is from (2) - do poll.done check in io_poll_task_func(), so that we won't put ref for the second time. Signed-off-by: NHao Xu <haoxu@linux.alibaba.com> Link: https://lore.kernel.org/r/20210922101238.7177-4-haoxu@linux.alibaba.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Hao Xu 提交于
We should set EPOLLONESHOT if cqring_fill_event() returns false since io_poll_add() decides to put req or not by it. Fixes: 5082620f ("io_uring: terminate multishot poll for CQ ring overflow") Signed-off-by: NHao Xu <haoxu@linux.alibaba.com> Link: https://lore.kernel.org/r/20210922101238.7177-3-haoxu@linux.alibaba.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Hao Xu 提交于
If poll arming and poll completion runs in parallel, there maybe races. For instance, run io_poll_add in iowq and io_poll_task_func in original context, then: iowq original context io_poll_add vfs_poll (interruption happens tw queued to original context) io_poll_task_func generate cqe del from cancel_hash[] if !poll.done insert to cancel_hash[] The entry left in cancel_hash[], similar case for fast poll. Fix it by set poll.done = true when del from cancel_hash[]. Fixes: 5082620f ("io_uring: terminate multishot poll for CQ ring overflow") Signed-off-by: NHao Xu <haoxu@linux.alibaba.com> Link: https://lore.kernel.org/r/20210922101238.7177-2-haoxu@linux.alibaba.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Dave reports that a coredumping workload gets stuck in 5.15-rc2, and identified the culprit in the Fixes line below. The problem is that relying solely on fatal_signal_pending() to gate whether to exit or not fails miserably if a process gets eg SIGILL sent. Don't exclusively rely on fatal signals, also check if the thread group is exiting. Fixes: 15e20db2 ("io-wq: only exit on fatal signals") Reported-by: NDave Chinner <david@fromorbit.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 24 9月, 2021 3 次提交
-
-
由 Steve French 提交于
Although very unlikely that the tlink pointer would be null in this case, get_next_mid function can in theory return null (but not an error) so need to check for null (not for IS_ERR, which can not be returned here). Address warning: fs/smbfs_client/connect.c:2392 cifs_match_super() warn: 'tlink' isn't an ERR_PTR Pointed out by Dan Carpenter via smatch code analysis tool CC: stable@vger.kernel.org Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NRonnie Sahlberg <lsahlber@redhat.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Steve French 提交于
Address warning: fs/smbfs_client/misc.c:273 header_assemble() warn: variable dereferenced before check 'treeCon->ses->server' Pointed out by Dan Carpenter via smatch code analysis tool Although the check is likely unneeded, adding it makes the code more consistent and easier to read, as the same check is done elsewhere in the function. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NRonnie Sahlberg <lsahlber@redhat.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Steve French 提交于
Address warning: fs/smbfs_client/smb2pdu.c:2425 create_sd_buf() warn: struct type mismatch 'smb3_acl vs cifs_acl' Pointed out by Dan Carpenter via smatch code analysis tool Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NRonnie Sahlberg <lsahlber@redhat.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-