- 17 5月, 2020 3 次提交
-
-
由 Pavel Begunkov 提交于
As for other not inlined requests, alloc req->io for FORCE_ASYNC reqs, so they can be prepared properly. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
If req->io is not NULL, it's already prepared. Don't do it again, it's dangerous. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Ensure that ctx->sqo_wait is initialized as soon as the ctx is allocated, instead of deferring it to the offload setup. This fixes a syzbot reported lockdep complaint, which is really due to trying to wake_up on an uninitialized wait queue: RSP: 002b:00007fffb1fb9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441319 RDX: 0000000000000001 RSI: 0000000020000140 RDI: 000000000000047b RBP: 0000000000010475 R08: 0000000000000001 R09: 00000000004002c8 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402260 R13: 00000000004022f0 R14: 0000000000000000 R15: 0000000000000000 INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 1 PID: 7090 Comm: syz-executor222 Not tainted 5.7.0-rc1-next-20200415-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 assign_lock_key kernel/locking/lockdep.c:913 [inline] register_lock_class+0x1664/0x1760 kernel/locking/lockdep.c:1225 __lock_acquire+0x104/0x4c50 kernel/locking/lockdep.c:4234 lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4934 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x8c/0xbf kernel/locking/spinlock.c:159 __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:122 io_cqring_ev_posted+0xa5/0x1e0 fs/io_uring.c:1160 io_poll_remove_all fs/io_uring.c:4357 [inline] io_ring_ctx_wait_and_kill+0x2bc/0x5a0 fs/io_uring.c:7305 io_uring_create fs/io_uring.c:7843 [inline] io_uring_setup+0x115e/0x22b0 fs/io_uring.c:7870 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x441319 Code: e8 5c ae 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 bb 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fffb1fb9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9 Reported-by: syzbot+8c91f5d054e998721c57@syzkaller.appspotmail.com Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 14 5月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
When we changed the file registration handling, it became important to iterate the bulk request freeing list for fixed files as well, or we miss dropping the fixed file reference. If not, we're leaking references, and we'll get a kworker stuck waiting for file references to disappear. This also means we can remove the special casing of fixed vs non-fixed files, we need to iterate for both and we can just rely on __io_req_aux_free() doing io_put_file() instead of doing it manually. Fixes: 05589553 ("io_uring: refactor file register/unregister/update handling") Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 10 5月, 2020 1 次提交
-
-
由 Pavel Begunkov 提交于
do_splice() doesn't expect len to be 0. Just always return 0 in this case as splice(2) does. Fixes: 7d67af2c ("io_uring: add splice(2) support") Reported-by: NJann Horn <jannh@google.com> Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 08 5月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
We currently make some guesses as when to open this fd, but in reality we have no business (or need) to do so at all. In fact, it makes certain things fail, like O_PATH. Remove the fd lookup from these opcodes, we're just passing the 'fd' to generic helpers anyway. With that, we can also remove the special casing of fd values in io_req_needs_file(), and the 'fd_non_neg' check that we have. And we can ensure that we only read sqe->fd once. This fixes O_PATH usage with openat/openat2, and ditto statx path side oddities. Cc: stable@vger.kernel.org: # v5.6 Reported-by: NMax Kellermann <mk@cm4all.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 07 5月, 2020 1 次提交
-
-
由 Pavel Begunkov 提交于
do_splice() is used by io_uring, as will be do_tee(). Move f_mode checks from sys_{splice,tee}() to do_{splice,tee}(), so they're enforced for io_uring as well. Fixes: 7d67af2c ("io_uring: add splice(2) support") Reported-by: NJann Horn <jannh@google.com> Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 06 5月, 2020 1 次提交
-
-
由 Xiaoguang Wang 提交于
If copy_to_user() in io_uring_setup() failed, we'll leak many kernel resources, which will be recycled until process terminates. This bug can be reproduced by using mprotect to set params to PROT_READ. To fix this issue, refactor io_uring_create() a bit to add a new 'struct io_uring_params __user *params' parameter and move the copy_to_user() in io_uring_setup() to io_uring_setup(), if copy_to_user() failed, we can free kernel resource properly. Suggested-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 04 5月, 2020 1 次提交
-
-
由 Xiaoguang Wang 提交于
The prepare_to_wait() and finish_wait() calls in io_uring_cancel_files() are mismatched. Currently I don't see any issues related this bug, just find it by learning codes. Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 01 5月, 2020 8 次提交
-
-
由 Pavel Begunkov 提交于
Nonblocking do_splice() still may wait for some time on an inode mutex. Let's play safe and always punt it async. Reported-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
io_req_defer() do double-checked locking. Use proper helpers for that, i.e. list_empty_careful(). Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
[ 40.179474] refcount_t: underflow; use-after-free. [ 40.179499] WARNING: CPU: 6 PID: 1848 at lib/refcount.c:28 refcount_warn_saturate+0xae/0xf0 ... [ 40.179612] RIP: 0010:refcount_warn_saturate+0xae/0xf0 [ 40.179617] Code: 28 44 0a 01 01 e8 d7 01 c2 ff 0f 0b 5d c3 80 3d 15 44 0a 01 00 75 91 48 c7 c7 b8 f5 75 be c6 05 05 44 0a 01 01 e8 b7 01 c2 ff <0f> 0b 5d c3 80 3d f3 43 0a 01 00 0f 85 6d ff ff ff 48 c7 c7 10 f6 [ 40.179619] RSP: 0018:ffffb252423ebe18 EFLAGS: 00010286 [ 40.179623] RAX: 0000000000000000 RBX: ffff98d65e929400 RCX: 0000000000000000 [ 40.179625] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 00000000ffffffff [ 40.179627] RBP: ffffb252423ebe18 R08: 0000000000000001 R09: 000000000000055d [ 40.179629] R10: 0000000000000c8c R11: 0000000000000001 R12: 0000000000000000 [ 40.179631] R13: ffff98d68c434400 R14: ffff98d6a9cbaa20 R15: ffff98d6a609ccb8 [ 40.179634] FS: 0000000000000000(0000) GS:ffff98d6af580000(0000) knlGS:0000000000000000 [ 40.179636] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 40.179638] CR2: 00000000033e3194 CR3: 000000006480a003 CR4: 00000000003606e0 [ 40.179641] Call Trace: [ 40.179652] io_put_req+0x36/0x40 [ 40.179657] io_free_work+0x15/0x20 [ 40.179661] io_worker_handle_work+0x2f5/0x480 [ 40.179667] io_wqe_worker+0x2a9/0x360 [ 40.179674] ? _raw_spin_unlock_irqrestore+0x24/0x40 [ 40.179681] kthread+0x12c/0x170 [ 40.179685] ? io_worker_handle_work+0x480/0x480 [ 40.179690] ? kthread_park+0x90/0x90 [ 40.179695] ret_from_fork+0x35/0x40 [ 40.179702] ---[ end trace 85027405f00110aa ]--- Opcode handler must never put submission ref, but that's what io_sync_file_range_finish() do. use io_steal_work() there. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Xiaoguang Wang 提交于
While working on to make io_uring sqpoll mode support syscalls that need struct files_struct, I got cpu soft lockup in io_ring_ctx_wait_and_kill(), while (ctx->sqo_thread && !wq_has_sleeper(&ctx->sqo_wait)) cpu_relax(); above loop never has an chance to exit, it's because preempt isn't enabled in the kernel, and the context calling io_ring_ctx_wait_and_kill() and io_sq_thread() run in the same cpu, if io_sq_thread calls a cond_resched() yield cpu and another context enters above loop, then io_sq_thread() will always in runqueue and never exit. Use cond_resched() can fix this issue. Reported-by: syzbot+66243bb7126c410cefe6@syzkaller.appspotmail.com Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Bijan Mottahedeh 提交于
Use ctx->fallback_req address for test_and_set_bit_lock() and clear_bit_unlock(). Signed-off-by: NBijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We do blocking retry from our poll handler, if the file supports polled notifications. Only mark the request as needing an async worker if we can't poll for it. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We can have files like eventfd where it's perfectly fine to do poll based retry on them, right now io_file_supports_async() doesn't take that into account. Pass in data direction and check the f_op instead of just always needing an async worker. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Trond Myklebust 提交于
The struct nfs_server gets put on the cl_superblocks list before the server->super field has been initialised, in which case the call to nfs_sb_active() will Oops. Add a check to ensure that we skip such a list entry. Fixes: 3c9e502b ("NFS: Add a helper nfs_client_for_each_server()") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 30 4月, 2020 2 次提交
-
-
由 Ritesh Harjani 提交于
We better warn the fibmap user and not return a truncated and therefore an incorrect block map address if the bmap() returned block address is greater than INT_MAX (since user supplied integer pointer). It's better to pr_warn() all user of ioctl_fibmap() and return a proper error code rather than silently letting a FS corruption happen if the user tries to fiddle around with the returned block map address. We fix this by returning an error code of -ERANGE and returning 0 as the block mapping address in case if it is > INT_MAX. Now iomap_bmap() could be called from either of these two paths. Either when a user is calling an ioctl_fibmap() interface to get the block mapping address or by some filesystem via use of bmap() internal kernel API. bmap() kernel API is well equipped with handling of u64 addresses. WARN condition in iomap_bmap_actor() was mainly added to warn all the fibmap users. But now that we have directly added this warning for all fibmap users and also made sure to return 0 as block map address in case if addr > INT_MAX. So we can now remove this logic from iomap_bmap_actor(). Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Arnd Bergmann 提交于
Some older compilers like gcc-4.8 warn about mismatched curly braces in a initializer: fs/btrfs/backref.c: In function 'is_shared_data_backref': fs/btrfs/backref.c:394:9: error: missing braces around initializer [-Werror=missing-braces] struct prelim_ref target = {0}; ^ fs/btrfs/backref.c:394:9: error: (near initialization for 'target.rbnode') [-Werror=missing-braces] Use the GNU empty initializer extension to avoid this. Fixes: ed58f2e6 ("btrfs: backref, don't add refs from shared block when resolving normal backref") Reviewed-by: NQu Wenruo <wqu@suse.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 29 4月, 2020 2 次提交
-
-
由 David Howells 提交于
Commit 6fcf0c72, a fix to get_tree_bdev() put a missing blkdev_put() in the wrong place, before a warnf() that displays the bdev under consideration rather after it. This results in a silent lockup in printk("%pg") called via warnf() from get_tree_bdev() under some circumstances when there's a race with the blockdev being frozen. This can be caused by xfstests/tests/generic/085 in combination with Lukas Czerner's ext4 mount API conversion patchset. It looks like it ought to occur with other users of get_tree_bdev() such as XFS, but apparently doesn't. Fix this by switching the order of the lines. Fixes: 6fcf0c72 ("vfs: add missing blkdev_put() in get_tree_bdev()") Reported-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> cc: Ian Kent <raven@themaw.net> cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Olga Kornievskaia 提交于
Currently, if the client sends BIND_CONN_TO_SESSION with NFS4_CDFC4_FORE_OR_BOTH but only gets NFS4_CDFS4_FORE back it ignores that it wasn't able to enable a backchannel. To make sure, the client sends BIND_CONN_TO_SESSION as the first operation on the connections (ie., no other session compounds haven't been sent before), and if the client's request to bind the backchannel is not satisfied, then reset the connection and retry. Cc: stable@vger.kernel.org Signed-off-by: NOlga Kornievskaia <kolga@netapp.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 28 4月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
Clay reports that OP_STATX fails for a test case with a valid fd and empty path: -- Test 0: statx:fd 3: SUCCEED, file mode 100755 -- Test 1: statx:path ./uring_statx: SUCCEED, file mode 100755 -- Test 2: io_uring_statx:fd 3: FAIL, errno 9: Bad file descriptor -- Test 3: io_uring_statx:path ./uring_statx: SUCCEED, file mode 100755 This is due to statx not grabbing the process file table, hence we can't lookup the fd in async context. If the fd is valid, ensure that we grab the file table so we can grab the file from async context. Cc: stable@vger.kernel.org # v5.6 Reported-by: NClay Harris <bugs@claycon.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 27 4月, 2020 3 次提交
-
-
由 Qu Wenruo 提交于
[BUG] One run of btrfs/063 triggered the following lockdep warning: ============================================ WARNING: possible recursive locking detected 5.6.0-rc7-custom+ #48 Not tainted -------------------------------------------- kworker/u24:0/7 is trying to acquire lock: ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs] but task is already holding lock: ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(sb_internal#2); lock(sb_internal#2); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by kworker/u24:0/7: #0: ffff88817b495948 ((wq_completion)btrfs-endio-write){+.+.}, at: process_one_work+0x557/0xb80 #1: ffff888189ea7db8 ((work_completion)(&work->normal_work)){+.+.}, at: process_one_work+0x557/0xb80 #2: ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs] #3: ffff888174ca4da8 (&fs_info->reloc_mutex){+.+.}, at: btrfs_record_root_in_trans+0x83/0xd0 [btrfs] stack backtrace: CPU: 0 PID: 7 Comm: kworker/u24:0 Not tainted 5.6.0-rc7-custom+ #48 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: btrfs-endio-write btrfs_work_helper [btrfs] Call Trace: dump_stack+0xc2/0x11a __lock_acquire.cold+0xce/0x214 lock_acquire+0xe6/0x210 __sb_start_write+0x14e/0x290 start_transaction+0x66c/0x890 [btrfs] btrfs_join_transaction+0x1d/0x20 [btrfs] find_free_extent+0x1504/0x1a50 [btrfs] btrfs_reserve_extent+0xd5/0x1f0 [btrfs] btrfs_alloc_tree_block+0x1ac/0x570 [btrfs] btrfs_copy_root+0x213/0x580 [btrfs] create_reloc_root+0x3bd/0x470 [btrfs] btrfs_init_reloc_root+0x2d2/0x310 [btrfs] record_root_in_trans+0x191/0x1d0 [btrfs] btrfs_record_root_in_trans+0x90/0xd0 [btrfs] start_transaction+0x16e/0x890 [btrfs] btrfs_join_transaction+0x1d/0x20 [btrfs] btrfs_finish_ordered_io+0x55d/0xcd0 [btrfs] finish_ordered_fn+0x15/0x20 [btrfs] btrfs_work_helper+0x116/0x9a0 [btrfs] process_one_work+0x632/0xb80 worker_thread+0x80/0x690 kthread+0x1a3/0x1f0 ret_from_fork+0x27/0x50 It's pretty hard to reproduce, only one hit so far. [CAUSE] This is because we're calling btrfs_join_transaction() without re-using the current running one: btrfs_finish_ordered_io() |- btrfs_join_transaction() <<< Call #1 |- btrfs_record_root_in_trans() |- btrfs_reserve_extent() |- btrfs_join_transaction() <<< Call #2 Normally such btrfs_join_transaction() call should re-use the existing one, without trying to re-start a transaction. But the problem is, in btrfs_join_transaction() call #1, we call btrfs_record_root_in_trans() before initializing current::journal_info. And in btrfs_join_transaction() call #2, we're relying on current::journal_info to avoid such deadlock. [FIX] Call btrfs_record_root_in_trans() after we have initialized current::journal_info. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: NQu Wenruo <wqu@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Filipe Manana 提交于
When we have an inode with a prealloc extent that starts at an offset lower than the i_size and there is another prealloc extent that starts at an offset beyond i_size, we can end up losing part of the first prealloc extent (the part that starts at i_size) and have an implicit hole if we fsync the file and then have a power failure. Consider the following example with comments explaining how and why it happens. $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt # Create our test file with 2 consecutive prealloc extents, each with a # size of 128Kb, and covering the range from 0 to 256Kb, with a file # size of 0. $ xfs_io -f -c "falloc -k 0 128K" /mnt/foo $ xfs_io -c "falloc -k 128K 128K" /mnt/foo # Fsync the file to record both extents in the log tree. $ xfs_io -c "fsync" /mnt/foo # Now do a redudant extent allocation for the range from 0 to 64Kb. # This will merely increase the file size from 0 to 64Kb. Instead we # could also do a truncate to set the file size to 64Kb. $ xfs_io -c "falloc 0 64K" /mnt/foo # Fsync the file, so we update the inode item in the log tree with the # new file size (64Kb). This also ends up setting the number of bytes # for the first prealloc extent to 64Kb. This is done by the truncation # at btrfs_log_prealloc_extents(). # This means that if a power failure happens after this, a write into # the file range 64Kb to 128Kb will not use the prealloc extent and # will result in allocation of a new extent. $ xfs_io -c "fsync" /mnt/foo # Now set the file size to 256K with a truncate and then fsync the file. # Since no changes happened to the extents, the fsync only updates the # i_size in the inode item at the log tree. This results in an implicit # hole for the file range from 64Kb to 128Kb, something which fsck will # complain when not using the NO_HOLES feature if we replay the log # after a power failure. $ xfs_io -c "truncate 256K" -c "fsync" /mnt/foo So instead of always truncating the log to the inode's current i_size at btrfs_log_prealloc_extents(), check first if there's a prealloc extent that starts at an offset lower than the i_size and with a length that crosses the i_size - if there is one, just make sure we truncate to a size that corresponds to the end offset of that prealloc extent, so that we don't lose the part of that extent that starts at i_size if a power failure happens. A test case for fstests follows soon. Fixes: 31d11b83 ("Btrfs: fix duplicate extents after fsync of file with prealloc extents") CC: stable@vger.kernel.org # 4.14+ Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Al Viro 提交于
... to protect the modification of mp->m_count done by it. Most of the places that modify that thing also have namespace_lock held, but not all of them can do so, so we really need mount_lock here. Kudos to Piotr Krysiuk <piotras@gmail.com>, who'd spotted a related bug in pivot_root(2) (fixed unnoticed in 5.3); search for other similar turds has caught out this one. Cc: stable@kernel.org Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 25 4月, 2020 2 次提交
-
-
由 Xiyu Yang 提交于
nfs4_proc_layoutget() invokes rpc_run_task(), which return the value to "task". Since rpc_run_task() is impossible to return an ERR pointer, there is no need to add the IS_ERR() condition on "task" here. So we need to remove it. Signed-off-by: NXiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: NXin Tan <tanxin.ctf@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Eric W. Biederman 提交于
Oleg pointed out that in the unlikely event the kernel is compiled with CONFIG_PROC_FS unset that release_task will now leak the pid. Move the put_pid out of proc_flush_pid into release_task to fix this and to guarantee I don't make that mistake again. When possible it makes sense to keep get and put in the same function so it can easily been seen how they pair up. Fixes: 7bc3e6e5 ("proc: Use a list of inodes to flush from proc") Reported-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 24 4月, 2020 4 次提交
-
-
由 David Howells 提交于
When an operation is meant to be done uninterruptibly (such as FS.StoreData), we should not be allowing volume and server record checking to be interrupted. Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
AFS keeps track of the epoch value from the rxrpc protocol to note (a) when a fileserver appears to have restarted and (b) when different endpoints of a fileserver do not appear to be associated with the same fileserver (ie. all probes back from a fileserver from all of its interfaces should carry the same epoch). However, the AFS_SERVER_FL_HAVE_EPOCH flag that indicates that we've received the server's epoch is never set, though it is used. Fix this to set the flag when we first receive an epoch value from a probe sent to the filesystem client from the fileserver. Fixes: 3bf0fb6f ("afs: Probe multiple fileservers simultaneously") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Remove three bits: (1) afs_server::no_epoch is neither set nor used. (2) afs_server::have_result is set and a wakeup is applied to it, but nothing looks at it or waits on it. (3) afs_vl_dump_edestaddrreq() prints afs_addr_list::probed, but nothing sets it for VL servers. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 Al Viro 提交于
'count' is how much you want written, not the final position. Moreover, it can legitimately be less than the current position... Cc: stable@vger.kernel.org Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 23 4月, 2020 7 次提交
-
-
由 Xiyu Yang 提交于
btrfs_recover_relocation() invokes btrfs_join_transaction(), which joins a btrfs_trans_handle object into transactions and returns a reference of it with increased refcount to "trans". When btrfs_recover_relocation() returns, "trans" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in one exception handling path of btrfs_recover_relocation(). When read_fs_root() failed, the refcnt increased by btrfs_join_transaction() is not decreased, causing a refcnt leak. Fix this issue by calling btrfs_end_transaction() on this error path when read_fs_root() failed. Fixes: 79787eaa ("btrfs: replace many BUG_ONs with proper error handling") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NXiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: NXin Tan <tanxin.ctf@gmail.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Xiyu Yang 提交于
btrfs_remove_block_group() invokes btrfs_lookup_block_group(), which returns a local reference of the block group that contains the given bytenr to "block_group" with increased refcount. When btrfs_remove_block_group() returns, "block_group" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in several exception handling paths of btrfs_remove_block_group(). When those error scenarios occur such as btrfs_alloc_path() returns NULL, the function forgets to decrease its refcnt increased by btrfs_lookup_block_group() and will cause a refcnt leak. Fix this issue by jumping to "out_put_group" label and calling btrfs_put_block_group() when those error scenarios occur. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: NXiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: NXin Tan <tanxin.ctf@gmail.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
Dave reported a problem where we were panicing with generic/475 with misc-5.7. This is because we were doing IO after we had stopped all of the worker threads, because we do the log tree cleanup on roots at drop time. Cleaning up the log tree will always need to do reads if we happened to have evicted the blocks from memory. Because of this simply add a helper to btrfs_cleanup_transaction() that will go through and drop all of the log roots. This gets run before we do the close_ctree() work, and thus we are allowed to do any reads that we would need. I ran this through many iterations of generic/475 with constrained memory and I did not see the issue. general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI CPU: 2 PID: 12359 Comm: umount Tainted: G W 5.6.0-rc7-btrfs-next-58 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 RIP: 0010:btrfs_queue_work+0x33/0x1c0 [btrfs] RSP: 0018:ffff9cfb015937d8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8eb5e339ed80 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff8eb5eb33b770 RDI: ffff8eb5e37a0460 RBP: ffff8eb5eb33b770 R08: 000000000000020c R09: ffffffff9fc09ac0 R10: 0000000000000007 R11: 0000000000000000 R12: 6b6b6b6b6b6b6b6b R13: ffff9cfb00229040 R14: 0000000000000008 R15: ffff8eb5d3868000 FS: 00007f167ea022c0(0000) GS:ffff8eb5fae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f167e5e0cb1 CR3: 0000000138c18004 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: btrfs_end_bio+0x81/0x130 [btrfs] __split_and_process_bio+0xaf/0x4e0 [dm_mod] ? percpu_counter_add_batch+0xa3/0x120 dm_process_bio+0x98/0x290 [dm_mod] ? generic_make_request+0xfb/0x410 dm_make_request+0x4d/0x120 [dm_mod] ? generic_make_request+0xfb/0x410 generic_make_request+0x12a/0x410 ? submit_bio+0x38/0x160 submit_bio+0x38/0x160 ? percpu_counter_add_batch+0xa3/0x120 btrfs_map_bio+0x289/0x570 [btrfs] ? kmem_cache_alloc+0x24d/0x300 btree_submit_bio_hook+0x79/0xc0 [btrfs] submit_one_bio+0x31/0x50 [btrfs] read_extent_buffer_pages+0x2fe/0x450 [btrfs] btree_read_extent_buffer_pages+0x7e/0x170 [btrfs] walk_down_log_tree+0x343/0x690 [btrfs] ? walk_log_tree+0x3d/0x380 [btrfs] walk_log_tree+0xf7/0x380 [btrfs] ? plist_requeue+0xf0/0xf0 ? delete_node+0x4b/0x230 free_log_tree+0x4c/0x130 [btrfs] ? wait_log_commit+0x140/0x140 [btrfs] btrfs_free_log+0x17/0x30 [btrfs] btrfs_drop_and_free_fs_root+0xb0/0xd0 [btrfs] btrfs_free_fs_roots+0x10c/0x190 [btrfs] ? do_raw_spin_unlock+0x49/0xc0 ? _raw_spin_unlock+0x29/0x40 ? release_extent_buffer+0x121/0x170 [btrfs] close_ctree+0x289/0x2e6 [btrfs] generic_shutdown_super+0x6c/0x110 kill_anon_super+0xe/0x30 btrfs_kill_super+0x12/0x20 [btrfs] deactivate_locked_super+0x3a/0x70 Reported-by: NDavid Sterba <dsterba@suse.com> Fixes: 8c38938c ("btrfs: move the root freeing stuff into btrfs_put_root") Reviewed-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Filipe Manana 提交于
When cleaning pinned extents right before deleting an unused block group, we check if there's still a previous transaction running and if so we increment its reference count before using it for cleaning pinned ranges in its pinned extents iotree. However we ended up never decrementing the reference count after using the transaction, resulting in a memory leak. Fix it by decrementing the reference count. Fixes: fe119a6e ("btrfs: switch to per-transaction pinned extents") Signed-off-by: NFilipe Manana <fdmanana@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Paulo Alcantara 提交于
SMB2_open_init() expects a pre-initialised lease_key when opening a file with a lease, so set pfid->lease_key prior to calling it in open_shroot(). This issue was observed when performing some DFS failover tests and the lease key was never randomly generated. Signed-off-by: NPaulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com> Reviewed-by: NAurelien Aptel <aaptel@suse.com> CC: Stable <stable@vger.kernel.org>
-
由 Paulo Alcantara 提交于
This patch is basically fixing the lookup of tcons (DFS specific) during reconnect (smb2pdu.c:__smb2_reconnect) to update their prefix paths. Previously, we relied on the TCP_Server_Info pointer (misc.c:tcp_super_cb) to determine which tcon to update the prefix path We could not rely on TCP server pointer to determine which super block to update the prefix path when reconnecting tcons since it might map to different tcons that share same TCP connection. Instead, walk through all cifs super blocks and compare their DFS full paths with the tcon being updated to. Signed-off-by: NPaulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
-
由 Paulo Alcantara 提交于
This disables tcon re-use for DFS shares. tcon->dfs_path stores the path that the tcon should connect to when doing failing over. If that tcon is used multiple times e.g. 2 mounts using it with different prefixpath, each will need a different dfs_path but there is only one tcon. The other solution would be to split the tcon in 2 tcons during failover but that is much harder. tcons could not be shared with DFS in cifs.ko because in a DFS namespace like: //domain/dfsroot -> /serverA/dfsroot, /serverB/dfsroot //serverA/dfsroot/link -> /serverA/target1/aa/bb //serverA/dfsroot/link2 -> /serverA/target1/cc/dd you can see that link and link2 are two DFS links that both resolve to the same target share (/serverA/target1), so cifs.ko will only contain a single tcon for both link and link2. The problem with that is, if we (auto)mount "link" and "link2", cifs.ko will only contain a single tcon for both DFS links so we couldn't perform failover or refresh the DFS cache for both links because tcon->dfs_path was set to either "link" or "link2", but not both -- which is wrong. Signed-off-by: NPaulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: NAurelien Aptel <aaptel@suse.com> Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
- 22 4月, 2020 2 次提交
-
-
由 Eric Sandeen 提交于
The timestamp for access_time has double seconds granularity(There is no 10msIncrement field for access_time unlike create/modify_time). exfat's atimes are restricted to only 2s granularity so after we set an atime, round it down to the nearest 2s and set the sub-second component of the timestamp to 0. Signed-off-by: NEric Sandeen <sandeen@sandeen.net> Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
-
由 Eric Sandeen 提交于
The s_time_gran superblock field indicates the on-disk nanosecond granularity of timestamps, and for exfat that seems to be 10ms, so set s_time_gran to 10000000ns. Without this, in-memory timestamps change when they get re-read from disk. Signed-off-by: NEric Sandeen <sandeen@sandeen.net> Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
-