- 02 6月, 2021 17 次提交
-
-
由 Dave Chinner 提交于
The only caller of xfs_dialloc_select_ag() will always return -ENOSPC to it's caller if the agbp returned from xfs_dialloc_select_ag() is NULL. IOWs, failure to find a candidate AGI we can allocate inodes from is always an ENOSPC condition, so move this logic up into xfs_dialloc_select_ag() so we can simplify the return logic in this function. xfs_dialloc_select_ag() now only ever returns 0 with a locked agbp, or an error with no agbp. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Now that everything passes a perag, the agno is not needed anymore. Convert all the users to use pag->pag_agno instead and remove the agno from the cursor. This was largely done as an automated search and replace. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Which will eventually completely replace the agno in it. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Dave Chinner 提交于
Needs a [from, to] ranged AG walk, and the perag to be stuffed into the info structure for callouts to use. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
We currently pass an agno from the AG reservation functions to the individual feature accounting functions, which in future may have to do perag lookups to access per-AG state. Instead, pre-emptively plumb the perag through from the highest AG reservation layer to the feature callouts so they won't have to look it up again. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Dave Chinner 提交于
All of the callers of the busy extent API either have perag references available to use so we can pass a perag to the busy extent functions rather than having them have to do unnecessary lookups. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Clean up the last external manual AG walk. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Rather than manually walking the ags and passing agnunbers around, pass the perag for the AG we are currently working on around in the iwalk structure. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Convert the raw walks to an iterator, pulling the current AG out of pag->pag_agno instead of the loop iterator variable. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
for_each_perag_tag() is defined in xfs_icache.c for local use. Promote this to xfs_ag.h and define equivalent iteration functions so that we can use them to iterate AGs instead to replace open coded perag walks and perag lookups. We also convert as many of the straight forward open coded AG walks to use these iterators as possible. Anything that is not a direct conversion to an iterator is ignored and will be updated in future commits. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Move the xfs_perag infrastructure to the libxfs files that contain all the per AG infrastructure. This helps set up for passing perags around all the code instead of bare agnos with minimal extra includes for existing files. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
The perag structures really need to be defined with the rest of the AG support infrastructure. The struct xfs_perag and init/teardown has been placed in xfs_mount.[ch] because there are differences in the structure between kernel and userspace. Mainly that userspace doesn't have a lot of the internal stuff that the kernel has for caches and discard and other such structures. However, it makes more sense to move this to libxfs than to keep this separation because we are now moving to use struct perags everywhere in the code instead of passing raw agnumber_t values about. Hence we shoudl really move the support infrastructure to libxfs/xfs_ag.[ch]. To do this without breaking userspace, first we need to rearrange the structures and code so that all the kernel specific code is located together. This makes it simple for userspace to ifdef out the all the parts it does not need, minimising the code differences between kernel and userspace. The next commit will do the move... Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
They are AG functions, not superblock functions, so move them to the appropriate location. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
- 15 5月, 2021 4 次提交
-
-
由 Jouni Roivas 提交于
I believe there are some issues introduced by commit 31651c60 ("hfsplus: avoid deadlock on file truncation") HFS+ has extent records which always contains 8 extents. In case the first extent record in catalog file gets full, new ones are allocated from extents overflow file. In case shrinking truncate happens to middle of an extent record which locates in extents overflow file, the logic in hfsplus_file_truncate() was changed so that call to hfs_brec_remove() is not guarded any more. Right action would be just freeing the extents that exceed the new size inside extent record by calling hfsplus_free_extents(), and then check if the whole extent record should be removed. However since the guard (blk_cnt > start) is now after the call to hfs_brec_remove(), this has unfortunate effect that the last matching extent record is removed unconditionally. To reproduce this issue, create a file which has at least 10 extents, and then perform shrinking truncate into middle of the last extent record, so that the number of remaining extents is not under or divisible by 8. This causes the last extent record (8 extents) to be removed totally instead of truncating into middle of it. Thus this causes corruption, and lost data. Fix for this is simply checking if the new truncated end is below the start of this extent record, making it safe to remove the full extent record. However call to hfs_brec_remove() can't be moved to it's previous place since we're dropping ->tree_lock and it can cause a race condition and the cached info being invalidated possibly corrupting the node data. Another issue is related to this one. When entering into the block (blk_cnt > start) we are not holding the ->tree_lock. We break out from the loop not holding the lock, but hfs_find_exit() does unlock it. Not sure if it's possible for someone else to take the lock under our feet, but it can cause hard to debug errors and premature unlocking. Even if there's no real risk of it, the locking should still always be kept in balance. Thus taking the lock now just before the check. Link: https://lkml.kernel.org/r/20210429165139.3082828-1-jouni.roivas@tuxera.com Fixes: 31651c60 ("hfsplus: avoid deadlock on file truncation") Signed-off-by: NJouni Roivas <jouni.roivas@tuxera.com> Reviewed-by: NAnton Altaparmakov <anton@tuxera.com> Cc: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Cc: Viacheslav Dubeyko <slava@dubeyko.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
A readahead request will not allocate more memory than can be represented by a size_t, even on systems that have HIGHMEM available. Change the length functions from returning an loff_t to a size_t. Link: https://lkml.kernel.org/r/20210510201201.1558972-1-willy@infradead.org Fixes: 32c0a6bc ("btrfs: add and use readahead_batch_length") Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Phillip Lougher 提交于
Sysbot has reported a "divide error" which has been identified as being caused by a corrupted file_size value within the file inode. This value has been corrupted to a much larger value than expected. Calculate_skip() is passed i_size_read(inode) >> msblk->block_log. Due to the file_size value corruption this overflows the int argument/variable in that function, leading to the divide error. This patch changes the function to use u64. This will accommodate any unexpectedly large values due to corruption. The value returned from calculate_skip() is clamped to be never more than SQUASHFS_CACHED_BLKS - 1, or 7. So file_size corruption does not lead to an unexpectedly large return result here. Link: https://lkml.kernel.org/r/20210507152618.9447-1-phillip@squashfs.org.ukSigned-off-by: NPhillip Lougher <phillip@squashfs.org.uk> Reported-by: <syzbot+e8f781243ce16ac2f962@syzkaller.appspotmail.com> Reported-by: <syzbot+7b98870d4fec9447b951@syzkaller.appspotmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Peter Xu 提交于
Patch series "mm/hugetlb: Fix issues on file sealing and fork", v2. Hugh reported issue with F_SEAL_FUTURE_WRITE not applied correctly to hugetlbfs, which I can easily verify using the memfd_test program, which seems that the program is hardly run with hugetlbfs pages (as by default shmem). Meanwhile I found another probably even more severe issue on that hugetlb fork won't wr-protect child cow pages, so child can potentially write to parent private pages. Patch 2 addresses that. After this series applied, "memfd_test hugetlbfs" should start to pass. This patch (of 2): F_SEAL_FUTURE_WRITE is missing for hugetlb starting from the first day. There is a test program for that and it fails constantly. $ ./memfd_test hugetlbfs memfd-hugetlb: CREATE memfd-hugetlb: BASIC memfd-hugetlb: SEAL-WRITE memfd-hugetlb: SEAL-FUTURE-WRITE mmap() didn't fail as expected Aborted (core dumped) I think it's probably because no one is really running the hugetlbfs test. Fix it by checking FUTURE_WRITE also in hugetlbfs_file_mmap() as what we do in shmem_mmap(). Generalize a helper for that. Link: https://lkml.kernel.org/r/20210503234356.9097-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20210503234356.9097-2-peterx@redhat.com Fixes: ab3948f5 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") Signed-off-by: NPeter Xu <peterx@redhat.com> Reported-by: NHugh Dickins <hughd@google.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 5月, 2021 3 次提交
-
-
由 Pavel Begunkov 提交于
Since recent changes instead of storing a large array of struct io_mapped_ubuf, we store pointers to them, that is 4 times slimmer and we should not to so worry about restricting max number of registererd buffer slots, increase the limit 4 times. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d3dee1da37f46da416aa96a16bf9e5094e10584d.1620990371.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
There are three types of requests that left disabled for sqpoll, namely epoll ctx, statx, and resources update. Since SQPOLL task is now closely mimics a userspace thread, remove the restrictions. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/909b52d70c45636d8d7897582474ea5aab5eed34.1620990306.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Always remove linked timeout on io_link_timeout_fn() from the master request link list, otherwise we may get use-after-free when first io_link_timeout_fn() puts linked timeout in the fail path, and then will be found and put on master's free. Cc: stable@vger.kernel.org # 5.10+ Fixes: 90cd7e42 ("io_uring: track link timeout's master explicitly") Reported-and-tested-by: syzbot+5a864149dd970b546223@syzkaller.appspotmail.com Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/69c46bf6ce37fec4fdcd98f0882e18eb07ce693a.1620990121.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 13 5月, 2021 1 次提交
-
-
由 Gao Xiang 提交于
If the 1st NONHEAD lcluster of a pcluster isn't CBLKCNT lcluster type rather than a HEAD or PLAIN type instead, which means its pclustersize _must_ be 1 lcluster (since its uncompressed size < 2 lclusters), as illustrated below: HEAD HEAD / PLAIN lcluster type ____________ ____________ |_:__________|_________:__| file data (uncompressed) . . .____________. |____________| pcluster data (compressed) Such on-disk case was explained before [1] but missed to be handled properly in the runtime implementation. It can be observed if manually generating 1 lcluster-sized pcluster with 2 lclusters (thus CBLKCNT doesn't exist.) Let's fix it now. [1] https://lore.kernel.org/r/20210407043927.10623-1-xiang@kernel.org Link: https://lore.kernel.org/r/20210510064715.29123-1-xiang@kernel.org Fixes: cec6e93b ("erofs: support parsing big pcluster compress indexes") Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NGao Xiang <xiang@kernel.org>
-
- 12 5月, 2021 7 次提交
-
-
由 Jaegeuk Kim 提交于
This tries to fix xfstests/generic/495. Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
The final solution can be migrating blocks to form a section-aligned file internally. Meanwhile, let's ask users to do that when preparing the swap file initially like: 1) create() 2) ioctl(F2FS_IOC_SET_PIN_FILE) 3) fallocate() Reported-by: Nkernel test robot <oliver.sang@intel.com> Fixes: 36e4d958 ("f2fs: check if swapfile is section-alligned") Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
In f2fs_destroy_compress_ctx(), after f2fs_destroy_compress_ctx(), cc.cluster_idx will be cleared w/ NULL_CLUSTER, f2fs_cluster_blocks() may check wrong cluster metadata, fix it. Fixes: 4c8ff709 ("f2fs: support data compression") Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
pos_fsstress testcase complains a panic as belew: ------------[ cut here ]------------ kernel BUG at fs/f2fs/compress.c:1082! invalid opcode: 0000 [#1] SMP PTI CPU: 4 PID: 2753477 Comm: kworker/u16:2 Tainted: G OE 5.12.0-rc1-custom #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Workqueue: writeback wb_workfn (flush-252:16) RIP: 0010:prepare_compress_overwrite+0x4c0/0x760 [f2fs] Call Trace: f2fs_prepare_compress_overwrite+0x5f/0x80 [f2fs] f2fs_write_cache_pages+0x468/0x8a0 [f2fs] f2fs_write_data_pages+0x2a4/0x2f0 [f2fs] do_writepages+0x38/0xc0 __writeback_single_inode+0x44/0x2a0 writeback_sb_inodes+0x223/0x4d0 __writeback_inodes_wb+0x56/0xf0 wb_writeback+0x1dd/0x290 wb_workfn+0x309/0x500 process_one_work+0x220/0x3c0 worker_thread+0x53/0x420 kthread+0x12f/0x150 ret_from_fork+0x22/0x30 The root cause is truncate() may race with overwrite as below, so that one reference count left in page can not guarantee the page attaching in mapping tree all the time, after truncation, later find_lock_page() may return NULL pointer. - prepare_compress_overwrite - f2fs_pagecache_get_page - unlock_page - f2fs_setattr - truncate_setsize - truncate_inode_page - delete_from_page_cache - find_lock_page Fix this by avoiding referencing updated page. Fixes: 4c8ff709 ("f2fs: support data compression") Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Chao Yu 提交于
In error path of f2fs_write_compressed_pages(), it needs to call f2fs_compress_free_page() to release temporary page. Fixes: 5e6bbde9 ("f2fs: introduce mempool for {,de}compress intermediate page allocation") Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
In f2fs_fileattr_set(), if (!fa->flags_valid) mask &= FS_COMMON_FL; In this case, we can set supported flags by mask only instead of BUG_ON. /* Flags shared betwen flags/xflags */ (FS_SYNC_FL | FS_IMMUTABLE_FL | FS_APPEND_FL | \ FS_NODUMP_FL | FS_NOATIME_FL | FS_DAX_FL | \ FS_PROJINHERIT_FL) Fixes: 9b1bb01c ("f2fs: convert to fileattr") Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
Unable to handle kernel NULL pointer dereference at virtual address 000000000000001a pc : f2fs_inplace_write_data+0x144/0x208 lr : f2fs_inplace_write_data+0x134/0x208 Call trace: f2fs_inplace_write_data+0x144/0x208 f2fs_do_write_data_page+0x270/0x770 f2fs_write_single_data_page+0x47c/0x830 __f2fs_write_data_pages+0x444/0x98c f2fs_write_data_pages.llvm.16514453770497736882+0x2c/0x38 do_writepages+0x58/0x118 __writeback_single_inode+0x44/0x300 writeback_sb_inodes+0x4b8/0x9c8 wb_writeback+0x148/0x42c wb_do_writeback+0xc8/0x390 wb_workfn+0xb0/0x2f4 process_one_work+0x1fc/0x444 worker_thread+0x268/0x4b4 kthread+0x13c/0x158 ret_from_fork+0x10/0x18 Fixes: 95577278 ("f2fs: drop inplace IO if fs status is abnormal") Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 11 5月, 2021 1 次提交
-
-
由 Ritesh Harjani 提交于
Add error handling in btrfs_fileattr_set in case of an error while starting a transaction. This fixes btrfs/232 which otherwise used to fail with below signature on Power. btrfs/232 [ 1119.474650] run fstests btrfs/232 at 2021-04-21 02:21:22 <...> [ 1366.638585] BUG: Unable to handle kernel data access on read at 0xffffffffffffff86 [ 1366.638768] Faulting instruction address: 0xc0000000009a5c88 cpu 0x0: Vector: 380 (Data SLB Access) at [c000000014f177b0] pc: c0000000009a5c88: btrfs_update_root_times+0x58/0xc0 lr: c0000000009a5c84: btrfs_update_root_times+0x54/0xc0 <...> pid = 24881, comm = fsstress btrfs_update_inode+0xa0/0x140 btrfs_fileattr_set+0x5d0/0x6f0 vfs_fileattr_set+0x2a8/0x390 do_vfs_ioctl+0x1290/0x1ac0 sys_ioctl+0x6c/0x120 system_call_exception+0x3d4/0x410 system_call_common+0xec/0x278 Fixes: 97fc2977 ("btrfs: convert to fileattr") Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 09 5月, 2021 1 次提交
-
-
由 Pavel Begunkov 提交于
WARNING: CPU: 0 PID: 10242 at lib/refcount.c:28 refcount_warn_saturate+0x15b/0x1a0 lib/refcount.c:28 RIP: 0010:refcount_warn_saturate+0x15b/0x1a0 lib/refcount.c:28 Call Trace: __refcount_sub_and_test include/linux/refcount.h:283 [inline] __refcount_dec_and_test include/linux/refcount.h:315 [inline] refcount_dec_and_test include/linux/refcount.h:333 [inline] io_put_req fs/io_uring.c:2140 [inline] io_queue_linked_timeout fs/io_uring.c:6300 [inline] __io_queue_sqe+0xbef/0xec0 fs/io_uring.c:6354 io_submit_sqe fs/io_uring.c:6534 [inline] io_submit_sqes+0x2bbd/0x7c50 fs/io_uring.c:6660 __do_sys_io_uring_enter fs/io_uring.c:9240 [inline] __se_sys_io_uring_enter+0x256/0x1d60 fs/io_uring.c:9182 io_link_timeout_fn() should put only one reference of the linked timeout request, however in case of racing with the master request's completion first io_req_complete() puts one and then io_put_req_deferred() is called. Cc: stable@vger.kernel.org # 5.12+ Fixes: 9ae1f8dd ("io_uring: fix inconsistent lock state") Reported-by: syzbot+a2910119328ce8e7996f@syzkaller.appspotmail.com Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ff51018ff29de5ffa76f09273ef48cb24c720368.1620417627.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 08 5月, 2021 6 次提交
-
-
由 Steve French 提交于
Mounting with "multichannel" is obviously implied if user requested more than one channel on mount (ie mount parm max_channels>1). Currently both have to be specified. Fix that so that if max_channels is greater than 1 on mount, enable multichannel rather than silently falling back to non-multichannel. Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-By: NTom Talpey <tom@talpey.com> Cc: <stable@vger.kernel.org> # v5.11+ Reviewed-by: NShyam Prasad N <sprasad@microsoft.com>
-
由 Steve French 提交于
We were ignoring CAP_MULTI_CHANNEL in the server response - if the server doesn't support multichannel we should not be attempting it. See MS-SMB2 section 3.2.5.2 Reviewed-by: NShyam Prasad N <sprasad@microsoft.com> Reviewed-By: NTom Talpey <tom@talpey.com> Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Steve French 提交于
In the SMB3/SMB3.1.1 negotiate protocol request, we are supposed to advertise CAP_MULTICHANNEL capability when establishing multiple channels has been requested by the user doing the mount. See MS-SMB2 sections 2.2.3 and 3.2.5.2 Without setting it there is some risk that multichannel could fail if the server interpreted the field strictly. Reviewed-By: NTom Talpey <tom@talpey.com> Reviewed-by: NShyam Prasad N <sprasad@microsoft.com> Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Vivek Goyal 提交于
I am seeing missed wakeups which ultimately lead to a deadlock when I am using virtiofs with DAX enabled and running "make -j". I had to mount virtiofs as rootfs and also reduce to dax window size to 256M to reproduce the problem consistently. So here is the problem. put_unlocked_entry() wakes up waiters only if entry is not null as well as !dax_is_conflict(entry). But if I call multiple instances of invalidate_inode_pages2() in parallel, then I can run into a situation where there are waiters on this index but nobody will wake these waiters. invalidate_inode_pages2() invalidate_inode_pages2_range() invalidate_exceptional_entry2() dax_invalidate_mapping_entry_sync() __dax_invalidate_entry() { xas_lock_irq(&xas); entry = get_unlocked_entry(&xas, 0); ... ... dax_disassociate_entry(entry, mapping, trunc); xas_store(&xas, NULL); ... ... put_unlocked_entry(&xas, entry); xas_unlock_irq(&xas); } Say a fault in in progress and it has locked entry at offset say "0x1c". Now say three instances of invalidate_inode_pages2() are in progress (A, B, C) and they all try to invalidate entry at offset "0x1c". Given dax entry is locked, all tree instances A, B, C will wait in wait queue. When dax fault finishes, say A is woken up. It will store NULL entry at index "0x1c" and wake up B. When B comes along it will find "entry=0" at page offset 0x1c and it will call put_unlocked_entry(&xas, 0). And this means put_unlocked_entry() will not wake up next waiter, given the current code. And that means C continues to wait and is not woken up. This patch fixes the issue by waking up all waiters when a dax entry has been invalidated. This seems to fix the deadlock I am facing and I can make forward progress. Reported-by: NSergio Lopez <slp@redhat.com> Fixes: ac401cc7 ("dax: New fault locking") Reviewed-by: NJan Kara <jack@suse.cz> Suggested-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20210428190314.1865312-4-vgoyal@redhat.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Vivek Goyal 提交于
As of now put_unlocked_entry() always wakes up next waiter. In next patches we want to wake up all waiters at one callsite. Hence, add a parameter to the function. This patch does not introduce any change of behavior. Reviewed-by: NGreg Kurz <groug@kaod.org> Reviewed-by: NJan Kara <jack@suse.cz> Suggested-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20210428190314.1865312-3-vgoyal@redhat.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Vivek Goyal 提交于
Dan mentioned that he is not very fond of passing around a boolean true/false to specify if only next waiter should be woken up or all waiters should be woken up. He instead prefers that we introduce an enum and make it very explicity at the callsite itself. Easier to read code. This patch should not introduce any change of behavior. Reviewed-by: NGreg Kurz <groug@kaod.org> Reviewed-by: NJan Kara <jack@suse.cz> Suggested-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20210428190314.1865312-2-vgoyal@redhat.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>
-