- 17 9月, 2019 3 次提交
-
-
由 Steve French 提交于
When mounting with "modefromsid" retrieve mode bits from special SID (S-1-5-88-3) on stat. Subsequent patch will fix setattr (chmod) to save mode bits in S-1-5-88-3-<mode> Note that when an ACE matching S-1-5-88-3 is not found, we default the mode to an approximation based on the owner, group and everyone permissions (as with the "cifsacl" mount option). See See e.g. https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2008-R2-and-2008/hh509017(v=ws.10)Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Colin Ian King 提交于
The variable ret is being initialized however this is never read and later it is being reassigned to a new value. The initialization is redundant and hence can be removed. Addresses-Coverity: ("Unused Value") Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Ronnie Sahlberg 提交于
Clarify a trivial comment Signed-off-by: NRonnie Sahlberg <lsahlber@redhat.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
- 16 9月, 2019 1 次提交
-
-
由 Linus Torvalds 提交于
This reverts commit b03755ad. This is sad, and done for all the wrong reasons. Because that commit is good, and does exactly what it says: avoids a lot of small disk requests for the inode table read-ahead. However, it turns out that it causes an entirely unrelated problem: the getrandom() system call was introduced back in 2014 by commit c6e9d6f3 ("random: introduce getrandom(2) system call"), and people use it as a convenient source of good random numbers. But part of the current semantics for getrandom() is that it waits for the entropy pool to fill at least partially (unlike /dev/urandom). And at least ArchLinux apparently has a systemd that uses getrandom() at boot time, and the improvements in IO patterns means that existing installations suddenly start hanging, waiting for entropy that will never happen. It seems to be an unlucky combination of not _quite_ enough entropy, together with a particular systemd version and configuration. Lennart says that the systemd-random-seed process (which is what does this early access) is supposed to not block any other boot activity, but sadly that doesn't actually seem to be the case (possibly due bogus dependencies on cryptsetup for encrypted swapspace). The correct fix is to fix getrandom() to not block when it's not appropriate, but that fix is going to take a lot more discussion. Do we just make it act like /dev/urandom by default, and add a new flag for "wait for entropy"? Do we add a boot-time option? Or do we just limit the amount of time it will wait for entropy? So in the meantime, we do the revert to give us time to discuss the eventual fix for the fundamental problem, at which point we can re-apply the ext4 inode table access optimization. Reported-by: NAhmed S. Darwish <darwish.07@gmail.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Willy Tarreau <w@1wt.eu> Cc: Alexander E. Patrakov <patrakov@gmail.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 9月, 2019 2 次提交
-
-
由 Filipe Manana 提交于
The lock_extent_buffer_io() returns 1 to the caller to tell it everything went fine and the callers needs to start writeback for the extent buffer (submit a bio, etc), 0 to tell the caller everything went fine but it does not need to start writeback for the extent buffer, and a negative value if some error happened. When it's about to return 1 it tries to lock all pages, and if a try lock on a page fails, and we didn't flush any existing bio in our "epd", it calls flush_write_bio(epd) and overwrites the return value of 1 to 0 or an error. The page might have been locked elsewhere, not with the goal of starting writeback of the extent buffer, and even by some code other than btrfs, like page migration for example, so it does not mean the writeback of the extent buffer was already started by some other task, so returning a 0 tells the caller (btree_write_cache_pages()) to not start writeback for the extent buffer. Note that epd might currently have either no bio, so flush_write_bio() returns 0 (success) or it might have a bio for another extent buffer with a lower index (logical address). Since we return 0 with the EXTENT_BUFFER_WRITEBACK bit set on the extent buffer and writeback is never started for the extent buffer, future attempts to writeback the extent buffer will hang forever waiting on that bit to be cleared, since it can only be cleared after writeback completes. Such hang is reported with a trace like the following: [49887.347053] INFO: task btrfs-transacti:1752 blocked for more than 122 seconds. [49887.347059] Not tainted 5.2.13-gentoo #2 [49887.347060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [49887.347062] btrfs-transacti D 0 1752 2 0x80004000 [49887.347064] Call Trace: [49887.347069] ? __schedule+0x265/0x830 [49887.347071] ? bit_wait+0x50/0x50 [49887.347072] ? bit_wait+0x50/0x50 [49887.347074] schedule+0x24/0x90 [49887.347075] io_schedule+0x3c/0x60 [49887.347077] bit_wait_io+0x8/0x50 [49887.347079] __wait_on_bit+0x6c/0x80 [49887.347081] ? __lock_release.isra.29+0x155/0x2d0 [49887.347083] out_of_line_wait_on_bit+0x7b/0x80 [49887.347084] ? var_wake_function+0x20/0x20 [49887.347087] lock_extent_buffer_for_io+0x28c/0x390 [49887.347089] btree_write_cache_pages+0x18e/0x340 [49887.347091] do_writepages+0x29/0xb0 [49887.347093] ? kmem_cache_free+0x132/0x160 [49887.347095] ? convert_extent_bit+0x544/0x680 [49887.347097] filemap_fdatawrite_range+0x70/0x90 [49887.347099] btrfs_write_marked_extents+0x53/0x120 [49887.347100] btrfs_write_and_wait_transaction.isra.4+0x38/0xa0 [49887.347102] btrfs_commit_transaction+0x6bb/0x990 [49887.347103] ? start_transaction+0x33e/0x500 [49887.347105] transaction_kthread+0x139/0x15c So fix this by not overwriting the return value (ret) with the result from flush_write_bio(). We also need to clear the EXTENT_BUFFER_WRITEBACK bit in case flush_write_bio() returns an error, otherwise it will hang any future attempts to writeback the extent buffer, and undo all work done before (set back EXTENT_BUFFER_DIRTY, etc). This is a regression introduced in the 5.2 kernel. Fixes: 2e3c2513 ("btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()") Fixes: f4340622 ("btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up") Reported-by: NZdenek Sojka <zsojka@seznam.cz> Link: https://lore.kernel.org/linux-btrfs/GpO.2yos.3WGDOLpx6t%7D.1TUDYM@seznam.cz/T/#uReported-by: NStefan Priebe - Profihost AG <s.priebe@profihost.ag> Link: https://lore.kernel.org/linux-btrfs/5c4688ac-10a7-fb07-70e8-c5d31a3fbb38@profihost.ag/T/#tReported-by: NDrazen Kacar <drazen.kacar@oradian.com> Link: https://lore.kernel.org/linux-btrfs/DB8PR03MB562876ECE2319B3E579590F799C80@DB8PR03MB5628.eurprd03.prod.outlook.com/ Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204377Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Filipe Manana 提交于
Sometimes when fsync'ing a file we need to log that other inodes exist and when we need to do that we acquire a reference on the inodes and then drop that reference using iput() after logging them. That generally is not a problem except if we end up doing the final iput() (dropping the last reference) on the inode and that inode has a link count of 0, which can happen in a very short time window if the logging path gets a reference on the inode while it's being unlinked. In that case we end up getting the eviction callback, btrfs_evict_inode(), invoked through the iput() call chain which needs to drop all of the inode's items from its subvolume btree, and in order to do that, it needs to join a transaction at the helper function evict_refill_and_join(). However because the task previously started a transaction at the fsync handler, btrfs_sync_file(), it has current->journal_info already pointing to a transaction handle and therefore evict_refill_and_join() will get that transaction handle from btrfs_join_transaction(). From this point on, two different problems can happen: 1) evict_refill_and_join() will often change the transaction handle's block reserve (->block_rsv) and set its ->bytes_reserved field to a value greater than 0. If evict_refill_and_join() never commits the transaction, the eviction handler ends up decreasing the reference count (->use_count) of the transaction handle through the call to btrfs_end_transaction(), and after that point we have a transaction handle with a NULL ->block_rsv (which is the value prior to the transaction join from evict_refill_and_join()) and a ->bytes_reserved value greater than 0. If after the eviction/iput completes the inode logging path hits an error or it decides that it must fallback to a transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a non-zero value from btrfs_log_dentry_safe(), and because of that non-zero value it tries to commit the transaction using a handle with a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes the transaction commit hit an assertion failure at btrfs_trans_release_metadata() because ->bytes_reserved is not zero but the ->block_rsv is NULL. The produced stack trace for that is like the following: [192922.917158] assertion failed: !trans->bytes_reserved, file: fs/btrfs/transaction.c, line: 816 [192922.917553] ------------[ cut here ]------------ [192922.917922] kernel BUG at fs/btrfs/ctree.h:3532! [192922.918310] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI [192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: G W 5.1.4-btrfs-next-47 #1 [192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs] (...) [192922.920925] RSP: 0018:ffffaebdc8a27da8 EFLAGS: 00010286 [192922.921315] RAX: 0000000000000051 RBX: ffff95c9c16a41c0 RCX: 0000000000000000 [192922.921692] RDX: 0000000000000000 RSI: ffff95cab6b16838 RDI: ffff95cab6b16838 [192922.922066] RBP: ffff95c9c16a41c0 R08: 0000000000000000 R09: 0000000000000000 [192922.922442] R10: ffffaebdc8a27e70 R11: 0000000000000000 R12: ffff95ca731a0980 [192922.922820] R13: 0000000000000000 R14: ffff95ca84c73338 R15: ffff95ca731a0ea8 [192922.923200] FS: 00007f337eda4e80(0000) GS:ffff95cab6b00000(0000) knlGS:0000000000000000 [192922.923579] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [192922.923948] CR2: 00007f337edad000 CR3: 00000001e00f6002 CR4: 00000000003606e0 [192922.924329] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [192922.924711] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [192922.925105] Call Trace: [192922.925505] btrfs_trans_release_metadata+0x10c/0x170 [btrfs] [192922.925911] btrfs_commit_transaction+0x3e/0xaf0 [btrfs] [192922.926324] btrfs_sync_file+0x44c/0x490 [btrfs] [192922.926731] do_fsync+0x38/0x60 [192922.927138] __x64_sys_fdatasync+0x13/0x20 [192922.927543] do_syscall_64+0x60/0x1c0 [192922.927939] entry_SYSCALL_64_after_hwframe+0x49/0xbe (...) [192922.934077] ---[ end trace f00808b12068168f ]--- 2) If evict_refill_and_join() decides to commit the transaction, it will be able to do it, since the nested transaction join only increments the transaction handle's ->use_count reference counter and it does not prevent the transaction from getting committed. This means that after eviction completes, the fsync logging path will be using a transaction handle that refers to an already committed transaction. What happens when using such a stale transaction can be unpredictable, we are at least having a use-after-free on the transaction handle itself, since the transaction commit will call kmem_cache_free() against the handle regardless of its ->use_count value, or we can end up silently losing all the updates to the log tree after that iput() in the logging path, or using a transaction handle that in the meanwhile was allocated to another task for a new transaction, etc, pretty much unpredictable what can happen. In order to fix both of them, instead of using iput() during logging, use btrfs_add_delayed_iput(), so that the logging path of fsync never drops the last reference on an inode, that step is offloaded to a safe context (usually the cleaner kthread). The assertion failure issue was sporadically triggered by the test case generic/475 from fstests, which loads the dm error target while fsstress is running, which lead to fsync failing while logging inodes with -EIO errors and then trying later to commit the transaction, triggering the assertion failure. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 05 9月, 2019 1 次提交
-
-
由 Al Viro 提交于
Make sure that attribute methods are not called after the item has been removed from the tree. To do so, we * at the point of no return in removals, grab ->frag_sem exclusive and mark the fragment dead. * call the methods of attributes with ->frag_sem taken shared and only after having verified that the fragment is still alive. The main benefit is for method instances - they are guaranteed that the objects they are accessing *and* all ancestors are still there. Another win is that we don't need to bother with extra refcount on config_item when opening a file - the item will be alive for as long as it stays in the tree, and we won't touch it/attributes/any associated data after it's been removed from the tree. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 03 9月, 2019 4 次提交
-
-
由 Al Viro 提交于
Refcounted, hangs of configfs_dirent, created by operations that add fragments to configfs tree (mkdir and configfs_register_{subsystem,group}). Will be used in the next commit to provide exclusion between fragment removal and ->show/->store calls. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Al Viro 提交于
revert cc57c073 "configfs: fix registered group removal" It was an attempt to handle something that fundamentally doesn't work - configfs_register_group() should never be done in a part of tree that can be rmdir'ed. And in mainline it never had been, so let's not borrow trouble; the fix was racy anyway, it would take a lot more to make that work and desired semantics is not clear. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Al Viro 提交于
simplifies the ->read()/->write()/->release() instances nicely Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Trond Myklebust 提交于
We want to throw out the attrbute if it refers to the mounted on fileid, and not the real fileid. However we do not want to block cache consistency updates from NFSv4 writes. Reported-by: NMurphy Zhou <jencce.kernel@gmail.com> Fixes: 7e10cc25 ("NFS: Don't refresh attributes with mounted-on-file...") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 28 8月, 2019 4 次提交
-
-
由 Steve French 提交于
To 2.22 Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Ronnie Sahlberg 提交于
Using strscpy is cleaner, and avoids some problems with handling maximum length strings. Linus noticed the original problem and Aurelien pointed out some additional problems. Fortunately most of this is SMB1 code (and in particular the ASCII string handling older, which is less common). Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Reviewed-by: NAurelien Aptel <aaptel@suse.com> Signed-off-by: NRonnie Sahlberg <lsahlber@redhat.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Dan Carpenter 提交于
It's safer to zero out the password so that it can never be disclosed. Fixes: 0c219f5799c7 ("cifs: set domainName when a domain-key is used in multiuser") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Ronnie Sahlberg 提交于
RHBZ: 1710429 When we use a domain-key to authenticate using multiuser we must also set the domainnmame for the new volume as it will be used and passed to the server in the NTLMSSP Domain-name. Signed-off-by: NRonnie Sahlberg <lsahlber@redhat.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
- 27 8月, 2019 8 次提交
-
-
由 YueHaibing 提交于
Fixes gcc '-Wunused-but-set-variable' warning: fs/nfs/write.c: In function nfs_page_async_flush: fs/nfs/write.c:609:24: warning: variable mapping set but not used [-Wunused-but-set-variable] It is not use since commit aefb623c422e ("NFS: Fix writepage(s) error handling to not report errors twice") Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Ensure we update the write result count on success, since the RPC call itself does not do so. Reported-by: NJan Stancek <jstancek@redhat.com> Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Tested-by: NJan Stancek <jstancek@redhat.com>
-
由 Trond Myklebust 提交于
If we received a reply from the server with a zero length read and no error, then that implies we are at eof. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
If writepage()/writepages() saw an error, but handled it without reporting it, we should not be re-reporting that error on exit. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
If the client attempts to read a page, but the read fails due to some spurious error (e.g. an ACCESS error or a timeout, ...) then we need to allow other processes to retry. Also try to report errors correctly when doing a synchronous readpage. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
If the mount is hard, we should ignore the 'io_maxretrans' module parameter so that we always keep retrying. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
This reverts commit a79f194a. The mechanism for aborting I/O is racy, since we are not guaranteed that the request is asleep while we're changing both task->tk_status and task->tk_action. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v5.1
-
由 Trond Myklebust 提交于
The pNFS/flexfiles I/O requests are sent with the SOFTCONN flag set, so they automatically time out if the connection breaks. It should therefore not be necessary to have the soft flag set in addition. Fixes: 5f01d953 ("nfs41: create NFSv3 DS connection if specified") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 25 8月, 2019 1 次提交
-
-
由 Oleg Nesterov 提交于
userfaultfd_release() should clear vm_flags/vm_userfaultfd_ctx even if mm->core_state != NULL. Otherwise a page fault can see userfaultfd_missing() == T and use an already freed userfaultfd_ctx. Link: http://lkml.kernel.org/r/20190820160237.GB4983@redhat.com Fixes: 04f5866e ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping") Signed-off-by: NOleg Nesterov <oleg@redhat.com> Reported-by: NKefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Tested-by: NKefeng Wang <wangkefeng.wang@huawei.com> Cc: Peter Xu <peterx@redhat.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 8月, 2019 2 次提交
-
-
由 Darrick J. Wong 提交于
Benjamin Moody reported to Debian that XFS partially wedges when a chgrp fails on account of being out of disk quota. I ran his reproducer script: # adduser dummy # adduser dummy plugdev # dd if=/dev/zero bs=1M count=100 of=test.img # mkfs.xfs test.img # mount -t xfs -o gquota test.img /mnt # mkdir -p /mnt/dummy # chown -c dummy /mnt/dummy # xfs_quota -xc 'limit -g bsoft=100k bhard=100k plugdev' /mnt (and then as user dummy) $ dd if=/dev/urandom bs=1M count=50 of=/mnt/dummy/foo $ chgrp plugdev /mnt/dummy/foo and saw: ================================================ WARNING: lock held when returning to user space! 5.3.0-rc5 #rc5 Tainted: G W ------------------------------------------------ chgrp/47006 is leaving the kernel with locks still held! 1 lock held by chgrp/47006: #0: 000000006664ea2d (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0xd2/0x290 [xfs] ...which is clearly caused by xfs_setattr_nonsize failing to unlock the ILOCK after the xfs_qm_vop_chown_reserve call fails. Add the missing unlock. Reported-by: benjamin.moody@gmail.com Fixes: 253f4911 ("xfs: better xfs_trans_alloc interface") Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Tested-by: NSalvatore Bonaccorso <carnil@debian.org>
-
由 Jens Axboe 提交于
The outer poll loop checks for whether we need to reschedule, and returns to userspace if we do. However, it's possible to get stuck in the inner loop as well, if the CPU we are running on needs to reschedule to finish the IO work. Add the need_resched() check in the inner loop as well. This fixes a potential hang if the kernel is configured with CONFIG_PREEMPT_VOLUNTARY=y. Reported-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Tested-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 22 8月, 2019 11 次提交
-
-
由 Liu Song 提交于
If the number of dirty pages to be written back is large, then writeback_inodes_sb will block waiting for a long time, causing hung task detection alarm. Therefore, we should limit the maximum number of pages written back this time, which let the budget be completed faster. The remaining dirty pages tend to rely on the writeback mechanism to complete the synchronization. Fixes: b6e51316 ("writeback: separate starting of sync vs opportunistic writeback") Signed-off-by: NLiu Song <liu.song11@zte.com.cn> Signed-off-by: NRichard Weinberger <richard@nod.at>
-
由 Richard Weinberger 提交于
Currently on a freshly mounted UBIFS, c->min_log_bytes is 0. This can lead to a log overrun and make commits fail. Recent kernels will report the following assert: UBIFS assert failed: c->lhead_lnum != c->ltail_lnum, in fs/ubifs/log.c:412 c->min_log_bytes can have two states, 0 and c->leb_size. It controls how much bytes of the log area are reserved for non-bud nodes such as commit nodes. After a commit it has to be set to c->leb_size such that we have always enough space for a commit. While a commit runs it can be 0 to make the remaining bytes of the log available to writers. Having it set to 0 right after mount is wrong since no space for commits is reserved. Fixes: 1e51764a ("UBIFS: add new flash file system") Reported-and-tested-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: NRichard Weinberger <richard@nod.at>
-
由 Richard Weinberger 提交于
We unlock after orphan_delete(), so no need to unlock in the function too. Reported-by: NHan Xu <han.xu@nxp.com> Fixes: 8009ce95 ("ubifs: Don't leak orphans on memory during commit") Signed-off-by: NRichard Weinberger <richard@nod.at>
-
由 YueHaibing 提交于
It seems that 'yfs_RXYFSStoreOpaqueACL2' should be use in yfs_fs_store_opaque_acl2(). Fixes: f5e45463 ("afs: Implement YFS ACL setting") Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 Marc Dionne 提交于
The afs_lookup trace event can cause the following: [ 216.576777] BUG: kernel NULL pointer dereference, address: 000000000000023b [ 216.576803] #PF: supervisor read access in kernel mode [ 216.576813] #PF: error_code(0x0000) - not-present page ... [ 216.576913] RIP: 0010:trace_event_raw_event_afs_lookup+0x9e/0x1c0 [kafs] If the inode from afs_do_lookup() is an error other than ENOENT, or if it is ENOENT and afs_try_auto_mntpt() returns an error, the trace event will try to dereference the error pointer as a valid pointer. Use IS_ERR_OR_NULL to only pass a valid pointer for the trace, or NULL. Ideally the trace would include the error value, but for now just avoid the oops. Fixes: 80548b03 ("afs: Add more tracepoints") Signed-off-by: NMarc Dionne <marc.dionne@auristor.com> Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Fix a leak on the cell refcount in afs_lookup_cell_rcu() due to non-clearance of the default error in the case a NULL cell name is passed and the workstation default cell is used. Also put a bit at the end to make sure we don't leak a cell ref if we're going to be returning an error. This leak results in an assertion like the following when the kafs module is unloaded: AFS: Assertion failed 2 == 1 is false 0x2 == 0x1 is false ------------[ cut here ]------------ kernel BUG at fs/afs/cell.c:770! ... RIP: 0010:afs_manage_cells+0x220/0x42f [kafs] ... process_one_work+0x4c2/0x82c ? pool_mayday_timeout+0x1e1/0x1e1 ? do_raw_spin_lock+0x134/0x175 worker_thread+0x336/0x4a6 ? rescuer_thread+0x4af/0x4af kthread+0x1de/0x1ee ? kthread_park+0xd4/0xd4 ret_from_fork+0x24/0x30 Fixes: 989782dc ("afs: Overhaul cell database management") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 Jeff Layton 提交于
When ceph_mdsc_do_request returns an error, we can't assume that the filelock_reply pointer will be set. Only try to fetch fields out of the r_reply_info when it returns success. Cc: stable@vger.kernel.org Reported-by: NHector Martin <hector@marcansoft.com> Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Erqi Chen 提交于
clear_page_dirty_for_io(page) before mapping->a_ops->invalidatepage(). invalidatepage() clears page's private flag, if dirty flag is not cleared, the page may cause BUG_ON failure in ceph_set_page_dirty(). Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/40862Signed-off-by: NErqi Chen <chenerqi@gmail.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Luis Henriques 提交于
Calling ceph_buffer_put() in fill_inode() may result in freeing the i_xattrs.blob buffer while holding the i_ceph_lock. This can be fixed by postponing the call until later, when the lock is released. The following backtrace was triggered by fstests generic/070. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 3852, name: kworker/0:4 6 locks held by kworker/0:4/3852: #0: 000000004270f6bb ((wq_completion)ceph-msgr){+.+.}, at: process_one_work+0x1b8/0x5f0 #1: 00000000eb420803 ((work_completion)(&(&con->work)->work)){+.+.}, at: process_one_work+0x1b8/0x5f0 #2: 00000000be1c53a4 (&s->s_mutex){+.+.}, at: dispatch+0x288/0x1476 #3: 00000000559cb958 (&mdsc->snap_rwsem){++++}, at: dispatch+0x2eb/0x1476 #4: 000000000d5ebbae (&req->r_fill_mutex){+.+.}, at: dispatch+0x2fc/0x1476 #5: 00000000a83d0514 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: fill_inode.isra.0+0xf8/0xf70 CPU: 0 PID: 3852 Comm: kworker/0:4 Not tainted 5.2.0+ #441 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Workqueue: ceph-msgr ceph_con_workfn Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 fill_inode.isra.0+0xa9b/0xf70 ceph_fill_trace+0x13b/0xc70 ? dispatch+0x2eb/0x1476 dispatch+0x320/0x1476 ? __mutex_unlock_slowpath+0x4d/0x2a0 ceph_con_workfn+0xc97/0x2ec0 ? process_one_work+0x1b8/0x5f0 process_one_work+0x244/0x5f0 worker_thread+0x4d/0x3e0 kthread+0x105/0x140 ? process_one_work+0x5f0/0x5f0 ? kthread_park+0x90/0x90 ret_from_fork+0x3a/0x50 Signed-off-by: NLuis Henriques <lhenriques@suse.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Luis Henriques 提交于
Calling ceph_buffer_put() in __ceph_build_xattrs_blob() may result in freeing the i_xattrs.blob buffer while holding the i_ceph_lock. This can be fixed by having this function returning the old blob buffer and have the callers of this function freeing it when the lock is released. The following backtrace was triggered by fstests generic/117. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 649, name: fsstress 4 locks held by fsstress/649: #0: 00000000a7478e7e (&type->s_umount_key#19){++++}, at: iterate_supers+0x77/0xf0 #1: 00000000f8de1423 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: ceph_check_caps+0x7b/0xc60 #2: 00000000562f2b27 (&s->s_mutex){+.+.}, at: ceph_check_caps+0x3bd/0xc60 #3: 00000000f83ce16a (&mdsc->snap_rwsem){++++}, at: ceph_check_caps+0x3ed/0xc60 CPU: 1 PID: 649 Comm: fsstress Not tainted 5.2.0+ #439 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 __ceph_build_xattrs_blob+0x12b/0x170 __send_cap+0x302/0x540 ? __lock_acquire+0x23c/0x1e40 ? __mark_caps_flushing+0x15c/0x280 ? _raw_spin_unlock+0x24/0x30 ceph_check_caps+0x5f0/0xc60 ceph_flush_dirty_caps+0x7c/0x150 ? __ia32_sys_fdatasync+0x20/0x20 ceph_sync_fs+0x5a/0x130 iterate_supers+0x8f/0xf0 ksys_sync+0x4f/0xb0 __ia32_sys_sync+0xa/0x10 do_syscall_64+0x50/0x1c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fc6409ab617 Signed-off-by: NLuis Henriques <lhenriques@suse.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Luis Henriques 提交于
Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the i_xattrs.prealloc_blob buffer while holding the i_ceph_lock. This can be fixed by postponing the call until later, when the lock is released. The following backtrace was triggered by fstests generic/117. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress 3 locks held by fsstress/650: #0: 00000000870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50 #1: 00000000ba0c4c74 (&type->i_mutex_dir_key#6){++++}, at: vfs_setxattr+0x55/0xa0 #2: 000000008dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: __ceph_setxattr+0x297/0x810 CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 __ceph_setxattr+0x2b4/0x810 __vfs_setxattr+0x66/0x80 __vfs_setxattr_noperm+0x59/0xf0 vfs_setxattr+0x81/0xa0 setxattr+0x115/0x230 ? filename_lookup+0xc9/0x140 ? rcu_read_lock_sched_held+0x74/0x80 ? rcu_sync_lockdep_assert+0x2e/0x60 ? __sb_start_write+0x142/0x1a0 ? mnt_want_write+0x20/0x50 path_setxattr+0xba/0xd0 __x64_sys_lsetxattr+0x24/0x30 do_syscall_64+0x50/0x1c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7ff23514359a Signed-off-by: NLuis Henriques <lhenriques@suse.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 21 8月, 2019 2 次提交
-
-
由 Jens Axboe 提交于
We need to check if we have CQEs pending before starting a poll loop, as those could be the events we will be spinning for (and hence we'll find none). This can happen if a CQE triggers an error, or if it is found by eg an IRQ before we get a chance to find it through polling. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If a request issue ends up being punted to async context to avoid blocking, we can get into a situation where the original application enters the poll loop for that very request before it has been issued. This should not be an issue, except that the polling will hold the io_uring uring_ctx mutex for the duration of the poll. When the async worker has actually issued the request, it needs to acquire this mutex to add the request to the poll issued list. Since the application polling is already holding this mutex, the workqueue sleeps on the mutex forever, and the application thus never gets a chance to poll for the very request it was interested in. Fix this by ensuring that the polling drops the uring_ctx occasionally if it's not making any progress. Reported-by: NJeffrey M. Birnbaum <jmbnyc@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 20 8月, 2019 1 次提交
-
-
由 Ira Weiny 提交于
The parens used in the while loop would result in error being assigned the value 1 rather than the intended errno value. This is required to return -ETXTBSY from follow on break_layout() changes. Signed-off-by: NIra Weiny <ira.weiny@intel.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-