- 16 6月, 2020 3 次提交
-
-
由 David Howells 提交于
Remove afs_operation::abort_code as it's read but never set. Use ac.abort_code instead. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Fix yfs_fs_fetch_status() to honour the vnode selector in op->fetch_status.which as does afs_fs_fetch_status() that allows afs_do_lookup() to use this as an alternative to the InlineBulkStatus RPC call if not implemented by the server. This doesn't matter in the current code as YFS servers always implement InlineBulkStatus, but a subsequent will call it on YFS servers too in some circumstances. Fixes: e49c7b2f ("afs: Build an abstraction around an "operation" concept") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Remove yfs_fs_fetch_file_status() as it's no longer used. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 15 6月, 2020 6 次提交
-
-
由 David Howells 提交于
Abort code UAEOVERFLOW is returned when we try and set a time that's out of range, but it's currently mapped to EREMOTEIO by the default case. Fix UAEOVERFLOW to map instead to EOVERFLOW. Found with the generic/258 xfstest. Note that the test is wrong as it assumes that the filesystem will support a pre-UNIX-epoch date. Fixes: 1eda8bab ("afs: Add support for the UAE error table") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Fix the following issues: (1) Fix writeback to reduce the size of a store operation to i_size, effectively discarding the extra data. The problem comes when afs_page_mkwrite() records that a page is about to be modified by mmap(). It doesn't know what bits of the page are going to be modified, so it records the whole page as being dirty (this is stored in page->private as start and end offsets). Without this, the marshalling for the store to the server extends the size of the file to the end of the page (in afs_fs_store_data() and yfs_fs_store_data()). (2) Fix setattr to actually truncate the pagecache, thereby clearing the discarded part of a file. (3) Fix setattr to check that the new size is okay and to disable ATTR_SIZE if i_size wouldn't change. (4) Force i_size to be updated as the result of a truncate. (5) Don't truncate if ATTR_SIZE is not set. (6) Call pagecache_isize_extended() if the file was enlarged. Note that truncate_set_size() isn't used because the setting of i_size is done inside afs_vnode_commit_status() under the vnode->cb_lock. Found with the generic/029 and generic/393 xfstests. Fixes: 31143d5d ("AFS: implement basic file write support") Fixes: 4343d008 ("afs: Get rid of the afs_writeback record") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
The in-kernel afs filesystem ignores ctime because the AFS fileserver protocol doesn't support ctimes. This, however, causes various xfstests to fail. Work around this by: (1) Setting ctime to attr->ia_ctime in afs_setattr(). (2) Not ignoring ATTR_MTIME_SET, ATTR_TIMES_SET and ATTR_TOUCH settings. (3) Setting the ctime from the server mtime when on the target file when creating a hard link to it. (4) Setting the ctime on directories from their revised mtimes when renaming/moving a file. Found by the generic/221 and generic/309 xfstests. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
When doing a partial writeback, afs_write_back_from_locked_page() may generate an FS.StoreData RPC request that writes out part of a file when a file has been constructed from pieces by doing seek, write, seek, write, ... as is done by ld. The FS.StoreData RPC is given the current i_size as the file length, but the server basically ignores it unless the data length is 0 (in which case it's just a truncate operation). The revised file length returned in the result of the RPC may then not reflect what we suggested - and this leads to i_size getting moved backwards - which causes issues later. Fix the client to take account of this by ignoring the returned file size unless the data version number jumped unexpectedly - in which case we're going to have to clear the pagecache and reload anyway. This can be observed when doing a kernel build on an AFS mount. The following pair of commands produce the issue: ld -m elf_x86_64 -z max-page-size=0x200000 --emit-relocs \ -T arch/x86/realmode/rm/realmode.lds \ arch/x86/realmode/rm/header.o \ arch/x86/realmode/rm/trampoline_64.o \ arch/x86/realmode/rm/stack.o \ arch/x86/realmode/rm/reboot.o \ -o arch/x86/realmode/rm/realmode.elf arch/x86/tools/relocs --realmode \ arch/x86/realmode/rm/realmode.elf \ >arch/x86/realmode/rm/realmode.relocs This results in the latter giving: Cannot read ELF section headers 0/18: Success as the realmode.elf file got corrupted. The sequence of events can also be driven with: xfs_io -t -f \ -c "pwrite -S 0x58 0 0x58" \ -c "pwrite -S 0x59 10000 1000" \ -c "close" \ /afs/example.com/scratch/a Fixes: 31143d5d ("AFS: implement basic file write support") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Fix afs_write_end() to change i_size under vnode->cb_lock rather than ->wb_lock so that it doesn't race with afs_vnode_commit_status() and afs_getattr(). The ->wb_lock is only meant to guard access to ->wb_keys which isn't accessed by that piece of code. Fixes: 4343d008 ("afs: Get rid of the afs_writeback record") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
The mtime on an inode needs to be updated when a write is made into an mmap'ed section. There are three ways in which this could be done: update it when page_mkwrite is called, update it when a page is changed from dirty to writeback or leave it to the server and fix the mtime up from the reply to the StoreData RPC. Found with the generic/215 xfstest. Fixes: 1cf7a151 ("afs: Implement shared-writeable mmap") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 14 6月, 2020 2 次提交
-
-
由 David Sterba 提交于
This reverts commit a43a67a2. This patch reverts the main part of switching direct io implementation to iomap infrastructure. There's a problem in invalidate page that couldn't be solved as regression in this development cycle. The problem occurs when buffered and direct io are mixed, and the ranges overlap. Although this is not recommended, filesystems implement measures or fallbacks to make it somehow work. In this case, fallback to buffered IO would be an option for btrfs (this already happens when direct io is done on compressed data), but the change would be needed in the iomap code, bringing new semantics to other filesystems. Another problem arises when again the buffered and direct ios are mixed, invalidation fails, then -EIO is set on the mapping and fsync will fail, though there's no real error. There have been discussions how to fix that, but revert seems to be the least intrusive option. Link: https://lore.kernel.org/linux-btrfs/20200528192103.xm45qoxqmkw7i5yl@fiona/Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Masahiro Yamada 提交于
Since commit 84af7a61 ("checkpatch: kconfig: prefer 'help' over '---help---'"), the number of '---help---' has been gradually decreasing, but there are still more than 2400 instances. This commit finishes the conversion. While I touched the lines, I also fixed the indentation. There are a variety of indentation styles found. a) 4 spaces + '---help---' b) 7 spaces + '---help---' c) 8 spaces + '---help---' d) 1 space + 1 tab + '---help---' e) 1 tab + '---help---' (correct indentation) f) 1 tab + 1 space + '---help---' g) 1 tab + 2 spaces + '---help---' In order to convert all of them to 1 tab + 'help', I ran the following commend: $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/' Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
- 13 6月, 2020 4 次提交
-
-
由 Steve French 提交于
Pavel noticed that a debug message (disabled by default) in creating the security descriptor context could be useful for new file creation owner fields (as we already have for the mode) when using mount parm idsfromsid. [38120.392272] CIFS: FYI: owner S-1-5-88-1-0, group S-1-5-88-2-0 [38125.792637] CIFS: FYI: owner S-1-5-88-1-1000, group S-1-5-88-2-1000 Also cleans up a typo in a comment Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
-
由 Eric W. Biederman 提交于
Recently syzbot reported that unmounting proc when there is an ongoing inotify watch on the root directory of proc could result in a use after free when the watch is removed after the unmount of proc when the watcher exits. Commit 69879c01 ("proc: Remove the now unnecessary internal mount of proc") made it easier to unmount proc and allowed syzbot to see the problem, but looking at the code it has been around for a long time. Looking at the code the fsnotify watch should have been removed by fsnotify_sb_delete in generic_shutdown_super. Unfortunately the inode was allocated with new_inode_pseudo instead of new_inode so the inode was not on the sb->s_inodes list. Which prevented fsnotify_unmount_inodes from finding the inode and removing the watch as well as made it so the "VFS: Busy inodes after unmount" warning could not find the inodes to warn about them. Make all of the inodes in proc visible to generic_shutdown_super, and fsnotify_sb_delete by using new_inode instead of new_inode_pseudo. The only functional difference is that new_inode places the inodes on the sb->s_inodes list. I wrote a small test program and I can verify that without changes it can trigger this issue, and by replacing new_inode_pseudo with new_inode the issues goes away. Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/000000000000d788c905a7dfa3f4@google.com Reported-by: syzbot+7d2debdcdb3cb93c1e5e@syzkaller.appspotmail.com Fixes: 0097875b ("proc: Implement /proc/thread-self to point at the directory of the current thread") Fixes: 021ada7d ("procfs: switch /proc/self away from proc_dir_entry") Fixes: 51f0885e ("vfs,proc: guarantee unique inodes in /proc") Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
由 Steve French 提交于
idsfromsid was ignored in chown and chgrp causing it to fail when upcalls were not configured for lookup. idsfromsid allows mapping users when setting user or group ownership using "special SID" (reserved for this). Add support for chmod and chgrp when idsfromsid mount option is enabled. Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
-
由 Steve French 提交于
Currently idsfromsid mount option allows querying owner information from the special sids used to represent POSIX uids and gids but needed changes to populate the security descriptor context with the owner information when idsfromsid mount option was used. Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
-
- 12 6月, 2020 14 次提交
-
-
由 Steve French 提交于
Add dynamic tracepoints for new SMB3.1.1. posix extensions query info level (100) Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NAurelien Aptel <aaptel@suse.com>
-
由 Steve French 提交于
Adds calls to the newer info level for query info using SMB3.1.1 posix extensions. The remaining two places that call the older query info (non-SMB3.1.1 POSIX) require passing in the fid and can be updated in a later patch. Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NAurelien Aptel <aaptel@suse.com>
-
由 Steve French 提交于
Improve support for lookup when using SMB3.1.1 posix mounts. Use new info level 100 (posix query info) Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NAurelien Aptel <aaptel@suse.com>
-
由 Steve French 提交于
Add worker function for non-compounded SMB3.1.1 POSIX Extensions query info. This is needed for revalidate of root (cached) directory for example. Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com> Reviewed-by: NAurelien Aptel <aaptel@suse.com>
-
由 Steve French 提交于
Adds support for better query info on dentry revalidation (using the SMB3.1.1 POSIX extensions level 100). Followon patch will add support for translating the UID/GID from the SID and also will add support for using the posix query info on lookup. Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com> Reviewed-by: NAurelien Aptel <aaptel@suse.com>
-
由 Namjae Jeon 提交于
Some of tests in xfstests failed with cifsd kernel server since commit e80ddeb2. cifsd kernel server validates credit charge from client by calculating it base on max((InputCount + OutputCount) and (MaxInputResponse + MaxOutputResponse)) according to specification. MS-SMB2 specification describe credit charge calculation of smb2 ioctl : If Connection.SupportsMultiCredit is TRUE, the server MUST validate CreditCharge based on the maximum of (InputCount + OutputCount) and (MaxInputResponse + MaxOutputResponse), as specified in section 3.3.5.2.5. If the validation fails, it MUST fail the IOCTL request with STATUS_INVALID_PARAMETER. This patch add indatalen that can be a non-zero value to calculation of credit charge in SMB2_ioctl_init(). Fixes: e80ddeb2 ("smb3: fix incorrect number of credits when ioctl MaxOutputResponse > 64K") Cc: Stable <stable@vger.kernel.org> Reviewed-by: NAurelien Aptel <aaptel@suse.com> Cc: Steve French <smfrench@gmail.com> Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NSteve French <stfrench@microsoft.com>
-
由 Tom Seewald 提交于
After commit 12abc5ee ("tcp: add tcp_sock_set_nodelay") and commit c488aead ("tcp: add tcp_sock_set_user_timeout"), building the kernel with OCFS2_FS=y but without INET=y causes it to fail with: ld: fs/ocfs2/cluster/tcp.o: in function `o2net_accept_many': tcp.c:(.text+0x21b1): undefined reference to `tcp_sock_set_nodelay' ld: tcp.c:(.text+0x21c1): undefined reference to `tcp_sock_set_user_timeout' ld: fs/ocfs2/cluster/tcp.o: in function `o2net_start_connect': tcp.c:(.text+0x2633): undefined reference to `tcp_sock_set_nodelay' ld: tcp.c:(.text+0x2643): undefined reference to `tcp_sock_set_user_timeout' This is due to tcp_sock_set_nodelay() and tcp_sock_set_user_timeout() being declared in linux/tcp.h and defined in net/ipv4/tcp.c, which depend on TCP/IP being enabled. To fix this, make OCFS2_FS depend on INET=y which already requires NET=y. Fixes: 12abc5ee ("tcp: add tcp_sock_set_nodelay") Fixes: c488aead ("tcp: add tcp_sock_set_user_timeout") Signed-off-by: NTom Seewald <tseewald@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Acked-by: NChristoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: David S. Miller <davem@davemloft.net> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200606190827.23954-1-tseewald@gmail.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Howells 提交于
Fix afs_store_data() so that it sets the mtime in the new operation descriptor otherwise the mtime on the server gets set to 0 when a write is stored to the server. Fixes: e49c7b2f ("afs: Build an abstraction around an "operation" concept") Reported-by: NDave Botsch <botsch@cnf.cornell.edu> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Chuck Lever 提交于
I measured a 50% throughput regression for large direct writes. The observed on-the-wire behavior is that the client sends every NFS WRITE twice: once as an UNSTABLE WRITE plus a COMMIT, and once as a FILE_SYNC WRITE. This is because the nfs_write_match_verf() check in nfs_direct_commit_complete() fails for every WRITE. Buffered writes use nfs_write_completion(), which sets req->wb_verf correctly. Direct writes use nfs_direct_write_completion(), which does not set req->wb_verf at all. This leaves req->wb_verf set to all zeroes for every direct WRITE, and thus nfs_direct_commit_completion() always sets NFS_ODIRECT_RESCHED_WRITES. This fix appears to restore nearly all of the lost performance. Fixes: 1f28476d ("NFS: Fix O_DIRECT commit verifier handling") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Zheng Bin 提交于
Use the following command to test nfsv4(size of file1M is 1MB): mount -t nfs -o vers=4.0,actimeo=60 127.0.0.1/dir1 /mnt cp file1M /mnt du -h /mnt/file1M -->0 within 60s, then 1M When write is done(cp file1M /mnt), will call this: nfs_writeback_done nfs4_write_done nfs4_write_done_cb nfs_writeback_update_inode nfs_post_op_update_inode_force_wcc_locked(change, ctime, mtime nfs_post_op_update_inode_force_wcc_locked nfs_set_cache_invalid nfs_refresh_inode_locked nfs_update_inode nfsd write response contains change, ctime, mtime, the flag will be clear after nfs_update_inode. Howerver, write response does not contain space_used, previous open response contains space_used whose value is 0, so inode->i_blocks is still 0. nfs_getattr -->called by "du -h" do_update |= force_sync || nfs_attribute_cache_expired -->false in 60s cache_validity = READ_ONCE(NFS_I(inode)->cache_validity) do_update |= cache_validity & (NFS_INO_INVALID_ATTR -->false if (do_update) { __nfs_revalidate_inode } Within 60s, does not send getattr request to nfsd, thus "du -h /mnt/file1M" is 0. Add a NFS_INO_INVALID_BLOCKS flag, set it when nfsv4 write is done. Fixes: 16e14375 ("NFS: More fine grained attribute tracking") Signed-off-by: NZheng Bin <zhengbin13@huawei.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Colin Ian King 提交于
The variable result is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
A short read can generate an -EIO error without there being an error on the wire. This tracepoint acts as an eyecatcher when there is no obvious I/O error. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
When sunrpc trace points are not enabled, the recorded task ID information alone is not helpful. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 11 6月, 2020 11 次提交
-
-
由 Xiaoguang Wang 提交于
While testing io_uring in arm, we found sometimes io_sq_thread() keeps polling io requests even though there are not inflight io requests in block layer. After some investigations, found a possible race about io_kiocb.flags, see below race codes: 1) in the end of io_write() or io_read() req->flags &= ~REQ_F_NEED_CLEANUP; kfree(iovec); return ret; 2) in io_complete_rw_iopoll() if (res != -EAGAIN) req->flags |= REQ_F_IOPOLL_COMPLETED; In IOPOLL mode, io requests still maybe completed by interrupt, then above codes are not safe, concurrent modifications to req->flags, which is not protected by lock or is not atomic modifications. I also had disassemble io_complete_rw_iopoll() in arm: req->flags |= REQ_F_IOPOLL_COMPLETED; 0xffff000008387b18 <+76>: ldr w0, [x19,#104] 0xffff000008387b1c <+80>: orr w0, w0, #0x1000 0xffff000008387b20 <+84>: str w0, [x19,#104] Seems that the "req->flags |= REQ_F_IOPOLL_COMPLETED;" is load and modification, two instructions, which obviously is not atomic. To fix this issue, add a new iopoll_completed in io_kiocb to indicate whether io request is completed. Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Some architectures like arm64 and s390 require USER_DS to be set for kernel threads to access user address space, which is the whole purpose of kthread_use_mm, but other like x86 don't. That has lead to a huge mess where some callers are fixed up once they are tested on said architectures, while others linger around and yet other like io_uring try to do "clever" optimizations for what usually is just a trivial asignment to a member in the thread_struct for most architectures. Make kthread_use_mm set USER_DS, and kthread_unuse_mm restore to the previous value instead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Tested-by: NJens Axboe <axboe@kernel.dk> Reviewed-by: NJens Axboe <axboe@kernel.dk> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Felipe Balbi <balbi@kernel.org> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: http://lkml.kernel.org/r/20200404094101.672954-7-hch@lst.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
Switch the function documentation to kerneldoc comments, and add WARN_ON_ONCE asserts that the calling thread is a kernel thread and does not have ->mm set (or has ->mm set in the case of unuse_mm). Also give the functions a kthread_ prefix to better document the use case. [hch@lst.de: fix a comment typo, cover the newly merged use_mm/unuse_mm caller in vfio] Link: http://lkml.kernel.org/r/20200416053158.586887-3-hch@lst.de [sfr@canb.auug.org.au: powerpc/vas: fix up for {un}use_mm() rename] Link: http://lkml.kernel.org/r/20200422163935.5aa93ba5@canb.auug.org.auSigned-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Tested-by: NJens Axboe <axboe@kernel.dk> Reviewed-by: NJens Axboe <axboe@kernel.dk> Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [usb] Acked-by: NHaren Myneni <haren@linux.ibm.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Felipe Balbi <balbi@kernel.org> Cc: Jason Wang <jasowang@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Link: http://lkml.kernel.org/r/20200404094101.672954-6-hch@lst.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
Patch series "improve use_mm / unuse_mm", v2. This series improves the use_mm / unuse_mm interface by better documenting the assumptions, and my taking the set_fs manipulations spread over the callers into the core API. This patch (of 3): Use the proper API instead. Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de These helpers are only for use with kernel threads, and I will tie them more into the kthread infrastructure going forward. Also move the prototypes to kthread.h - mmu_context.h was a little weird to start with as it otherwise contains very low-level MM bits. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Tested-by: NJens Axboe <axboe@kernel.dk> Reviewed-by: NJens Axboe <axboe@kernel.dk> Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Felipe Balbi <balbi@kernel.org> Cc: Jason Wang <jasowang@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de Link: http://lkml.kernel.org/r/20200416053158.586887-1-hch@lst.de Link: http://lkml.kernel.org/r/20200404094101.672954-5-hch@lst.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Keyur Patel 提交于
./ocfs2/mmap.c:65: bebongs ==> belonging Signed-off-by: NKeyur Patel <iamkeyur96@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200608014818.102358-1-iamkeyur96@gmail.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ryusuke Konishi 提交于
After commit c3aab9a0 ("mm/filemap.c: don't initiate writeback if mapping has no dirty pages"), the following null pointer dereference has been reported on nilfs2: BUG: kernel NULL pointer dereference, address: 00000000000000a8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI ... RIP: 0010:percpu_counter_add_batch+0xa/0x60 ... Call Trace: __test_set_page_writeback+0x2d3/0x330 nilfs_segctor_do_construct+0x10d3/0x2110 [nilfs2] nilfs_segctor_construct+0x168/0x260 [nilfs2] nilfs_segctor_thread+0x127/0x3b0 [nilfs2] kthread+0xf8/0x130 ... This crash turned out to be caused by set_page_writeback() call for segment summary buffers at nilfs_segctor_prepare_write(). set_page_writeback() can call inc_wb_stat(inode_to_wb(inode), WB_WRITEBACK) where inode_to_wb(inode) is NULL if the inode of underlying block device does not have an associated wb. This fixes the issue by calling inode_attach_wb() in advance to ensure to associate the bdev inode with its wb. Fixes: c3aab9a0 ("mm/filemap.c: don't initiate writeback if mapping has no dirty pages") Reported-by: NWalton Hoops <me@waltonhoops.com> Reported-by: NTomas Hlavaty <tom@logand.com> Reported-by: NARAI Shun-ichi <hermes@ceres.dti.ne.jp> Reported-by: NHideki EIRAKU <hdk1983@gmail.com> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Tested-by: NRyusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> [5.4+] Link: http://lkml.kernel.org/r/20200608.011819.1399059588922299158.konishi.ryusuke@gmail.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jiufei Xue 提交于
If the socket is O_NONBLOCK, we should complete the accept request with -EAGAIN when data is not ready. Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Xiaoguang Wang 提交于
Basically IORING_OP_POLL_ADD command and async armed poll handlers for regular commands don't touch io_wq_work, so only REQ_F_WORK_INITIALIZED is set, can we do io_wq_work copy and restore. Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Xiaoguang Wang 提交于
If requests can be submitted and completed inline, we don't need to initialize whole io_wq_work in io_init_req(), which is an expensive operation, add a new 'REQ_F_WORK_INITIALIZED' to determine whether io_wq_work is initialized and add a helper io_req_init_async(), users must call io_req_init_async() for the first time touching any members of io_wq_work. I use /dev/nullb0 to evaluate performance improvement in my physical machine: modprobe null_blk nr_devices=1 completion_nsec=0 sudo taskset -c 60 fio -name=fiotest -filename=/dev/nullb0 -iodepth=128 -thread -rw=read -ioengine=io_uring -direct=1 -bs=4k -size=100G -numjobs=1 -time_based -runtime=120 before this patch: Run status group 0 (all jobs): READ: bw=724MiB/s (759MB/s), 724MiB/s-724MiB/s (759MB/s-759MB/s), io=84.8GiB (91.1GB), run=120001-120001msec With this patch: Run status group 0 (all jobs): READ: bw=761MiB/s (798MB/s), 761MiB/s-761MiB/s (798MB/s-798MB/s), io=89.2GiB (95.8GB), run=120001-120001msec About 5% improvement. Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Alexey Gladkov 提交于
syzbot found that proc_fill_super() fails before filling up sb->s_fs_info, deactivate_locked_super() will be called and sb->s_fs_info will be NULL. The proc_kill_sb() does not expect fs_info to be NULL which is wrong. Link: https://lore.kernel.org/lkml/0000000000002d7ca605a7b8b1c5@google.com Reported-by: syzbot+4abac52934a48af5ff19@syzkaller.appspotmail.com Fixes: fa10fed3 ("proc: allow to mount many instances of proc in one pid namespace") Signed-off-by: NAlexey Gladkov <gladkov.alexey@gmail.com> Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
-
由 Christoph Hellwig 提交于
Instead of triggering a WARN_ON deep down in the page allocator just give up early on allocations that are way larger than the usual sysctl values. Fixes: 32927393 ("sysctl: pass kernel pointers to ->proc_handler") Reported-by: NVegard Nossum <vegard.nossum@oracle.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-