- 10 9月, 2020 2 次提交
-
-
由 Vivek Goyal 提交于
This reduces code duplication and make it little easier to read code. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Vivek Goyal 提交于
virtiofs device has a range of memory which is mapped into file inodes using dax. This memory is mapped in qemu on host and maps different sections of real file on host. Size of this memory is limited (determined by administrator) and depending on filesystem size, we will soon reach a situation where all the memory is in use and we need to reclaim some. As part of reclaim process, we will need to make sure that there are no active references to pages (taken by get_user_pages()) on the memory range we are trying to reclaim. I am planning to use dax_layout_busy_page() for this. But in current form this is per inode and scans through all the pages of the inode. We want to reclaim only a portion of memory (say 2MB page). So we want to make sure that only that 2MB range of pages do not have any references (and don't want to unmap all the pages of inode). Hence, create a range version of this function named dax_layout_busy_page_range() which can be used to pass a range which needs to be unmapped. Cc: Dan Williams <dan.j.williams@intel.com> Cc: linux-nvdimm@lists.01.org Cc: Jan Kara <jack@suse.cz> Cc: Vishal L Verma <vishal.l.verma@intel.com> Cc: "Weiny, Ira" <ira.weiny@intel.com> Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 04 9月, 2020 1 次提交
-
-
由 André Almeida 提交于
As stated in https://sourceforge.net/projects/fuse/, "the FUSE project has moved to https://github.com/libfuse/" in 22-Dec-2015. Update URLs to reflect this. Signed-off-by: NAndré Almeida <andrealmeid@collabora.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 29 8月, 2020 1 次提交
-
-
由 Paulo Alcantara 提交于
For SMB1, the DFS flag should be checked against tcon->Flags rather than tcon->share_flags. While at it, add an is_tcon_dfs() helper to check for DFS capability in a more generic way. Signed-off-by: NPaulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: NSteve French <stfrench@microsoft.com> Reviewed-by: NShyam Prasad N <nspmangalore@gmail.com>
-
- 28 8月, 2020 3 次提交
-
-
由 Jens Axboe 提交于
These events happen inline from submission, so there's no need to bounce them through the original task. Just set them up for retry and issue retry directly instead of going over task_work. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
This normally isn't hit, as polling is mostly done on NVMe with deep queue depths. But if we do run into request starvation, we need to ensure that retries are properly serialized. Reported-by: NAndres Freund <andres@anarazel.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Dan Carpenter 提交于
The fall through annotation comes after a return statement so it's not reachable. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
- 27 8月, 2020 2 次提交
-
-
由 Jens Axboe 提交于
Make sure we clear req->result, which was set to -EAGAIN for retry purposes, when moving it to the reissue list. Otherwise we can end up retrying a request more than once, which leads to weird results in the io-wq handling (and other spots). Cc: stable@vger.kernel.org Reported-by: NAndres Freund <andres@anarazel.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
The man page for io_uring generally claims were consistent with what preadv2 and pwritev2 accept, but turns out there's a slight discrepancy in how offset == -1 is handled for pipes/streams. preadv doesn't allow it, but preadv2 does. This currently causes io_uring to return -EINVAL if that is attempted, but we should allow that as documented. This change makes us consistent with preadv2/pwritev2 for just passing in a NULL ppos for streams if the offset is -1. Cc: stable@vger.kernel.org # v5.7+ Reported-by: NBenedikt Ames <wisp3rwind@posteo.eu> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 26 8月, 2020 3 次提交
-
-
由 Jens Axboe 提交于
We need to call kiocb_done() for any ret < 0 to ensure that we always get the proper -ERESTARTSYS (and friends) transformation done. At some point this should be tied into general error handling, so we can get rid of the various (mostly network) related commands that check and perform this substitution. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
There's no point in using the poll handler if we can't do a nonblocking IO attempt of the operation, since we'll need to go async anyway. In fact this is actively harmful, as reading from eg pipes won't return 0 to indicate EOF. Cc: stable@vger.kernel.org # v5.7+ Reported-by: NBenedikt Ames <wisp3rwind@posteo.eu> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We do the initial accounting of locked_vm and pinned_vm before we have setup ctx->sqo_mm, which means we can end up having not accounted the memory at setup time, but still decrement it when we exit. This causes an imbalance in the accounting. Setup ctx->sqo_mm earlier in io_uring_create(), before we do the first accounting of mm->{locked,pinned}_vm. This also unifies the state grabbing for the ctx, and eliminates a failure case in io_sq_offload_start(). Fixes: f74441e6 ("io_uring: account locked memory before potential error case") Reported-by: NRobert M. Muncrief <rmuncrief@humanavance.com> Reported-by: NNiklas Schnelle <schnelle@linux.ibm.com> Tested-by: NNiklas Schnelle <schnelle@linux.ibm.com> Tested-by: NRobert M. Muncrief <rmuncrief@humanavance.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 8月, 2020 2 次提交
-
-
由 Jens Axboe 提交于
Some consumers of the iov_iter will return an error, but still have bytes consumed in the iterator. This is an issue for -EAGAIN, since we rely on a sane iov_iter state across retries. Fix this by ensuring that we revert consumed bytes, if any, if the file operations have consumed any bytes from iterator. This is similar to what generic_file_read_iter() does, and is always safe as we have the previous bytes count handy already. Fixes: ff6165b2 ("io_uring: retain iov_iter state over io_read/io_write calls") Reported-by: NDmitry Shulyak <yashulyak@gmail.com> Reviewed-by: NStefano Garzarella <sgarzare@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jeff Layton 提交于
Leases don't currently work correctly on kcephfs, as they are not broken when caps are revoked. They could eventually be implemented similarly to how we did them in libcephfs, but for now don't allow them. [ idryomov: no need for simple_nosetlease() in ceph_dir_fops and ceph_snapdir_fops ] Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 24 8月, 2020 6 次提交
-
-
由 Jeff Layton 提交于
Tuan and Ulrich mentioned that they were hitting a problem on s390x, which has a 32-bit ino_t value, even though it's a 64-bit arch (for historical reasons). I think the current handling of inode numbers in the ceph driver is wrong. It tries to use 32-bit inode numbers on 32-bit arches, but that's actually not a problem. 32-bit arches can deal with 64-bit inode numbers just fine when userland code is compiled with LFS support (the common case these days). What we really want to do is just use 64-bit numbers everywhere, unless someone has mounted with the ino32 mount option. In that case, we want to ensure that we hash the inode number down to something that will fit in 32 bits before presenting the value to userland. Add new helper functions that do this, and only do the conversion before presenting these values to userland in getattr and readdir. The inode table hashvalue is changed to just cast the inode number to unsigned long, as low-order bits are the most likely to vary anyway. While it's not strictly required, we do want to put something in inode->i_ino. Instead of basing it on BITS_PER_LONG, however, base it on the size of the ino_t type. NOTE: This is a user-visible change on 32-bit arches: 1/ inode numbers will be seen to have changed between kernel versions. 32-bit arches will see large inode numbers now instead of the hashed ones they saw before. 2/ any really old software not built with LFS support may start failing stat() calls with -EOVERFLOW on inode numbers >2^32. Nothing much we can do about these, but hopefully the intersection of people running such code on ceph will be very small. The workaround for both problems is to mount with "-o ino32". [ idryomov: changelog tweak ] URL: https://tracker.ceph.com/issues/46828Reported-by: NUlrich Weigand <Ulrich.Weigand@de.ibm.com> Reported-and-Tested-by: NTuan Hoang1 <Tuan.Hoang1@ibm.com> Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Bob Peterson 提交于
When a log flush fails due to io errors, it signals the failure but does not clean up after itself very well. This is because buffers are added to the transaction tr_buf and tr_databuf queue, but the io error causes gfs2_log_flush to bypass the "after_commit" functions responsible for dequeueing the bd elements. If the bd elements are added to the ail list before the error, function ail_drain takes care of dequeueing them. But if they haven't gotten that far, the elements are forgotten and make the transactions unable to be freed. This patch introduces new function trans_drain which drains the bd elements from the transaction so they can be freed properly. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
-
由 Max Filippov 提交于
binfmt_flat loader uses the gap between text and data to store data segment pointers for the libraries. Even in the absence of shared libraries it stores at least one pointer to the executable's own data segment. Text and data can go back to back in the flat binary image and without offsetting data segment last few instructions in the text segment may get corrupted by the data segment pointer. Fix it by reverting commit a2357223 ("binfmt_flat: don't offset the data start"). Cc: stable@vger.kernel.org Fixes: a2357223 ("binfmt_flat: don't offset the data start") Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com> Signed-off-by: NGreg Ungerer <gerg@linux-m68k.org>
-
由 Gustavo A. R. Silva 提交于
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
由 Pavel Begunkov 提交于
Don't forget to update wqe->hash_tail after cancelling a pending work item, if it was hashed. Cc: stable@vger.kernel.org # 5.7+ Reported-by: NDmitry Shulyak <yashulyak@gmail.com> Fixes: 86f3cd1b ("io-wq: handle hashed writes in chains") Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If an application is doing reads on signalfd, and we arm the poll handler because there's no data available, then the wakeup can recurse on the tasks sighand->siglock as the signal delivery from task_work_add() will use TWA_SIGNAL and that attempts to lock it again. We can detect the signalfd case pretty easily by comparing the poll->head wait_queue_head_t with the target task signalfd wait queue. Just use normal task wakeup for this case. Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 23 8月, 2020 2 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Marc Zyngier 提交于
When adding a new fd to an epoll, and that this new fd is an epoll fd itself, we recursively scan the fds attached to it to detect cycles, and add non-epool files to a "check list" that gets subsequently parsed. However, this check list isn't completely safe when deletions can happen concurrently. To sidestep the issue, make sure that a struct file placed on the check list sees its f_count increased, ensuring that a concurrent deletion won't result in the file disapearing from under our feet. Cc: stable@vger.kernel.org Signed-off-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 22 8月, 2020 3 次提交
-
-
由 David Howells 提交于
If an error occurs during the construction of an afs superblock, it's possible that an error occurs after a superblock is created, but before we've created the root dentry. If the superblock has a dynamic root (ie. what's normally mounted on /afs), the afs_kill_super() will call afs_dynroot_depopulate() to unpin any created dentries - but this will oops if the root hasn't been created yet. Fix this by skipping that bit of code if there is no root dentry. This leads to an oops looking like: general protection fault, ... KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f] ... RIP: 0010:afs_dynroot_depopulate+0x25f/0x529 fs/afs/dynroot.c:385 ... Call Trace: afs_kill_super+0x13b/0x180 fs/afs/super.c:535 deactivate_locked_super+0x94/0x160 fs/super.c:335 afs_get_tree+0x1124/0x1460 fs/afs/super.c:598 vfs_get_tree+0x89/0x2f0 fs/super.c:1547 do_new_mount fs/namespace.c:2875 [inline] path_mount+0x1387/0x2070 fs/namespace.c:3192 do_mount fs/namespace.c:3205 [inline] __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount fs/namespace.c:3390 [inline] __x64_sys_mount+0x27f/0x300 fs/namespace.c:3390 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 which is oopsing on this line: inode_lock(root->d_inode); presumably because sb->s_root was NULL. Fixes: 0da0b7fd ("afs: Display manually added cells in dynamic root mount") Reported-by: syzbot+c1eff8205244ae7e11a6@syzkaller.appspotmail.com Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Phillip Lougher 提交于
This is a regression introduced by the patch "migrate from ll_rw_block usage to BIO". Bio_alloc() is limited to 256 pages (1 Mbyte). This can cause a failure when reading 1 Mbyte block filesystems. The problem is a datablock can be fully (or almost uncompressed), requiring 256 pages, but, because blocks are not aligned to page boundaries, it may require 257 pages to read. Bio_kmalloc() can handle 1024 pages, and so use this for the edge condition. Fixes: 93e72b3c ("squashfs: migrate from ll_rw_block usage to BIO") Reported-by: NNicolas Prochazka <nicolas.prochazka@gmail.com> Reported-by: NTomoatsu Shimada <shimada@walbrix.com> Signed-off-by: NPhillip Lougher <phillip@squashfs.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NGuenter Roeck <groeck@chromium.org> Cc: Philippe Liard <pliard@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Adrien Schildknecht <adrien+dev@schischi.me> Cc: Daniel Rosenberg <drosen@google.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200815035637.15319-1-phillip@squashfs.org.ukSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jann Horn 提交于
romfs has a superblock field that limits the size of the filesystem; data beyond that limit is never accessed. romfs_dev_read() fetches a caller-supplied number of bytes from the backing device. It returns 0 on success or an error code on failure; therefore, its API can't represent short reads, it's all-or-nothing. However, when romfs_dev_read() detects that the requested operation would cross the filesystem size limit, it currently silently truncates the requested number of bytes. This e.g. means that when the content of a file with size 0x1000 starts one byte before the filesystem size limit, ->readpage() will only fill a single byte of the supplied page while leaving the rest uninitialized, leaking that uninitialized memory to userspace. Fix it by returning an error code instead of truncating the read when the requested read operation would go beyond the end of the filesystem. Fixes: da4458bd ("NOMMU: Make it possible for RomFS to use MTD devices directly") Signed-off-by: NJann Horn <jannh@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200818013202.2246365-1-jannh@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 8月, 2020 3 次提交
-
-
由 Boris Burkov 提交于
can_nocow_extent and btrfs_cross_ref_exist both rely on a heuristic for detecting a must cow condition which is not exactly accurate, but saves unnecessary tree traversal. The incorrect assumption is that if the extent was created in a generation smaller than the last snapshot generation, it must be referenced by that snapshot. That is true, except the snapshot could have since been deleted, without affecting the last snapshot generation. The original patch claimed a performance win from this check, but it also leads to a bug where you are unable to use a swapfile if you ever snapshotted the subvolume it's in. Make the check slower and more strict for the swapon case, without modifying the general cow checks as a compromise. Turning swap on does not seem to be a particularly performance sensitive operation, so incurring a possibly unnecessary btrfs_search_slot seems worthwhile for the added usability. Note: Until the snapshot is competely cleaned after deletion, check_committed_refs will still cause the logic to think that cow is necessary, so the user must until 'btrfs subvolu sync' finished before activating the swapfile swapon. CC: stable@vger.kernel.org # 5.4+ Suggested-by: NOmar Sandoval <osandov@osandov.com> Signed-off-by: NBoris Burkov <boris@bur.io> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
With my new locking code dbench is so much faster that I tripped over a transaction abort from ENOSPC. This turned out to be because btrfs_del_dir_entries_in_log was checking for ret == -ENOSPC, but this function sets err on error, and returns err. So instead of properly marking the inode as needing a full commit, we were returning -ENOSPC and aborting in __btrfs_unlink_inode. Fix this by checking the proper variable so that we return the correct thing in the case of ENOSPC. The ENOENT needs to be checked, because btrfs_lookup_dir_item_index() can return -ENOENT if the dir item isn't in the tree log (which would happen if we hadn't fsync'ed this guy). We actually handle that case in __btrfs_unlink_inode, so it's an expected error to get back. Fixes: 4a500fd1 ("Btrfs: Metadata ENOSPC handling for tree log") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> [ add note and comment about ENOENT ] Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Howells 提交于
The afs_put_operation() function needs to put the reference to the key that's authenticating the operation. Fixes: e49c7b2f ("afs: Build an abstraction around an "operation" concept") Reported-by: NDave Botsch <botsch@cnf.cornell.edu> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 8月, 2020 12 次提交
-
-
由 Pavel Begunkov 提交于
If io_import_iovec() returns an error, return iovec is undefined and must not be used, so don't set it to NULL when failing. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
kfree() handles NULL pointers well, but io_{read,write}() checks it because of performance reasons. Leave a comment there for those who are tempted to patch it. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Setting and clearing REQ_F_OVERFLOW in io_uring_cancel_files() and io_cqring_overflow_flush() are racy, because they might be called asynchronously. REQ_F_OVERFLOW flag in only needed for files cancellation, so if it can be guaranteed that requests _currently_ marked inflight can't be overflown, the problem will be solved with removing the flag altogether. That's how the patch works, it removes inflight status of a request in io_cqring_fill_event() whenever it should be thrown into CQ-overflow list. That's Ok to do, because no opcode specific handling can be done after io_cqring_fill_event(), the same assumption as with "struct io_completion" patches. And it already have a good place for such cleanups, which is io_clean_op(). A nice side effect of this is removing this inflight check from the hot path. note on synchronisation: now __io_cqring_fill_event() may be taking two spinlocks simultaneously, completion_lock and inflight_lock. It's fine, because we never do that in reverse order, and CQ-overflow of inflight requests shouldn't happen often. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We currently use system_wq, which is unbounded in terms of number of workers. This means that if we're exiting tons of rings at the same time, then we'll briefly spawn tons of event kworkers just for a very short blocking time as the rings exit. Use system_unbound_wq instead, which has a sane cap on the concurrency level. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Filipe Manana 提交于
If a transaction aborts it can cause a memory leak of the pages array of a block group's io_ctl structure. The following steps explain how that can happen: 1) Transaction N is committing, currently in state TRANS_STATE_UNBLOCKED and it's about to start writing out dirty extent buffers; 2) Transaction N + 1 already started and another task, task A, just called btrfs_commit_transaction() on it; 3) Block group B was dirtied (extents allocated from it) by transaction N + 1, so when task A calls btrfs_start_dirty_block_groups(), at the very beginning of the transaction commit, it starts writeback for the block group's space cache by calling btrfs_write_out_cache(), which allocates the pages array for the block group's io_ctl with a call to io_ctl_init(). Block group A is added to the io_list of transaction N + 1 by btrfs_start_dirty_block_groups(); 4) While transaction N's commit is writing out the extent buffers, it gets an IO error and aborts transaction N, also setting the file system to RO mode; 5) Task A has already returned from btrfs_start_dirty_block_groups(), is at btrfs_commit_transaction() and has set transaction N + 1 state to TRANS_STATE_COMMIT_START. Immediately after that it checks that the filesystem was turned to RO mode, due to transaction N's abort, and jumps to the "cleanup_transaction" label. After that we end up at btrfs_cleanup_one_transaction() which calls btrfs_cleanup_dirty_bgs(). That helper finds block group B in the transaction's io_list but it never releases the pages array of the block group's io_ctl, resulting in a memory leak. In fact at the point when we are at btrfs_cleanup_dirty_bgs(), the pages array points to pages that were already released by us at __btrfs_write_out_cache() through the call to io_ctl_drop_pages(). We end up freeing the pages array only after waiting for the ordered extent to complete through btrfs_wait_cache_io(), which calls io_ctl_free() to do that. But in the transaction abort case we don't wait for the space cache's ordered extent to complete through a call to btrfs_wait_cache_io(), so that's why we end up with a memory leak - we wait for the ordered extent to complete indirectly by shutting down the work queues and waiting for any jobs in them to complete before returning from close_ctree(). We can solve the leak simply by freeing the pages array right after releasing the pages (with the call to io_ctl_drop_pages()) at __btrfs_write_out_cache(), since we will never use it anymore after that and the pages array points to already released pages at that point, which is currently not a problem since no one will use it after that, but not a good practice anyway since it can easily lead to use-after-free issues. So fix this by freeing the pages array right after releasing the pages at __btrfs_write_out_cache(). This issue can often be reproduced with test case generic/475 from fstests and kmemleak can detect it and reports it with the following trace: unreferenced object 0xffff9bbf009fa600 (size 512): comm "fsstress", pid 38807, jiffies 4298504428 (age 22.028s) hex dump (first 32 bytes): 00 a0 7c 4d 3d ed ff ff 40 a0 7c 4d 3d ed ff ff ..|M=...@.|M=... 80 a0 7c 4d 3d ed ff ff c0 a0 7c 4d 3d ed ff ff ..|M=.....|M=... backtrace: [<00000000f4b5cfe2>] __kmalloc+0x1a8/0x3e0 [<0000000028665e7f>] io_ctl_init+0xa7/0x120 [btrfs] [<00000000a1f95b2d>] __btrfs_write_out_cache+0x86/0x4a0 [btrfs] [<00000000207ea1b0>] btrfs_write_out_cache+0x7f/0xf0 [btrfs] [<00000000af21f534>] btrfs_start_dirty_block_groups+0x27b/0x580 [btrfs] [<00000000c3c23d44>] btrfs_commit_transaction+0xa6f/0xe70 [btrfs] [<000000009588930c>] create_subvol+0x581/0x9a0 [btrfs] [<000000009ef2fd7f>] btrfs_mksubvol+0x3fb/0x4a0 [btrfs] [<00000000474e5187>] __btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs] [<00000000708ee349>] btrfs_ioctl_snap_create_v2+0xb0/0xf0 [btrfs] [<00000000ea60106f>] btrfs_ioctl+0x12c/0x3130 [btrfs] [<000000005c923d6d>] __x64_sys_ioctl+0x83/0xb0 [<0000000043ace2c9>] do_syscall_64+0x33/0x80 [<00000000904efbce>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 CC: stable@vger.kernel.org # 4.9+ Reviewed-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
The build robot reports compiler: h8300-linux-gcc (GCC) 9.3.0 In file included from fs/btrfs/tests/extent-map-tests.c:8: >> fs/btrfs/tests/../ctree.h:2166:8: warning: type qualifiers ignored on function return type [-Wignored-qualifiers] 2166 | size_t __const btrfs_get_num_csums(void); | ^~~~~~~ The function attribute for const does not follow the expected scheme and in this case is confused with a const type qualifier. Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Marcos Paulo de Souza 提交于
Currently a user can set mount "-o compress" which will set the compression algorithm to zlib, and use the default compress level for zlib (3): relatime,compress=zlib:3,space_cache If the user remounts the fs using "-o compress=lzo", then the old compress_level is used: relatime,compress=lzo:3,space_cache But lzo does not expose any tunable compression level. The same happens if we set any compress argument with different level, also with zstd. Fix this by resetting the compress_level when compress=lzo is specified. With the fix applied, lzo is shown without compress level: relatime,compress=lzo,space_cache CC: stable@vger.kernel.org # 4.4+ Signed-off-by: NMarcos Paulo de Souza <mpdesouza@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Johannes Thumshirn 提交于
Btrfs' async submit mechanism is able to handle errors in the submission path and the meta-data async submit function correctly passes the error code to the caller. In btrfs_submit_bio_start() and btrfs_submit_bio_start_direct_io() we're not handling the errors returned by btrfs_csum_one_bio() correctly though and simply call BUG_ON(). This is unnecessary as the caller of these two functions - run_one_async_start - correctly checks for the return values and sets the status of the async_submit_bio. The actual bio submission will be handled later on by run_one_async_done only if async_submit_bio::status is 0, so the data won't be written if we encountered an error in the checksum process. Simply return the error from btrfs_csum_one_bio() to the async submitters, like it's done in btree_submit_bio_start(). Reviewed-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 brookxu 提交于
In the scenario of writing sparse files, the per-inode prealloc list may be very long, resulting in high overhead for ext4_mb_use_preallocated(). To circumvent this problem, we limit the maximum length of per-inode prealloc list to 512 and allow users to modify it. After patching, we observed that the sys ratio of cpu has dropped, and the system throughput has increased significantly. We created a process to write the sparse file, and the running time of the process on the fixed kernel was significantly reduced, as follows: Running time on unfixed kernel: [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat real 0m2.051s user 0m0.008s sys 0m2.026s Running time on fixed kernel: [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat real 0m0.471s user 0m0.004s sys 0m0.395s Signed-off-by: NChunguang Xu <brookxu@tencent.com> Link: https://lore.kernel.org/r/d7a98178-056b-6db5-6bce-4ead23f4a257@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 brookxu 提交于
Reorganize the if statement of ext4_mb_release_context(), make it easier to read. Signed-off-by: NChunguang Xu <brookxu@tencent.com> Link: https://lore.kernel.org/r/5439ac6f-db79-ad68-76c1-a4dda9aa0cc3@gmail.comReviewed-by: NRitesh Harjani <riteshh@linux.ibm.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 brookxu 提交于
Lost chunks are when some other process raced with the current thread to grab a particular block allocation. Add mb_debug log for developers who wants to see how often this is happening for a particular workload. Signed-off-by: NChunguang Xu <brookxu@tencent.com> Link: https://lore.kernel.org/r/0a165ac0-1912-aebd-8a0d-b42e7cd1aea1@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 kyoungho koo 提交于
I have found double typed comments "the the". So i modified it to one "the" Signed-off-by: Nkyoungho koo <rnrudgh@gmail.com> Link: https://lore.kernel.org/r/20200424171620.GA11943@koo-Z370-HD3Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-