- 26 7月, 2016 7 次提交
-
-
由 Salah Triki 提交于
size contains the value returned by posix_acl_from_xattr(), which returns -ERANGE, -ENODATA, zero, or an integer greater than zero. So replace -ENOENT by -ERANGE. Signed-off-by: NSalah Triki <salah.triki@gmail.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
The code flow in btrfs_new_inode allows for btrfs_evict_inode to be called with not fully initialised inode (e.g. ->root member not being set). This can happen when btrfs_set_inode_index in btrfs_new_inode fails, which in turn would call iput for the newly allocated inode. This in turn leads to vfs calling into btrfs_evict_inode. This leads to null pointer dereference. To handle this situation check whether the passed inode has root set and just free it in case it doesn't. Signed-off-by: NNikolay Borisov <kernel@kyup.com> Reviewed-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Liu Bo 提交于
We use read_node_slot() to read btree node and it has two cases, a) slot is out of range, which means 'no such entry' b) we fail to read the block, due to checksum fails or corrupted content or not with uptodate flag. But we're returning NULL in both cases, this makes it return -ENOENT in case a) and return -EIO in case b), and this fixes its callers as well as btrfs_search_forward() 's caller to catch the new errors. The problem is reported by Peter Becker, and I can manage to hit the same BUG_ON by mounting my fuzz image. Reported-by: NPeter Becker <floyd.net@gmail.com> Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Liu Bo 提交于
I got this warning while mounting a btrfs image, [ 3020.509606] ------------[ cut here ]------------ [ 3020.510107] WARNING: CPU: 3 PID: 5581 at lib/idr.c:1051 ida_remove+0xca/0x190 [ 3020.510853] ida_remove called for id=42 which is not allocated. [ 3020.511466] Modules linked in: [ 3020.511802] CPU: 3 PID: 5581 Comm: mount Not tainted 4.7.0-rc5+ #274 [ 3020.512438] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014 [ 3020.513385] 0000000000000286 0000000021295d86 ffff88006c66b8f0 ffffffff8182ba5a [ 3020.514153] 0000000000000000 0000000000000009 ffff88006c66b930 ffffffff810e0ed7 [ 3020.514928] 0000041b00000000 ffffffff8289a8c0 ffff88007f437880 0000000000000000 [ 3020.515717] Call Trace: [ 3020.515965] [<ffffffff8182ba5a>] dump_stack+0xc9/0x13f [ 3020.516487] [<ffffffff810e0ed7>] __warn+0x147/0x160 [ 3020.517005] [<ffffffff810e0f4f>] warn_slowpath_fmt+0x5f/0x80 [ 3020.517572] [<ffffffff8182e6ca>] ida_remove+0xca/0x190 [ 3020.518075] [<ffffffff813a2bcc>] free_anon_bdev+0x2c/0x60 [ 3020.518609] [<ffffffff81657a9f>] free_fs_root+0x13f/0x160 [ 3020.519138] [<ffffffff8165c679>] btrfs_get_fs_root+0x379/0x3d0 [ 3020.519710] [<ffffffff81e6e975>] ? __mutex_unlock_slowpath+0x155/0x2c0 [ 3020.520366] [<ffffffff816615b1>] open_ctree+0x2e91/0x3200 [ 3020.520965] [<ffffffff8161ede2>] btrfs_mount+0x1322/0x15b0 [ 3020.521536] [<ffffffff81e60e74>] ? kmemleak_alloc_percpu+0x44/0x170 [ 3020.522167] [<ffffffff8115f5e1>] ? lockdep_init_map+0x61/0x210 [ 3020.522780] [<ffffffff813a4f59>] mount_fs+0x49/0x2c0 [ 3020.523305] [<ffffffff813d840c>] vfs_kern_mount+0xac/0x1b0 [ 3020.523872] [<ffffffff8161dee1>] btrfs_mount+0x421/0x15b0 [ 3020.524402] [<ffffffff81e60e74>] ? kmemleak_alloc_percpu+0x44/0x170 [ 3020.525045] [<ffffffff8115f5e1>] ? lockdep_init_map+0x61/0x210 [ 3020.525657] [<ffffffff8115f5e1>] ? lockdep_init_map+0x61/0x210 [ 3020.526289] [<ffffffff813a4f59>] mount_fs+0x49/0x2c0 [ 3020.526803] [<ffffffff813d840c>] vfs_kern_mount+0xac/0x1b0 [ 3020.527365] [<ffffffff813dc27a>] do_mount+0x41a/0x1770 [ 3020.527899] [<ffffffff812e800d>] ? strndup_user+0x6d/0xc0 [ 3020.528447] [<ffffffff812e7f68>] ? memdup_user+0x78/0xb0 [ 3020.528987] [<ffffffff813ddad0>] SyS_mount+0x150/0x160 [ 3020.529493] [<ffffffff81e72b7c>] entry_SYSCALL_64_fastpath+0x1f/0xbd It turns out that we free fs root twice, btrfs_init_fs_root() calls free_anon_bdev(root->anon_dev) and later then btrfs_get_fs_root() cals free_fs_root which does another free_anon_bdev() and it ends up with the above warning. Instead of reset root->anon_dev to 0 after free_anon_bdev(), we can let btrfs_init_fs_root() return directly since its callers have already done the free job by calling free_fs_root(). Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Reviewed-by: NChandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Liu Bo 提交于
With btrfs-corrupt-block, one can set btree node/leaf's field, if we assign a negative value to node/leaf, we can get various hangs, eg. if extent_root's nritems is -2ULL, then we get stuck in btrfs_read_block_groups() because it has a while loop and btrfs_search_slot() on extent_root will always return the first child. This lets us know what's happening and returns a EINVAL to callers instead of returning the first item. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Liu Bo 提交于
With btrfs-corrupt-block, one can drop one chunk item and mounting will end up with a panic in btrfs_full_stripe_len(). This doesn't not remove the BUG_ON, but instead checks it a bit earlier when we find the block group item. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Wang Xiaoguang 提交于
Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 21 7月, 2016 2 次提交
-
-
由 Chris Mason 提交于
Commit 56244ef1 was almost but not quite enough to fix the reservation math after btrfs_copy_from_user returned partial copies. Some users are still seeing warnings in btrfs_destroy_inode, and with a long enough test run I'm able to trigger them as well. This patch fixes the accounting math again, bringing it much closer to the way it was before the sectorsize conversion Chandan did. The problem is accounting for the offset into the page/sector when we do a partial copy. This one just uses the dirty_sectors variable which should already be updated properly. Signed-off-by: NChris Mason <clm@fb.com> cc: stable@vger.kernel.org # v4.6+
-
由 Josef Bacik 提交于
The new enospc code makes it possible to deadlock if we don't use FLUSH_LIMIT during reservations inside a transaction. This enforces the correct flush type to avoid both deadlocks and assertions Signed-off-by: NChris Mason <clm@fb.com> Signed-off-by: NJosef Bacik <jbacik@fb.com>
-
- 08 7月, 2016 17 次提交
-
-
由 Josef Bacik 提交于
We used to allow you to set FLUSH_ALL and then just wouldn't do things like commit transactions or wait on ordered extents if we noticed you were in a transaction. However now that all the flushing for FLUSH_ALL is asynchronous we've lost the ability to tell, and we could end up deadlocking. So instead use FLUSH_LIMIT in reserve_metadata_bytes in relocation and then return -EAGAIN if we error out to preserve the previous behavior. I've also added an ASSERT() to catch anybody else who tries to do this. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
Since we set the reloc control before we've reserved our space for relocation we could race with a root being dirtied and not actually have space to do our init reloc root. So once we've allocated it and set it up go ahead and make our reservation before setting the relocate control, that way anybody who tries to do the reloc root init has space to use. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
This is the case all the time anyway except for relocation which could be doing a reloc root for a non ref counted root, in which case we'd end up with some random block rsv rather than the one we have our reservation in. If there isn't enough space in the block rsv we are trying to steal from we'll BUG() because we expect there to be space for the orphan to make its reservation. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
Traditionally we've calculated the global block rsv by guessing how much of the metadata used amount was the extent tree, and then taking the data size and figuring out how large the csum tree would have to be to hold that much data. This is imprecise and falls down on MIXED file systems as we can't trust the data used amount. This resulted in failures for xfstests generic/333 because it creates lots of clones, which explodes out the extent tree. Our global reserve calculations were woefully inaccurate in this case which meant we got into a situation where we did not have enough reserved to do our work. We know we only use the global block rsv for the extent, csum, and root trees, so just get the bytes used for these trees and use that as the basis of our global reserve. Since these are not reference counted trees the bytes_used value will be accurate. This fixed the transaction aborts seen with generic/333. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
Instead of doing fs_info->fs_root in need_async_flush, which may not be set during recovery when mounting, just pass the root itself in, which makes more sense as thats what btrfs_calc_reclaim_metadata_size takes. Signed-off-by: NJosef Bacik <jbacik@fb.com> Reported-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
We do this check when we start the async reclaimer thread, might as well check before we kick it off to save us some cycles. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
We were doing trace_btrfs_release_reserved_extent() in pin_down_extent which isn't quite right because we will go through and free that extent later when we unpin, so it messes up apps that are accounting for the reservation space. We were also unconditionally doing it in __btrfs_free_reserved_extent(), when we only actually free the reservation instead of pinning the extent. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
We want to track when we're triggering flushing from our reservation code and what flushing is being done when we start flushing. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
We can sometimes drop the reservation we had for our inode, so we need to remove that amount from to_reserve so that our tracepoint reports a valid amount of space. Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
Pinned extents are an important metric to keep track of for enospc. Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
Our enospc flushing sucks. It is born from a time where we were early enospc'ing constantly because multiple threads would race in for the same reservation and randomly starve other ones out. So I came up with this solution to block any other reservations from happening while one guy tried to flush stuff to satisfy his reservation. This gives us pretty good correctness, but completely crap latency. The solution I've come up with is ticketed reservations. Basically we try to make our reservation, and if we can't we put a ticket on a list in order and kick off an async flusher thread. This async flusher thread does the same old flushing we always did, just asynchronously. As space is freed and added back to the space_info it checks and sees if we have any tickets that need satisfying, and adds space to the tickets and wakes up anything we've satisfied. Once the flusher thread stops making progress it wakes up all the current tickets and tells them to take a hike. There is a priority list for things that can't flush, since the async flusher could do anything we need to avoid deadlocks. These guys get priority for having their reservation made, and will still do manual flushing themselves in case the async flusher isn't running. This patch gives us significantly better latencies. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
I'm writing a tool to visualize the enospc system inside btrfs, I need this tracepoint in order to keep track of the block groups in the system. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
These were hidden behind enospc_debug, which isn't helpful as they indicate actual bugs, unlike the rest of the enospc_debug stuff which is really debug information. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
We reserve space for the inode update when we first reserve space for writing to a file. However there are lots of ways that we can use this reservation and not have it for subsequent ordered extents. Previously we'd fall through and try to reserve metadata bytes for this, then we'd just steal the full reservation from the delalloc_block_rsv, and if that didn't have enough space we'd steal the full reservation from the global reserve. The problem with this is we can easily just return ENOSPC and fallback to updating the inode item directly. In the worst case (assuming 4k nodesize) we'd steal 64kib from the global reserve if we fall all the way through, however if we just fallback and update the inode directly we'd only steal 4k * BTRFS_PATH_MAX in the worst case which is 32kib. We would have also just added the extent item for the inode so we likely will have already cow'ed down most of the way to the leaf containing the inode item, so we are more often than not only need one or two nodesize's worth of reservations. Given the reservation for the extent itself is also a worst case we will likely already have space to cover the inode update. This change will make us behave better in the theoretical worst case, and much better in the case that we don't have our reservation and cannot reserve more metadata. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
There are a few races in the metadata reservation stuff. First we add the bytes to the block_rsv well after we've set the bit on the inode saying that we have space for it and after we've reserved the bytes. So use the normal btrfs_block_rsv_add helper for this case. Secondly we can flush delalloc extents when we try to reserve space for our write, which means that we could have used up the space for the inode and we wouldn't know because we only check before the reservation. So instead make sure we are always reserving space for the inode update, and then if we don't need it release those bytes afterward. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Reviewed-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
So btrfs_block_rsv_migrate just unconditionally calls block_rsv_migrate_bytes. Not only this but it unconditionally changes the size of the block_rsv. This isn't a bug strictly speaking, but it makes truncate block rsv's look funny because every time we migrate bytes over its size grows, even though we only want it to be a specific size. So collapse this into one function that takes an update_size argument and make truncate and evict not update the size for consistency sake. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
For some reason we're adding bytes_readonly to the space info after we update the space info with the block group info. This creates a tiny race where we could over-reserve space because we haven't yet taken out the bytes_readonly bit. Since we already know this information at the time we call update_space_info, just pass it along so it can be updated all at once. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 25 6月, 2016 1 次提交
-
-
由 Omar Sandoval 提交于
Commit fe742fd4 ("Revert "btrfs: switch to ->iterate_shared()"") backed out the conversion to ->iterate_shared() for Btrfs because the delayed inode handling in btrfs_real_readdir() is racy. However, we can still do readdir in parallel if there are no delayed nodes. This is a temporary fix which upgrades the shared inode lock to an exclusive lock only when we have delayed items until we come up with a more complete solution. While we're here, rename the btrfs_{get,put}_delayed_items functions to make it very clear that they're just for readdir. Tested with xfstests and by doing a parallel kernel build: while make tinyconfig && make -j4 && git clean dqfx; do : done along with a bunch of parallel finds in another shell: while true; do for ((i=0; i<4; i++)); do find . >/dev/null & done wait done Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 24 6月, 2016 4 次提交
-
-
由 Chandan Rajendra 提交于
Btrfs code currently assumes stripesize to be same as sectorsize. However Btrfs-progs (until commit df05c7ed455f519e6e15e46196392e4757257305) has been setting btrfs_super_block->stripesize to a value of 4096. This commit makes sure that the value of btrfs_super_block->stripesize is a power of 2. Later, it unconditionally sets btrfs_root->stripesize to sectorsize. Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Wang Xiaoguang 提交于
When doing truncate operation, btrfs_setsize() will first call truncate_setsize() to set new inode->i_size, but if later btrfs_truncate() fails, btrfs_setsize() will call "i_size_write(inode, BTRFS_I(inode)->disk_i_size)" to reset the inmemory inode size, now bug occurs. It's because for truncate case btrfs_ordered_update_i_size() directly uses inode->i_size to update BTRFS_I(inode)->disk_i_size, indeed we should use the "offset" argument to update disk_i_size. Here is the call graph: ==>btrfs_truncate() ====>btrfs_truncate_inode_items() ======>btrfs_ordered_update_i_size(inode, last_size, NULL); Here btrfs_ordered_update_i_size()'s offset argument is last_size. And below test case can reveal this bug: dd if=/dev/zero of=fs.img bs=$((1024*1024)) count=100 dev=$(losetup --show -f fs.img) mkdir -p /mnt/mntpoint mkfs.btrfs -f $dev mount $dev /mnt/mntpoint cd /mnt/mntpoint echo "workdir is: /mnt/mntpoint" blocksize=$((128 * 1024)) dd if=/dev/zero of=testfile bs=$blocksize count=1 sync count=$((17*1024*1024*1024/blocksize)) echo "file size is:" $((count*blocksize)) for ((i = 1; i <= $count; i++)); do i=$((i + 1)) dst_offset=$((blocksize * i)) xfs_io -f -c "reflink testfile 0 $dst_offset $blocksize"\ testfile > /dev/null done sync truncate --size 0 testfile ls -l testfile du -sh testfile exit In this case, truncate operation will fail for enospc reason and "du -sh testfile" returns value greater than 0, but testfile's size is 0, we need to reflect correct inode->i_size. Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Liu Bo 提交于
map_private_extent_buffer() can return -EINVAL in two different cases, 1. when the requested contents span two pages if nodesize is larger than pagesize, 2. when it detects something insane. The 2nd one used to be only a WARN_ON(1), and we decided to return a error to callers, but we didn't fix up all its callers, which will be addressed by this patch. Without this, btrfs may end up with 'general protection', ie. reading invalid memory. Reported-by: NVegard Nossum <vegard.nossum@oracle.com> Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Wei Yongjun 提交于
Fix to return a negative error code from the kern_mount() error handling case instead of 0(ret is set to 0 by register_filesystem), as done elsewhere in this function. Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 23 6月, 2016 3 次提交
-
-
由 Josef Bacik 提交于
Before we write into prealloc/nocow space we have to make sure that there are no references to the extents we are writing into, which means checking the extent tree and csum tree in the case of nocow. So we don't want to do the nocow dance unless we can't reserve data space, since it's a serious drag on performance. With the following sequence fallocate -l10737418240 /mnt/btrfs-test/file cp --reflink /mnt/btrfs-test/file /mnt/btrfs-test/link fio --name=randwrite --rw=randwrite --bs=4k --filename=/mnt/btrfs-test/file \ --end_fsync=1 we get the worst case scenario where we have to fall back on to doing the check anyway. Without this patch lat (usec): min=5, max=111598, avg=27.65, stdev=124.51 write: io=10240MB, bw=126876KB/s, iops=31718, runt= 82646msec With this patch lat (usec): min=3, max=91210, avg=14.09, stdev=110.62 write: io=10240MB, bw=212753KB/s, iops=53188, runt= 49286msec We get twice the throughput, half of the runtime, and half of the average latency. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> [ PAGE_CACHE_ removal related fixups ] Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Chris Mason 提交于
"Btrfs: track transid for delayed ref flushing" was deadlocking on btrfs_attach_transaction because its not safe to call from the async delayed ref start code. This commit brings back btrfs_join_transaction instead and checks for a blocked commit. Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Josef Bacik 提交于
Using the offwakecputime bpf script I noticed most of our time was spent waiting on the delayed ref throttling. This is what is supposed to happen, but sometimes the transaction can commit and then we're waiting for throttling that doesn't matter anymore. So change this stuff to be a little smarter by tracking the transid we were in when we initiated the throttling. If the transaction we get is different then we can just bail out. This resulted in a 50% speedup in my fs_mark test, and reduced the amount of time spent throttling by 60 seconds over the entire run (which is about 30 minutes). Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 18 6月, 2016 6 次提交
-
-
由 Chandan Rajendra 提交于
Older btrfs-progs/mkfs.btrfs sets 4096 as the stripesize. Hence restricting stripesize to be equal to sectorsize would cause super block validation to return an error on architectures where PAGE_SIZE is not equal to 4096. Hence as a workaround, this commit allows stripesize to be set to 4096 bytes. Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
Introduced in 2c1984f2 ("btrfs: build fixup for qgroup_account_snapshot") as temporary bisectability build fixup. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
We've renamed btrfs_std_error, this one is left from last merge. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Zygo Blaxell 提交于
This fixes a problem introduced in commit 2f3165ec "btrfs: don't force mounts to wait for cleaner_kthread to delete one or more subvolumes". open_ctree eventually calls btrfs_replay_log which in turn calls btrfs_commit_super which tries to lock the cleaner_mutex, causing a recursive mutex deadlock during mount. Instead of playing whack-a-mole trying to keep up with all the functions that may want to lock cleaner_mutex, put all the cleaner_mutex lockers back where they were, and attack the problem more directly: keep cleaner_kthread asleep until the filesystem is mounted. When filesystems are mounted read-only and later remounted read-write, open_ctree did not set fs_info->open and neither does anything else. Set this flag in btrfs_remount so that neither btrfs_delete_unused_bgs nor cleaner_kthread get confused by the common case of "/" filesystem read-only mount followed by read-write remount. Signed-off-by: NZygo Blaxell <ce3g8jdj@umail.furryterror.org> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Josef Bacik 提交于
This is just a screwup for developers, so change it to an ASSERT() so developers notice when things go wrong and deal with the error appropriately if ASSERT() isn't enabled. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Jeff Mahoney 提交于
The test for !trans->blocks_used in btrfs_abort_transaction is insufficient to determine whether it's safe to drop the transaction handle on the floor. btrfs_cow_block, informed by should_cow_block, can return blocks that have already been CoW'd in the current transaction. trans->blocks_used is only incremented for new block allocations. If an operation overlaps the blocks in the current transaction entirely and must abort the transaction, we'll happily let it clean up the trans handle even though it may have modified the blocks and will commit an incomplete operation. In the long-term, I'd like to do closer tracking of when the fs is actually modified so we can still recover as gracefully as possible, but that approach will need some discussion. In the short term, since this is the only code using trans->blocks_used, let's just switch it to a bool indicating whether any blocks were used and set it when should_cow_block returns false. Cc: stable@vger.kernel.org # 3.4+ Signed-off-by: NJeff Mahoney <jeffm@suse.com> Reviewed-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-