- 30 11月, 2016 2 次提交
-
-
由 David Sterba 提交于
Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Omar Sandoval 提交于
My QEMU VM was seeing inexplicable I/O errors that I tracked down to errors coming from the qcow2 virtual drive in the host system. The qcow2 file is a nocow file on my Btrfs drive, which QEMU opens with O_DIRECT. Every once in awhile, pread() or pwrite() would return EEXIST, which makes no sense. This turned out to be a bug in btrfs_get_extent(). Commit 8dff9c85 ("Btrfs: deal with duplciates during extent_map insertion in btrfs_get_extent") fixed a case in btrfs_get_extent() where two threads race on adding the same extent map to an inode's extent map tree. However, if the added em is merged with an adjacent em in the extent tree, then we'll end up with an existing extent that is not identical to but instead encompasses the extent we tried to add. When we call merge_extent_mapping() to find the nonoverlapping part of the new em, the arithmetic overflows because there is no such thing. We then end up trying to add a bogus em to the em_tree, which results in a EEXIST that can bubble all the way up to userspace. Fix it by extending the identical extent map special case. Signed-off-by: NOmar Sandoval <osandov@fb.com> Reviewed-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 29 11月, 2016 1 次提交
-
-
由 Christoph Hellwig 提交于
btrfs_map_block supports different types of mappings, which to a large extent resemble block layer operations. But they don't always do, and currently btrfs dangerously overlays it's own flag over the block layer flags. This is just asking for a conflict, so introduce a different map flags enum inside of btrfs instead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 25 10月, 2016 2 次提交
-
-
由 Wang Xiaoguang 提交于
In btrfs_truncate_inode_items()->btrfs_async_run_delayed_refs(), we swap the arg2 and arg3 wrongly, fix this. This bug just impacts asynchronous delayed refs handle when we truncate inodes. In delayed_ref_async_start(), there is such codes: trans = btrfs_join_transaction(async->root); if (trans->transid > async->transid) goto end; ret = btrfs_run_delayed_refs(trans, async->root, async->count); From this codes, we can see that this just influence whether can we handle delayed refs or the number of delayed refs to handle, this may impact performance, but will not result in missing delayed refs, all delayed refs will be handled in btrfs_commit_transaction(). Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: NHolger Hoffstätte <holger@applied-asynchrony.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Goldwyn Rodrigues 提交于
While free'ing qgroup->reserved resources, we much check if the page has not been invalidated by a truncate operation by checking if the page is still dirty before reducing the qgroup resources. Resources in such a case are free'd when the entire extent is released by delayed_ref. This fixes a double accounting while releasing resources in case of truncating a file, reproduced by the following testcase. SCRATCH_DEV=/dev/vdb SCRATCH_MNT=/mnt mkfs.btrfs -f $SCRATCH_DEV mount -t btrfs $SCRATCH_DEV $SCRATCH_MNT cd $SCRATCH_MNT btrfs quota enable $SCRATCH_MNT btrfs subvolume create a btrfs qgroup limit 500m a $SCRATCH_MNT sync for c in {1..15}; do dd if=/dev/zero bs=1M count=40 of=$SCRATCH_MNT/a/file; done sleep 10 sync sleep 5 touch $SCRATCH_MNT/a/newfile echo "Removing file" rm $SCRATCH_MNT/a/file Fixes: b9d0b389 ("btrfs: Add handler for invalidate page") Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: NQu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 11 10月, 2016 1 次提交
-
-
由 Al Viro 提交于
looking for duplicate ->iov_base makes sense only for iovec-backed iterators; for kvec-backed ones it's pointless, for bvec-backed ones it's pointless and broken on 32bit (we walk through an array of struct bio_vec accessing them as if they were struct iovec; works by accident on 64bit, but on 32bit it'll blow up) and for pipe-backed ones it's pointless and ends up oopsing. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 08 10月, 2016 1 次提交
-
-
由 Andreas Gruenbacher 提交于
These inode operations are no longer used; remove them. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 28 9月, 2016 1 次提交
-
-
由 Deepa Dinamani 提交于
current_fs_time() uses struct super_block* as an argument. As per Linus's suggestion, this is changed to take struct inode* as a parameter instead. This is because the function is primarily meant for vfs inode timestamps. Also the function was renamed as per Arnd's suggestion. Change all calls to current_fs_time() to use the new current_time() function instead. current_fs_time() will be deleted. Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 27 9月, 2016 3 次提交
-
-
由 Miklos Szeredi 提交于
Generated patch: sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2` sed -i "s/\brename2\b/rename/g" `git grep -wl rename2` Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Jeff Mahoney 提交于
For many printks, we want to know which file system issued the message. This patch converts most pr_* calls to use the btrfs_* versions instead. In some cases, this means adding plumbing to allow call sites access to an fs_info pointer. fs/btrfs/check-integrity.c is left alone for another day. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Jeff Mahoney 提交于
CodingStyle chapter 2: "[...] never break user-visible strings such as printk messages, because that breaks the ability to grep for them." This patch unsplits user-visible strings. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 26 9月, 2016 2 次提交
-
-
由 Josef Bacik 提交于
We have a lot of random ints in btrfs_fs_info that can be put into flags. This is mostly equivalent with the exception of how we deal with quota going on or off, now instead we set a flag when we are turning it on or off and deal with that appropriately, rather than just having a pending state that the current quota_enabled gets set to. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Qu Wenruo 提交于
btrfs: extend btrfs_set_extent_delalloc and its friends to support in-band dedupe and subpage size patchset Extend btrfs_set_extent_delalloc() and extent_clear_unlock_delalloc() parameters for both in-band dedupe and subpage sector size patchset. This should reduce conflict of both patchset and the effort to rebase them. Cc: Chandan Rajendra <chandan@linux.vnet.ibm.com> Cc: David Sterba <dsterba@suse.cz> Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 22 9月, 2016 1 次提交
-
-
由 Jan Kara 提交于
inode_change_ok() will be resposible for clearing capabilities and IMA extended attributes and as such will need dentry. Give it as an argument to inode_change_ok() instead of an inode. Also rename inode_change_ok() to setattr_prepare() to better relect that it does also some modifications in addition to checks. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 16 9月, 2016 1 次提交
-
-
由 Miklos Szeredi 提交于
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Reviewed-by: NOmar Sandoval <osandov@fb.com> Cc: Chris Mason <clm@fb.com>
-
- 14 9月, 2016 1 次提交
-
-
由 Bart Van Assche 提交于
Introduce the bio_flags() macro. Ensure that the second argument of bio_set_op_attrs() only contains flags and no operation. This patch does not change any functionality. Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Chris Mason <clm@fb.com> (maintainer:BTRFS FILE SYSTEM) Cc: Josef Bacik <jbacik@fb.com> (maintainer:BTRFS FILE SYSTEM) Cc: Mike Snitzer <snitzer@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Damien Le Moal <damien.lemoal@hgst.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 25 8月, 2016 1 次提交
-
-
由 Wang Xiaoguang 提交于
This patch can fix some false ENOSPC errors, below test script can reproduce one false ENOSPC error: #!/bin/bash dd if=/dev/zero of=fs.img bs=$((1024*1024)) count=128 dev=$(losetup --show -f fs.img) mkfs.btrfs -f -M $dev mkdir /tmp/mntpoint mount $dev /tmp/mntpoint cd /tmp/mntpoint xfs_io -f -c "falloc 0 $((64*1024*1024))" testfile Above script will fail for ENOSPC reason, but indeed fs still has free space to satisfy this request. Please see call graph: btrfs_fallocate() |-> btrfs_alloc_data_chunk_ondemand() | bytes_may_use += 64M |-> btrfs_prealloc_file_range() |-> btrfs_reserve_extent() |-> btrfs_add_reserved_bytes() | alloc_type is RESERVE_ALLOC_NO_ACCOUNT, so it does not | change bytes_may_use, and bytes_reserved += 64M. Now | bytes_may_use + bytes_reserved == 128M, which is greater | than btrfs_space_info's total_bytes, false enospc occurs. | Note, the bytes_may_use decrease operation will be done in | end of btrfs_fallocate(), which is too late. Here is another simple case for buffered write: CPU 1 | CPU 2 | |-> cow_file_range() |-> __btrfs_buffered_write() |-> btrfs_reserve_extent() | | | | | | | | | ..... | |-> btrfs_check_data_free_space() | | | | |-> extent_clear_unlock_delalloc() | In CPU 1, btrfs_reserve_extent()->find_free_extent()-> btrfs_add_reserved_bytes() do not decrease bytes_may_use, the decrease operation will be delayed to be done in extent_clear_unlock_delalloc(). Assume in this case, btrfs_reserve_extent() reserved 128MB data, CPU2's btrfs_check_data_free_space() tries to reserve 100MB data space. If 100MB > data_sinfo->total_bytes - data_sinfo->bytes_used - data_sinfo->bytes_reserved - data_sinfo->bytes_pinned - data_sinfo->bytes_readonly - data_sinfo->bytes_may_use btrfs_check_data_free_space() will try to allcate new data chunk or call btrfs_start_delalloc_roots(), or commit current transaction in order to reserve some free space, obviously a lot of work. But indeed it's not necessary as long as decreasing bytes_may_use timely, we still have free space, decreasing 128M from bytes_may_use. To fix this issue, this patch chooses to update bytes_may_use for both data and metadata in btrfs_add_reserved_bytes(). For compress path, real extent length may not be equal to file content length, so introduce a ram_bytes argument for btrfs_reserve_extent(), find_free_extent() and btrfs_add_reserved_bytes(), it's becasue bytes_may_use is increased by file content length. Then compress path can update bytes_may_use correctly. Also now we can discard RESERVE_ALLOC_NO_ACCOUNT, RESERVE_ALLOC and RESERVE_FREE. As we know, usually EXTENT_DO_ACCOUNTING is used for error path. In run_delalloc_nocow(), for inode marked as NODATACOW or extent marked as PREALLOC, we also need to update bytes_may_use, but can not pass EXTENT_DO_ACCOUNTING, because it also clears metadata reservation, so here we introduce EXTENT_CLEAR_DATA_RESV flag to indicate btrfs_clear_bit_hook() to update btrfs_space_info's bytes_may_use. Meanwhile __btrfs_prealloc_file_range() will call btrfs_free_reserved_data_space() internally for both sucessful and failed path, btrfs_prealloc_file_range()'s callers does not need to call btrfs_free_reserved_data_space() any more. Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 08 8月, 2016 1 次提交
-
-
由 Jens Axboe 提交于
Since commit 63a4cc24, bio->bi_rw contains flags in the lower portion and the op code in the higher portions. This means that old code that relies on manually setting bi_rw is most likely going to be broken. Instead of letting that brokeness linger, rename the member, to force old and out-of-tree code to break at compile time instead of at runtime. No intended functional changes in this commit. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 01 8月, 2016 2 次提交
-
-
由 Filipe Manana 提交于
With commit 56f23fdb ("Btrfs: fix file/data loss caused by fsync after rename and new inode") we got simple fix for a functional issue when the following sequence of actions is done: at transaction N create file A at directory D at transaction N + M (where M >= 1) move/rename existing file A from directory D to directory E create a new file named A at directory D fsync the new file power fail The solution was to simply detect such scenario and fallback to a full transaction commit when we detect it. However this turned out to had a significant impact on throughput (and a bit on latency too) for benchmarks using the dbench tool, which simulates real workloads from smbd (Samba) servers. For example on a test vm (with a debug kernel): Unpatched: Throughput 19.1572 MB/sec 32 clients 32 procs max_latency=1005.229 ms Patched: Throughput 23.7015 MB/sec 32 clients 32 procs max_latency=809.206 ms The patched results (this patch is applied) are similar to the results of a kernel with the commit 56f23fdb ("Btrfs: fix file/data loss caused by fsync after rename and new inode") reverted. This change avoids the fallback to a transaction commit and instead makes sure all the names of the conflicting inode (the one that had a name in a past transaction that matches the name of the new file in the same parent directory) are logged so that at log replay time we don't lose neither the new file nor the old file, and the old file gets the name it was renamed to. This also ends up avoiding a full transaction commit for a similar case that involves an unlink instead of a rename of the old file: at transaction N create file A at directory D at transaction N + M (where M >= 1) remove file A create a new file named A at directory D fsync the new file power fail Signed-off-by: NFilipe Manana <fdmanana@suse.com>
-
由 Filipe Manana 提交于
When we attempt to read an inode from disk, we end up always returning an -ESTALE error to the caller regardless of the actual failure reason, which can be an out of memory problem (when allocating a path), some error found when reading from the fs/subvolume btree (like a genuine IO error) or the inode does not exists. So lets start returning the real error code to the callers so that they don't treat all -ESTALE errors as meaning that the inode does not exists (such as during orphan cleanup). This will also be needed for a subsequent patch in the same series dealing with a special fsync case. Signed-off-by: NFilipe Manana <fdmanana@suse.com>
-
- 26 7月, 2016 8 次提交
-
-
由 Jeff Mahoney 提交于
__btrfs_abort_transaction doesn't use its root parameter except to obtain an fs_info pointer. We can obtain that from trans->root->fs_info for now and from trans->fs_info in a later patch. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Jeff Mahoney 提交于
Now that we have a dummy fs_info associated with each test that uses a root, we don't need the DUMMY_ROOT bit anymore. This lets us make choices without needing an actual root like in e.g. btrfs_find_create_tree_block. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Jeff Mahoney 提交于
btrfs_test_opt and friends only use the root pointer to access the fs_info. Let's pass the fs_info directly in preparation to eliminate similar patterns all over btrfs. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Wang Xiaoguang 提交于
Extract cow_file_range() new parameters for both in-band dedupe and subpage sector size patchset. This should make conflict of both patchset to minimal, and reduce the effort needed to rebase them. Cc: Chandan Rajendra <chandan@linux.vnet.ibm.com> Cc: David Sterba <dsterba@suse.cz> Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Ashish Samant 提交于
Remove unnecessary checks in compress_file_range(). Signed-off-by: NAshish Samant <ashish.samant@oracle.com> [ minor coding style fixups ] Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Liu Bo 提交于
One can use btrfs-corrupt-block to hit BUG_ON() in merge_bio(), thus this aims to stop anyone to panic the whole system by using their btrfs. Since the error in merge_bio can only come from __btrfs_map_block() when chunk tree mapping has something insane and __btrfs_map_block() has already had printed the reason, we can just return errors in merge_bio. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
BTRFS is using a variety of slab caches to satisfy internal needs. Those slab caches are always allocated with the SLAB_RECLAIM_ACCOUNT, meaning allocations from the caches are going to be accounted as SReclaimable. At the same time btrfs is not registering any shrinkers whatsoever, thus preventing memory from the slabs to be shrunk. This means those caches are not in fact reclaimable. To fix this remove the SLAB_RECLAIM_ACCOUNT on all caches apart from the inode cache, since this one is being freed by the generic VFS super_block shrinker. Also set the transaction related caches as SLAB_TEMPORARY, to better document the lifetime of the objects (it just translates to SLAB_RECLAIM_ACCOUNT). Signed-off-by: NNikolay Borisov <n.borisov.lkml@gmail.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
The code flow in btrfs_new_inode allows for btrfs_evict_inode to be called with not fully initialised inode (e.g. ->root member not being set). This can happen when btrfs_set_inode_index in btrfs_new_inode fails, which in turn would call iput for the newly allocated inode. This in turn leads to vfs calling into btrfs_evict_inode. This leads to null pointer dereference. To handle this situation check whether the passed inode has root set and just free it in case it doesn't. Signed-off-by: NNikolay Borisov <kernel@kyup.com> Reviewed-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 08 7月, 2016 1 次提交
-
-
由 Josef Bacik 提交于
So btrfs_block_rsv_migrate just unconditionally calls block_rsv_migrate_bytes. Not only this but it unconditionally changes the size of the block_rsv. This isn't a bug strictly speaking, but it makes truncate block rsv's look funny because every time we migrate bytes over its size grows, even though we only want it to be a specific size. So collapse this into one function that takes an update_size argument and make truncate and evict not update the size for consistency sake. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 25 6月, 2016 1 次提交
-
-
由 Omar Sandoval 提交于
Commit fe742fd4 ("Revert "btrfs: switch to ->iterate_shared()"") backed out the conversion to ->iterate_shared() for Btrfs because the delayed inode handling in btrfs_real_readdir() is racy. However, we can still do readdir in parallel if there are no delayed nodes. This is a temporary fix which upgrades the shared inode lock to an exclusive lock only when we have delayed items until we come up with a more complete solution. While we're here, rename the btrfs_{get,put}_delayed_items functions to make it very clear that they're just for readdir. Tested with xfstests and by doing a parallel kernel build: while make tinyconfig && make -j4 && git clean dqfx; do : done along with a bunch of parallel finds in another shell: while true; do for ((i=0; i<4; i++)); do find . >/dev/null & done wait done Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 23 6月, 2016 1 次提交
-
-
由 Josef Bacik 提交于
Using the offwakecputime bpf script I noticed most of our time was spent waiting on the delayed ref throttling. This is what is supposed to happen, but sometimes the transaction can commit and then we're waiting for throttling that doesn't matter anymore. So change this stuff to be a little smarter by tracking the transid we were in when we initiated the throttling. If the transaction we get is different then we can just bail out. This resulted in a 50% speedup in my fs_mark test, and reduced the amount of time spent throttling by 60 seconds over the entire run (which is about 30 minutes). Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 18 6月, 2016 1 次提交
-
-
由 Josef Bacik 提交于
This is just a screwup for developers, so change it to an ASSERT() so developers notice when things go wrong and deal with the error appropriately if ASSERT() isn't enabled. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 08 6月, 2016 5 次提交
-
-
由 Mike Christie 提交于
We don't need bi_rw to be so large on 64 bit archs, so reduce it to unsigned int. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
The bio REQ_OP and bi_rw rq_flag_bits are now always setup, so there is no need to pass around the rq_flag_bits bits too. btrfs users should should access the bio insead. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
We no longer pass in a bitmap of rq_flag_bits bits to __btrfs_map_block. It will always be a REQ_OP, or the btrfs specific REQ_GET_READ_MIRRORS, so this drops the bit tests. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This should be the easier cases to convert btrfs to bio_set_op_attrs/bio_op. They are mostly just cut and replace type of changes. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This patch has the dio code use a REQ_OP for the op and rq_flag_bits for bi_rw flags. To set/get the op it uses the bio_set_op_attrs/bio_op accssors. It also begins to convert btrfs's dio_submit_t because of the dio submit_io callout use. The next patches will completely convert this code and the reset of the btrfs code paths. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 04 6月, 2016 1 次提交
-
-
由 Chris Mason 提交于
When dealing with inline extents, btrfs_get_extent will incorrectly try to insert a duplicate extent_map. The dup hits -EEXIST from add_extent_map, but then we try to merge with the existing one and end up trying to insert a zero length extent_map. This actually works most of the time, except when there are extent maps past the end of the inline extent. rocksdb will trigger this sometimes because it preallocates an extent and then truncates down. Josef made a script to trigger with xfs_io: #!/bin/bash xfs_io -f -c "pwrite 0 1000" inline xfs_io -c "falloc -k 4k 1M" inline xfs_io -c "pread 0 1000" -c "fadvise -d 0 1000" -c "pread 0 1000" inline xfs_io -c "fadvise -d 0 1000" inline cat inline You'll get EIOs trying to read inline after this because add_extent_map is returning EEXIST Signed-off-by: NChris Mason <clm@fb.com>
-
- 26 5月, 2016 1 次提交
-
-
由 Nicholas D Steeves 提交于
Signed-off-by: NNicholas D Steeves <nsteeves@gmail.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 19 5月, 2016 1 次提交
-
-
由 Al Viro 提交于
This reverts commit 972b241f. Quoth Chris: didn't take the delayed inode stuff into account it got an rbtree of items and it pulls things out so in shared mode, its hugely racey sorry, lets revert and fix it for real inside of btrfs Signed-off-by: NChris Mason <clm@fb.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-