- 24 6月, 2016 1 次提交
-
-
由 Liu Bo 提交于
map_private_extent_buffer() can return -EINVAL in two different cases, 1. when the requested contents span two pages if nodesize is larger than pagesize, 2. when it detects something insane. The 2nd one used to be only a WARN_ON(1), and we decided to return a error to callers, but we didn't fix up all its callers, which will be addressed by this patch. Without this, btrfs may end up with 'general protection', ie. reading invalid memory. Reported-by: NVegard Nossum <vegard.nossum@oracle.com> Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 18 6月, 2016 2 次提交
-
-
由 Jeff Mahoney 提交于
The test for !trans->blocks_used in btrfs_abort_transaction is insufficient to determine whether it's safe to drop the transaction handle on the floor. btrfs_cow_block, informed by should_cow_block, can return blocks that have already been CoW'd in the current transaction. trans->blocks_used is only incremented for new block allocations. If an operation overlaps the blocks in the current transaction entirely and must abort the transaction, we'll happily let it clean up the trans handle even though it may have modified the blocks and will commit an incomplete operation. In the long-term, I'd like to do closer tracking of when the fs is actually modified so we can still recover as gracefully as possible, but that approach will need some discussion. In the short term, since this is the only code using trans->blocks_used, let's just switch it to a bool indicating whether any blocks were used and set it when should_cow_block returns false. Cc: stable@vger.kernel.org # 3.4+ Signed-off-by: NJeff Mahoney <jeffm@suse.com> Reviewed-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Liu Bo 提交于
Thanks to fuzz testing, we can pass an invalid bytenr to extent buffer via alloc_extent_buffer(). An unaligned eb can have more pages than it should have, which ends up extent buffer's leak or some corrupted content in extent buffer. This adds a warning to let us quickly know what was happening. Now that alloc_extent_buffer() no more returns NULL, this changes its caller and callers of its caller to match with the new error handling. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 03 6月, 2016 1 次提交
-
-
由 Feifei Xu 提交于
self-tests code assumes 4k as the sectorsize and nodesize. This commit fix hardcoded 4K. Enables the self-tests code to be executed on non-4k page sized systems (e.g. ppc64). Reviewed-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NFeifei Xu <xufeifei@linux.vnet.ibm.com> Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 26 5月, 2016 1 次提交
-
-
由 Nicholas D Steeves 提交于
Signed-off-by: NNicholas D Steeves <nsteeves@gmail.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 28 4月, 2016 1 次提交
-
-
由 Anand Jain 提交于
btrfs_std_error() handles errors, puts FS into readonly mode (as of now). So its good idea to rename it to btrfs_handle_fs_error(). Signed-off-by: NAnand Jain <anand.jain@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> [ edit changelog ] Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 04 4月, 2016 1 次提交
-
-
由 David Sterba 提交于
The allocation of node could fail if the memory is too fragmented for a given node size, practically observed with 64k. http://article.gmane.org/gmane.comp.file-systems.btrfs/54689Reported-and-tested-by: NJean-Denis Girard <jd.girard@sysnux.pf> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 11 2月, 2016 1 次提交
-
-
由 David Sterba 提交于
The send operation is not on the critical writeback path we don't need to use GFP_NOFS for allocations. All error paths are handled and the whole operation is restartable. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 02 2月, 2016 1 次提交
-
-
由 Chandan Rajendra 提交于
In subpagesize-blocksize a page can map multiple extent buffers and hence using (page index, seq) as the search key is incorrect. For example, searching through tree modification log tree can return an entry associated with the first extent buffer mapped by the page (if such an entry exists), when we are actually searching for entries associated with extent buffers that are mapped at position 2 or more in the page. Reviewed-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 07 1月, 2016 2 次提交
-
-
由 David Sterba 提交于
Replace the integers by enums for better readability. The value 2 does not have any meaning since a7175319 "Btrfs: do less aggressive btree readahead" (2009-01-22). Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Byongho Lee 提交于
We use many constants to represent size and offset value. And to make code readable we use '256 * 1024 * 1024' instead of '268435456' to represent '256MB'. However we can make far more readable with 'SZ_256MB' which is defined in the 'linux/sizes.h'. So this patch replaces 'xxx * 1024 * 1024' kind of expression with single 'SZ_xxxMB' if 'xxx' is a power of 2 then 'xxx * SZ_1M' if 'xxx' is not a power of 2. And I haven't touched to '4096' & '8192' because it's more intuitive than 'SZ_4KB' & 'SZ_8KB'. Signed-off-by: NByongho Lee <bhlee.kernel@gmail.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 22 10月, 2015 1 次提交
-
-
由 Alexandru Moise 提交于
The return values of btrfs_item_offset_nr and btrfs_item_size_nr are of type u32. To avoid mixing signed and unsigned integers we should also declare dsize and last_off to be of type u32. Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NAlexandru Moise <00moses.alexander00@gmail.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 29 9月, 2015 1 次提交
-
-
由 Anand Jain 提交于
btrfs_error() and btrfs_std_error() does the same thing and calls _btrfs_std_error(), so consolidate them together. And the main motivation is that btrfs_error() is closely named with btrfs_err(), one handles error action the other is to log the error, so don't closely name them. Signed-off-by: NAnand Jain <anand.jain@oracle.com> Suggested-by: NDavid Sterba <dsterba@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 09 8月, 2015 1 次提交
-
-
由 Zhaolei 提交于
When btrfs_reloc_cow_block() failed in __btrfs_cow_block(), current code just return a err-value to caller, but leave new_created extent buffer exist and locked. Then subsequent code (in relocate) try to lock above eb again, and caused deadlock without any dmesg. (eb lock use wait_event(), so no lockdep message) It is hard to do recover work in __btrfs_cow_block() at this error point, but we can abort transaction to avoid deadlock and operate on unstable state.a It also helps developer to find wrong place quickly. (better than a frozen fs without any dmesg before patch) Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 03 6月, 2015 1 次提交
-
-
由 Liu Bo 提交于
The return value of read_tree_block() can confuse callers as it always returns NULL for either -ENOMEM or -EIO, so it's likely that callers parse it to a wrong error, for instance, in btrfs_read_tree_root(). This fixes the above issue. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NChris Mason <clm@fb.com>
-
- 04 3月, 2015 1 次提交
-
-
由 David Sterba 提交于
Convert kmalloc(nr * size, ..) to kmalloc_array that does additional overflow checks, the zeroing variant is kcalloc. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
- 03 3月, 2015 1 次提交
-
-
由 Filipe Manana 提交于
The end_slot variable actually matches the number of pointers in the node and not the last slot (which is 'nritems - 1'). Therefore in order to check that the current slot in the for loop doesn't match the last one, the correct logic is to check if 'i' is less than 'end_slot - 1' and not 'end_slot - 2'. Fix this and set end_slot to be 'nritems - 1', as it's less confusing since the variable name implies it's inclusive rather then exclusive. Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 17 2月, 2015 2 次提交
-
-
由 Daniel Dressler 提交于
This is the 3rd independent patch of a larger project to cleanup btrfs's internal usage of btrfs_root. Many functions take btrfs_root only to grab the fs_info struct. By requiring a root these functions cause programmer overhead. That these functions can accept any valid root is not obvious until inspection. This patch reduces the specificity of such functions to accept the fs_info directly. These patches can be applied independently and thus are not being submitted as a patch series. There should be about 26 patches by the project's completion. Each patch will cleanup between 1 and 34 functions apiece. Each patch covers a single file's functions. This patch affects the following function(s): 1) csum_tree_block 2) csum_dirty_buffer 3) check_tree_block_fsid 4) btrfs_find_tree_block 5) clean_tree_block Signed-off-by: NDaniel Dressler <danieru.dressler@gmail.com> Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
由 Daniel Dressler 提交于
This patch is part of a larger project to cleanup btrfs's internal usage of struct btrfs_root. Many functions take btrfs_root only to grab a pointer to fs_info. This causes programmers to ponder which root can be passed. Since only the fs_info is read affected functions can accept any root, except this is only obvious upon inspection. This patch reduces the specificty of such functions to accept the fs_info directly. This patch does not address the two functions in ctree.c (insert_ptr, and split_item) which only use root for BUG_ONs in ctree.c This patch affects the following functions: 1) fixup_low_keys 2) btrfs_set_item_key_safe Signed-off-by: NDaniel Dressler <danieru.dressler@gmail.com> Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
- 22 1月, 2015 3 次提交
-
-
由 chandan 提交于
btrfs_alloc_tree_block() returns an extent buffer on which a blocked lock has been taken. Hence assign the appropriate value to path->locks[level]. Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Filipe Manana 提交于
We were incorrectly detecting when the target key didn't exist anymore after releasing the path and re-searching the tree. This could make us split or duplicate (btrfs_split_item() and btrfs_duplicate_item() are its only callers at the moment) an item when we should not. For the case of duplicating an item, we currently only duplicate checksum items (csum tree) and file extent items (fs/subvol trees). For the checksum items we end up overriding the item completely, but for file extent items we update only some of their fields in the copy (done in __btrfs_drop_extents), which means we can end up having a logical corruption for some values. Also for the case where we duplicate a file extent item it will make us produce a leaf with a wrong key order, as btrfs_duplicate_item() advances us to the next slot and then its caller sets a smaller key on the new item at that slot (like in __btrfs_drop_extents() e.g.). Alternatively if the tree search in setup_leaf_for_split() leaves with path->slots[0] == btrfs_header_nritems(path->nodes[0]), we end up accessing beyond the leaf's end (when we check if the item's size has changed) and make our caller insert an item at the invalid slot btrfs_header_nritems(path->nodes[0]) + 1, causing an invalid memory access if the leaf is full or nearly full. This issue has been present since the introduction of this function in 2009: Btrfs: Add btrfs_duplicate_item commit ad48fd75Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Josef Bacik 提交于
I've been overloading root->dirty_list to keep track of dirty roots and which roots need to have their commit roots switched at transaction commit time. This could cause us to lose an update to the root which could corrupt the file system. To fix this use a state bit to know if the root is dirty, and if it isn't set we go ahead and move the root to the dirty list. This way if we re-dirty the root after adding it to the switch_commit list we make sure to update it. This also makes it so that the extent root is always the last root on the dirty list to try and keep the amount of churn down at this point in the commit. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 15 1月, 2015 2 次提交
-
-
由 David Sterba 提交于
If the found_key is NULL, then btrfs_find_item becomes a verbose wrapper for simple btrfs_search_slot. After we've removed all such callers, passing a NULL key is not valid anymore. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
由 David Sterba 提交于
If btrfs_find_item is called with NULL path it allocates one locally but does not free it. Affected paths are inserting an orphan item for a file and for a subvol root. Move the path allocation to the callers. CC: <stable@vger.kernel.org> # 3.14+ Fixes: 3f870c28 ("btrfs: expand btrfs_find_item() to include find_orphan_item functionality") Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
- 13 12月, 2014 2 次提交
-
-
由 David Sterba 提交于
Make the extent buffer allocation interface consistent. Cloned eb will set a valid fs_info. For dummy eb, we can drop the length parameter and set it from fs_info. The built-in sanity checks may pass a NULL fs_info that's queried for nodesize, but we know it's 4096. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
由 David Sterba 提交于
All callers pass nodesize. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
- 21 11月, 2014 1 次提交
-
-
由 Filipe Manana 提交于
Replacing a xattr consists of doing a lookup for its existing value, delete the current value from the respective leaf, release the search path and then finally insert the new value. This leaves a time window where readers (getxattr, listxattrs) won't see any value for the xattr. Xattrs are used to store ACLs, so this has security implications. This change also fixes 2 other existing issues which were: *) Deleting the old xattr value without verifying first if the new xattr will fit in the existing leaf item (in case multiple xattrs are packed in the same item due to name hash collision); *) Returning -EEXIST when the flag XATTR_CREATE is given and the xattr doesn't exist but we have have an existing item that packs muliple xattrs with the same name hash as the input xattr. In this case we should return ENOSPC. A test case for xfstests follows soon. Thanks to Alexandre Oliva for reporting the non-atomicity of the xattr replace implementation. Reported-by: NAlexandre Oliva <oliva@gnu.org> Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 20 11月, 2014 1 次提交
-
-
由 Chris Mason 提交于
The fair reader/writer locks mean that btrfs_clear_path_blocking needs to strictly follow lock ordering rules even when we already have blocking locks on a given path. Before we can clear a blocking lock on the path, we need to make sure all of the locks have been converted to blocking. This will remove lock inversions against anyone spinning in write_lock() against the buffers we're trying to get read locks on. These inversions didn't exist before the fair read/writer locks, but now we need to be more careful. We papered over this deadlock in the past by changing btrfs_try_read_lock() to be a true trylock against both the spinlock and the blocking lock. This was slower, and not sufficient to fix all the deadlocks. This patch adds a btrfs_tree_read_lock_atomic(), which basically means get the spinlock but trylock on the blocking lock. Signed-off-by: NChris Mason <clm@fb.com> Signed-off-by: NJosef Bacik <jbacik@fb.com> Reported-by: NPatrick Schmid <schmid@phys.ethz.ch> cc: stable@vger.kernel.org #v3.15+
-
- 04 10月, 2014 1 次提交
-
-
由 Fabian Frederick 提交于
cmp was declared twice in btrfs_compare_trees resulting in a shadow warning. This patch renames second internal variable. Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NChris Mason <clm@fb.com>
-
- 02 10月, 2014 6 次提交
-
-
由 David Sterba 提交于
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
由 David Sterba 提交于
Use a common definition for the inline data start so we don't have to open-code it and introduce bugs like "Btrfs: fix wrong max inline data size limit" fixed. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
由 David Sterba 提交于
Rename to btrfs_alloc_tree_block as it fits to the alloc/find/free + _tree_block family. The parameter blocksize was set to the metadata block size, directly or indirectly. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
由 David Sterba 提交于
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
由 David Sterba 提交于
We know the tree block size, no need to pass it around. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
由 David Sterba 提交于
The parent_transid parameter has been unused since its introduction in ca7a79ad ("Pass down the expected generation number when reading tree blocks"). In reada_tree_block, it was even wrongly set to leafsize. Transid check is done in the proper read and readahead ignores errors. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
- 18 9月, 2014 4 次提交
-
-
由 Filipe Manana 提交于
None of the uses of btrfs_search_forward() need to have the path nodes (level >= 1) read locked, only the leaf needs to be locked while the caller processes it. Therefore make it return a path with all nodes unlocked, except for the leaf. This change is motivated by the observation that during a file fsync we repeatdly call btrfs_search_forward() and process the returned leaf while upper nodes of the returned path (level >= 1) are read locked, which unnecessarily blocks other tasks that want to write to the same fs/subvol btree. Therefore instead of modifying the fsync code to unlock all nodes with level >= 1 immediately after calling btrfs_search_forward(), change btrfs_search_forward() to do it, so that it benefits all callers. Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Filipe Manana 提交于
If we need to cow a node, increase the write lock level and retry the tree search, there's no point of changing the node locks in our path to blocking mode, as we only waste time and unnecessarily wake up other tasks waiting on the spinning locks (just to block them again shortly after) because we release our path before repeating the tree search. Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Filipe Manana 提交于
In ctree.c:setup_items_for_insert(), we can unlock all nodes in our path before we process the leaf (shift items and data, adjust data offsets, etc). This allows for better btree concurrency, as we're often holding a write lock on at least the node at level 1. Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 David Sterba 提交于
The nodesize and leafsize were never of different values. Unify the usage and make nodesize the one. Cleanup the redundant checks and helpers. Shaves a few bytes from .text: text data bss dec hex filename 852418 24560 23112 900090 dbbfa btrfs.ko.before 851074 24584 23112 898770 db6d2 btrfs.ko.after Signed-off-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NChris Mason <clm@fb.com>
-
- 15 8月, 2014 1 次提交
-
-
由 Josef Bacik 提交于
Before I extended the no_quota arg to btrfs_dec/inc_ref because I didn't understand how snapshot delete was using it and assumed that we needed the quota operations there. With Mark's work this has turned out to be not the case, we _always_ need to use no_quota for btrfs_dec/inc_ref, so just drop the argument and make __btrfs_mod_ref call it's process function with no_quota set always. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-