- 25 9月, 2008 40 次提交
-
-
由 Chris Mason 提交于
Drop i_mutex during the commit Don't bother doing the fsync at all unless the dir is marked as dirtied and needing fsync in this transaction. For directories, this means that someone has unlinked a file from the dir without fsyncing the file. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Christoph Hellwig 提交于
btrfs_ilookup is unused, which is good because a normal filesystem should never have to use ilookup anyway. Remove it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
File syncs and directory syncs are optimized by copying their items into a special (copy-on-write) log tree. There is one log tree per subvolume and the btrfs super block points to a tree of log tree roots. After a crash, items are copied out of the log tree and back into the subvolume. See tree-log.c for all the details. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Christoph Hellwig 提交于
btrfs actually stores the whole xattr name, including the prefix ondisk, so using the generic resolver that strips off the prefix is not very helpful. Instead do the real ondisk xattrs manually and only use the generic resolver for synthetic xattrs like ACLs. (Sorry Josef for guiding you towards the wrong direction here intially) Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 David Woodhouse 提交于
Date: Sun, 17 Aug 2008 17:12:56 +0100 Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 David Woodhouse 提交于
Date: Sun, 17 Aug 2008 17:08:36 +0100 Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 David Woodhouse 提交于
Date: Sun, 17 Aug 2008 15:14:48 +0100 We never get asked by the VFS to lookup either of them, and we can handle the readdir() case a lot more simply, too. Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 David Woodhouse 提交于
Date: Wed, 6 Aug 2008 19:42:33 +0100 Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Balaji Rao 提交于
Date: Mon, 21 Jul 2008 02:01:04 +0530 This patch introduces a btrfs_iget helper to be used in NFS support. Signed-off-by: NBalaji Rao <balajirrao@gmail.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
This optimization had been removed because I thought it was triggering csum errors. The real cause of the errors was elsewhere, and so this optimization is back. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
add_extent_mapping was allowing the insertion of overlapping extents. This never used to happen because it only inserted the extents from disk and those were never overlapping. But, with the data=ordered code, the disk and memory representations of the file are not the same. add_extent_mapping needs to ensure a new extent does not overlap before it inserts. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
This takes the csum mutex deeper in the call chain and releases it more often. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
The writeback_index field is used by write_cache_pages to pick up where writeback on a given inode left off. But, it is never set to a sane value, so writeback can often start at a random offset in the file. Kernels 2.6.28 and higher will have this fixed, but for everyone else, we also fill in the value in btrfs. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Yan Zheng 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
rename and link don't always have a lock on the source inode, and our use of a per-inode index variable was racy. This changes things to store the index in a local variable instead. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Large streaming reads make for large bios, which means each entry on the list async work queues represents a large amount of data. IO congestion throttling on the device was kicking in before the async worker threads decided a single thread was busy and needed some help. The end result was that a streaming read would result in a single CPU running at 100% instead of balancing the work off to other CPUs. This patch also changes the pre-IO checksum lookup done by reads to work on a per-bio basis instead of a per-page. This results in many extra btree lookups on large streaming reads. Doing the checksum lookup right before bio submit allows us to reuse searches while processing adjacent offsets. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Sven Wegener 提交于
Add a couple of #if's to follow API changes. Signed-off-by: NSven Wegener <sven.wegener@stealer.net> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Yan 提交于
The memory reclaiming issue happens when snapshot exists. In that case, some cache entries may not be used during old snapshot dropping, so they will remain in the cache until umount. The patch adds a field to struct btrfs_leaf_ref to record create time. Besides, the patch makes all dead roots of a given snapshot linked together in order of create time. After a old snapshot was completely dropped, we check the dead root list and remove all cache entries created before the oldest dead root in the list. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Yan Zheng 提交于
To check whether a given file extent is referenced by multiple snapshots, the checker walks down the fs tree through dead root and checks all tree blocks in the path. We can easily detect whether a given tree block is directly referenced by other snapshot. We can also detect any indirect reference from other snapshot by checking reference's generation. The checker can always detect multiple references, but can't reliably detect cases of single reference. So btrfs may do file data cow even there is only one reference. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
A large reference cache is directly related to a lot of work pending for the cleaner thread. This throttles back new operations based on the size of the reference cache so the cleaner thread will be able to keep up. Overall, this actually makes the FS faster because the cleaner thread will be more likely to find things in cache. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
This changes the reference cache to make a single cache per root instead of one cache per transaction, and to key by the byte number of the disk block instead of the keys inside. This makes it much less likely to have cache misses if a snapshot or something has an extra reference on a higher node or a leaf while the first transaction that added the leaf into the cache is dropping. Some throttling is added to functions that free blocks heavily so they wait for old transactions to drop. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Yan 提交于
Inode ref item can be in the next leaf when we find "path->slots[0] == btrfs_header_nritems(...)". Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Balaji Rao 提交于
Remove a unused variable 'path' in fixup_tree_root_location. Signed-off-by: NBalaji Rao <balajirrao@gmail.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Josef Bacik 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Josef Bacik 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Josef Bacik 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Checksum items are not inserted into the tree until all of the io from a given extent is complete. This means one dirty page from an extent may be written, freed, and then read again before the entire extent is on disk and the checksum item is inserted. The checksums themselves are stored in the ordered extent so they can be inserted in bulk when IO is complete. On read, if a checksum item isn't found, the ordered extents were being searched for a checksum record. This all worked most of the time, but the checksum insertion code tries to reduce the number of tree operations by pre-inserting checksum items based on i_size and a few other factors. This means the read code might find a checksum item that hasn't yet really been filled in. This commit changes things to check the ordered extents first and only dive into the btree if nothing was found. This removes the need for extra locking and is more reliable. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Stress testing was showing data checksum errors, most of which were caused by a lookup bug in the extent_map tree. The tree was caching the last pointer returned, and searches would check the last pointer first. But, search callers also expect the search to return the very first matching extent in the range, which wasn't always true with the last pointer usage. For now, the code to cache the last return value is just removed. It is easy to fix, but I think lookups are rare enough that it isn't required anymore. This commit also replaces do_sync_mapping_range with a local copy of the related functions. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Before, extent buffers were a temporary object, meant to map a number of pages at once and collect operations on them. But, a few extra fields have crept in, and they are also the best place to store a per-tree block lock field as well. This commit puts the extent buffers into an rbtree, and ensures a single extent buffer for each tree block. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
* In btrfs_delete_inode, wait for ordered extents after calling truncate_inode_pages. This is much faster, and more correct * Properly clear our the PageChecked bit everywhere we redirty the page. * Change the writepage fixup handler to lock the page range and check to see if an ordered extent had been inserted since the improperly dirtied page was discovered * Wait for ordered extents outside the transaction. This isn't required for locking rules but does improve transaction latencies * Reduce contention on the alloc_mutex by dropping it while incrementing refs on a node/leaf and while dropping refs on a leaf. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
It was possible for stale mappings from disk to be used instead of the new pending ordered extent. This adds a flag to the extent map struct to keep it pinned until the pending ordered extent is actually on disk. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-