- 06 1月, 2009 9 次提交
-
-
由 Joel Becker 提交于
Add block check calls to the read_block validate functions. This is the almost all of the read-side checking of metaecc. xattr buckets are not checked yet. Writes are also unchecked, and so a read-write mount will quickly fail. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Jan Kara 提交于
Add quota calls for allocation and freeing of inodes and space, also update estimates on number of needed credits for a transaction. Move out inode allocation from ocfs2_mknod_locked() because vfs_dq_init() must be called outside of a transaction. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Mark Fasheh 提交于
JBD2 is fully backwards compatible with JBD and it's been tested enough with Ocfs2 that we can clean this code up now. Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
Add an optional validation hook to ocfs2_read_blocks(). Now the validation function is only called when a block was actually read off of disk. It is not called when the buffer was in cache. We add a buffer state bit BH_NeedsValidate to flag these buffers. It must always be one higher than the last JBD2 buffer state bit. The dinode, dirblock, extent_block, and xattr_block validators are lifted to this scheme directly. The group_descriptor validator needs to be split into two pieces. The first part only needs the gd buffer and is passed to ocfs2_read_block(). The second part requires the dinode as well, and is called every time. It's only 3 compares, so it's tiny. This also allows us to clean up the non-fatal gd check used by resize.c. It now has no magic argument. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
We weren't consistently checking extent blocks after we read them. Most places checked the signature, but none checked h_blkno or h_fs_signature. Create a toplevel ocfs2_read_extent_block() that does the read and the validation. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
Random places in the code would check a dinode bh to see if it was valid. Not only did they do different levels of validation, they handled errors in different ways. The previous commit unified inode block reads, validating all block reads in the same place. Thus, these haphazard checks are no longer necessary. Rather than eliminate them, however, we change them to BUG_ON() checks. This ensures the assumptions remain true. All of the code paths to these checks have been audited to ensure they come from a validated inode read. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
The ocfs2 code currently reads inodes off disk with a simple ocfs2_read_block() call. Each place that does this has a different set of sanity checks it performs. Some check only the signature. A couple validate the block number (the block read vs di->i_blkno). A couple others check for VALID_FL. Only one place validates i_fs_generation. A couple check nothing. Even when an error is found, they don't all do the same thing. We wrap inode reading into ocfs2_read_inode_block(). This will validate all the above fields, going readonly if they are invalid (they never should be). ocfs2_read_inode_block_full() is provided for the places that want to pass read_block flags. Every caller is passing a struct inode with a valid ip_blkno, so we don't need a separate blkno argument either. We will remove the validation checks from the rest of the code in a later commit, as they are no longer necessary. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Mark Fasheh 提交于
This patch genericizes the high level handling of extent removal. ocfs2_remove_btree_range() is nearly identical to __ocfs2_remove_inode_range(), except that extent tree operations have been used where necessary. We update ocfs2_remove_inode_range() to use the generic helper. Now extent tree based structures have an easy way to truncate ranges. Signed-off-by: NMark Fasheh <mfasheh@suse.com> Acked-by: NJoel Becker <joel.becker@oracle.com>
-
由 Tao Ma 提交于
Now in ocfs2 xattr set, the whole process are divided into many small parts and they are wrapped into diffrent transactions and it make the set doesn't look like a real transaction. So we want to integrate it into a real one. In some cases we will allocate some clusters and free some in just one transaction. e.g, one xattr is larger than inline size, so it and its value root is stored within the inode while the value is outside in a cluster. Then we try to update it with a smaller value(larger than the size of root but smaller than inline size), we may need to free the outside cluster while allocate a new bucket(one cluster) since now the inode may be full. The old solution will lock the global_bitmap(if the local alloc failed in stress test) and then the truncate log. This will cause a ABBA lock with truncate log flush. This patch add the clusters free in dealloc_ctxt, so that we can record the free clusters during the transaction and then free it after we release the global_bitmap in xattr set. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
- 15 10月, 2008 2 次提交
-
-
由 Joel Becker 提交于
More than 30 callers of ocfs2_read_block() pass exactly OCFS2_BH_CACHED. Only six pass a different flag set. Rather than have every caller care, let's make ocfs2_read_block() take no flags and always do a cached read. The remaining six places can call ocfs2_read_blocks() directly. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
Now that synchronous readers are using ocfs2_read_blocks_sync(), all callers of ocfs2_read_blocks() are passing an inode. Use it unconditionally. Since it's there, we don't need to pass the ocfs2_super either. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
- 14 10月, 2008 22 次提交
-
-
由 Mark Fasheh 提交于
This is pointless as brelse() already does the check. Signed-off-by: Mark Fasheh
-
由 Joel Becker 提交于
ocfs2 wants JBD2 for many reasons, not the least of which is that JBD is limiting our maximum filesystem size. It's a pretty trivial change. Most functions are just renamed. The only functional change is moving to Jan's inode-based ordered data mode. It's better, too. Because JBD2 reads and writes JBD journals, this is compatible with any existing filesystem. It can even interact with JBD-based ocfs2 as long as the journal is formated for JBD. We provide a compatibility option so that paranoid people can still use JBD for the time being. This will go away shortly. [ Moved call of ocfs2_begin_ordered_truncate() from ocfs2_delete_inode() to ocfs2_truncate_for_delete(). --Mark ] Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
The original get/put_extent_tree() functions held a reference on et_root_bh. However, every single caller already has a safe reference, making the get/put cycle irrelevant. We change ocfs2_get_*_extent_tree() to ocfs2_init_*_extent_tree(). It no longer gets a reference on et_root_bh. ocfs2_put_extent_tree() is removed. Callers now have a simpler init+use pattern. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
struct ocfs2_extent_tree_operations provides methods for the different on-disk btrees in ocfs2. Describing what those methods do is probably a good idea. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
We now have three different kinds of extent trees in ocfs2: inode data (dinode), extended attributes (xattr_tree), and extended attribute values (xattr_value). There is a nice abstraction for them, ocfs2_extent_tree, but it is hidden in alloc.c. All the calling functions have to pick amongst a varied API and pass in type bits and often extraneous pointers. A better way is to make ocfs2_extent_tree a first-class object. Everyone converts their object to an ocfs2_extent_tree() via the ocfs2_get_*_extent_tree() calls, then uses the ocfs2_extent_tree for all tree calls to alloc.c. This simplifies a lot of callers, making for readability. It also provides an easy way to add additional extent tree types, as they only need to be defined in alloc.c with a ocfs2_get_<new>_extent_tree() function. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
A couple places check an extent_tree for a valid inode. We move that out to add an eo_insert_check() operation. It can be called from ocfs2_insert_extent() and elsewhere. We also have the wrapper calls ocfs2_et_insert_check() and ocfs2_et_sanity_check() ignore NULL ops. That way we don't have to provide useless operations for xattr types. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
A caller knows what kind of extent tree they have. There's no reason they have to call ocfs2_get_extent_tree() with a NULL when they could just as easily call a specific function to their type of extent tree. Introduce ocfs2_dinode_get_extent_tree(), ocfs2_xattr_tree_get_extent_tree(), and ocfs2_xattr_value_get_extent_tree(). They only take the necessary arguments, calling into the underlying __ocfs2_get_extent_tree() to do the real work. __ocfs2_get_extent_tree() is the old ocfs2_get_extent_tree(), but without needing any switch-by-type logic. ocfs2_get_extent_tree() is now a wrapper around the specific calls. It exists because a couple alloc.c functions can take et_type. This will go later. Another benefit is that ocfs2_xattr_value_get_extent_tree() can take a struct ocfs2_xattr_value_root* instead of void*. This gives us typechecking where we didn't have it before. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
Provide an optional extent_tree_operation to specify the max_leaf_clusters of an ocfs2_extent_tree. If not provided, the value is 0 (unlimited). Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
ocfs2_num_free_extents() re-implements the logic of ocfs2_get_extent_tree(). Now that ocfs2_get_extent_tree() does not allocate, let's use it in ocfs2_num_free_extents() to simplify the code. The inode validation code in ocfs2_num_free_extents() is not needed. All callers are passing in pre-validated inodes. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
The root_el of an ocfs2_extent_tree needs to be calculated from et->et_object. Make it an operation on et->et_ops. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
The 'private' pointer was a way to store off xattr values, which don't live at a set place in the bh. But the concept of "the object containing the extent tree" is much more generic. For an inode it's the struct ocfs2_dinode, for an xattr value its the value. Let's save off the 'object' at all times. If NULL is passed to ocfs2_get_extent_tree(), 'object' is set to bh->b_data; Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
Rather than allocating a struct ocfs2_extent_tree, just put it on the stack. Fill it with ocfs2_get_extent_tree() and drop it with ocfs2_put_extent_tree(). Now the callers don't have to ENOMEM, yet still safely ref the root_bh. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
The members of the ocfs2_extent_tree structure gain a prefix of 'et_'. All users are updated. Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Joel Becker 提交于
The ocfs2_extent_tree_operations structure gains a field prefix on its members. The ->eo_sanity_check() operation gains a wrapper function for completeness. All of the extent tree operation wrappers gain a consistent name (ocfs2_et_*()). Signed-off-by: NJoel Becker <joel.becker@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tao Ma 提交于
In xattr bucket, we want to limit the maximum size of a btree leaf, otherwise we'll lose the benefits of hashing because we'll have to search large leaves. So add a new field in ocfs2_extent_tree which indicates the maximum leaf cluster size we want so that we can prevent ocfs2_insert_extent() from merging the leaf record even if it is contiguous with an adjacent record. Other btree types are not affected by this change. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tao Ma 提交于
When necessary, an ocfs2_xattr_block will embed an ocfs2_extent_list to store large numbers of EAs. This patch adds a new type in ocfs2_extent_tree_type and adds the implementation so that we can re-use the b-tree code to handle the storage of many EAs. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tiger Yang 提交于
Add the structures and helper functions we want for handling inline extended attributes. We also update the inline-data handlers so that they properly function in the event that we have both inline data and inline attributes sharing an inode block. Signed-off-by: NTiger Yang <tiger.yang@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tao Ma 提交于
Add some thin wrappers around ocfs2_insert_extent() for each of the 3 different btree types, ocfs2_inode_insert_extent(), ocfs2_xattr_value_insert_extent() and ocfs2_xattr_tree_insert_extent(). The last is for the xattr index btree, which will be used in a followup patch. All the old callers in file.c etc will call ocfs2_dinode_insert_extent(), while the other two handle the xattr issue. And the init of extent tree are handled by these functions. When storing xattr value which is too large, we will allocate some clusters for it and here ocfs2_extent_list and ocfs2_extent_rec will also be used. In order to re-use the b-tree operation code, a new parameter named "private" is added into ocfs2_extent_tree and it is used to indicate the root of ocfs2_exent_list. The reason is that we can't deduce the root from the buffer_head now. It may be in an inode, an ocfs2_xattr_block or even worse, in any place in an ocfs2_xattr_bucket. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tao Ma 提交于
Factor out the non-inode specifics of ocfs2_do_extend_allocation() into a more generic function, ocfs2_do_cluster_allocation(). ocfs2_do_extend_allocation calls ocfs2_do_cluster_allocation() now, but the latter can be used for other btree types as well. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tao Ma 提交于
In the old extent tree operation, we take the hypothesis that we are using the ocfs2_extent_list in ocfs2_dinode as the tree root. As xattr will also use ocfs2_extent_list to store large value for a xattr entry, we refactor the tree operation so that xattr can use it directly. The refactoring includes 4 steps: 1. Abstract set/get of last_eb_blk and update_clusters since they may be stored in different location for dinode and xattr. 2. Add a new structure named ocfs2_extent_tree to indicate the extent tree the operation will work on. 3. Remove all the use of fe_bh and di, use root_bh and root_el in extent tree instead. So now all the fe_bh is replaced with et->root_bh, el with root_el accordingly. 4. Make ocfs2_lock_allocators generic. Now it is limited to be only used in file extend allocation. But the whole function is useful when we want to store large EAs. Note: This patch doesn't touch ocfs2_commit_truncate() since it is not used for anything other than truncate inode data btrees. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tao Ma 提交于
ocfs2_extend_meta_needed(), ocfs2_calc_extend_credits() and ocfs2_reserve_new_metadata() are all useful for extent tree operations. But they are all limited to an inode btree because they use a struct ocfs2_dinode parameter. Change their parameter to struct ocfs2_extent_list (the part of an ocfs2_dinode they actually use) so that the xattr btree code can use these functions. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tao Ma 提交于
ocfs2_num_free_extents() is used to find the number of free extent records in an inode btree. Hence, it takes an "ocfs2_dinode" parameter. We want to use this for extended attribute trees in the future, so genericize the interface the take a buffer head. A future patch will allow that buffer_head to contain any structure rooting an ocfs2 btree. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
- 04 10月, 2008 1 次提交
-
-
由 Mark Fasheh 提交于
Plug ocfs2 into ->fiemap. Some portions of ocfs2_get_clusters() had to be refactored so that the extent cache can be skipped in favor of going directly to the on-disk records. This makes it easier for us to determine which extent is the last one in the btree. Also, I'm not sure we want to be caching fiemap lookups anyway as they're not directly related to data read/write. Signed-off-by: NMark Fasheh <mfasheh@suse.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: ocfs2-devel@oss.oracle.com Cc: linux-fsdevel@vger.kernel.org
-
- 22 5月, 2008 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 4月, 2008 4 次提交
-
-
由 Julia Lawall 提交于
if (...) BUG(); should be replaced with BUG_ON(...) when the test has no side-effects to allow a definition of BUG_ON that drops the code completely. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @ disable unlikely @ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } | - if (unlikely(E)) { BUG(); } + BUG_ON(E); ) @@ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } | - if (E) { BUG(); } + BUG_ON(E); ) // </smpl> Signed-off-by: NJulia Lawall <julia@diku.dk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tao Ma 提交于
Inode allocation is modified to look in other nodes allocators during extreme out of space situations. We retry our own slot when space is freed back to the global bitmap, or whenever we've allocated more than 1024 inodes from another slot. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tao Ma 提交于
In ocfs2_figure_merge_contig_type, we judge whether there exists a cross extent block merge and enable it by setting CONTIG_LEFT and CONTIG_RIGHT accordingly. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
由 Tao Ma 提交于
In ocfs2_merge_rec_left, when we find the merge extent is "CONTIG_RIGHT" with the first extent record of the next extent block, we will merge it to the next extent block and change all the related extent blocks accordingly. In ocfs2_merge_rec_right, when we find the merge extent is "CONTIG_LEFT" with the last extent record of the previous extent block, we will merge it to the prevoius extent block and change all the related extent blocks accordingly. As for CONTIG_LEFTRIGHT, we will handle CONTIG_RIGHT first so that when the index is zero, the merge process will be more efficient and easier. Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-
- 06 2月, 2008 1 次提交
-
-
由 Christoph Lameter 提交于
Simplify page cache zeroing of segments of pages through 3 functions zero_user_segments(page, start1, end1, start2, end2) Zeros two segments of the page. It takes the position where to start and end the zeroing which avoids length calculations and makes code clearer. zero_user_segment(page, start, end) Same for a single segment. zero_user(page, start, length) Length variant for the case where we know the length. We remove the zero_user_page macro. Issues: 1. Its a macro. Inline functions are preferable. 2. The KM_USER0 macro is only defined for HIGHMEM. Having to treat this special case everywhere makes the code needlessly complex. The parameter for zeroing is always KM_USER0 except in one single case that we open code. Avoiding KM_USER0 makes a lot of code not having to be dealing with the special casing for HIGHMEM anymore. Dealing with kmap is only necessary for HIGHMEM configurations. In those configurations we use KM_USER0 like we do for a series of other functions defined in highmem.h. Since KM_USER0 is depends on HIGHMEM the existing zero_user_page function could not be a macro. zero_user_* functions introduced here can be be inline because that constant is not used when these functions are called. Also extract the flushing of the caches to be outside of the kmap. [akpm@linux-foundation.org: fix nfs and ntfs build] [akpm@linux-foundation.org: fix ntfs build some more] Signed-off-by: NChristoph Lameter <clameter@sgi.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: David Chinner <dgc@sgi.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-