提交 · cbee7e1a6a1a2a3d6eda1f76ffc38a3ed3eeb6cc · openanolis / cloud-kernel

05 9月, 2009 5 次提交

ocfs2: ocfs2_add_clusters_in_btree() no longer needs struct inode. · cbee7e1a

由 Joel Becker 提交于 2月 13, 2009

One more function that doesn't need a struct inode to pass to its
children.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

cbee7e1a

ocfs2: ocfs2_insert_extent() no longer needs struct inode. · cc79d8c1

由 Joel Becker 提交于 2月 13, 2009

One more function down, no inode in the entire insert-extent chain.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

cc79d8c1

ocfs2: ocfs2_find_path() only needs the caching info · facdb77f

由 Joel Becker 提交于 2月 12, 2009

ocfs2_find_path and ocfs2_find_leaf() walk our btrees, reading extent
blocks.  They need struct ocfs2_caching_info for that, but not struct
inode.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

facdb77f

ocfs2: Pass ocfs2_caching_info to ocfs2_read_extent_block(). · 3d03a305

由 Joel Becker 提交于 2月 12, 2009

extent blocks belong to btrees on more than just inodes, so we want to
pass the ocfs2_caching_info structure directly to
ocfs2_read_extent_block().  A number of places in alloc.c can now drop
struct inode from their argument list.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

3d03a305

ocfs2: Store the ocfs2_caching_info on ocfs2_extent_tree. · d9a0a1f8

由 Joel Becker 提交于 2月 12, 2009

What do we cache?  Metadata blocks.  What are most of our non-inode metadata
blocks?  Extent blocks for our btrees.  struct ocfs2_extent_tree is the
main structure for managing those.  So let's store the associated
ocfs2_caching_info there.

This means that ocfs2_et_root_journal_access() doesn't need struct inode
anymore, and any place that has an et can refer to et->et_ci instead of
INODE_CACHE(inode).
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

d9a0a1f8

04 4月, 2009 1 次提交

ocfs2: Add a name indexed b-tree to directory inodes · 9b7895ef

由 Mark Fasheh 提交于 11月 12, 2008

This patch makes use of Ocfs2's flexible btree code to add an additional
tree to directory inodes. The new tree stores an array of small,
fixed-length records in each leaf block. Each record stores a hash value,
and pointer to a block in the traditional (unindexed) directory tree where a
dirent with the given name hash resides. Lookup exclusively uses this tree
to find dirents, thus providing us with constant time name lookups.

Some of the hashing code was copied from ext3. Unfortunately, it has lots of
unfixed checkpatch errors. I left that as-is so that tracking changes would
be easier.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Acked-by: NJoel Becker <joel.becker@oracle.com>

9b7895ef

06 1月, 2009 6 次提交

ocfs2: Create ocfs2_xattr_value_buf. · 2a50a743

由 Joel Becker 提交于 12月 09, 2008

When an ocfs2 extended attribute is large enough to require its own
allocation tree, we root it with an ocfs2_xattr_value_root.  However,
these roots can be a part of inodes, xattr blocks, or xattr buckets.
Thus, they need a different journal access function for each container.

We wrap the bh, its journal access function, and the value root (xv) in
a structure called ocfs2_xattr_valu_buf.  This is a package that can
be passed around.  In this first pass, we simply pass it to the
extent tree code.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

2a50a743

ocfs2: Use metadata-specific ocfs2_journal_access_*() functions. · 13723d00

由 Joel Becker 提交于 10月 17, 2008

The per-metadata-type ocfs2_journal_access_*() functions hook up jbd2
commit triggers and allow us to compute metadata ecc right before the
buffers are written out.  This commit provides ecc for inodes, extent
blocks, group descriptors, and quota blocks.  It is not safe to use
extened attributes and metaecc at the same time yet.

The ocfs2_extent_tree and ocfs2_path abstractions in alloc.c both hide
the type of block at their root.  Before, it didn't matter, but now the
root block must use the appropriate ocfs2_journal_access_*() function.
To keep this abstract, the structures now have a pointer to the matching
journal_access function and a wrapper call to call it.

A few places use naked ocfs2_write_block() calls instead of adding the
blocks to the journal.  We make sure to calculate their checksum and ecc
before the write.

Since we pass around the journal_access functions.  Let's typedef them
in ocfs2.h.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

13723d00

ocfs2: Wrap extent block reads in a dedicated function. · 5e96581a

由 Joel Becker 提交于 11月 13, 2008

We weren't consistently checking extent blocks after we read them.
Most places checked the signature, but none checked h_blkno or
h_fs_signature.  Create a toplevel ocfs2_read_extent_block() that does
the read and the validation.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

5e96581a

ocfs2: turn __ocfs2_remove_inode_range() into ocfs2_remove_btree_range() · fecc0112

由 Mark Fasheh 提交于 11月 12, 2008

This patch genericizes the high level handling of extent removal.
ocfs2_remove_btree_range() is nearly identical to
__ocfs2_remove_inode_range(), except that extent tree operations have been
used where necessary. We update ocfs2_remove_inode_range() to use the
generic helper. Now extent tree based structures have an easy way to
truncate ranges.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Acked-by: NJoel Becker <joel.becker@oracle.com>

fecc0112

ocfs2/xattr: Reserve meta/data at the beginning of ocfs2_xattr_set. · 78f30c31

由 Tao Ma 提交于 11月 12, 2008

In ocfs2 xattr set, we reserve metadata and clusters in any place
they are needed. It is time-consuming and ineffective, so this
patch try to reserve metadata and clusters at the beginning of
ocfs2_xattr_set.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

78f30c31

ocfs2: Add clusters free in dealloc_ctxt. · 2891d290

由 Tao Ma 提交于 11月 12, 2008

Now in ocfs2 xattr set, the whole process are divided into many small
parts and they are wrapped into diffrent transactions and it make the
set doesn't look like a real transaction. So we want to integrate it
into a real one.

In some cases we will allocate some clusters and free some in just one
transaction. e.g, one xattr is larger than inline size, so it and its
value root is stored within the inode while the value is outside in a
cluster. Then we try to update it with a smaller value(larger than the
size of root but smaller than inline size), we may need to free the
outside cluster while allocate a new bucket(one cluster) since now the
inode may be full. The old solution will lock the global_bitmap(if the
local alloc failed in stress test) and then the truncate log. This will
cause a ABBA lock with truncate log flush.

This patch add the clusters free in dealloc_ctxt, so that we can record
the free clusters during the transaction and then free it after we
release the global_bitmap in xattr set.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

2891d290

14 10月, 2008 10 次提交

ocfs2: Change ocfs2_get_*_extent_tree() to ocfs2_init_*_extent_tree() · 8d6220d6

由 Joel Becker 提交于 8月 22, 2008

The original get/put_extent_tree() functions held a reference on
et_root_bh.  However, every single caller already has a safe reference,
making the get/put cycle irrelevant.

We change ocfs2_get_*_extent_tree() to ocfs2_init_*_extent_tree().  It
no longer gets a reference on et_root_bh.  ocfs2_put_extent_tree() is
removed.  Callers now have a simpler init+use pattern.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

8d6220d6

ocfs2: Make ocfs2_extent_tree the first-class representation of a tree. · f99b9b7c

由 Joel Becker 提交于 8月 20, 2008

We now have three different kinds of extent trees in ocfs2: inode data
(dinode), extended attributes (xattr_tree), and extended attribute
values (xattr_value).  There is a nice abstraction for them,
ocfs2_extent_tree, but it is hidden in alloc.c.  All the calling
functions have to pick amongst a varied API and pass in type bits and
often extraneous pointers.

A better way is to make ocfs2_extent_tree a first-class object.
Everyone converts their object to an ocfs2_extent_tree() via the
ocfs2_get_*_extent_tree() calls, then uses the ocfs2_extent_tree for all
tree calls to alloc.c.

This simplifies a lot of callers, making for readability.  It also
provides an easy way to add additional extent tree types, as they only
need to be defined in alloc.c with a ocfs2_get_<new>_extent_tree()
function.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

f99b9b7c

ocfs2: Create specific get_extent_tree functions. · 1a09f556

由 Joel Becker 提交于 8月 20, 2008

A caller knows what kind of extent tree they have.  There's no reason
they have to call ocfs2_get_extent_tree() with a NULL when they could
just as easily call a specific function to their type of extent tree.

Introduce ocfs2_dinode_get_extent_tree(),
ocfs2_xattr_tree_get_extent_tree(), and
ocfs2_xattr_value_get_extent_tree().  They only take the necessary
arguments, calling into the underlying __ocfs2_get_extent_tree() to do
the real work.

__ocfs2_get_extent_tree() is the old ocfs2_get_extent_tree(), but
without needing any switch-by-type logic.

ocfs2_get_extent_tree() is now a wrapper around the specific calls.  It
exists because a couple alloc.c functions can take et_type.  This will
go later.

Another benefit is that ocfs2_xattr_value_get_extent_tree() can take a
struct ocfs2_xattr_value_root* instead of void*.  This gives us
typechecking where we didn't have it before.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

1a09f556

ocfs2: Optionally limit extent size in ocfs2_insert_extent() · ca12b7c4

由 Tao Ma 提交于 8月 18, 2008

In xattr bucket, we want to limit the maximum size of a btree leaf,
otherwise we'll lose the benefits of hashing because we'll have to search
large leaves.

So add a new field in ocfs2_extent_tree which indicates the maximum leaf cluster
size we want so that we can prevent ocfs2_insert_extent() from merging the leaf
record even if it is contiguous with an adjacent record.

Other btree types are not affected by this change.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

ca12b7c4

ocfs2: Add xattr index tree operations · ba492615

由 Tao Ma 提交于 8月 18, 2008

When necessary, an ocfs2_xattr_block will embed an ocfs2_extent_list to
store large numbers of EAs. This patch adds a new type in
ocfs2_extent_tree_type and adds the implementation so that we can re-use the
b-tree code to handle the storage of many EAs.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

ba492615

ocfs2: Add extent tree operation for xattr value btrees · f56654c4

由 Tao Ma 提交于 8月 18, 2008

Add some thin wrappers around ocfs2_insert_extent() for each of the 3
different btree types, ocfs2_inode_insert_extent(),
ocfs2_xattr_value_insert_extent() and ocfs2_xattr_tree_insert_extent(). The
last is for the xattr index btree, which will be used in a followup patch.

All the old callers in file.c etc will call ocfs2_dinode_insert_extent(),
while the other two handle the xattr issue. And the init of extent tree are
handled by these functions.

When storing xattr value which is too large, we will allocate some clusters
for it and here ocfs2_extent_list and ocfs2_extent_rec will also be used. In
order to re-use the b-tree operation code, a new parameter named "private"
is added into ocfs2_extent_tree and it is used to indicate the root of
ocfs2_exent_list. The reason is that we can't deduce the root from the
buffer_head now. It may be in an inode, an ocfs2_xattr_block or even worse,
in any place in an ocfs2_xattr_bucket.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

f56654c4

ocfs2: Make high level btree extend code generic · 0eb8d47e

由 Tao Ma 提交于 8月 18, 2008

Factor out the non-inode specifics of ocfs2_do_extend_allocation() into a more generic
function, ocfs2_do_cluster_allocation(). ocfs2_do_extend_allocation calls
ocfs2_do_cluster_allocation() now, but the latter can be used for other
btree types as well.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

0eb8d47e

ocfs2: Abstract ocfs2_extent_tree in b-tree operations. · e7d4cb6b

由 Tao Ma 提交于 8月 18, 2008

In the old extent tree operation, we take the hypothesis that we
are using the ocfs2_extent_list in ocfs2_dinode as the tree root.
As xattr will also use ocfs2_extent_list to store large value
for a xattr entry, we refactor the tree operation so that xattr
can use it directly.

The refactoring includes 4 steps:
1. Abstract set/get of last_eb_blk and update_clusters since they may
   be stored in different location for dinode and xattr.
2. Add a new structure named ocfs2_extent_tree to indicate the
   extent tree the operation will work on.
3. Remove all the use of fe_bh and di, use root_bh and root_el in
   extent tree instead. So now all the fe_bh is replaced with
   et->root_bh, el with root_el accordingly.
4. Make ocfs2_lock_allocators generic. Now it is limited to be only used
   in file extend allocation. But the whole function is useful when we want
   to store large EAs.

Note: This patch doesn't touch ocfs2_commit_truncate() since it is not used
for anything other than truncate inode data btrees.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

e7d4cb6b

ocfs2: Use ocfs2_extent_list instead of ocfs2_dinode. · 811f933d

由 Tao Ma 提交于 8月 18, 2008

ocfs2_extend_meta_needed(), ocfs2_calc_extend_credits() and
ocfs2_reserve_new_metadata() are all useful for extent tree operations. But
they are all limited to an inode btree because they use a struct
ocfs2_dinode parameter. Change their parameter to struct ocfs2_extent_list
(the part of an ocfs2_dinode they actually use) so that the xattr btree code
can use these functions.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

811f933d

ocfs2: Modify ocfs2_num_free_extents for future xattr usage. · 231b87d1

由 Tao Ma 提交于 8月 18, 2008

ocfs2_num_free_extents() is used to find the number of free extent records
in an inode btree. Hence, it takes an "ocfs2_dinode" parameter. We want to
use this for extended attribute trees in the future, so genericize the
interface the take a buffer head. A future patch will allow that buffer_head
to contain any structure rooting an ocfs2 btree.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

231b87d1

04 10月, 2008 1 次提交

ocfs2: fiemap support · 00dc417f

由 Mark Fasheh 提交于 10月 03, 2008

Plug ocfs2 into ->fiemap. Some portions of ocfs2_get_clusters() had to be
refactored so that the extent cache can be skipped in favor of going
directly to the on-disk records. This makes it easier for us to determine
which extent is the last one in the btree. Also, I'm not sure we want to be
caching fiemap lookups anyway as they're not directly related to data
read/write.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: ocfs2-devel@oss.oracle.com
Cc: linux-fsdevel@vger.kernel.org

00dc417f

13 10月, 2007 2 次提交

ocfs2: Write support for directories with inline data · 5b6a3a2b

由 Mark Fasheh 提交于 9月 13, 2007

Create all new directories with OCFS2_INLINE_DATA_FL and the inline data
bytes formatted as an empty directory. Inode size field reflects the actual
amount of inline data available, which makes searching for dirent space
very similar to the regular directory search.

Inline-data directories are automatically pushed out to extents on any
insert request which is too large for the available space.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: NJoel Becker <joel.becker@oracle.com>

5b6a3a2b

ocfs2: Write support for inline data · 1afc32b9

由 Mark Fasheh 提交于 9月 07, 2007

This fixes up write, truncate, mmap, and RESVSP/UNRESVP to understand inline
inode data.

For the most part, the changes to the core write code can be relied on to do
the heavy lifting. Any code calling ocfs2_write_begin (including shared
writeable mmap) can count on it doing the right thing with respect to
growing inline data to an extent tree.

Size reducing truncates, including UNRESVP can simply zero that portion of
the inode block being removed. Size increasing truncatesm, including RESVP
have to be a little bit smarter and grow the inode to an extent tree if
necessary.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: NJoel Becker <joel.becker@oracle.com>

1afc32b9

11 7月, 2007 6 次提交

ocfs2: support for removing file regions · 063c4561

由 Mark Fasheh 提交于 7月 03, 2007

Provide an internal interface for the removal of arbitrary file regions.

ocfs2_remove_inode_range() takes a byte range within a file and will remove
existing extents within that range. Partial clusters will be zeroed so that
any read from within the region will return zeros.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

063c4561

ocfs2: update truncate handling of partial clusters · 35edec1d

由 Mark Fasheh 提交于 7月 06, 2007

The partial cluster zeroing code used during truncate usually assumes that
the rightmost byte in the range to be zeroed lies on a cluster boundary.
This makes sense for truncate, but punching holes might require zeroing on
non-aligned rightmost boundaries.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

35edec1d

ocfs2: Support creation of unwritten extents · 2ae99a60

由 Mark Fasheh 提交于 3月 09, 2007

This can now be trivially supported with re-use of our existing extend code.

ocfs2_allocate_unwritten_extents() takes a start offset and a byte length
and iterates over the inode, adding extents (marked as unwritten) until len
is reached. Existing extents are skipped over.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

2ae99a60

ocfs2: btree changes for unwritten extents · 328d5752

由 Mark Fasheh 提交于 6月 18, 2007

Writes to a region marked as unwritten might result in a record split or
merge. We can support splits by making minor changes to the existing insert
code. Merges require left rotations which mostly re-use right rotation
support functions.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

328d5752

M
ocfs2: plug truncate into cached dealloc routines · 59a5e416
由 Mark Fasheh 提交于 6月 22, 2007
```
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
```
59a5e416

ocfs2: simplify deallocation locking · 2b604351

由 Mark Fasheh 提交于 6月 22, 2007

Deallocation of suballocator blocks, most notably extent blocks, might
involve multiple suballocator inodes.

The locking for this can get extremely complicated, especially when the
suballocator inodes to delete from aren't known until deep within an
unrelated codepath.

Implement a simple scheme for recording the blocks to be unlinked so that
the actual deallocation can be done in a context which won't deadlock.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

2b604351

27 4月, 2007 4 次提交

ocfs2: make room for unwritten extents flag · e48edee2

由 Mark Fasheh 提交于 3月 07, 2007

Due to the size of our group bitmaps, we'll never have a leaf node extent
record with more than 16 bits worth of clusters. Split e_clusters up so that
leaf nodes can get a flags field where we can mark unwritten extents.
Interior nodes whose length references all the child nodes beneath it can't
split their e_clusters field, so we use a union to preserve sizing there.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

e48edee2

ocfs2: zero tail of sparse files on truncate · 60b11392

由 Mark Fasheh 提交于 2月 16, 2007

Since we don't zero on extend anymore, truncate needs to be fixed up to zero
the part of a file between i_size and and end of it's cluster. Otherwise a
subsequent extend could expose bad data.

This introduced a new helper, which can be used in ocfs2_write().
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

60b11392

ocfs2: temporarily remove extent map caching · 363041a5

由 Mark Fasheh 提交于 1月 17, 2007

The code in extent_map.c is not prepared to deal with a subtree being
rotated between lookups. This can happen when filling holes in sparse files.
Instead of a lengthy patch to update the code (which would likely lose the
benefit of caching subtree roots), we remove most of the algorithms and
implement a simple path based lookup. A less ambitious extent caching scheme
will be added in a later patch.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

363041a5

ocfs2: sparse b-tree support · dcd0538f

由 Mark Fasheh 提交于 1月 16, 2007

Introduce tree rotations into the b-tree code. This will allow ocfs2 to
support sparse files. Much of the added code is designed to be generic (in
the ocfs2 sense) so that it can later be re-used to implement large
extended attributes.

This patch only adds the rotation code and does minimal updates to callers
of the extent api.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

dcd0538f

02 12月, 2006 1 次提交

ocfs2: Remove struct ocfs2_journal_handle in favor of handle_t · 1fabe148

由 Mark Fasheh 提交于 10月 09, 2006

This is mostly a search and replace as ocfs2_journal_handle is now no more
than a container for a handle_t pointer.

ocfs2_commit_trans() becomes very straight forward, and we remove some out
of date comments / code.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

1fabe148

04 1月, 2006 1 次提交

[PATCH] OCFS2: The Second Oracle Cluster Filesystem · ccd979bd

由 Mark Fasheh 提交于 12月 15, 2005

The OCFS2 file system module.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>

ccd979bd

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功