提交 · 9f868f16e40e9ad8e39aebff94a4be0d96520734 · openanolis / cloud-kernel

06 1月, 2009 40 次提交

ocfs2/xattr: Restore not_found in xis · 9f868f16

由 Tao Ma 提交于 11月 19, 2008

During an xattr set, when we move a xattr which was stored in inode to the
outside bucket, we have to delete it and it will use the old value of
xis->not_found. xis->not_found is removed by ocfs2_calc_xattr_set_need
though, so we must restore it.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

9f868f16

ocfs2/xattr: Fix a bug in xattr allocation estimation · 97aff52a

由 Tao Ma 提交于 11月 19, 2008

When we extend one xattr's value to a large size, the old value size might
be smaller than the size of a value root. In those cases, we still need to
guess the metadata allocation.
Reported-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

97aff52a

ocfs2: Remove JBD compatibility layer · 53ef99ca

由 Mark Fasheh 提交于 11月 18, 2008

JBD2 is fully backwards compatible with JBD and it's been tested enough with
Ocfs2 that we can clean this code up now.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

53ef99ca

ocfs2: Convert ocfs2_read_dir_block() to ocfs2_read_virt_blocks() · 511308d9

由 Joel Becker 提交于 11月 13, 2008

Now that we've centralized the ocfs2_read_virt_blocks() code, let's use
it in ocfs2_read_dir_block().
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

511308d9

ocfs2: Wrap virtual block reads in ocfs2_read_virt_blocks() · a8549fb5

由 Joel Becker 提交于 11月 13, 2008

The ocfs2_read_dir_block() function really maps an inode's virtual
blocks to physical ones before calling ocfs2_read_blocks().  Let's
extract that to common code, because other places might want to do that.

Other than the block number being virtual, ocfs2_read_virt_blocks()
takes the same arguments as ocfs2_read_blocks().  It converts those
virtual block numbers to physical before calling ocfs2_read_blocks()
directly.  If the blocks asked for are discontiguous, this can mean
multiple calls to ocfs2_read_blocks(), but this is mostly hidden from
the caller.

Like ocfs2_read_blocks(), the caller can pass in an existing
buffer_head.  This is usually done to pick up some readahead I/O.
ocfs2_read_virt_blocks() checks the buffer_head's block number
against the extent map - it must match.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

a8549fb5

ocfs2: Validate metadata only when it's read from disk. · 970e4936

由 Joel Becker 提交于 11月 13, 2008

Add an optional validation hook to ocfs2_read_blocks(). Now the
validation function is only called when a block was actually read off of
disk. It is not called when the buffer was in cache.

We add a buffer state bit BH_NeedsValidate to flag these buffers. It
must always be one higher than the last JBD2 buffer state bit.

The dinode, dirblock, extent_block, and xattr_block validators are
lifted to this scheme directly. The group_descriptor validator needs to
be split into two pieces. The first part only needs the gd buffer and
is passed to ocfs2_read_block(). The second part requires the dinode as
well, and is called every time. It's only 3 compares, so it's tiny.
This also allows us to clean up the non-fatal gd check used by resize.c.
It now has no magic argument.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

970e4936

ocfs2: Wrap xattr block reads in a dedicated function · 4ae1d69b

由 Joel Becker 提交于 11月 13, 2008

We weren't consistently checking xattr blocks after we read them.
Most places checked the signature, but none checked xb_blkno or
xb_fs_signature.  Create a toplevel ocfs2_read_xattr_block() that does
the read and the validation.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

4ae1d69b

ocfs2: Wrap dirblock reads in a dedicated function. · a22305cc

由 Joel Becker 提交于 11月 13, 2008

We have ocfs2_bread() as a vestige of the original ext-based dir code.
It's only used by directories, though.  Turn it into
ocfs2_read_dir_block(), with a prototype matching the other metadata
read functions.  It's set up to validate dirblocks when the time comes.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

a22305cc

ocfs2: Wrap extent block reads in a dedicated function. · 5e96581a

由 Joel Becker 提交于 11月 13, 2008

We weren't consistently checking extent blocks after we read them.
Most places checked the signature, but none checked h_blkno or
h_fs_signature.  Create a toplevel ocfs2_read_extent_block() that does
the read and the validation.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

5e96581a

ocfs2: Morph the haphazard OCFS2_IS_VALID_GROUP_DESC() checks. · 42035306

由 Joel Becker 提交于 11月 13, 2008

Random places in the code would check a group descriptor bh to see if it
was valid. The previous commit unified descriptor block reads,
validating all block reads in the same place.  Thus, these checks are no
longer necessary.  Rather than eliminate them, however, we change them
to BUG_ON() checks.  This ensures the assumptions remain true.  All of
the code paths to these checks have been audited to ensure they come
from a validated descriptor read.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

42035306

ocfs2: Wrap group descriptor reads in a dedicated function. · 68f64d47

由 Joel Becker 提交于 11月 13, 2008

We have a clean call for validating group descriptors, but every place
that wants the always does a read_block()+validate() call pair. Create
a toplevel ocfs2_read_group_descriptor() that does the right
thing. This allows us to leverage the single call point later for
fancier handling. We also add validation of gd->bg_generation against
the superblock and gd->bg_blkno against the block we thought we read.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

68f64d47

ocfs2: Consolidate validation of group descriptors. · 57e3e797

由 Joel Becker 提交于 11月 13, 2008

Currently the validation of group descriptors is directly duplicated so
that one version can error the filesystem and the other (resize) can
just report the problem.  Consolidate to one function that takes a
boolean.  Wrap that function with the old call for the old users.

This is in preparation for lifting the read+validate step into a
single function.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

57e3e797

ocfs2: Morph the haphazard OCFS2_IS_VALID_DINODE() checks. · 10995aa2

由 Joel Becker 提交于 11月 13, 2008

Random places in the code would check a dinode bh to see if it was
valid.  Not only did they do different levels of validation, they
handled errors in different ways.

The previous commit unified inode block reads, validating all block
reads in the same place.  Thus, these haphazard checks are no longer
necessary.  Rather than eliminate them, however, we change them to
BUG_ON() checks.  This ensures the assumptions remain true.  All of the
code paths to these checks have been audited to ensure they come from a
validated inode read.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

10995aa2

ocfs2: Wrap inode block reads in a dedicated function. · b657c95c

由 Joel Becker 提交于 11月 13, 2008

The ocfs2 code currently reads inodes off disk with a simple
ocfs2_read_block() call.  Each place that does this has a different set
of sanity checks it performs.  Some check only the signature.  A couple
validate the block number (the block read vs di->i_blkno).  A couple
others check for VALID_FL.  Only one place validates i_fs_generation.  A
couple check nothing.  Even when an error is found, they don't all do
the same thing.

We wrap inode reading into ocfs2_read_inode_block().  This will validate
all the above fields, going readonly if they are invalid (they never
should be).  ocfs2_read_inode_block_full() is provided for the places
that want to pass read_block flags.  Every caller is passing a struct
inode with a valid ip_blkno, so we don't need a separate blkno argument
either.

We will remove the validation checks from the rest of the code in a
later commit, as they are no longer necessary.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

b657c95c

ocfs2: add mount option and Kconfig option for acl · a68979b8

由 Tiger Yang 提交于 11月 14, 2008

This patch adds the Kconfig option "CONFIG_OCFS2_FS_POSIX_ACL"
and mount options "acl" to enable acls in Ocfs2.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

a68979b8

ocfs2: add ocfs2_init_acl in mknod · 89c38bd0

由 Tiger Yang 提交于 11月 14, 2008

We need to get the parent directories acls and let the new child inherit it.
To this, we add additional calculations for data/metadata allocation.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

89c38bd0

ocfs2: add ocfs2_acl_chmod · 060bc66d

由 Tiger Yang 提交于 11月 14, 2008

This function is used to update acl xattrs during file mode changes.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

060bc66d

ocfs2: add ocfs2_check_acl · 23fc2702

由 Tiger Yang 提交于 11月 14, 2008

This function is used to enhance permission checking with POSIX ACLs.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

23fc2702

ocfs2: add POSIX ACL API · 929fb014

由 Tiger Yang 提交于 11月 14, 2008

This patch adds POSIX ACL(access control lists) APIs in ocfs2. We convert
struct posix_acl to many ocfs2_acl_entry and regard them as an extended
attribute entry.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

929fb014

ocfs2: add ocfs2_xattr_get_nolock · 4e3e9d02

由 Tiger Yang 提交于 11月 14, 2008

This function does the work of ocfs2_xattr_get under an open lock.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

4e3e9d02

ocfs2: add ocfs2_init_security in during file create · 534eaddd

由 Tiger Yang 提交于 11月 14, 2008

Security attributes must be set when creating a new inode.

We do this in three steps.

- First, get security xattr's name and value by security_operation

- Calculate and reserve the meta data and clusters needed by this security
  xattr before starting transaction

- Finally, we set it before add_entry
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

534eaddd

ocfs2: add security xattr API · 923f7f31

由 Tiger Yang 提交于 11月 14, 2008

This patch add security xattr set/get/list APIs to
support security attributes in Ocfs2.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

923f7f31

ocfs2: add ocfs2_xattr_set_handle · 6c3faba4

由 Tiger Yang 提交于 11月 14, 2008

This function is used to set xattr's in a started transaction. It is only
called during inode creation inode for initial security/acl xattrs of the
new inode. These xattrs could be put into ibody or extent block, so xattr
bucket would not be use in this case.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

6c3faba4

ocfs2: move new inode allocation out of the transaction · f5d36202

由 Tiger Yang 提交于 11月 14, 2008

Move out inode allocation from ocfs2_mknod_locked() because
vfs_dq_init() must be called outside of a transaction.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

f5d36202

ocfs2: turn __ocfs2_remove_inode_range() into ocfs2_remove_btree_range() · fecc0112

由 Mark Fasheh 提交于 11月 12, 2008

This patch genericizes the high level handling of extent removal.
ocfs2_remove_btree_range() is nearly identical to
__ocfs2_remove_inode_range(), except that extent tree operations have been
used where necessary. We update ocfs2_remove_inode_range() to use the
generic helper. Now extent tree based structures have an easy way to
truncate ranges.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Acked-by: NJoel Becker <joel.becker@oracle.com>

fecc0112

ocfs2/xattr: Merge xattr set transaction. · 85db90e7

由 Tao Ma 提交于 11月 12, 2008

In current ocfs2/xattr, the whole xattr set is divided into
many steps are many transaction are used, this make the
xattr set process isn't like a real transaction, so this
patch try to merge all the transaction into one. Another
benefit is that acl can use it easily now.

I don't merge the transaction of deleting xattr when we
remove an inode. The reason is that if we have a large number
of xattrs and every xattrs has large values(large enough
for outside storage), the whole transaction will be very
huge and it looks like jbd can't handle it(I meet with a
jbd complain once). And the old inode removal is also divided
into many steps, so I'd like to leave as it is.

Note:
In xattr set, I try to avoid ocfs2_extend_trans since if
the credits aren't enough for the extension, it will commit
all the dirty blocks and create a new transaction which may
lead to inconsistency in metadata. All ocfs2_extend_trans
remained are safe now.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

85db90e7

ocfs2/xattr: Reserve meta/data at the beginning of ocfs2_xattr_set. · 78f30c31

由 Tao Ma 提交于 11月 12, 2008

In ocfs2 xattr set, we reserve metadata and clusters in any place
they are needed. It is time-consuming and ineffective, so this
patch try to reserve metadata and clusters at the beginning of
ocfs2_xattr_set.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

78f30c31

ocfs2/xattr: Move clusters free into dealloc. · c73f60f9

由 Tao Ma 提交于 11月 12, 2008

Move clusters free process into dealloc context so that
they can be freed after the transaction.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

c73f60f9

ocfs2: Add clusters free in dealloc_ctxt. · 2891d290

由 Tao Ma 提交于 11月 12, 2008

Now in ocfs2 xattr set, the whole process are divided into many small
parts and they are wrapped into diffrent transactions and it make the
set doesn't look like a real transaction. So we want to integrate it
into a real one.

In some cases we will allocate some clusters and free some in just one
transaction. e.g, one xattr is larger than inline size, so it and its
value root is stored within the inode while the value is outside in a
cluster. Then we try to update it with a smaller value(larger than the
size of root but smaller than inline size), we may need to free the
outside cluster while allocate a new bucket(one cluster) since now the
inode may be full. The old solution will lock the global_bitmap(if the
local alloc failed in stress test) and then the truncate log. This will
cause a ABBA lock with truncate log flush.

This patch add the clusters free in dealloc_ctxt, so that we can record
the free clusters during the transaction and then free it after we
release the global_bitmap in xattr set.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

2891d290

ocfs2/xattr: Only extend xattr bucket in need. · 976331d8

由 Tao Ma 提交于 11月 12, 2008

When the first block of a bucket is filled up with xattr
entries, we normally extend the bucket. But if we are
just replace one xattr with small length, we don't need
to extend it. This is important since we will calculate
what we need before the transaction and in this situation
no resources will be allocated.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

976331d8

ocfs2/xattr: Only set buffer update if it doesn't exist in cache. · 757055ad

由 Tao Ma 提交于 11月 06, 2008

When we call ocfs2_init_xattr_bucket, we deem that the new buffer head
will be written to disk immediately, so we just use sb_getblk. But in
some cases the buffer may have already been in ocfs2 uptodate cache,
so we only call ocfs2_set_buffer_uptodate if the buffer head isn't
in the cache.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

757055ad

ocfs2/xattr: Remove additional bucket allocation in bucket defragment. · 1c32a2fd

由 Tao Ma 提交于 11月 06, 2008

Joel has refactored xattr bucket and make xattr bucket a general
wrapper. So in ocfs2_defrag_xattr_bucket, we have already passed the
bucket in, so there is no need to allocate a new one and read it.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

1c32a2fd

ocfs2: Use buckets in ocfs2_xattr_set_entry_in_bucket(). · 02dbf38d

由 Joel Becker 提交于 10月 27, 2008

The ocfs2_xattr_set_entry_in_bucket() function is already working on an
ocfs2_xattr_bucket structure, so let's use the bucket API.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

02dbf38d

ocfs2: Use buckets in ocfs2_defrag_xattr_bucket(). · 161d6f30

由 Joel Becker 提交于 10月 27, 2008

Use the ocfs2_xattr_bucket abstraction for reading and writing the
bucket in ocfs2_defrag_xattr_bucket().
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

161d6f30

ocfs2: Use buckets in ocfs2_xattr_create_index_block(). · 178eeac3

由 Joel Becker 提交于 10月 27, 2008

Use the ocfs2_xattr_bucket abstraction in
ocfs2_xattr_create_index_block() and its helpers.  We get more efficient
reads, a lot less buffer_head munging, and nicer code to boot.  While
we're at it, ocfs2_xattr_update_xattr_search() becomes void.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

178eeac3

ocfs2: Use buckets in ocfs2_xattr_bucket_find(). · e2356a3f

由 Joel Becker 提交于 10月 27, 2008

Change the ocfs2_xattr_bucket_find() function to use ocfs2_xattr_bucket
as its abstraction.  This makes for more efficient reads, as buckets are
linear blocks, and also has improved caching characteristics.  It also
reads better.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

e2356a3f

ocfs2: Take ocfs2_xattr_bucket structures off of the stack. · ba937127

由 Joel Becker 提交于 10月 24, 2008

The ocfs2_xattr_bucket structure is a nice abstraction, but it is a bit
large to have on the stack.  Just like ocfs2_path, let's allocate it
with a ocfs2_xattr_bucket_new() function.

We can now store the inode on the bucket, cleaning up all the other
bucket functions.  While we're here, we catch another place or two that
wasn't using ocfs2_read_xattr_bucket().

Updates:
- No longer allocating xis.bucket, as it will never be used.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

ba937127

ocfs2: Copy xattr buckets with a dedicated function. · 4980c6da

由 Joel Becker 提交于 10月 24, 2008

Now that the places that copy whole buckets are using struct
ocfs2_xattr_bucket, we can do the copy in a dedicated function.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

4980c6da

ocfs2: Wrap journal_access/journal_dirty for xattr buckets. · 1224be02

由 Joel Becker 提交于 10月 24, 2008

A common action is to call ocfs2_journal_access() and
ocfs2_journal_dirty() on the buffer heads of an xattr bucket.  Let's
create nice wrappers.

While we're there, let's drop the places that try to be smart by writing
only the first and last blocks of a bucket.  A bucket is contiguous, so
writing the whole thing is actually more efficient.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

1224be02

ocfs2: Improve ocfs2_read_xattr_bucket(). · 784b816a

由 Joel Becker 提交于 10月 24, 2008

The ocfs2_read_xattr_bucket() function would read an xattr bucket into a
list of buffer heads. However, we have a nice ocfs2_xattr_bucket
structure. Let's have it fill that out instead.

In addition, ocfs2_read_xattr_bucket() would initialize buffer heads for
a bucket that's never been on disk before. That's confusing. Let's
call that functionality ocfs2_init_xattr_bucket().

The functions ocfs2_cp_xattr_bucket() and ocfs2_half_xattr_bucket() are
updated to use the ocfs2_xattr_bucket structure rather than raw bh
lists. That way they can use the new read/init calls. In addition,
they drop the wasted read of an existing target bucket.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

784b816a

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功