提交 · 68f64d471be38631d7196b938d9809802dd467fa · openeuler / raspberrypi-kernel

06 1月, 2009 3 次提交

ocfs2: Wrap group descriptor reads in a dedicated function. · 68f64d47

由 Joel Becker 提交于 11月 13, 2008

We have a clean call for validating group descriptors, but every place
that wants the always does a read_block()+validate() call pair. Create
a toplevel ocfs2_read_group_descriptor() that does the right
thing. This allows us to leverage the single call point later for
fancier handling. We also add validation of gd->bg_generation against
the superblock and gd->bg_blkno against the block we thought we read.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

68f64d47

ocfs2: Consolidate validation of group descriptors. · 57e3e797

由 Joel Becker 提交于 11月 13, 2008

Currently the validation of group descriptors is directly duplicated so
that one version can error the filesystem and the other (resize) can
just report the problem.  Consolidate to one function that takes a
boolean.  Wrap that function with the old call for the old users.

This is in preparation for lifting the read+validate step into a
single function.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

57e3e797

ocfs2: Morph the haphazard OCFS2_IS_VALID_DINODE() checks. · 10995aa2

由 Joel Becker 提交于 11月 13, 2008

Random places in the code would check a dinode bh to see if it was
valid.  Not only did they do different levels of validation, they
handled errors in different ways.

The previous commit unified inode block reads, validating all block
reads in the same place.  Thus, these haphazard checks are no longer
necessary.  Rather than eliminate them, however, we change them to
BUG_ON() checks.  This ensures the assumptions remain true.  All of the
code paths to these checks have been audited to ensure they come from a
validated inode read.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

10995aa2

15 10月, 2008 2 次提交

ocfs2: Simplify ocfs2_read_block() · 0fcaa56a

由 Joel Becker 提交于 10月 09, 2008

More than 30 callers of ocfs2_read_block() pass exactly OCFS2_BH_CACHED.
Only six pass a different flag set. Rather than have every caller care,
let's make ocfs2_read_block() take no flags and always do a cached read.
The remaining six places can call ocfs2_read_blocks() directly.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

0fcaa56a

ocfs2: Require an inode for ocfs2_read_block(s)(). · 31d33073

由 Joel Becker 提交于 10月 09, 2008

Now that synchronous readers are using ocfs2_read_blocks_sync(), all
callers of ocfs2_read_blocks() are passing an inode.  Use it
unconditionally.  Since it's there, we don't need to pass the
ocfs2_super either.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

31d33073

14 10月, 2008 9 次提交

M
ocfs2: Don't check for NULL before brelse() · a81cb88b
由 Mark Fasheh 提交于 10月 07, 2008
```
This is pointless as brelse() already does the check.

Signed-off-by: Mark Fasheh
```
a81cb88b

ocfs2: Add the 'inode64' mount option. · 12462f1d

由 Joel Becker 提交于 9月 03, 2008

Now that ocfs2 limits inode numbers to 32bits, add a mount option to
disable the limit.  This parallels XFS.  64bit systems can handle the
larger inode numbers.

[ Added description of inode64 mount option in ocfs2.txt. --Mark ]
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

12462f1d

ocfs2: Limit inode allocation to 32bits. · 1187c968

由 Joel Becker 提交于 9月 03, 2008

ocfs2 inode numbers are block numbers.  For any filesystem with less
than 2^32 blocks, this is not a problem.  However, when ocfs2 starts
using JDB2, it will be able to support filesystems with more than 2^32
blocks.  This would result in inode numbers higher than 2^32.

The problem is that stat(2) can't handle those numbers on 32bit
machines.  The simple solution is to have ocfs2 allocate all inodes
below that boundary.

The suballoc code is changed to honor an optional block limit.  Only the
inode suballocator sets that limit - all other allocations stay unlimited.

The biggest trick is to grow the inode suballocator beneath that limit.
There's no point in allocating block groups that are above the limit,
then rejecting their elements later on.  We want to prevent the inode
allocator from ever having block groups above the limit.  This involves
a little gyration with the local alloc code.  If the local alloc window
is above the limit, it signals the caller to try the global bitmap but
does not disable the local alloc file (which can be used for other
allocations).

[ Minor cleanup - removed an ML_NOTICE comment. --Mark ]
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

1187c968

ocfs2: Make ocfs2_extent_tree the first-class representation of a tree. · f99b9b7c

由 Joel Becker 提交于 8月 20, 2008

We now have three different kinds of extent trees in ocfs2: inode data
(dinode), extended attributes (xattr_tree), and extended attribute
values (xattr_value).  There is a nice abstraction for them,
ocfs2_extent_tree, but it is hidden in alloc.c.  All the calling
functions have to pick amongst a varied API and pass in type bits and
often extraneous pointers.

A better way is to make ocfs2_extent_tree a first-class object.
Everyone converts their object to an ocfs2_extent_tree() via the
ocfs2_get_*_extent_tree() calls, then uses the ocfs2_extent_tree for all
tree calls to alloc.c.

This simplifies a lot of callers, making for readability.  It also
provides an easy way to add additional extent tree types, as they only
need to be defined in alloc.c with a ocfs2_get_<new>_extent_tree()
function.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

f99b9b7c

ocfs2: Add extended attribute support · cf1d6c76

由 Tiger Yang 提交于 8月 18, 2008

This patch implements storing extended attributes both in inode or a single
external block. We only store EA's in-inode when blocksize > 512 or that
inode block has free space for it. When an EA's value is larger than 80
bytes, we will store the value via b-tree outside inode or block.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

cf1d6c76

ocfs2: Add extent tree operation for xattr value btrees · f56654c4

由 Tao Ma 提交于 8月 18, 2008

Add some thin wrappers around ocfs2_insert_extent() for each of the 3
different btree types, ocfs2_inode_insert_extent(),
ocfs2_xattr_value_insert_extent() and ocfs2_xattr_tree_insert_extent(). The
last is for the xattr index btree, which will be used in a followup patch.

All the old callers in file.c etc will call ocfs2_dinode_insert_extent(),
while the other two handle the xattr issue. And the init of extent tree are
handled by these functions.

When storing xattr value which is too large, we will allocate some clusters
for it and here ocfs2_extent_list and ocfs2_extent_rec will also be used. In
order to re-use the b-tree operation code, a new parameter named "private"
is added into ocfs2_extent_tree and it is used to indicate the root of
ocfs2_exent_list. The reason is that we can't deduce the root from the
buffer_head now. It may be in an inode, an ocfs2_xattr_block or even worse,
in any place in an ocfs2_xattr_bucket.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

f56654c4

ocfs2: Abstract ocfs2_extent_tree in b-tree operations. · e7d4cb6b

由 Tao Ma 提交于 8月 18, 2008

In the old extent tree operation, we take the hypothesis that we
are using the ocfs2_extent_list in ocfs2_dinode as the tree root.
As xattr will also use ocfs2_extent_list to store large value
for a xattr entry, we refactor the tree operation so that xattr
can use it directly.

The refactoring includes 4 steps:
1. Abstract set/get of last_eb_blk and update_clusters since they may
   be stored in different location for dinode and xattr.
2. Add a new structure named ocfs2_extent_tree to indicate the
   extent tree the operation will work on.
3. Remove all the use of fe_bh and di, use root_bh and root_el in
   extent tree instead. So now all the fe_bh is replaced with
   et->root_bh, el with root_el accordingly.
4. Make ocfs2_lock_allocators generic. Now it is limited to be only used
   in file extend allocation. But the whole function is useful when we want
   to store large EAs.

Note: This patch doesn't touch ocfs2_commit_truncate() since it is not used
for anything other than truncate inode data btrees.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

e7d4cb6b

ocfs2: Use ocfs2_extent_list instead of ocfs2_dinode. · 811f933d

由 Tao Ma 提交于 8月 18, 2008

ocfs2_extend_meta_needed(), ocfs2_calc_extend_credits() and
ocfs2_reserve_new_metadata() are all useful for extent tree operations. But
they are all limited to an inode btree because they use a struct
ocfs2_dinode parameter. Change their parameter to struct ocfs2_extent_list
(the part of an ocfs2_dinode they actually use) so that the xattr btree code
can use these functions.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

811f933d

ocfs2: throttle back local alloc when low on disk space · 9c7af40b

由 Mark Fasheh 提交于 7月 28, 2008

Ocfs2's local allocator disables itself for the duration of a mount point
when it has trouble allocating a large enough area from the primary bitmap.
That can cause performance problems, especially for disks which were only
temporarily full or fragmented. This patch allows for the allocator to
shrink it's window first, before being disabled. Later, it can also be
re-enabled so that any performance drop is minimized.

To do this, we allow the value of osb->local_alloc_bits to be shrunk when
needed. The default value is recorded in a mostly read-only variable so that
we can re-initialize when required.

Locking had to be updated so that we could protect changes to
local_alloc_bits. Mostly this involves protecting various local alloc values
with the osb spinlock. A new state is also added, OCFS2_LA_THROTTLED, which
is used when the local allocator is has shrunk, but is not disabled. If the
available space dips below 1 megabyte, the local alloc file is disabled. In
either case, local alloc is re-enabled 30 seconds after the event, or when
an appropriate amount of bits is seen in the primary bitmap.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

9c7af40b

18 4月, 2008 3 次提交

ocfs2: Add inode stealing for ocfs2_reserve_new_inode · 4d0ddb2c

由 Tao Ma 提交于 3月 05, 2008

Inode allocation is modified to look in other nodes allocators during
extreme out of space situations. We retry our own slot when space is freed
back to the global bitmap, or whenever we've allocated more than 1024 inodes
from another slot.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

4d0ddb2c

ocfs2: Add ac_alloc_slot in ocfs2_alloc_context · a4a48911

由 Tao Ma 提交于 3月 03, 2008

In inode stealing, we no longer restrict the allocation to
happen in the local node. So it is neccessary for us to add
a new member in ocfs2_alloc_context to indicate which slot
we are using for allocation. We also modify the process of
local alloc so that this member can be used there also.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

a4a48911

ocfs2: Add a new parameter for ocfs2_reserve_suballoc_bits · ffda89a3

由 Tao Ma 提交于 3月 03, 2008

In some cases(Inode stealing from other nodes), we may not want
ocfs2_reserve_suballoc_bits to allocate new groups from the
global_bitmap since it may already be full. So add a new parameter
for this.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

ffda89a3

03 2月, 2008 1 次提交

fs/: Spelling fixes · c78bad11

由 Joe Perches 提交于 2月 03, 2008

Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NAdrian Bunk <bunk@kernel.org>

c78bad11

26 1月, 2008 3 次提交

ocfs2: Local alloc window size changeable via mount option · 2fbe8d1e

由 Sunil Mushran 提交于 12月 20, 2007

Local alloc is a performance optimization in ocfs2 in which a node
takes a window of bits from the global bitmap and then uses that for
all small local allocations. This window size is fixed to 8MB currently.
This patch allows users to specify the window size in MB including
disabling it by passing in 0. If the number specified is too large,
the fs will use the default value of 8MB.

mount -o localalloc=X /dev/sdX /mntpoint
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

2fbe8d1e

[PATCH 1/2] ocfs2: Add group extend for online resize · d659072f

由 Tao Ma 提交于 12月 18, 2007

This patch adds the ability for a userspace program to request an extend of
last cluster group on an Ocfs2 file system. The request is made via ioctl,
OCFS2_IOC_GROUP_EXTEND. This is derived from EXT3_IOC_GROUP_EXTEND, but is
obviously Ocfs2 specific.

tunefs.ocfs2 would call this for an online-resize operation if the last
cluster group isn't full.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

d659072f

ocfs2: Rename ocfs2_meta_[un]lock · e63aecb6

由 Mark Fasheh 提交于 10月 18, 2007

Call this the "inode_lock" now, since it covers both data and meta data.
This patch makes no functional changes.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

e63aecb6

21 9月, 2007 1 次提交

ocfs2: Allow smaller allocations during large writes · 415cb800

由 Mark Fasheh 提交于 9月 16, 2007

The ocfs2 write code loops through a page much like the block code, except
that ocfs2 allocation units can be any size, including larger than page
size. Typically it's equal to or larger than page size - most kernels run 4k
pages, the minimum ocfs2 allocation (cluster) size.

Some changes introduced during 2.6.23 changed the way writes to pages are
handled, and inadvertantly broke support for > 4k page size. Instead of just
writing one cluster at a time, we now handle the whole page in one pass.

This means that multiple (small) seperate allocations might happen in the
same pass. The allocation code howver typically optimizes by getting the
maximum which was reserved. This triggered a BUG_ON in the extend code where
it'd ask for a single bit (for one part of a > 4k page) and get back more
than it asked for.

Fix this by providing a variant of the high level allocation function which
allows the caller to specify a maximum. The traditional function remains and
just calls the new one with a maximum determined from the initial
reservation.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

415cb800

11 7月, 2007 3 次提交

ocfs2: use all extent block suballocators · 1f6697d0

由 Mark Fasheh 提交于 6月 25, 2007

Now that we have a method to deallocate blocks from them, each node should
allocate extent blocks from their local suballocator file.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

1f6697d0

M
ocfs2: plug truncate into cached dealloc routines · 59a5e416
由 Mark Fasheh 提交于 6月 22, 2007
```
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
```
59a5e416

ocfs2: simplify deallocation locking · 2b604351

由 Mark Fasheh 提交于 6月 22, 2007

Deallocation of suballocator blocks, most notably extent blocks, might
involve multiple suballocator inodes.

The locking for this can get extremely complicated, especially when the
suballocator inodes to delete from aren't known until deep within an
unrelated codepath.

Implement a simple scheme for recording the blocks to be unlinked so that
the actual deallocation can be done in a context which won't deadlock.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

2b604351

03 5月, 2007 1 次提交

ocfs2: fix sparse warnings in fs/ocfs2 · 1ca1a111

由 Mark Fasheh 提交于 4月 27, 2007

None of these are actually harmful, but the noise makes looking for real
problems difficult.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

1ca1a111

27 4月, 2007 1 次提交

ocfs2: Fix up i_blocks calculation to know about holes · 8110b073

由 Mark Fasheh 提交于 3月 22, 2007

Older file systems which didn't support holes did a dumb calculation of
i_blocks based on i_size. This is no longer accurate, so fix things up to
take actual allocation into account.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

8110b073

14 12月, 2006 1 次提交

[PATCH] Fix numerous kcalloc() calls, convert to kzalloc() · cd861280

由 Robert P. J. Day 提交于 12月 13, 2006

All kcalloc() calls of the form "kcalloc(1,...)" are converted to the
equivalent kzalloc() calls, and a few kcalloc() calls with the incorrect
ordering of the first two arguments are fixed.
Signed-off-by: NRobert P. J. Day <rpjday@mindspring.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Adam Belay <ambx1@neo.rr.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cd861280

02 12月, 2006 6 次提交

ocfs2: Remove struct ocfs2_journal_handle in favor of handle_t · 1fabe148

由 Mark Fasheh 提交于 10月 09, 2006

This is mostly a search and replace as ocfs2_journal_handle is now no more
than a container for a handle_t pointer.

ocfs2_commit_trans() becomes very straight forward, and we remove some out
of date comments / code.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

1fabe148

ocfs2: remove handle argument to ocfs2_start_trans() · 65eff9cc

由 Mark Fasheh 提交于 10月 09, 2006

All callers either pass in NULL directly, or a local variable that is
already set to NULL.

The internals of ocfs2_start_trans() get a nice cleanup as a result.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

65eff9cc

M
ocfs2: pass ocfs2_super * into ocfs2_commit_trans() · 02dc1af4
由 Mark Fasheh 提交于 10月 09, 2006
```
This sets us up to remove handle->journal.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
```
02dc1af4

ocfs2: remove unused handle argument from ocfs2_meta_lock_full() · 4bcec184

由 Mark Fasheh 提交于 10月 09, 2006

Now that this is unused and all callers pass NULL, we can safely remove it.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

4bcec184

ocfs2: don't use handle for locking in allocation functions · da5cbf2f

由 Mark Fasheh 提交于 10月 06, 2006

Instead we record our state on the allocation context structure which all
callers already know about and lifetime correctly. This means the
reservation functions don't need a handle passed in any more, and we can
also take it off the alloc context.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

da5cbf2f

ocfs2: remove ocfs2_journal_handle flags field · c161f89b

由 Mark Fasheh 提交于 10月 05, 2006

Callers can set h_sync directly on the handle_t, whether a transaction has
been started or not can be determined via the existence of the handle_t on
the struct ocfs2_journal_handle.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

c161f89b

08 8月, 2006 2 次提交

ocfs2: allocation hints · 883d4cae

由 Mark Fasheh 提交于 6月 05, 2006

Record the most recently used allocation group on the allocation context, so
that subsequent allocations can attempt to optimize for contiguousness.
Local alloc especially should benefit from this as the current chain search
tends to let it spew across the disk.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

883d4cae

ocfs2: better group descriptor consistency checks · 7bf72ede

由 Mark Fasheh 提交于 5月 03, 2006

Try to catch corrupted group descriptors with some stronger checks placed in
a couple of strategic locations. Detect a failed resizefs and refuse to
allocate past what bitmap i_clusters allows.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

7bf72ede

25 3月, 2006 1 次提交
- M
  ocfs2: don't use MLF* in the file system · b0697053
  由 Mark Fasheh 提交于 3月 03, 2006
```
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
```
  b0697053
04 1月, 2006 1 次提交

[PATCH] OCFS2: The Second Oracle Cluster Filesystem · ccd979bd

由 Mark Fasheh 提交于 12月 15, 2005

The OCFS2 file system module.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>

ccd979bd