提交 · 6b27f62fc750d85bc6fc3718b3b38ec60edc2d74 · openeuler / Kernel

04 6月, 2011 1 次提交

more conservative S_NOSEC handling · 9e1f1de0

由 Al Viro 提交于 6月 03, 2011

Caching "we have already removed suid/caps" was overenthusiastic as merged.
On network filesystems we might have had suid/caps set on another client,
silently picked by this client on revalidate, all of that *without* clearing
the S_NOSEC flag.

AFAICS, the only reasonably sane way to deal with that is
	* new superblock flag; unless set, S_NOSEC is not going to be set.
	* local block filesystems set it in their ->mount() (more accurately,
mount_bdev() does, so does btrfs ->mount(), users of mount_bdev() other than
local block ones clear it)
	* if any network filesystem (or a cluster one) wants to use S_NOSEC,
it'll need to set MS_NOSEC in sb->s_flags *AND* take care to clear S_NOSEC when
inode attribute changes are picked from other clients.

It's not an earth-shattering hole (anybody that can set suid on another client
will almost certainly be able to write to the file before doing that anyway),
but it's a bug that needs fixing.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9e1f1de0

27 5月, 2011 1 次提交

ocfs2: add cleancache support · 1cfd8bd0

由 Dan Magenheimer 提交于 5月 26, 2011

This eighth patch of eight in this cleancache series "opts-in"
cleancache for ocfs2.  Clustered filesystems must explicitly enable
cleancache by calling cleancache_init_shared_fs anytime an instance
of the filesystem is mounted.  Ocfs2 is currently the only user of
the clustered filesystem interface but nevertheless, the cleancache
hooks in the VFS layer are sufficient for ocfs2 including the matching
cleancache_flush_fs hook which must be called on unmount.

Details and a FAQ can be found in Documentation/vm/cleancache.txt

[v8: trivial merge conflict update]
[v5: jeremy@goop.org: simplify init hook and any future fs init changes]
Signed-off-by: NDan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Reviewed-by: NJeremy Fitzhardinge <jeremy@goop.org>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik Van Riel <riel@redhat.com>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Andreas Dilger <adilger@sun.com>
Cc: Ted Tso <tytso@mit.edu>
Cc: Nitin Gupta <ngupta@vflare.org>

1cfd8bd0

24 5月, 2011 1 次提交

ocfs2: clean up mount option about atime in ocfs2.txt · e2b0c215

由 Tiger Yang 提交于 3月 02, 2011

As ocfs2 supports relatime and strictatime, we need update the
relative document. Atime_quantum need work with strictatime, so only
show it in procfs when mount with strictatime.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

e2b0c215

31 3月, 2011 1 次提交

Fix common misspellings · 25985edc

由 Lucas De Marchi 提交于 3月 30, 2011

Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

25985edc

23 2月, 2011 1 次提交

ocfs2: Remove masklog ML_SUPER. · 32a42d39

由 Tao Ma 提交于 2月 23, 2011

Remove mlog(0) from fs/ocfs2/super.c and the masklog SUPER.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>

32a42d39

21 2月, 2011 1 次提交

ocfs2: Add ocfs2_trace.h. · 80a9a84d

由 Wengang Wang 提交于 2月 21, 2011

About one year ago, Wengang Wang tried some first steps
to add tracepoints to ocfs2. Hiss original patch is here:
http://oss.oracle.com/pipermail/ocfs2-devel/2009-November/005512.html

But as Steven Rostedt indicated in his article
http://lwn.net/Articles/383362/, we'd better have our trace
files resides in fs/ocfs2, so I rewrited the patch using the
method Steven mentioned in that article.
Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: NTao Ma <boyu.mt@taobao.com>

80a9a84d

07 3月, 2011 1 次提交

ocfs2: Remove EXIT from masklog. · c1e8d35e

由 Tao Ma 提交于 3月 07, 2011

mlog_exit is used to record the exit status of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.

This patch just try to remove it or change it. So:
1. if all the error paths already use mlog_errno, it is just removed.
   Otherwise, it will be replaced by mlog_errno.
2. if it is used to print some return value, it is replaced with
   mlog(0,...).
mlog_exit_ptr is changed to mlog(0.
All those mlog(0,...) will be replaced with trace events later.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>

c1e8d35e

21 2月, 2011 1 次提交

ocfs2: Remove ENTRY from masklog. · ef6b689b

由 Tao Ma 提交于 2月 21, 2011

ENTRY is used to record the entry of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.

So for mlog_entry_void, we just remove it.
for mlog_entry(...), we replace it with mlog(0,...), and they
will be replace by trace event later.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>

ef6b689b

20 2月, 2011 1 次提交

ocfs2: Check heartbeat mode for kernel stacks only · 52c303c5

由 Mark Fasheh 提交于 1月 31, 2011

Commit 2c442719 added some checks for proper
heartbeat mode when the o2cb stack is running.  Unfortunately, it didn't
take into account that a userpsace stack could be running. Fix this by only
doing the check if o2cb is in use. This patch allows userspace stacks to
mount the fs again.

Cc: stable@kernel.org
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

52c303c5

01 2月, 2011 1 次提交

ocfs2: use system_wq instead of ocfs2_quota_wq · 316873c9

由 Tejun Heo 提交于 2月 01, 2011

ocfs2_quota_wq is not depended upon during memory reclaim and, with
cmwq, there's no reason to use a dedicated workqueue.  Drop
ocfs2_quota_wq and use system_wq instead.  dqi_sync_work is already
sync canceled on quota disable and no further synchronization is
necessary.

This change makes ocfs2_quota_setup/shutdown() noops.  Both functions
removed.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>

316873c9

13 1月, 2011 2 次提交

A
switch ocfs2, close races · ba87167c
由 Al Viro 提交于 12月 18, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ba87167c

quota: Fix deadlock during path resolution · f00c9e44

由 Jan Kara 提交于 9月 15, 2010

As Al Viro pointed out path resolution during Q_QUOTAON calls to quotactl
is prone to deadlocks. We hold s_umount semaphore for reading during the
path resolution and resolution itself may need to acquire the semaphore
for writing when e. g. autofs mountpoint is passed.

Solve the problem by performing the resolution before we get hold of the
superblock (and thus s_umount semaphore). The whole thing is complicated
by the fact that some filesystems (OCFS2) ignore the path argument. So to
distinguish between filesystem which want the path and which do not we
introduce new .quota_on_meta callback which does not get the path. OCFS2
then uses this callback instead of old .quota_on.

CC: Al Viro <viro@ZenIV.linux.org.uk>
CC: Christoph Hellwig <hch@lst.de>
CC: Ted Ts'o <tytso@mit.edu>
CC: Joel Becker <joel.becker@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>

f00c9e44

07 1月, 2011 1 次提交

fs: icache RCU free inodes · fa0d7e3d

由 Nick Piggin 提交于 1月 07, 2011

RCU free the struct inode. This will allow:

- Subsequent store-free path walking patch. The inode must be consulted for
  permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
  to take i_lock no longer need to take sb_inode_list_lock to walk the list in
  the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
  page lock to follow page->mapping.

The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.

In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.

The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

fa0d7e3d

18 11月, 2010 1 次提交

BKL: remove extraneous #include <smp_lock.h> · 451a3c24

由 Arnd Bergmann 提交于 11月 17, 2010

The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

451a3c24

29 10月, 2010 1 次提交

new helper: mount_bdev() · 152a0836

由 Al Viro 提交于 7月 25, 2010

... and switch of the obvious get_sb_bdev() users to ->mount()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

152a0836

12 10月, 2010 2 次提交

ocfs2: Add a mount option "coherency=*" to handle cluster coherency for O_DIRECT writes. · 7bdb0d18

由 Tristan Ye 提交于 10月 11, 2010

Currently, the default behavior of O_DIRECT writes was allowing
concurrent writing among nodes to the same file, with no cluster
coherency guaranteed (no EX lock held).  This can leave stale data in
the cache for buffered reads on other nodes.

The new mount option introduce a chance to choose two different
behaviors for O_DIRECT writes:

    * coherency=full, as the default value, will disallow
                      concurrent O_DIRECT writes by taking
                      EX locks.

    * coherency=buffered, allow concurrent O_DIRECT writes
                          without EX lock among nodes, which
                          gains high performance at risk of
                          getting stale data on other nodes.
Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

7bdb0d18

Initialize max_slots early · 75d9bbc7

由 Goldwyn Rodrigues 提交于 10月 11, 2010

Functions such as ocfs2_recovery_init() make use of osb->max_slots.
Initialize osb->max_slots early so the functions may use the correct
value.
Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.de>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

75d9bbc7

08 10月, 2010 1 次提交

· 2c442719

由 Sunil Mushran 提交于 10月 07, 2010

ocfs2: Add support for heartbeat=global mount option

Adds support for heartbeat=global mount option. It ensures that the heartbeat
mode passed matches the one enabled on disk.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>

2c442719

10 10月, 2010 1 次提交

· 98f486f2

由 Sunil Mushran 提交于 10月 09, 2010

ocfs2: Add an incompat feature flag OCFS2_FEATURE_INCOMPAT_CLUSTERINFO

OCFS2_FEATURE_INCOMPAT_CLUSTERINFO allows us to use sb->s_cluster_info for
both userspace and o2cb cluster stacks. It also allows us to extend cluster
info to include stack flags.

This patch also adds stackflags to sb->s_clusterinfo. It also introduces a
clusterinfo flag OCFS2_CLUSTER_O2CB_GLOBAL_HEARTBEAT to denote the enabled
global heartbeat mode.

This incompat flag can be set/cleared using tunefs.ocfs2 --fs-features. The
clusterinfo flag is set/cleared using tunefs.ocfs2 --update-cluster-stack.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>

98f486f2

05 10月, 2010 2 次提交

BKL: Remove BKL from OCFS2 · 60056794

由 Arnd Bergmann 提交于 2月 24, 2010

The BKL in ocfs2/dlmfs is used in put_super, fill_super and remount_fs
that are all three protected by the superblocks s_umount rw_semaphore.

The use in ocfs2_control_open is evidently unrelated and the function
is protected by ocfs2_control_lock.

Therefore it is safe to remove the BKL entirely.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>

60056794

BKL: Explicitly add BKL around get_sb/fill_super · db719222

由 Jan Blunck 提交于 8月 15, 2010

This patch is a preparation necessary to remove the BKL from do_new_mount().
It explicitly adds calls to lock_kernel()/unlock_kernel() around
get_sb/fill_super operations for filesystems that still uses the BKL.

I've read through all the code formerly covered by the BKL inside
do_kern_mount() and have satisfied myself that it doesn't need the BKL
any more.

do_kern_mount() is already called without the BKL when mounting the rootfs
and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called
from various places without BKL: simple_pin_fs(), nfs_do_clone_mount()
through nfs_follow_mountpoint(), afs_mntpt_do_automount() through
afs_mntpt_follow_link(). Both later functions are actually the filesystems
follow_link inode operation. vfs_kern_mount() is calling the specified
get_sb function and lets the filesystem do its job by calling the given
fill_super function.

Therefore I think it is safe to push down the BKL from the VFS to the
low-level filesystems get_sb/fill_super operation.

[arnd: do not add the BKL to those file systems that already
       don't use it elsewhere]
Signed-off-by: NJan Blunck <jblunck@infradead.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Christoph Hellwig <hch@infradead.org>

db719222

10 9月, 2010 2 次提交

ocfs2: Cache system inodes of other slots. · b4d693fc

由 Tao Ma 提交于 8月 16, 2010

Durring orphan scan, if we are slot 0, and we are replaying
orphan_dir:0001, the general process is that for every file
in this dir:
1. we will iget orphan_dir:0001, since there is no inode for it.
   we will have to create an inode and read it from the disk.
2. do the normal work, such as delete_inode and remove it from
   the dir if it is allowed.
3. call iput orphan_dir:0001 when we are done. In this case,
   since we have no dcache for this inode, i_count will
   reach 0, and VFS will have to call clear_inode and in
   ocfs2_clear_inode we will checkpoint the inode which will let
   ocfs2_cmt and journald begin to work.
4. We loop back to 1 for the next file.

So you see, actually for every deleted file, we have to read the
orphan dir from the disk and checkpoint the journal. It is very
time consuming and cause a lot of journal checkpoint I/O.
A better solution is that we can have another reference for these
inodes in ocfs2_super. So if there is no other race among
nodes(which will let dlmglue to checkpoint the inode), for step 3,
clear_inode won't be called and for step 1, we may only need to
read the inode for the 1st time. This is a big win for us.

So this patch will try to cache system inodes of other slots so
that we will have one more reference for these inodes and avoid
the extra inode read and journal checkpoint.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

b4d693fc

OCFS2: Allow huge (> 16 TiB) volumes to mount · 3bdb8efd

由 Patrick J. LoPresti 提交于 7月 22, 2010

The OCFS2 developers have already done all of the hard work to allow
volumes larger than 16 TiB.  But there is still a "sanity check" in
fs/ocfs2/super.c that prevents the mounting of such volumes, even when
the cluster size and journal options would allow it.

This patch replaces that sanity check with a more sophisticated one to
mount a huge volume provided that (a) it is addressable by the raw
word/address size of the system (borrowing a test from ext4); (b) the
volume is using JBD2; and (c) the JBD2_FEATURE_INCOMPAT_64BIT flag is
set on the journal.

I factored out the sanity check into its own function.  I also moved it
from ocfs2_initialize_super() down to ocfs2_check_volume(); any earlier,
and the journal will not have been initialized yet.

This patch is one of a pair, and it depends on the other ("JBD2: Allow
feature checks before journal recovery").

I have tested this patch on small volumes, huge volumes, and huge
volumes without 64-bit block support in the journal.  All of them appear
to work or to fail gracefully, as appropriate.
Signed-off-by: NPatrick LoPresti <lopresti@gmail.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

3bdb8efd

10 8月, 2010 1 次提交
- A
  convert ocfs2 to ->evict_inode() · 066d92dc
  由 Al Viro 提交于 6月 08, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  066d92dc
17 6月, 2010 1 次提交

fix typos concerning "initiali[zs]e" · 421f91d2

由 Uwe Kleine-König 提交于 6月 11, 2010

Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

421f91d2

24 5月, 2010 4 次提交

quota: rename default quotactl methods to dquot_ · 287a8095

由 Christoph Hellwig 提交于 5月 19, 2010

Follow the dquot_* style used elsewhere in dquot.c.

[Jan Kara: Fixed up missing conversion of ext2]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

287a8095

quota: drop remount argument to ->quota_on and ->quota_off · 307ae18a

由 Christoph Hellwig 提交于 5月 19, 2010

Remount handling has fully moved into the filesystem, so all this is
superflous now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

307ae18a

quota: kill the vfs_dq_off and vfs_dq_quota_on_remount wrappers · 0f0dd62f

由 Christoph Hellwig 提交于 5月 19, 2010

Instead of having wrappers in the VFS namespace export the dquot_suspend
and dquot_resume helpers directly.  Also rename vfs_quota_disable to
dquot_disable while we're at it.

[Jan Kara: Moved dquot_suspend to quotaops.h and made it inline]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

0f0dd62f

ocfs2: Fix use after free on remount read-only · eea7feb0

由 Jan Kara 提交于 5月 13, 2010

We also have to cancel quota syncing thread on remount read only because
at that moment quota is being turned off. Otherwise quota syncing thread
will try to access already freed quota structures.
Signed-off-by: NJan Kara <jack@suse.cz>

eea7feb0

22 5月, 2010 1 次提交

ocfs2: Fix lock inversion in quotas during umount · c06bcbfa

由 Jan Kara 提交于 5月 13, 2010

We cannot cancel delayed work from ocfs2_local_free_info because that is called
with dqonoff_mutex held and the work it cancels requires dqonoff_mutex to
finish. Cancel the work before acquiring dqonoff_mutex.
Acked-by: NJoel Becker <Joel.Becker@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>

c06bcbfa

11 5月, 2010 1 次提交

ocfs2: Wrap signal blocking in void functions. · e4b963f1

由 Joel Becker 提交于 9月 02, 2009

ocfs2 sometimes needs to block signals around dlm operations, but it
currently does it with sigprocmask().  Even worse, it's checking the
error code of sigprocmask().  The in-kernel sigprocmask() can only error
if you get the SIG_* argument wrong.  We don't.

Wrap the sigprocmask() calls with ocfs2_[un]block_signals().  These
functions are void, but they will BUG() if somehow sigprocmask() returns
an error.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

e4b963f1

06 5月, 2010 6 次提交

ocfs2: Make nointr a default mount option · 4b37fcb7

由 Sunil Mushran 提交于 4月 13, 2010

OCFS2 has never really supported intr. This patch acknowledges this reality
and makes nointr the default mount option. In a later patch, we intend to
support intr.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

4b37fcb7

ocfs2: Add dir_resv_level mount option · 83f92318

由 Mark Fasheh 提交于 4月 05, 2010

The default behavior for directory reservations stays the same, but we add a
mount option so people can tweak the size of directory reservations
according to their workloads.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

83f92318

ocfs2: increase the default size of local alloc windows · 6b82021b

由 Mark Fasheh 提交于 4月 05, 2010

I have observed that the current size of 8M gives us pretty poor
fragmentation on multi-threaded workloads which do lots of writes.

Generally, I can increase the size of local alloc windows and observe a
marked decrease in fragmentation, even up and beyond window sizes of 512
megabytes. This makes sense for a couple reasons - larger local alloc means
more room for reservation windows. On multi-node workloads the larger local
alloc helps as well because we don't have to do window slides as often.

Also, I removed the OCFS2_DEFAULT_LOCAL_ALLOC_SIZE constant as it is no
longer used and the comment above it was out of date.

To test fragmentation, I used a workload which launched 4 threads that did
4k writes into a series of about 140 alternating files.

With resv_level=2, and a 4k/4k file system I observed the following average
fragmentation for various localalloc= parameters:

localalloc=	avg. fragmentation
	8		48
	32		16
	64		10
	120		7

On larger cluster sizes, the difference is more dramatic.

The new default size top out at 256M, which we'll only get for cluster
sizes of 32K and above.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

6b82021b

ocfs2: clean up localalloc mount option size parsing · 73c8a800

由 Mark Fasheh 提交于 4月 05, 2010

This patch pulls the local alloc sizing code into localalloc.c and provides
a callout to it from ocfs2_fill_super(). Behavior is essentially unchanged
except that I correctly calculate the maximum local alloc size. The old code
in ocfs2_parse_options() calculated the max size as:

ocfs2_local_alloc_size(sb) * 8

which is correct, in bits. Unfortunately though the option passed in is in
megabytes. Ultimately, this bug made no real difference - the shrink code
would catch a too-large size and bring it down to something reasonable.
Still, it's less than efficient as-is.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

73c8a800

ocfs2: use allocation reservations during file write · 4fe370af

由 Mark Fasheh 提交于 12月 07, 2009

Add a per-inode reservations structure and pass it through to the
reservations code.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

4fe370af

ocfs2: allocation reservations · d02f00cc

由 Mark Fasheh 提交于 12月 07, 2009

This patch improves Ocfs2 allocation policy by allowing an inode to
reserve a portion of the local alloc bitmap for itself. The reserved
portion (allocation window) is advisory in that other allocation
windows might steal it if the local alloc bitmap becomes
full. Otherwise, the reservations are honored and guaranteed to be
free. When the local alloc window is moved to a different portion of
the bitmap, existing reservations are discarded.

Reservation windows are represented internally by a red-black
tree. Within that tree, each node represents the reservation window of
one inode. An LRU of active reservations is also maintained. When new
data is written, we allocate it from the inodes window. When all bits
in a window are exhausted, we allocate a new one as close to the
previous one as possible. Should we not find free space, an existing
reservation is pulled off the LRU and cannibalized.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

d02f00cc

13 4月, 2010 2 次提交

ocfs2: ocfs2_group_bitmap_size has to handle old volume. · 8571882c

由 Tao Ma 提交于 4月 13, 2010

ocfs2_group_bitmap_size has to handle the case when the
volume don't have discontiguous block group support. So
pass the feature_incompat in and check it.
Signed-off-by: NTao Ma <tao.ma@oracle.com>

8571882c

ocfs2: Define data structures for discontiguous block groups. · 4cbe4249

由 Joel Becker 提交于 4月 13, 2010

Defines the OCFS2_FEATURE_INCOMPAT_DISCONTIG_BG feature bit and modifies
struct ocfs2_group_desc for the feature.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

4cbe4249

27 2月, 2010 1 次提交

ocfs2: add extent block stealing for ocfs2 v5 · b89c5428

由 Tiger Yang 提交于 1月 25, 2010

This patch add extent block (metadata) stealing mechanism for
extent allocation. This mechanism is same as the inode stealing.
if no room in slot specific extent_alloc, we will try to
allocate extent block from the next slot.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Acked-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

b89c5428

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功