提交 · df152c241df9e9d2b9a65d37bd02961abe7f591a · openanolis / cloud-kernel

23 6月, 2009 1 次提交

ocfs2: Disable orphan scanning for local and hard-ro mounts · df152c24

由 Sunil Mushran 提交于 6月 22, 2009

Local and Hard-RO mounts do not need orphan scanning.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

df152c24

04 6月, 2009 1 次提交

ocfs2: timer to queue scan of all orphan slots · 83273932

由 Srinivas Eeda 提交于 6月 03, 2009

When a dentry is unlinked, the unlinking node takes an EX on the dentry lock
before moving the dentry to the orphan directory. Other nodes that have
this dentry in cache have a PR on the same dentry lock. When the EX is
requested, the other nodes flag the corresponding inode as MAYBE_ORPHANED
during downconvert. The inode is finally deleted when the last node to iput
the inode sees that i_nlink==0 and the MAYBE_ORPHANED flag is set.

A problem arises if a node is forced to free dentry locks because of memory
pressure. If this happens, the node will no longer get downconvert
notifications for the dentries that have been unlinked on another node.
If it also happens that node is actively using the corresponding inode and
happens to be the one performing the last iput on that inode, it will fail
to delete the inode as it will not have the MAYBE_ORPHANED flag set.

This patch fixes this shortcoming by introducing a periodic scan of the
orphan directories to delete such inodes. Care has been taken to distribute
the workload across the cluster so that no one node has to perform the task
all the time.
Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

83273932

04 4月, 2009 1 次提交

ocfs2: fix rare stale inode errors when exporting via nfs · 6ca497a8

由 wengang wang 提交于 3月 06, 2009

For nfs exporting, ocfs2_get_dentry() returns the dentry for fh.
ocfs2_get_dentry() may read from disk when the inode is not in memory,
without any cross cluster lock. this leads to the file system loading a
stale inode.

This patch fixes above problem.

Solution is that in case of inode is not in memory, we get the cluster
lock(PR) of alloc inode where the inode in question is allocated from (this
causes node on which deletion is done sync the alloc inode) before reading
out the inode itsself. then we check the bitmap in the group (the inode in
question allcated from) to see if the bit is clear. if it's clear then it's
stale. if the bit is set, we then check generation as the existing code
does.

We have to read out the inode in question from disk first to know its alloc
slot and allot bit. And if its not stale we read it out using ocfs2_iget().
The second read should then be from cache.

And also we have to add a per superblock nfs_sync_lock to cover the lock for
alloc inode and that for inode in question. this is because ocfs2_get_dentry()
and ocfs2_delete_inode() lock on them in reverse order. nfs_sync_lock is locked
in EX mode in ocfs2_get_dentry() and in PR mode in ocfs2_delete_inode(). so
that mutliple ocfs2_delete_inode() can run concurrently in normal case.

[mfasheh@suse.com: build warning fixes and comment cleanups]
Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
Acked-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

6ca497a8

06 1月, 2009 1 次提交

ocfs2: Implementation of local and global quota file handling · 9e33d69f

由 Jan Kara 提交于 8月 25, 2008

For each quota type each node has local quota file. In this file it stores
changes users have made to disk usage via this node. Once in a while this
information is synced to global file (and thus with other nodes) so that
limits enforcement at least aproximately works.

Global quota files contain all the information about usage and limits. It's
mostly handled by the generic VFS code (which implements a trie of structures
inside a quota file). We only have to provide functions to convert structures
from on-disk format to in-memory one. We also have to provide wrappers for
various quota functions starting transactions and acquiring necessary cluster
locks before the actual IO is really started.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

9e33d69f

18 4月, 2008 4 次提交

ocfs2: Break out stackglue into modules. · 286eaa95

由 Joel Becker 提交于 2月 01, 2008

We define the ocfs2_stack_plugin structure to represent a stack driver.
The o2cb stack code is split into stack_o2cb.c.  This becomes the
ocfs2_stack_o2cb.ko module.

The stackglue generic functions are similarly split into the
ocfs2_stackglue.ko module.  This module now provides an interface to
register drivers.  The ocfs2_stack_o2cb driver registers itself.  As
part of this interface, ocfs2_stackglue can load drivers on demand.
This is accomplished in ocfs2_cluster_connect().

ocfs2_cluster_disconnect() is now notified when a _hangup() is pending.
If a hangup is pending, it will not release the driver module and will
let _hangup() do that.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

286eaa95

ocfs2: Clean up stackglue initialization · 63e0c48a

由 Joel Becker 提交于 1月 30, 2008

The stack glue initialization function needs a better name so that it can be
used cleanly when stackglue becomes a module.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

63e0c48a

ocfs2: Introduce the new ocfs2_cluster_connect/disconnect() API. · 4670c46d

由 Joel Becker 提交于 2月 01, 2008

This step introduces a cluster stack agnostic API for initializing and
exiting. fs/ocfs2/dlmglue.c no longer uses o2cb/o2dlm knowledge to
connect to the stack. It is all handled in stackglue.c.

heartbeat.c no longer needs to know how it gets called.
ocfs2_do_node_down() is now a clean recovery trigger.

The big gotcha is the ordering of initializations and de-initializations done
underneath ocfs2_cluster_connect(). ocfs2_dlm_init() used to do all
o2dlm initialization in one block. Thus, the o2dlm functionality of
ocfs2_cluster_connect() is very straightforward. ocfs2_dlm_shutdown(),
however, did a few things between de-registration of the eviction
callback and actually shutting down the domain. Now de-registration and
shutdown of the domain are wrapped within the single
ocfs2_cluster_disconnect() call. I've checked the code paths to make
sure we can safely tear down things in ocfs2_dlm_shutdown() before
calling ocfs2_cluster_disconnect(). The filesystem has already set
itself to ignore the callback.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

4670c46d

ocfs2: Separate out dlm lock functions. · 24ef1815

由 Joel Becker 提交于 1月 29, 2008

This is the first in a series of patches to isolate ocfs2 from the
underlying cluster stack. Here we wrap the dlm locking functions with
ocfs2-specific calls. Because ocfs2 always uses the same dlm lock status
callbacks, we can eliminate the callbacks from the filesystem visible
functions.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

24ef1815

04 3月, 2008 1 次提交

[2.6 patch] fs/ocfs2/: possible cleanups · 00600056

由 Adrian Bunk 提交于 1月 29, 2008

This patch contains the following cleanups that are now possible:
- make the following needlessly global functions static:
  - dlmglue.c:ocfs2_process_blocked_lock()
  - heartbeat.c:ocfs2_node_map_init()
- #if 0 the following unused global function plus support functions:
  - heartbeat.c:ocfs2_node_map_is_only()
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

00600056

07 2月, 2008 1 次提交

ocfs2: Negotiate locking protocol versions. · d24fbcda

由 Joel Becker 提交于 1月 25, 2008

Currently, when ocfs2 nodes connect via TCP, they advertise their
compatibility level. If the versions do not match, two nodes cannot speak
to each other and they disconnect. As a result, this provides no forward or
backwards compatibility.

This patch implements a simple protocol negotiation at the dlm level by
introducing a major/minor version number scheme for entities that
communicate. Specifically, o2dlm has a major/minor version for interaction
with o2dlm on other nodes, and ocfs2 itself has a major/minor version for
interacting with the filesystem on other nodes.

This will allow rolling upgrades of ocfs2 clusters when changes to the
locking or network protocols can be done in a backwards compatible manner.
In those cases, only the minor number is changed and the negotatied protocol
minor is returned from dlm join. In the far less likely event that a
required protocol change makes backwards compatibility impossible, we simply
bump the major number.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

d24fbcda

26 1月, 2008 4 次提交

[PATCH 1/2] ocfs2: add flock lock type · cf8e06f1

由 Mark Fasheh 提交于 12月 20, 2007

This adds a new dlmglue lock type which is intended to back flock()
requests.

Since these locks are driven from userspace, usage rules are much more
liberal than the typical Ocfs2 internal cluster lock. As a result, we can't
make use of most dlmglue features - lock caching and lock level
optimizations in particular. Additionally, userspace is free to deadlock
itself, so we have to deal with that in the same way as the rest of the
kernel - by allowing a signal to abort a lock request.

In order to keep ocfs2_cluster_lock() complexity down, ocfs2_file_lock()
does it's own dlm coordination. We still use the same helper functions
though, so duplicated code is kept to a minimum.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

cf8e06f1

ocfs2: Rename ocfs2_meta_[un]lock · e63aecb6

由 Mark Fasheh 提交于 10月 18, 2007

Call this the "inode_lock" now, since it covers both data and meta data.
This patch makes no functional changes.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

e63aecb6

ocfs2: Remove data locks · c934a92d

由 Mark Fasheh 提交于 10月 18, 2007

The meta lock now covers both meta data and data, so this just removes the
now-redundant data lock.

Combining locks saves us a round of lock mastery per inode and one less lock
to ping between nodes during read/write.

We don't lose much - since meta locks were always held before a data lock
(and at the same level) ordered writeout mode (the default) ensured that
flushing for the meta data lock also pushed out data anyways.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

c934a92d

ocfs2: Remove mount/unmount votes · 34d024f8

由 Mark Fasheh 提交于 9月 24, 2007

The node maps that are set/unset by these votes are no longer relevant, thus
we can remove the mount and umount votes. Since those are the last two
remaining votes, we can also remove the entire vote infrastructure.

The vote thread has been renamed to the downconvert thread, and the small
amount of functionality related to managing it has been moved into
fs/ocfs2/dlmglue.c. All references to votes have been removed or updated.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

34d024f8

13 10月, 2007 1 次提交

ocfs2: Structure updates for inline data · 15b1e36b

由 Mark Fasheh 提交于 9月 07, 2007

Add the disk, network and memory structures needed to support data in inode.

Struct ocfs2_inline_data is defined and embedded in ocfs2_dinode for storing
inline data.

A new inode field, i_dyn_features, is added to facilitate tracking of
dynamic inode state. Since it will be used often, we want to mirror it on
ocfs2_inode_info, and transfer it via the meta data lvb.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: NJoel Becker <joel.becker@oracle.com>

15b1e36b

03 5月, 2007 1 次提交

[PATCH] fs/ocfs2/: make 3 functions static · 6cb129f5

由 Adrian Bunk 提交于 4月 26, 2007

This patch makes the following needlessly global functions static:
- aops.c: ocfs2_write_data_page()
- dlmglue.c: ocfs2_dump_meta_lvb_info()
- file.c: ocfs2_set_inode_size()
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

6cb129f5

27 4月, 2007 1 次提交

ocfs2: Remove delete inode vote · 50008630

由 Tiger Yang 提交于 3月 20, 2007

Ocfs2 currently does cluster-wide node messaging to check the open state of
an inode during delete. This patch removes that mechanism in favor of an
inode cluster lock which is taken at shared read when an inode is first read
and dropped in clear_inode(). This allows a deleting node to test the
liveness of an inode by attempting to take an exclusive lock.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

50008630

02 12月, 2006 3 次提交

ocfs2: core atime update functions · 7f1a37e3

由 Tiger Yang 提交于 11月 15, 2006

This patch adds the core routines for updating atime in ocfs2.
Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

7f1a37e3

ocfs2: remove unused handle argument from ocfs2_meta_lock_full() · 4bcec184

由 Mark Fasheh 提交于 10月 09, 2006

Now that this is unused and all callers pass NULL, we can safely remove it.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

4bcec184

[2.6 patch] make ocfs2_create_new_lock() static · da66116e

由 Adrian Bunk 提交于 11月 20, 2006

This patch makes the needlessly global ocfs2_create_new_lock() static.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

da66116e

25 9月, 2006 4 次提交

ocfs2: Remove i_generation from inode lock names · 24c19ef4

由 Mark Fasheh 提交于 9月 22, 2006

OCFS2 puts inode meta data in the "lock value block" provided by the DLM.
Typically, i_generation is encoded in the lock name so that a deleted inode
on and a new one in the same block don't share the same lvb.

Unfortunately, that scheme means that the read in ocfs2_read_locked_inode()
is potentially thrown away as soon as the meta data lock is taken - we
cannot encode the lock name without first knowing i_generation, which
requires a disk read.

This patch encodes i_generation in the inode meta data lvb, and removes the
value from the inode meta data lock name. This way, the read can be covered
by a lock, and at the same time we can distinguish between an up to date and
a stale LVB.

This will help cold-cache stat(2) performance in particular.

Since this patch changes the protocol version, we take the opportunity to do
a minor re-organization of two of the LVB fields.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

24c19ef4

ocfs2: Encode i_generation in the meta data lvb · f9e2d82e

由 Mark Fasheh 提交于 9月 12, 2006

When i_generation is removed from the lockname, this will help us determine
whether a meta data lvb has information that is in sync with the local
struct inode.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

f9e2d82e

ocfs2: Free up some space in the lvb · 4d3b83f7

由 Mark Fasheh 提交于 9月 12, 2006

lvb_version doesn't need to be a whole 32 bits. Make it an 8 bit field to
free up some space. This should be backwards compatible until we use one of
the fields, in which case we'd bump the lvb version anyway.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

4d3b83f7

ocfs2: Add new cluster lock type · d680efe9

由 Mark Fasheh 提交于 9月 08, 2006

Replace the dentry vote mechanism with a cluster lock which covers a set
of dentries. This allows us to force d_delete() only on nodes which actually
care about an unlink.

Every node that does a ->lookup() gets a read only lock on the dentry, until
an unlink during which the unlinking node, will request an exclusive lock,
forcing the other nodes who care about that dentry to d_delete() it. The
effect is that we retain a very lightweight ->d_revalidate(), and at the
same time get to make large improvements to the average case performance of
the ocfs2 unlink and rename operations.

This patch adds the cluster lock type which OCFS2 can attach to
dentries. A small number of fs/ocfs2/dcache.c functions are stubbed
out so that this change can compile.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

d680efe9

21 9月, 2006 1 次提交

ocfs2: add ext2 attributes · ca4d147e

由 Herbert Poetzl 提交于 7月 03, 2006

Support immutable, and other attributes.

Some renaming and other minor fixes done by myself.
Signed-off-by: NHerbert Poetzl <herbert@13thfloor.at>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

ca4d147e

04 1月, 2006 1 次提交

[PATCH] OCFS2: The Second Oracle Cluster Filesystem · ccd979bd

由 Mark Fasheh 提交于 12月 15, 2005

The OCFS2 file system module.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: NKurt Hackel <kurt.hackel@oracle.com>

ccd979bd

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功