提交 · 007dce53a29ccffc000ab5373d188f73881390fd · openeuler / Kernel

18 4月, 2008 40 次提交

ocfs2/dlm: Dump the dlm state in a debugfs file · 007dce53

由 Sunil Mushran 提交于 3月 10, 2008

This patch dumps the dlm state (dlm_ctxt) into a debugfs file.
Useful for debugging.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

007dce53

ocfs2/dlm: Create debugfs dirs · 6325b4a2

由 Sunil Mushran 提交于 3月 10, 2008

This patch creates the debugfs directories that will hold the
files to be used to dump the dlm state.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

6325b4a2

ocfs2/dlm: Link all lockres' to a tracking list · 29576f8b

由 Sunil Mushran 提交于 3月 10, 2008

This patch links all the lockres' to a tracking list in dlm_ctxt.
We will use this in an upcoming patch that will walk the entire
list and to dump the lockres states to a debugfs file.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

29576f8b

ocfs2/dlm: Create slabcaches for lock and lockres · 724bdca9

由 Sunil Mushran 提交于 3月 10, 2008

This patch makes the o2dlm allocate memory for lockres, lockname and lock
structures from slabcaches rather than kmalloc. This allows us to not only
make these allocs more efficient but also allows us to track the memory being
consumed by these structures.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

724bdca9

ocfs2/dlm: Rename slabcache dlm_mle_cache to o2dlm_mle · 12eb0035

由 Sunil Mushran 提交于 3月 10, 2008

This patch renames dlm_mle_slabcache to prevent namespace clashes with fs/dlm.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

12eb0035

ocfs2: Document /sys/fs/ocfs2 · 53f67e33

由 Joel Becker 提交于 3月 31, 2008

Add ABI documentation for these files:

	/sys/fs/ocfs2/max_locking_protocol
	/sys/fs/ocfs2/loaded_cluster_plugins
	/sys/fs/ocfs2/active_cluster_plugin
	/sys/fs/ocfs2/cluster_stack
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

53f67e33

ocfs2: Allow selection of cluster plug-ins. · 9341d229

由 Joel Becker 提交于 3月 04, 2008

ocfs2 now supports plug-ins for the classic O2CB stack as well as
userspace cluster stacks in conjunction with fs/dlm.  This allows zero,
one, or both of the plug-ins to be selected in Kconfig.  For local mounts
(non-clustered), neither plug-in is needed.  Both plugins can be loaded
at one time, the runtime will select the one needed for the cluster
systme in use.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

9341d229

ocfs2: Add kbuild for ocfs2_stack_user.ko · b92eccdd

由 Joel Becker 提交于 11月 28, 2007

Add ocfs2_stack_user.ko to the Makefile so that it builds.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

b92eccdd

ocfs2: Change mlog_bug_on to BUG_ON in ocfs2_lockid.h · 8f318311

由 Joel Becker 提交于 3月 04, 2008

The masklog code is in the o2cb stack, but ocfs2_lockid.h now needs to
be included by the user stack.  The BUG() in ocfs2_lock_type_string()
does not need masklog support, so change it to a regular BUG_ON().
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

8f318311

ocfs2: add fsdlm to stackglue · cf4d8d75

由 David Teigland 提交于 2月 20, 2008

Add code to use fs/dlm.

[ Modified to be part of the stack_user module -- Joel ]
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

cf4d8d75

ocfs2: Add the 'set version' message to the ocfs2_control device. · d4b95eef

由 Joel Becker 提交于 2月 20, 2008

The "SETV" message sets the filesystem locking protocol version as
negotiated by the client.  The client negotiates based on the maximum
version advertised in /sys/fs/ocfs2/max_locking_protocol.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

d4b95eef

ocfs2: Add the local node id to the handshake. · 3cfd4ab6

由 Joel Becker 提交于 2月 20, 2008

This is the second part of the ocfs2_control handshake.  After
negotiating the ocfs2_control protocol, the daemon tells the filesystem
what the local node id is via the SETN message.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

3cfd4ab6

ocfs2: Introduce the DOWN message to ocfs2_control · de870ef0

由 Joel Becker 提交于 2月 18, 2008

When the control daemon sees a node go down, it sends a DOWN message
through the ocfs2_control device.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

de870ef0

ocfs2: Start the ocfs2_control handshake. · 462c7e6a

由 Joel Becker 提交于 2月 18, 2008

When a control daemon opens the ocfs2_control device, it must perform a
handshake to tell the filesystem it is something capable of monitoring
cluster status.  Only after the handshake is complete will the filesystem
allow mounts.

This is the first part of the handshake.  The daemon reads all supported
ocfs2_control protocols, then writes in the protocol it will use.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

462c7e6a

ocfs2: Add the ocfs2_control misc device. · 6427a727

由 Joel Becker 提交于 2月 18, 2008

The ocfs2_control misc device is how a userspace control daemon (controld)
talks to the filesystem.  Introduce the bare-bones filesystem ops.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

6427a727

ocfs2: Add the user stack module. · 8adf0536

由 Joel Becker 提交于 11月 28, 2007

Add a skeleton for the stack_user module.  It's just the barebones module
code.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

8adf0536

ocfs2: Add the 'cluster_stack' sysfs file. · 9c6c877c

由 Joel Becker 提交于 2月 01, 2008

Userspace can now query and specify the cluster stack in use via the
/sys/fs/ocfs2/cluster_stack file.  By default, it is 'o2cb', which is
the classic stack.  Thus, old tools that do not know how to modify this
file will work just fine.  The stack cannot be modified if there is a
live filesystem.

ocfs2_cluster_connect() now takes the expected cluster stack as an
argument.  This way, the filesystem and the stack glue ensure they are
speaking to the same backend.

If the stack is 'o2cb', the o2cb stack plugin is used.  For any other
value, the fsdlm stack plugin is selected.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

9c6c877c

ocfs2: Add the USERSPACE_STACK incompat bit. · b61817e1

由 Joel Becker 提交于 2月 01, 2008

The filesystem gains the USERSPACE_STACK incomat bit and the
s_cluster_info field on the superblock.  When a userspace stack is in
use, the name of the stack is stored on-disk for mount-time
verification.

The "cluster_stack" option is added to mount(2) processing.  The mount
process needs to pass the matching stack name.  If the passed name and
the on-disk name do not match, the mount is failed.

When using the classic o2cb stack, the incompat bit is *not* set and no
mount option is used other than the usual heartbeat=local.  Thus, the
filesystem is compatible with older tools.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

b61817e1

ocfs2: Create stack glue sysfs files. · 74ae4e10

由 Joel Becker 提交于 1月 31, 2008

Introduce a set of sysfs files that describe the current stack glue
state.  The files live under /sys/fs/ocfs2.  The locking_protocol file
displays the version of ocfs2's locking code.  The
loaded_cluster_plugins file displays all of the currently loaded stack
plugins.  When filesystems are mounted, the active_cluster_plugin file
will display the plugin in use.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

74ae4e10

ocfs2: Break out stackglue into modules. · 286eaa95

由 Joel Becker 提交于 2月 01, 2008

We define the ocfs2_stack_plugin structure to represent a stack driver.
The o2cb stack code is split into stack_o2cb.c.  This becomes the
ocfs2_stack_o2cb.ko module.

The stackglue generic functions are similarly split into the
ocfs2_stackglue.ko module.  This module now provides an interface to
register drivers.  The ocfs2_stack_o2cb driver registers itself.  As
part of this interface, ocfs2_stackglue can load drivers on demand.
This is accomplished in ocfs2_cluster_connect().

ocfs2_cluster_disconnect() is now notified when a _hangup() is pending.
If a hangup is pending, it will not release the driver module and will
let _hangup() do that.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

286eaa95

ocfs2: Create ocfs2_stack_operations and split out the o2cb stack. · e3dad42b

由 Joel Becker 提交于 2月 01, 2008

Define the ocfs2_stack_operations structure. Build o2cb_stack_ops from
all of the o2cb-specific stack functions. Change the generic stack glue
functions to call the stack_ops instead of the o2cb functions directly.

The o2cb functions are moved to stack_o2cb.c. The headers are cleaned up
to where only needed headers are included.

In this code, stackglue.c and stack_o2cb.c refer to some shared
extern variables. When they become modules, that will change.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

e3dad42b

ocfs2: Split o2cb code from generic stack functions. · 553aa7e4

由 Joel Becker 提交于 2月 01, 2008

Split off the o2cb-specific funtionality from the generic stack glue
calls.  This is a precurser to wrapping the o2cb functionality in an
operations vector.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

553aa7e4

ocfs2: Clean up stackglue initialization · 63e0c48a

由 Joel Becker 提交于 1月 30, 2008

The stack glue initialization function needs a better name so that it can be
used cleanly when stackglue becomes a module.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

63e0c48a

ocfs2: Abstract out a debugging function for underlying dlms. · cf0acdcd

由 Joel Becker 提交于 1月 29, 2008

dlmglue.c was still referencing a raw o2dlm lksb in one instance.  Let's
create a generic ocfs2_dlm_dump_lksb() function.  This allows underlying
DLMs to print whatever they want about their lock.

We then move the o2dlm dump into stackglue.c where it belongs.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

cf0acdcd

ocfs2: handle async EAGAIN from NOQUEUE request · 1693a5c0

由 David Teigland 提交于 1月 30, 2008

When using fsdlm, -EAGAIN is returned in the async callback for NOQUEUE
requests. Fix up dlmglue to expect this.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

1693a5c0

ocfs2: Remove CANCELGRANT from the view of dlmglue. · de551246

由 Joel Becker 提交于 2月 01, 2008

o2dlm has the non-standard behavior of providing a cancel callback
(unlock_ast) even when the cancel has failed (the locking operation
succeeded without canceling). This is called CANCELGRANT after the
status code sent to the callback. fs/dlm does not provide this
callback, so dlmglue must be changed to live without it.
o2dlm_unlock_ast_wrapper() in stackglue now ignores CANCELGRANT calls.

Because dlmglue no longer sees CANCELGRANT, ocfs2_unlock_ast() no longer
needs to check for it. ocfs2_locking_ast() must catch that a cancel was
tried and clear the cancel state.

Making these changes opens up a locking race. dlmglue uses the the
OCFS2_LOCK_BUSY flag to ensure only one thread is calling the dlm at any
one time. But dlmglue must unlock the lockres before calling into the
dlm. In the small window of time between unlocking the lockres and
calling the dlm, the downconvert thread can try to cancel the lock. The
downconvert thread is checking the OCFS2_LOCK_BUSY flag - it doesn't
know that ocfs2_dlm_lock() has not yet been called.

Because ocfs2_dlm_lock() has not yet been called, the cancel operation
will just be a no-op. There's nothing to cancel. With CANCELGRANT,
dlmglue uses the CANCELGRANT callback to clear up the cancel state.
When it comes around again, it will retry the cancel. Eventually, the
first thread will have called into ocfs2_dlm_lock(), and either the
lock or the cancel will succeed. The downconvert thread can then do its
downconvert.

Without CANCELGRANT, there is nothing to clean up the cancellation
state. The downconvert thread does not know to retry its operations.
More importantly, the original lock may be blocking on the other node
that is trying to cancel us. With neither able to make progress, the
ast is never called and the cancellation state is never cleaned up that
way. dlmglue is deadlocked.

The OCFS2_LOCK_PENDING flag is introduced to remedy this window. It is
set at the same time OCFS2_LOCK_BUSY is. Thus, the downconvert thread
can check whether the lock is cancelable. If not, it just loops around
to try again. Once ocfs2_dlm_lock() is called, the thread then clears
OCFS2_LOCK_PENDING and wakes the downconvert thread. Now, if the
downconvert thread finds the lock BUSY, it can safely try to cancel it.
Whether the cancel works or not, the state will be properly set and the
lock processing can continue.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

de551246

ocfs2: Fill node number during cluster stack init · 0abd6d18

由 Mark Fasheh 提交于 1月 29, 2008

It doesn't make sense to query for a node number before connecting to the
cluster stack. This should be safe to do because node_num is only just
printed,
and we're actually only moving the setting of node num a small amount
further in the mount process.

[ Disconnect when node query fails -- Joel ]
Reviewed-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

0abd6d18

ocfs2: Move o2hb functionality into the stack glue. · 6953b4c0

由 Joel Becker 提交于 1月 29, 2008

The last bit of classic stack used directly in ocfs2 code is o2hb.
Specifically, the check for heartbeat during mount and the call to
ocfs2_hb_ctl during unmount.

We create an extra API, ocfs2_cluster_hangup(), to encapsulate the call
to ocfs2_hb_ctl.  Other stacks will just leave hangup() empty.

The check for heartbeat is moved into ocfs2_cluster_connect().  It will
be matched by a similar check for other stacks.

With this change, only stackglue.c includes cluster/ headers.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

6953b4c0

ocfs2: Abstract out node number queries. · 19fdb624

由 Joel Becker 提交于 1月 30, 2008

ocfs2 asks the cluster stack for the local node's node number for two
reasons; to fill the slot map and to print it. While the slot map isn't
necessary for userspace cluster stacks, the printing is very nice for
debugging. Thus we add ocfs2_cluster_this_node() as a generic API to get
this value. It is anticipated that the slot map will not be used under a
userspace cluster stack, so validity checks of the node num only need to
exist in the slot map code. Otherwise, it just gets used and printed as an
opaque value.

[ Fixed up some "int" versus "unsigned int" issues and made osb->node_num
  truly opaque. --Mark ]
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

19fdb624

ocfs2: Introduce the new ocfs2_cluster_connect/disconnect() API. · 4670c46d

由 Joel Becker 提交于 2月 01, 2008

This step introduces a cluster stack agnostic API for initializing and
exiting. fs/ocfs2/dlmglue.c no longer uses o2cb/o2dlm knowledge to
connect to the stack. It is all handled in stackglue.c.

heartbeat.c no longer needs to know how it gets called.
ocfs2_do_node_down() is now a clean recovery trigger.

The big gotcha is the ordering of initializations and de-initializations done
underneath ocfs2_cluster_connect(). ocfs2_dlm_init() used to do all
o2dlm initialization in one block. Thus, the o2dlm functionality of
ocfs2_cluster_connect() is very straightforward. ocfs2_dlm_shutdown(),
however, did a few things between de-registration of the eviction
callback and actually shutting down the domain. Now de-registration and
shutdown of the domain are wrapped within the single
ocfs2_cluster_disconnect() call. I've checked the code paths to make
sure we can safely tear down things in ocfs2_dlm_shutdown() before
calling ocfs2_cluster_disconnect(). The filesystem has already set
itself to ignore the callback.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

4670c46d

ocfs2: Create the lock status block union. · 8f2c9c1b

由 Joel Becker 提交于 2月 01, 2008

Wrap the lock status block (lksb) in a union.  Later we will add a union
element for the fs/dlm lksb.  Create accessors for the status and lvb
fields.

Other than a debugging function, dlmglue.c does not directly reference
the o2dlm locking path anymore.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

8f2c9c1b

ocfs2: Use -errno instead of dlm_status for ocfs2_dlm_lock/unlock() API. · 7431cd7e

由 Joel Becker 提交于 2月 01, 2008

Change the ocfs2_dlm_lock/unlock() functions to return -errno values.
This is the first step towards elminiating dlm_status in
fs/ocfs2/dlmglue.c.  The change also passes -errno values to
->unlock_ast().

[ Fix a return code in dlmglue.c and change the error translation table into
  an array of ints. --Mark ]
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

7431cd7e

ocfs2: Use global DLM_ constants in generic code. · bd3e7610

由 Joel Becker 提交于 2月 01, 2008

The ocfs2 generic code should use the values in <linux/dlmconstants.h>.
stackglue.c will convert them to o2dlm values.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

bd3e7610

ocfs2: Separate out dlm lock functions. · 24ef1815

由 Joel Becker 提交于 1月 29, 2008

This is the first in a series of patches to isolate ocfs2 from the
underlying cluster stack. Here we wrap the dlm locking functions with
ocfs2-specific calls. Because ocfs2 always uses the same dlm lock status
callbacks, we can eliminate the callbacks from the filesystem visible
functions.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

24ef1815

ocfs2: New slot map format · 386a2ef8

由 Joel Becker 提交于 2月 01, 2008

The old slot map had a few limitations:

- It was limited to one block, so the maximum slot count was 255.
- Each slot was signed 16bits, limiting node numbers to INT16_MAX.
- An empty slot was marked by the magic 0xFFFF (-1).

The new slot map format provides 32bit node numbers (UINT32_MAX), a
separate space to mark a slot in use, and extra room to grow.  The slot
map is now bounded by i_size, not a block.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

386a2ef8

ocfs2: Define the contents of the slot_map file. · fb86b1f0

由 Joel Becker 提交于 2月 01, 2008

The slot map file is merely an array of __le16.  Wrap it in a structure for
cleaner reference.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

fb86b1f0

ocfs2: De-magic the in-memory slot map. · fc881fa0

由 Joel Becker 提交于 2月 01, 2008

The in-memory slot map uses the same magic as the on-disk one.  There is
a special value to mark a slot as invalid.  It relies on the size of
certain types and so on.

Write a new in-memory map that keeps validity as a separate field.  Outside
of the I/O functions, OCFS2_INVALID_SLOT now means what it is supposed to.
It also is no longer tied to the type size.

This also means that only the I/O functions refer to 16bit quantities.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

fc881fa0

ocfs2: slot_map I/O based on max_slots. · 1c8d9a6a

由 Joel Becker 提交于 2月 01, 2008

The slot map code assumed a slot_map file has one block allocated.
This changes the code to I/O as many blocks as will cover max_slots.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

1c8d9a6a

ocfs2: Change the recovery map to an array of node numbers. · 553abd04

由 Joel Becker 提交于 2月 01, 2008

The old recovery map was a bitmap of node numbers.  This was sufficient
for the maximum node number of 254.  Going forward, we want node numbers
to be UINT32.  Thus, we need a new recovery map.

Note that we can't keep track of slots here.  We must write down the
node number to recovery *before* we get the locks needed to convert a
node number into a slot number.

The recovery map is now an array of unsigned ints, max_slots in size.
It moves to journal.c with the rest of recovery.

Because it needs to be initialized, we move all of recovery initialization
into a new function, ocfs2_recovery_init().  This actually cleans up
ocfs2_initialize_super() a little as well.  Following on, recovery cleaup
becomes part of ocfs2_recovery_exit().

A number of node map functions are rendered obsolete and are removed.

Finally, waiting on recovery is wrapped in a function rather than naked
checks on the recovery_event.  This is a cleanup from Mark.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

553abd04

ocfs2: Make ocfs2_slot_info private. · d85b20e4

由 Joel Becker 提交于 2月 01, 2008

Just use osb_lock around the ocfs2_slot_info data.  This allows us to
take the ocfs2_slot_info structure private in slot_info.c.  All access
is now via accessors.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

d85b20e4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功