提交 · 8499137d4ef1829281e04838113b6b09a0bf1269 · openeuler / raspberrypi-kernel

01 5月, 2007 3 次提交

[DLM] add orphan purging code (1/2) · 8499137d

由 David Teigland 提交于 3月 30, 2007

Add code for purging orphan locks.  A process can also purge all of its
own non-orphan locks by passing a pid of zero.  Code already exists for
processes to create persistent locks that become orphans when the process
exits, but the complimentary capability for another process to then purge
these orphans has been missing.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8499137d

[DLM] split create_message function · 7e4dac33

由 David Teigland 提交于 4月 02, 2007

This splits the current create_message() function into two parts so that
later patches can call the new lower-level _create_message() function when
they don't have an rsb struct. No functional change in this patch.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

7e4dac33

[DLM] overlapping cancel and unlock · ef0c2bb0

由 David Teigland 提交于 3月 28, 2007

Full cancel and force-unlock support.  In the past, cancel and force-unlock
wouldn't work if there was another operation in progress on the lock.  Now,
both cancel and unlock-force can overlap an operation on a lock, meaning there
may be 2 or 3 operations in progress on a lock in parallel.  This support is
important not only because cancel and force-unlock are explicit operations
that an app can use, but both are used implicitly when a process exits while
holding locks.

Summary of changes:

- add-to and remove-from waiters functions were rewritten to handle situations
  with more than one remote operation outstanding on a lock

- validate_unlock_args detects when an overlapping cancel/unlock-force
  can be sent and when it needs to be delayed until a request/lookup
  reply is received

- processing request/lookup replies detects when cancel/unlock-force
  occured during the op, and carries out the delayed cancel/unlock-force

- manipulation of the "waiters" (remote operation) state of a lock moved under
  the standard rsb mutex that protects all the other lock state

- the two recovery routines related to locks on the waiters list changed
  according to the way lkb's are now locked before accessing waiters state

- waiters recovery detects when lkb's being recovered have overlapping
  cancel/unlock-force, and may not recover such locks

- revert_lock (cancel) returns a value to distinguish cases where it did
  nothing vs cases where it actually did a cancel; the cancel completion ast
  should only be done when cancel did something

- orphaned locks put on new list so they can be found later for purging

- cancel must be called on a lock when making it an orphan

- flag user locks (ENDOFLIFE) at the end of their useful life (to the
  application) so we can return an error for any further cancel/unlock-force

- we weren't setting COMP/BAST ast flags if one was already set, so we'd lose
  either a completion or blocking ast

- clear an unread bast on a lock that's become unlocked
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ef0c2bb0

06 2月, 2007 9 次提交

[DLM] zero new user lvbs · 62a0f623

由 David Teigland 提交于 1月 31, 2007

A new lvb for a userland lock wasn't being initialized to zero.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

62a0f623

[DLM] can miss clearing resend flag · b790c3b7

由 David Teigland 提交于 1月 24, 2007

A long, complicated sequence of events, beginning with the RESEND flag not
being cleared on an lkb, can result in an unlock never completing.

- lkb on waiters list for remote lookup
- the remote node is both the dir node and the master node, so
  it optimizes the lookup into a request and sends a request
  reply back
- the request reply is saved on the requestqueue to be processed
  after recovery
- recovery runs dlm_recover_waiters_pre() which sets RESEND flag
  so the lookup will be resent after recovery
- end of recovery: process_requestqueue takes saved request reply
  which removes the lkb off the waitesr list, _without_ clearing
  the RESEND flag
- end of recovery: dlm_recover_waiters_post() doesn't do anything
  with the now completed lookup lkb (would usually clear RESEND)
- later, the node unmounts, unlocks this lkb that still has RESEND
  flag set
- the lkb is on the waiters list again, now for unlock, when recovery
  occurs, dlm_recover_waiters_pre() shows the lkb for unlock with RESEND
  set, doesn't do anything since the master still exists
- end of recovery: dlm_recover_waiters_post() takes this lkb off
  the waiters list because it has the RESEND flag set, then reports
  an error because unlocks are never supposed to be handled in
  recover_waiters_post().
- later, the unlock reply is received, doesn't find the lkb on
  the waiters list because recover_waiters_post() has wrongly
  removed it.
- the unlock operation has been lost, and we're left with a
  stray granted lock
- unmount spins waiting for the unlock to complete

The visible evidence of this problem will be a node where gfs umount is
spinning, the dlm waiters list will be empty, and the dlm locks list will
show a granted lock.

The fix is simply to clear the RESEND flag when taking an lkb off the
waiters list.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

b790c3b7

[DLM] saved dlm message can be dropped · 8fd3a98f

由 David Teigland 提交于 1月 24, 2007

dlm_receive_message() returns 0 instead of returning 'error'. What would
happen is that process_requestqueue would take a saved message off the
requestqueue and call receive_message on it. receive_message would then
see that recovery had been aborted, set error to EINTR, and 'goto out',
expecting that the error would be returned. Instead, 0 was always
returned, so process_requestqueue would think that the message had been
processed and delete it instead of saving it to process next time. This
means the message (usually an unlock in my tests) would be lost.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8fd3a98f

[DLM] fix user unlocking · a1bc86e6

由 David Teigland 提交于 1月 15, 2007

When a user process exits, we clear all the locks it holds. There is a
problem, though, with locks that the process had begun unlocking before it
exited. We couldn't find the lkb's that were in the process of being
unlocked remotely, to flag that they are DEAD. To solve this, we move
lkb's being unlocked onto a new list in the per-process structure that
tracks what locks the process is holding. We can then go through this
list to flag the necessary lkb's when clearing locks for a process when it
exits.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

a1bc86e6

[DLM] rename dlm_config_info fields · 68c817a1

由 David Teigland 提交于 1月 09, 2007

Add a "ci_" prefix to the fields in the dlm_config_info struct so that we
can use macros to add configfs functions to access them (in a later
patch). No functional changes in this patch, just naming changes.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

68c817a1

[DLM] fix lost flags in stub replies · 075529b5

由 David Teigland 提交于 12月 13, 2006

When the dlm fakes an unlock/cancel reply from a failed node using a stub
message struct, it wasn't setting the flags in the stub message. So, in
the process of receiving the fake message the lkb flags would be updated
and cleared from the zero flags in the message. The problem observed in
tests was the loss of the USER flag which caused the dlm to think a user
lock was a kernel lock and subsequently fail an assertion checking the
validity of the ast/callback field.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

075529b5

[DLM] fix receive_request() lvb copying · 8d07fd50

由 David Teigland 提交于 12月 13, 2006

LVB's are not sent as part of new requests, but the code receiving the
request was copying data into the lvb anyway. The space in the message
where it mistakenly thought the lvb lived actually contained the resource
name, so it wound up incorrectly copying this name data into the lvb. Fix
is to just create the lvb, not copy junk into it.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8d07fd50

[DLM] fix send_args() lvb copying · da49f36f

由 David Teigland 提交于 12月 13, 2006

The send_args() function is used to copy parameters into a message for a
number different message types. Only some of those types are set up
beforehand (in create_message) to include space for sending lvb data.
send_args was wrongly copying the lvb for all message types as long as the
lock had an lvb. This means that the lvb data was being written past the
end of the message into unknown space.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

da49f36f

[DLM] fix resend rcom lock · dc200a88

由 David Teigland 提交于 12月 13, 2006

There's a chance the new master of resource hasn't learned it's the new
master before another node sends it a lock during recovery.  The node
sending the lock needs to resend if this happens.

- A sends a master lookup for resource R to C
- B sends a master lookup for resource R to C
- C receives A's lookup, assigns A to be master of R and
  sends a reply back to A
- C receives B's lookup and sends a reply back to B saying
  that A is the master
- B receives lookup reply from C and sends its lock for R to A
- A receives lock from B, doesn't think it's the master of R
  and sends an error back to B
- A receives lookup reply from C and becomes master of R
- B gets error back from A and resends its lock back to A
  (this resending is what this patch does)
- A receives lock from B, it now sees it's the master of R
  and takes the lock
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

dc200a88

30 11月, 2006 2 次提交

[DLM] clear sbflags on lock master · 6f90a8b1

由 David Teigland 提交于 11月 10, 2006

RH BZ 211622

The ALTMODE flag can be set in the lock master's copy of the lock but
never cleared, so ALTMODE will also be returned in a subsequent conversion
of the lock when it shouldn't be. This results in lock_dlm incorrectly
switching to the alternate lock mode when returning the result to gfs
which then asserts when it sees the wrong lock state. The fix is to
propagate the cleared sbflags value to the master node when the lock is
requested. QA's d_rwrandirectlarge test triggers this bug very quickly.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

6f90a8b1

[DLM] fix requestqueue race · d4400156

由 David Teigland 提交于 10月 31, 2006

Red Hat BZ 211914

There's a race between dlm_recoverd (1) enabling locking and (2) clearing
out the requestqueue, and dlm_recvd (1) checking if locking is enabled and
(2) adding a message to the requestqueue.  An order of recoverd(1),
recvd(1), recvd(2), recoverd(2) will result in a message being left on the
requestqueue.  The fix is to have dlm_recvd check if dlm_recoverd has
enabled locking after taking the mutex for the requestqueue and if it has
processing the message instead of queueing it.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d4400156

25 9月, 2006 1 次提交

[GFS2/DLM] Fix trailing whitespace · 907b9bce

由 Steven Whitehouse 提交于 9月 25, 2006

As per Andrew Morton's request, removed trailing whitespace.

Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

907b9bce

09 9月, 2006 1 次提交

[DLM] confirm master for recovered waiting requests · fa9f0e49

由 David Teigland 提交于 9月 08, 2006

Fixing the following scenario:
- A request is on the waiters list waiting for a reply from a remote node.
- The request is the first one on the resource, so first_lkid is set.
- The remote node fails causing recovery.
- During recovery the requesting node becomes master.
- The request is now processed locally instead of being a remote operation.
- At this point we need to call confirm_master() on the resource since
  we're certain we're now the master node.  This will clear first_lkid.
- We weren't calling confirm_master(), so first_lkid was not being cleared
  causing subsequent requests on that resource to get stuck.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

fa9f0e49

24 8月, 2006 1 次提交

[DLM] down conversion clearing flags · 32f105a1

由 David Teigland 提交于 8月 23, 2006

The down-conversion optimization was resulting in the lkb flags being
cleared because the stub message reply had no flags value set. Copy the
current flags into the stub message so they'll be copied back into the lkb
as part of processing the fake reply. Also add an assertion to catch this
error more directly if it exists elsewhere.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

32f105a1

23 8月, 2006 2 次提交

[DLM] down conversion clearing flags · c059f70e

由 Patrick Caulfield 提交于 8月 23, 2006

Oh, and here's (hopefully) the last of these ua_tmp patches. I think I've
caught all the paths now. Sorry it didn't make the last one.
Signed-off-by: NPatrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c059f70e

[DLM] preserve lksb address in user conversions · 10948eb4

由 Patrick Caulfield 提交于 8月 23, 2006

This patch fixes bz#203444 where the LKSB was lost during userland conversion
operations
Signed-off-by: NPatrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

10948eb4

21 8月, 2006 1 次提交

[DLM] dump rsb and locks on assert · a345da3e

由 David Teigland 提交于 8月 18, 2006

Introduce new function dlm_dump_rsb() to call within assertions instead of
dlm_print_rsb().  The new function dumps info about all locks on the rsb
in addition to rsb details.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

a345da3e

08 8月, 2006 1 次提交

[DLM] fix userland unlock · cc346d55

由 Patrick Caulfield 提交于 8月 08, 2006

This patch fixes the userland DLM unlock code so that it correctly returns the
address of the userland lock status block in its completion AST.

It fixes bug #201348

Patrick
Signed-Off-By: NPatrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

cc346d55

26 7月, 2006 2 次提交

[DLM] schedule during long loop through locks · 81456807

由 David Teigland 提交于 7月 25, 2006

The loop through all waiting locks in recover_waiters can potentially be
long, so we should schedule explicitly.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

81456807

[DLM] fix loop in grant_after_purge · 2b4e926a

由 David Teigland 提交于 7月 25, 2006

The loop in grant_after_purge is intended to find all rsb's in each hash
bucket that have the LOCKS_PURGED flag set.  The loop was quitting the
current bucket after finding just one rsb instead of going until there are
no more.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2b4e926a

20 7月, 2006 2 次提交

[DLM] fix leaking user locks · 34e22bed

由 David Teigland 提交于 7月 18, 2006

User NOQUEUE lock requests to a remote node that failed with -EAGAIN were
never being removed from a process's list of locks.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

34e22bed

[DLM] [RFC: -mm patch] fs/dlm/lock.c: unexport dlm_lvb_operations · 3b4a0a74

由 Adrian Bunk 提交于 7月 15, 2006

On Thu, Jul 13, 2006 at 10:48:00PM -0700, Andrew Morton wrote:
>...
> Changes since 2.6.18-rc1-mm1:
>...
>  git-gfs2.patch
>...
>  git trees.
>...

This patch removes the unused EXPORT_SYMBOL_GPL(dlm_lvb_operations).
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3b4a0a74

13 7月, 2006 1 次提交

[DLM] dlm: user locks · 597d0cae

由 David Teigland 提交于 7月 12, 2006

This changes the way the dlm handles user locks.  The core dlm is now
aware of user locks so they can be dealt with more efficiently.  There is
no more dlm_device module which previously managed its own duplicate copy
of every user lock.
Signed-off-by: NPatrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

597d0cae

03 5月, 2006 1 次提交

[DLM] fix grant_after_purge softlockup · 97a35d1e

由 David Teigland 提交于 5月 02, 2006

In dlm_grant_after_purge() we were holding a hash table read_lock while
calling put_rsb() which potentially removes the rsb from the hash table,
taking the same lock in write. Fix this by flagging rsb's ahead of time
that have been purged. Then iteratively read_lock the hash table, find a
flagged rsb, unlock, process rsb.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

97a35d1e

01 3月, 2006 1 次提交

[DLM] Pass in lockspace to lkb put function · b3f58d8f

由 David Teigland 提交于 2月 28, 2006

In some cases a lockspace isn't attached to the lkb, so that
it needs to be passed directly to the lkb put function.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

b3f58d8f

23 2月, 2006 1 次提交

[DLM] Remove range locks from the DLM · 3bcd3687

由 David Teigland 提交于 2月 23, 2006

This patch removes support for range locking from the DLM
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3bcd3687

20 1月, 2006 1 次提交

[DLM] Update DLM to the latest patch level · 90135925

由 David Teigland 提交于 1月 20, 2006

Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteve Whitehouse <swhiteho@redhat.com>

90135925

18 1月, 2006 1 次提交

[DLM] The core of the DLM for GFS2/CLVM · e7fd4179

由 David Teigland 提交于 1月 18, 2006

This is the core of the distributed lock manager which is required
to use GFS2 as a cluster filesystem. It is also used by CLVM and
can be used as a standalone lock manager independantly of either
of these two projects.

It implements VAX-style locking modes.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteve Whitehouse <swhiteho@redhat.com>

e7fd4179