提交 · 3ae1acf93a21512512f8a78430fcde5992dd208e · OpenHarmony / kernel_linux

09 7月, 2007 1 次提交

[DLM] add lock timeouts and warnings [2/6] · 3ae1acf9

由 David Teigland 提交于 5月 18, 2007

New features: lock timeouts and time warnings.  If the DLM_LKF_TIMEOUT
flag is set, then the request/conversion will be canceled after waiting
the specified number of centiseconds (specified per lock).  This feature
is only available for locks requested through libdlm (can be enabled for
kernel dlm users if there's a use for it.)

If the new DLM_LSFL_TIMEWARN flag is set when creating the lockspace, then
a warning message will be sent to userspace (using genetlink) after a
request/conversion has been waiting for a given number of centiseconds
(configurable per node).  The time warnings will be used in the future
to do deadlock detection in userspace.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3ae1acf9

06 2月, 2007 1 次提交

[DLM] change some log_error to log_debug · 8ec68867

由 David Teigland 提交于 1月 09, 2007

Some common, non-error messages should use log_debug instead of log_error
so they can be turned off.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8ec68867

30 11月, 2006 5 次提交

[DLM] fix format warnings in rcom.c and recoverd.c · 57adf7ee

由 Ryusuke Konishi 提交于 11月 29, 2006

This fixes the following gcc warnings generated on
the architectures where uint64_t != unsigned long long (e.g. ppc64).

fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'uint64_t'
fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'uint64_t'
fs/dlm/recoverd.c:48: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
fs/dlm/recoverd.c:202: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
fs/dlm/recoverd.c:210: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
Signed-off-by: NRyusuke Konishi <ryusuke@osrg.net>
Signed-off-by: NPatrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

57adf7ee

[DLM] fix add_requestqueue checking nodes list · 2896ee37

由 David Teigland 提交于 11月 27, 2006

Requests that arrive after recovery has started are saved in the
requestqueue and processed after recovery is done.  Some of these requests
are purged during recovery if they are from nodes that have been removed.
We move the purging of the requests (dlm_purge_requestqueue) to later in
the recovery sequence which allows the routine saving requests
(dlm_add_requestqueue) to avoid filtering out requests by nodeid since the
same will be done by the purge.  The current code has add_requestqueue
filtering by nodeid but doesn't hold any locks when accessing the list of
current nodes.  This also means that we need to call the purge routine
when the lockspace is being shut down since the add routine will not be
rejecting requests itself any more.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2896ee37

[DLM] do full recover_locks barrier · 4b77f2c9

由 David Teigland 提交于 11月 01, 2006

Red Hat BZ 211914

The previous patch "[DLM] fix aborted recovery during
node removal" was incomplete as discovered with further testing.  It set
the bit for the RS_LOCKS barrier but did not then wait for the barrier.
This is often ok, but sometimes it will cause yet another recovery hang.
If it's a new node that also has the lowest nodeid that skips the barrier
wait, then it misses the important step of collecting and reporting the
barrier status from the other nodes (which is the job of the low nodeid in
the barrier wait routine).
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4b77f2c9

[DLM] fix stopping unstarted recovery · 2cdc98aa

由 David Teigland 提交于 10月 31, 2006

Red Hat BZ 211914

When many nodes are joining a lockspace simultaneously, the dlm gets a
quick sequence of stop/start events, a pair for adding each node.
dlm_controld in user space sends dlm_recoverd in the kernel each stop and
start event. dlm_controld will sometimes send the stop before
dlm_recoverd has had a chance to take up the previously queued start. The
stop aborts the processing of the previous start by setting the
RECOVERY_STOP flag. dlm_recoverd is erroneously clearing this flag and
ignoring the stop/abort if it happens to take up the start after the stop
meant to abort it. The fix is to check the sequence number that's
incremented for each stop/start before clearing the flag.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2cdc98aa

[DLM] fix aborted recovery during node removal · 91c0dc93

由 David Teigland 提交于 10月 31, 2006

Red Hat BZ 211914

With the new cluster infrastructure, dlm recovery for a node removal can
be aborted and restarted for a node addition.  When this happens, the
restarted recovery isn't aware that it's doing recovery for the earlier
removal as well as the addition.  So, it then skips the recovery steps
only required when nodes are removed.  This can result in locks not being
purged for failed/removed nodes.  The fix is to check for removed nodes
for which recovery has not been completed at the start of a new recovery
sequence.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

91c0dc93

25 8月, 2006 1 次提交

[DLM] add new lockspace to list ealier · 5f88f1ea

由 David Teigland 提交于 8月 24, 2006

When a new lockspace was being created, the recoverd thread was being
started for it before the lockspace was added to the global list of
lockspaces. The new thread was looking up the lockspace in the global
list and sometimes not finding it due to the race with the original thread
adding it to the list. We need to add the lockspace to the global list
before starting the thread instead of after, and if the new thread can't
find the lockspace for some reason, it should return an error.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5f88f1ea

09 8月, 2006 1 次提交

[DLM] abort recovery more quickly · f6db1b8e

由 David Teigland 提交于 8月 08, 2006

When we abort one recovery to do another, break out of the ping_members()
routine more quickly, and wake up the dlm_recoverd thread more quickly
instead of waiting for it to time out.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f6db1b8e

20 1月, 2006 1 次提交

[DLM] Update DLM to the latest patch level · 90135925

由 David Teigland 提交于 1月 20, 2006

Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteve Whitehouse <swhiteho@redhat.com>

90135925

18 1月, 2006 1 次提交

[DLM] The core of the DLM for GFS2/CLVM · e7fd4179

由 David Teigland 提交于 1月 18, 2006

This is the core of the distributed lock manager which is required
to use GFS2 as a cluster filesystem. It is also used by CLVM and
can be used as a standalone lock manager independantly of either
of these two projects.

It implements VAX-style locking modes.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteve Whitehouse <swhiteho@redhat.com>

e7fd4179

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年