- 06 2月, 2007 1 次提交
-
-
由 David Teigland 提交于
A reply to a recovery message will often be received after the relevant recovery sequence has aborted and the next recovery sequence has begun. We need to ignore replies to these old messages from the previous recovery. There's already a way to do this for synchronous recovery requests using the rc_id number, but not for async. Each recovery sequence already has a locally unique sequence number associated with it. This patch adds a field to the rcom (recovery message) structure where this recovery sequence number can be placed, rc_seq. When a node sends a reply to a recovery request, it copies the rc_seq number it received into rc_seq_reply. When the first node receives the reply to its recovery message, it will check whether rc_seq_reply matches the current recovery sequence number, ls_recover_seq, and if not then it ignores the old reply. An old, inadequate approach to filtering out old replies (checking if the current stage of recovery has moved back to the start) has been removed from two spots. The protocol version number is changed to reflect the different rcom structures. Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 30 11月, 2006 1 次提交
-
-
由 David Teigland 提交于
We often abort a recovery after sending a status request to a remote node. We want to ignore any potential status reply we get from the remote node. If we get one of these unwanted replies, we've often moved on to the next recovery message and incremented the message sequence counter, so the reply will be ignored due to the seq number. In some cases, we've not moved on to the next message so the seq number of the reply we want to ignore is still correct, causing the reply to be accepted. The next recovery message will then mistake this old reply as a new one. To fix this, we add the flag RCOM_WAIT to indicate when we can accept a new reply. We clear this flag if we abort recovery while waiting for a reply. Before the flag is set again (to allow new replies) we know that any old replies will be rejected due to their sequence number. We also initialize the recovery-message sequence number to a random value when a lockspace is first created. This makes it clear when messages are being rejected from an old instance of a lockspace that has since been recreated. Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 24 8月, 2006 1 次提交
-
-
由 David Teigland 提交于
The down-conversion optimization was resulting in the lkb flags being cleared because the stub message reply had no flags value set. Copy the current flags into the stub message so they'll be copied back into the lkb as part of processing the fake reply. Also add an assertion to catch this error more directly if it exists elsewhere. Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 10 8月, 2006 1 次提交
-
-
由 David Teigland 提交于
When recoveries are aborted by other recoveries we can get replies to status or names requests that we've given up on. This can cause problems if we're making another request and receive an old reply. Add a sequence number to status/names requests and reject replies that don't match. A field already exists for the seq number that's used in other message types. Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 09 8月, 2006 1 次提交
-
-
由 David Teigland 提交于
To aid debugging, it's useful to be able to see what nodeid the dlm is waiting on for a message reply. Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 26 7月, 2006 1 次提交
-
-
由 David Teigland 提交于
Display more information from debugfs, particularly locks waiting for a master lookup or operations waiting for a remote reply. Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 13 7月, 2006 1 次提交
-
-
由 David Teigland 提交于
This changes the way the dlm handles user locks. The core dlm is now aware of user locks so they can be dealt with more efficiently. There is no more dlm_device module which previously managed its own duplicate copy of every user lock. Signed-off-by: NPatrick Caulfield <pcaulfie@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 03 5月, 2006 1 次提交
-
-
由 David Teigland 提交于
In dlm_grant_after_purge() we were holding a hash table read_lock while calling put_rsb() which potentially removes the rsb from the hash table, taking the same lock in write. Fix this by flagging rsb's ahead of time that have been purged. Then iteratively read_lock the hash table, find a flagged rsb, unlock, process rsb. Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 23 2月, 2006 1 次提交
-
-
由 David Teigland 提交于
This patch removes support for range locking from the DLM Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 20 1月, 2006 1 次提交
-
-
由 David Teigland 提交于
Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteve Whitehouse <swhiteho@redhat.com>
-
- 18 1月, 2006 1 次提交
-
-
由 David Teigland 提交于
This is the core of the distributed lock manager which is required to use GFS2 as a cluster filesystem. It is also used by CLVM and can be used as a standalone lock manager independantly of either of these two projects. It implements VAX-style locking modes. Signed-off-by: NDavid Teigland <teigland@redhat.com> Signed-off-by: NSteve Whitehouse <swhiteho@redhat.com>
-