1. 06 2月, 2007 1 次提交
    • D
      [DLM] fix old rcom messages · 38aa8b0c
      David Teigland 提交于
      A reply to a recovery message will often be received after the relevant
      recovery sequence has aborted and the next recovery sequence has begun.
      We need to ignore replies to these old messages from the previous
      recovery.  There's already a way to do this for synchronous recovery
      requests using the rc_id number, but not for async.
      
      Each recovery sequence already has a locally unique sequence number
      associated with it.  This patch adds a field to the rcom (recovery
      message) structure where this recovery sequence number can be placed,
      rc_seq.  When a node sends a reply to a recovery request, it copies the
      rc_seq number it received into rc_seq_reply.  When the first node receives
      the reply to its recovery message, it will check whether rc_seq_reply
      matches the current recovery sequence number, ls_recover_seq, and if not
      then it ignores the old reply.
      
      An old, inadequate approach to filtering out old replies (checking if the
      current stage of recovery has moved back to the start) has been removed
      from two spots.
      
      The protocol version number is changed to reflect the different rcom
      structures.
      Signed-off-by: NDavid Teigland <teigland@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      38aa8b0c
  2. 30 11月, 2006 1 次提交
    • D
      [DLM] don't accept replies to old recovery messages · 98f176fb
      David Teigland 提交于
      We often abort a recovery after sending a status request to a remote node.
      We want to ignore any potential status reply we get from the remote node.
      If we get one of these unwanted replies, we've often moved on to the next
      recovery message and incremented the message sequence counter, so the
      reply will be ignored due to the seq number.  In some cases, we've not
      moved on to the next message so the seq number of the reply we want to
      ignore is still correct, causing the reply to be accepted.  The next
      recovery message will then mistake this old reply as a new one.
      
      To fix this, we add the flag RCOM_WAIT to indicate when we can accept a
      new reply.  We clear this flag if we abort recovery while waiting for a
      reply.  Before the flag is set again (to allow new replies) we know that
      any old replies will be rejected due to their sequence number.  We also
      initialize the recovery-message sequence number to a random value when a
      lockspace is first created.  This makes it clear when messages are being
      rejected from an old instance of a lockspace that has since been
      recreated.
      Signed-off-by: NDavid Teigland <teigland@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      98f176fb
  3. 24 8月, 2006 1 次提交
  4. 10 8月, 2006 1 次提交
  5. 09 8月, 2006 1 次提交
  6. 26 7月, 2006 1 次提交
  7. 13 7月, 2006 1 次提交
  8. 03 5月, 2006 1 次提交
  9. 23 2月, 2006 1 次提交
  10. 20 1月, 2006 1 次提交
  11. 18 1月, 2006 1 次提交