提交 fa9f0e49 编写于 作者: D David Teigland 提交者: Steven Whitehouse

[DLM] confirm master for recovered waiting requests

Fixing the following scenario:
- A request is on the waiters list waiting for a reply from a remote node.
- The request is the first one on the resource, so first_lkid is set.
- The remote node fails causing recovery.
- During recovery the requesting node becomes master.
- The request is now processed locally instead of being a remote operation.
- At this point we need to call confirm_master() on the resource since
  we're certain we're now the master node.  This will clear first_lkid.
- We weren't calling confirm_master(), so first_lkid was not being cleared
  causing subsequent requests on that resource to get stuck.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
上级 37b2fa6a
...@@ -3283,6 +3283,8 @@ int dlm_recover_waiters_post(struct dlm_ls *ls) ...@@ -3283,6 +3283,8 @@ int dlm_recover_waiters_post(struct dlm_ls *ls)
hold_rsb(r); hold_rsb(r);
lock_rsb(r); lock_rsb(r);
_request_lock(r, lkb); _request_lock(r, lkb);
if (is_master(r))
confirm_master(r, 0);
unlock_rsb(r); unlock_rsb(r);
put_rsb(r); put_rsb(r);
break; break;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册