提交 1b02bd8f 编写于 作者: K Kenan Yao

Fix postmaster reset failure on segment nodes with mirror configured

If a QE crashes for reasons such as SIGSEGV, SIGKILL or PANIC, segment
postmaster reset fails sometimes. The root cause is: primary segment
postmaster would first tell child processes to exit, then start a filerep
peer reset process to instruct mirror postmaster do reset; the filerep peer
reset process would only exit when mirror postmaster finishes or fails the
reset procedure; primary postmaster would wait for the termination
of important processes such as AutoVacuum, BgWriter, CheckPoint, filerep peer
reset process etc, before it resets share memory and restarts auxiliary
processes; however, in some cases, primary postmaster would be stuck in filerep
peer reset step, if mirror postmaster is hanging/waiting for some events;
if this happens, filerep peer reset process would wait there until timeout(1 hour),
and retry 10 times before reports failure to primary postmaster (so 10 hours in total);
so the final result is primary postmaster takes 10 hours to report reset failure.

This happens almost every time on mirror segment host machine with poor performance
for reasons that: mirror postmaster would do similar reset procedure with
primary postmaster, i.e, notify child processes to exit and wait their
terminations and then restart auxiliary processes; filerep peer reset process
would first connect to mirror postmaster to request a postmaster reset, then it would
check the reset status of mirror every 10ms by connecting to mirror postmaster;
so it can happen that filerep peer reset process keeps connecting mirror
postmaster, which would lead to continuous dead_end backend processes forked,
while at the same time mirror postmaster waits for the exit of all dead_end backend
processes, so it is possible that the speed of generating new dead_end processes
exceeds the exit speed, and hence mirror postmaster can never see the clearance of
child processes. All in all, this can lead to hang issue and failure of postmaster reset.

This issue exists for master postmaster reset as well on heavy workload circumstances.
上级 aa02fa06
......@@ -5027,10 +5027,10 @@ static void do_reaper()
/*
* Wait for all important children to exit, then reset shmem and
* redo database startup. (We can ignore the archiver and stats processes
* here since they are not connected to shmem.)
* redo database startup. (We can ignore the syslogger, archiver and stats
* processes here since they are not connected to shmem.)
*/
if (DLGetHead(BackendList) ||
if (CountChildren(BACKEND_TYPE_ALL) != 0 ||
StartupPID != 0 ||
StartupPass2PID != 0 ||
StartupPass3PID != 0 ||
......@@ -5048,6 +5048,25 @@ static void do_reaper()
goto reaper_done;
}
/*
* Start waiting for dead_end children to die. This state change causes
* ServerLoop to stop creating new ones. Otherwise, we may infinitely
* wait here on heavy workload circumstances, or in postmaster reset
* cases of segments where FilerepPeerReset process on primary segment
* continuously connects corresponding mirror postmaster.
*/
if (DLGetHead(BackendList) != NULL)
{
pmState = PM_CHILD_STOP_WAIT_DEAD_END_CHILDREN;
goto reaper_done;
}
/*
* NB: We cannot change the pmState to PM_CHILD_STOP_NO_CHILDREN here,
* since there should be syslogger existing, and maybe archiver and
* pgstats as well.
*/
if ( RecoveryError )
{
ereport(LOG,
......@@ -5677,9 +5696,6 @@ static PMState StateMachineCheck_WaitBackends(void)
}
else
{
/*
* This state change causes ServerLoop to stop creating new ones.
*/
Assert(Shutdown > NoShutdown);
moveToNextState = true;
}
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册