提交 8d008792 编写于 作者: D Denis Smirnov 提交者: David Kimura

Fix dbid inconsistency on spread mirroring

Mirror registration passes through several steps at the moment:
1. CREATE_QE_ARRAY (QE_MIRROR_ARRAY is ordered by content)
2. ARRAY_REORDER (QE_MIRROR_ARRAY is ordered by port)
3. CREATE_ARRAY_SORTED_ON_CONTENT_ID (form QE_MIRROR_ARRAY_SORTED_ON_CONTENT_ID
   on a base of QE_MIRROR_ARRAY)
4. REGISTER_MIRRORS (walk through QE_MIRROR_ARRAY, register mirrors with
   pg_catalog.gp_add_segment_mirror on master's gp_segment_configuration
   and update QE_MIRROR_ARRAY with returned dbids)
5. CREATE_SEGMENT (walk through QE_MIRROR_ARRAY_SORTED_ON_CONTENT_ID with old
   dbids and create mirrors on segment hosts with pg_basebackup)
The problem is in a step 4 - we update the wrong array (QE_MIRROR_ARRAY instead
of QE_MIRROR_ARRAY_SORTED_ON_CONTENT_ID). Because of that we get inconsistency
between mirror dbids on gp_segment_configuration and internal.auto.conf files.
This can cause inoperable cluster state in some situations when we promote a
failed primary from a mirror with wrong dbids (FTS can't solve this issue).

Also fixed column indexes in array used for segment arrray ordering.
It was not done after commit https://github.com/greenplum-db/gpdb/commit/03c7d557720c5a78af1e2574ac385d10a0797f5e
which prepend array with new hostname column.
Co-authored-by: NVasiliy Ivanov <ivi@arenadata.io>
上级 9d0e3cd0
......@@ -1006,10 +1006,10 @@ ARRAY_REORDER() {
;;
esac
QE_REORDER_ARRAY=(`$ECHO ${QE_PRIMARY_ARRAY[@]}|$TR ' ' '\n'|$SORT -t$S -n -k2,2|$TR '\n' ' '`)
QE_REORDER_ARRAY=(`$ECHO ${QE_PRIMARY_ARRAY[@]}|$TR ' ' '\n'|$SORT -t$S -n -k3,3|$TR '\n' ' '`)
QE_PRIMARY_ARRAY=(${QE_REORDER_ARRAY[@]})
if [ $MIRROR_TYPE -eq 1 ];then
QE_REORDER_ARRAY=(`$ECHO ${QE_MIRROR_ARRAY[@]}|$TR ' ' '\n'|$SORT -t$S -n -k2,2|$TR '\n' ' '`)
QE_REORDER_ARRAY=(`$ECHO ${QE_MIRROR_ARRAY[@]}|$TR ' ' '\n'|$SORT -t$S -n -k3,3|$TR '\n' ' '`)
QE_MIRROR_ARRAY=(${QE_REORDER_ARRAY[@]})
fi
LOG_MSG "[INFO]:-End Function $FUNCNAME"
......@@ -1021,10 +1021,10 @@ CREATE_ARRAY_SORTED_ON_CONTENT_ID() {
local REORDERING_ON_CONTENT
REORDERING_ON_CONTENT=(`$ECHO ${QE_PRIMARY_ARRAY[@]}|$TR ' ' '\n'|$SORT -t$S -n -k5,5|$TR '\n' ' '`)
REORDERING_ON_CONTENT=(`$ECHO ${QE_PRIMARY_ARRAY[@]}|$TR ' ' '\n'|$SORT -t$S -n -k6,6|$TR '\n' ' '`)
QE_PRIMARY_ARRAY_SORTED_ON_CONTENT_ID=(${REORDERING_ON_CONTENT[@]})
if [ $MIRRORING -ne 0 ] ; then
REORDERING_ON_CONTENT=(`$ECHO ${QE_MIRROR_ARRAY[@]}|$TR ' ' '\n'|$SORT -t$S -n -k5,5|$TR '\n' ' '`)
REORDERING_ON_CONTENT=(`$ECHO ${QE_MIRROR_ARRAY[@]}|$TR ' ' '\n'|$SORT -t$S -n -k6,6|$TR '\n' ' '`)
QE_MIRROR_ARRAY_SORTED_ON_CONTENT_ID=(${REORDERING_ON_CONTENT[@]})
fi
LOG_MSG "[INFO]:-End Function $FUNCNAME"
......@@ -1436,15 +1436,15 @@ CREATE_SEGMENT () {
REGISTER_MIRRORS () {
LOG_MSG "[INFO]:-Start Function $FUNCNAME"
for I in "${QE_MIRROR_ARRAY[@]}"
for I in "${QE_MIRROR_ARRAY_SORTED_ON_CONTENT_ID[@]}"
do
SET_VAR $I
dbid=`env PGOPTIONS="-c gp_role=utility" $PSQL -p $MASTER_PORT -d "${DEFAULTDB}" -X -A -t -c "select pg_catalog.gp_add_segment_mirror(${GP_CONTENT}::int2, '${GP_HOSTNAME}', '${GP_HOSTADDRESS}', ${GP_PORT}, '${GP_DIR}');" 2>/dev/null` >> $LOG_FILE 2>&1
ERROR_CHK $? "failed to register mirror for contentid=${GP_CONTENT}" 2
MIRRORS_UPDATED_DBID=(${MIRRORS_UPDATED_DBID[@]} ${GP_HOSTADDRESS}~${GP_PORT}~${GP_DIR}~${dbid}~${GP_CONTENT})
MIRRORS_UPDATED_DBID=(${MIRRORS_UPDATED_DBID[@]} ${GP_HOSTNAME}~${GP_HOSTADDRESS}~${GP_PORT}~${GP_DIR}~${dbid}~${GP_CONTENT})
done
QE_MIRROR_ARRAY=(${MIRRORS_UPDATED_DBID[@]})
QE_MIRROR_ARRAY_SORTED_ON_CONTENT_ID=(${MIRRORS_UPDATED_DBID[@]})
LOG_MSG "[INFO]:-End Function $FUNCNAME"
}
......
......@@ -17,6 +17,19 @@ Feature: Tests for gpaddmirrors
########################### @concourse_cluster tests ###########################
# The @concourse_cluster tag denotes the scenario that requires a remote cluster
@concourse_cluster
Scenario: spread mirroring configuration
Given a working directory of the test as '/tmp/gpaddmirrors'
And the database is not running
And a cluster is created with "spread" segment mirroring on "mdw" and "sdw1, sdw2, sdw3"
Then verify that mirror segments are in "spread" configuration
Given a preferred primary has failed
When the user runs "gprecoverseg -a"
Then gprecoverseg should return a return code of 0
And all the segments are running
And the segments are synchronized
And the user runs "gpstop -aqM fast"
@concourse_cluster
Scenario: gprecoverseg works correctly on a newly added mirror with HBA_HOSTNAMES=0
Given a working directory of the test as '/tmp/gpaddmirrors'
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册