• P
    Limit gxact number on master with MaxBackends. · 2a961e65
    Paul Guo 提交于
    Previously we assign it as max_prepared_xacts. It is used to initialize some
    2pc related shared memory. For example the array shmCommittedGxactArray is
    created with this length and that array is used to collect not-yet "forgotten"
    distributed transactions during master/standby recovery, but the array length
    might be problematic since:
    
    1. If master max_prepared_xacts is equal to segment max_prepared_xacts as
    usual.  It is possible some distributed transactions use just partial gang so
    the total distributed transactions might be larger (and even much larger) than
    max_prepared_xacts. The document says max_prepared_xacts should be greater than
    max_connections but there is no code to enforce that.
    
    2. Also it is possible that master max_prepared_xacts might be different than
    segment max_prepared_xacts (although the document does not suggest it there is
    no code to enforce that),
    
    To fix that we use MaxBackends for the gxact number on master. We may just use
    guc max_connections (MaxBackends includes number for autovacuum workers and bg
    workers additionally besides guc max_connections), but I'm conservatively using
    MaxBackends,  since this issue is annoying - standby can not recover due to the
    FATAL message as below even after postgres reboot unless we temporarily
    increase the guc max_prepared_transactions value.
    
    2020-07-17 16:48:19.178667
    CST,,,p33652,th1972721600,,,,0,,,seg-1,,,,,"FATAL","XX000","the limit of 3
    distributed transactions has been reached","It should not happen. Temporarily
    increase max_connections (need postmaster reboot) on the postgres (master or
    standby) to work around this issue and then report a bug",,,,"xlog redo at
    0/C339BA0 for Transaction/DISTRIBUTED_COMMIT: distributed commit 2020-07-17
    16:48:19.101832+08 gid = 1594975696-0000000009, gxid =
    9",,0,,"cdbdtxrecovery.c",571,"Stack trace:
    
    1    0xb3a30f postgres errstart (elog.c:558)
    2    0xc3da4d postgres redoDistributedCommitRecord (cdbdtxrecovery.c:565)
    3    0x564227 postgres <symbol not found> (xact.c:6942)
    4    0x564671 postgres xact_redo (xact.c:7080)
    5    0x56fee5 postgres StartupXLOG (xlog.c:7207)
    Reviewed-by: Nxiong-gang <gxiong@pivotal.io>
    2a961e65
isolation2_schedule 9.6 KB