Fix xids on segments
Problem: After upgrade of the database 5->6 or 6->6, when a create table is executed, it reports the below warning when you create a table. ``` WARNING: database with OID 0 must be vacuumed within 147483647 transactions (seg2 127.0.0.1:25434 pid=97928) HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database. ``` If you further go to run VACUUM FREEZE, it reports the below error: ``` template1=# vacuum freeze; ERROR: found xmin 711 from before relfrozenxid 800 (seg0 127.0.0.1:50433 pid=39596) ``` Even trying to run VACUUM FREEZE on the upgraded tables reports the same error as above. For upgrading gpdb database, first we upgrade the master and then the segments. The interesting ones related to the problem here are the below ``` 1. Create a new cluster 2. Upgrade the master of the new cluster 2.a) Execute copy_clog_xlog_xid - Which executes pg_resetxlog to the set new control data files using the values based on the old master control data file. 2.b) Execute set_frozenxids: Which updates the catalog with datfrozenxid, datminmxid. relfrozenxid, relminmxid based on the corresponding old master 2.c) Restore the schema 2.d) Vacuum Freeze - This updates the value of `checkpoint's oldestXID` and `checkpoint's oldestXID's DB` too. 5. Upgrade segment 5.a) Copy master catalog to the segment 5.b) Execute copy_clog_xlog_xid - Which executes pg_resetxlog to the set new control data files using the values based on the old master control data file. ``` The below error is coming from segments: ``` WARNING: database with OID 0 must be vacuumed within 147483647 transactions (seg2 127.0.0.1:25434 pid=97928) HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database. ``` If you look the controldata files for the segment after upgrade, they have default values for the below variables, But for the master, they have valid values. (SINCE on master, VACUUM FREEZE was run which updates the values) pg_controldata <path_of_segment_data_dir> ``` Latest checkpoint's oldestXID: 2294968074 Latest checkpoint's oldestXID's DB: 0 ``` The above values are currently invalid. When pg_resetxlog is run to set the `Latest checkpoint's NextXID:`, it sets the value of the nextxid, and put default values for oldestXidDB and oldestXid with the assumption that autovacuum will take care of it. Autovacuum is disabled in greenplum, so these values are not updated, and the WARNING `WARNING: database with OID 0 must be vacuumed ` is observed. Refer to the code below: ``` if (set_xid != 0) { ControlFile.checkPointCopy.nextXid = set_xid; /* * For the moment, just set oldestXid to a value that will force * immediate autovacuum-for-wraparound. It's not clear whether adding * user control of this is useful, so let's just do something that's * reasonably safe. The magic constant here corresponds to the * maximum allowed value of autovacuum_freeze_max_age. */ ControlFile.checkPointCopy.oldestXid = set_xid - 2000000000; ControlFile.checkPointCopy.oldestXid = set_xid - 2000000000; if (ControlFile.checkPointCopy.oldestXid < FirstNormalTransactionId) ControlFile.checkPointCopy.oldestXid += FirstNormalTransactionId; ControlFile.checkPointCopy.oldestXidDB = InvalidOid; } ```` For the segments, a VACUUM FREEZE should be executed after pg_resetxlog is performed to update these values. Further, even if you try to run VACUUM, it complains ``` ERROR: found xmin 711 from before relfrozenxid 800 (seg0 127.0.0.1:50433 pid=39596) ``` During the upgrade of a segment, the following steps are performed 1. Copy the upgraded master catalog 2. Execute copy_clog_xlog_xid which executes pg_resetxlog to the set new control data files using the values based on the old master control data file. There is no step performed to update the relfrozenxid of the relation in pg_class of the target cluster segments to reflect the same values as that in the source cluster segment. GPDB has multiple databases which acts as master or segment, and the values of these variables can be different. In Master, the restore scripts contains update statement for relfrozenxid and relminmxid and the same is copied to the segment. However, we should all update the target segment database values with the values of the corresponding source segment database. This PR adds primarily the below 2 things: 1. Execute VACUUM FREEZE on segments (and the required steps before it) 1.a) Execute set_frozenxids: After pg_resetxlog has been run the 2. Update the segments relfrozenxid, relminxmid, datfrozenxid, datminmxid
Showing
想要评论请 注册 或 登录