- 12 12月, 2018 1 次提交
-
-
由 Jialun 提交于
- Add a fault inject to test rollback process - Remove pg_hba.conf updating during online expand, for only master will connect to segment via libpq, master's ip has already been in pg_hba.conf
-
- 10 12月, 2018 1 次提交
-
-
由 Tang Pengzhou 提交于
commit 8eed4217 & e0b06678 allow us to do a gpexpand of GPDB cluster without a restart, so we called this strategy as "online expand", those two commits were mainly focus on how to avoid restarting cluster when expanding cluster, this commit will do the left work to improve gpexpand using a few features we merged to master recently. First improvement is, it's no longer necessary to change policy of all non-partition table to random at the beginning of gpexpand. Previously, we couldn't tell the difference between expanded table and non-expaned table, so if the distribution policy of these tables were the same, planner took these tables's data as co-located and produced a incorrect plan. To avoid this, gpexpand used to change policy of all table to random so planner could produce a correct but a non-effective plan, because random policy always means annoying broadcase motion. Now each table has a 'numsegments' attribute introduced by 4eb65a53, GPDB can recognize expanded and non-expand table and produced correct plans. so the first imporvement is removing the randomization step of tables. The second improvement is, use a brand new syntax "ALTER TABLE foo EXPAND TABLE" to focus on the data rebalance of tables. Previously, tables were converted to randomly before rebalance and numsegments of tables were always the same with GPDB cluster size, so we can use a tricky "ALTER TABLE foo SET WITH (REORGNIZE = true) DISTRIBUTED BY (original_key)" command to rebalance data to new added segments, now policy and numsegments are not changed before rebalancing data, so expanding such tables is not match the concept of "SET DISTRIBUTED BY" now, we need a proper syntax to focus on table expanding. New expand syntax named "ALTER TABLE foo EXPAND TABLE", we have two methods to do the real data movement inside, one is CTAS, another is RESHUFFLE, which one is better depends on how much data needs to be moved and whether the table has index and whether the table is a append only table (analyzed by Hekki). A drawback of this commit is we always expand a partition within a transaction, the old behavior is expand the leaf partition in parallel which is faster for a partition table. We don't allow root partition and its leaf partitions to have different numsegments for now, this commit disabled the old behavior temporarily. A topic on how to expanding partition table in parallel is under discussion and we wish to bring the ability back properly in the feature. * Refine name quote Eg: if schema name of table is a.b, table name is c'd, then current gpexpand cann't handle it.
-
- 07 12月, 2018 1 次提交
-
-
由 Daniel Gustafsson 提交于
The pass statement is a no-op intended to be used in empty classes or other types of stubs. When following other statements in a block it has no purpose. Remove all such occurrences. Reviewed-by: NJacob Champion <pchampion@pivotal.io>
-
- 04 12月, 2018 2 次提交
-
-
由 Jim Doty 提交于
- Some tests were expanding into /tmp which ran out of space, so now expand into /data/gpdata. - Consolidate test to verify redistribution after expand. - Actually use dump of ICW database in the relevant test. Co-authored-by: NDavid Krieger <dkrieger@pivotal.io> Co-authored-by: NJim Doty <jdoty@pivotal.io> Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
-
由 Jim Doty 提交于
Because expand copies the entire master directory, we need to zero out the distributed_xlog since the segment is not yet part of the cluster, to allow it to only see local transactions. Fixed the log level to WARN when expanding a cluster with unique indexes. Co-authored-by: NDavid Krieger <dkrieger@pivotal.io> Co-authored-by: NJim Doty <jdoty@pivotal.io> Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
-
- 30 11月, 2018 1 次提交
-
-
由 Daniel Gustafsson 提交于
Reviewed-by: NJacob Champion <pchampion@pivotal.io> Reviewed-by: NJimmy Yih <jyih@pivotal.io>
-
- 29 11月, 2018 1 次提交
-
-
由 Daniel Gustafsson 提交于
While == None works for comparison, it's a wasteful operation as it performs type conversion and expansion. Instead move to using the "is" operator which is the documented best practice for Python code. Reviewed-by: Jacob Champion
-
- 25 10月, 2018 3 次提交
-
-
由 Tang Pengzhou 提交于
* Don't use GpIdentity.numsegments directly for the number of segments Use getgpsegmentCount() instead. * Unify the way to fetch/manage the number of segments Commit e0b06678 lets us expanding a GPDB cluster without a restart, the number of segments may changes during a transaction now, so we need to take care of the numsegments. We now have two way to get segments number, 1) from GpIdentity.numsegments 2) from gp_segment_configuration (cdb_component_dbs) which dispatcher used to decide the segments range of dispatching. We did some hard work to update GpIdentity.numsegments correctly within e0b06678 which made the management of segments more complicated, now we want to use an easier way to do it: 1. We only allow getting segments info (include number of segments) through gp_segment_configuration, gp_segment_configuration has newest segments info, there is no need to update GpIdentity.numsegments, GpIdentity.numsegments is left only for debugging and can be removed totally in the future. 2. Each global transaction fetches/updates the newest snapshot of gp_segment_configuration and never change it until the end of transaction unless a writer gang is lost, so a global transaction can see consistent state of segments. We used to use gxidDispatched to do the same thing, now it can be removed. * Remove GpIdentity.numsegments GpIdentity.numsegments take no effect now, remove it. This commit does not remove gp_num_contents_in_cluster because it needs to modify utilities like gpstart, gpstop, gprecoverseg etc, let's do such cleanup work in another PR. * Exchange the default UP/DOWN value in fts cache Previously, Fts prober read gp_segment_configuration, checked the status of segments and then set the status of segments in the shared memory struct named ftsProbeInfo->fts_status[], so other components (mainly used by dispatcher) can detect a segment was down. All segments were initialized as down and then be updated to up in most common cases, this brings two problems: 1. The fts_status is invalid until FTS does the first loop, so QD need to check ftsProbeInfo->fts_statusVersion > 0 2. gpexpand add a new segment in gp_segment_configuration, the new added segment may be marked as DOWN if FTS doesn't scan it yet. This commit changes the default value from DOWN to UP which can resolve problems mentioned above. * Fts should not be used to notify backends that a gpexpand occurs As Ashwin mentioned in PR#5679, "I don't think giving FTS responsibility to provide new segment count is right. FTS should only be responsible for HA of the segments. The dispatcher should independently figure out the count based on catalog.gp_segment_configuration should be the only way to get the segment count", FTS should decouple from gpexpand. * Access gp_segment_configuration inside a transaction * upgrade log level from ERROR to FATAL if expand version changed * Modify gpexpand test cases according to new design
-
由 Jamie McAtamney 提交于
Authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
-
由 Jamie McAtamney 提交于
After commit 4eb65a53 added the numsegments column to gp_distribution_policy, gpexpand would no longer correctly redistribute tables over the new segments after an expansion. ALTER TABLE ... SET WITH(REORGANIZE=true) now relies upon the numsegments value, and that value was not being updated by gpexpand. This commit adds an UPDATE statement before every redistribution statement to set numsegments to the proper value for the corresponding table so that the data is correctly redistributed over the new segments. Authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
-
- 17 10月, 2018 1 次提交
-
-
由 Kalen Krempely 提交于
gpexpand does not currently work with a ICW dump yet, so this commit prepares for that step as we work to fix all the outstanding errors. We have also made the following changes: Refactor and cleanup gpexpand behave tests Add gpexpand test that can be run locally Add new gpexpand test to verify redistribution Add initial gpexpand test using snowflake-simple-database until we fix gpexpand ICW dump errors Co-authored-by: NKalen Krempely <kkrempely@pivotal.io> Co-authored-by: NShoaib Lari <slari@pivotal.io>
-
- 15 10月, 2018 1 次提交
-
-
由 Ning Yu 提交于
* Protect catalog changes on master. To allow gpexpand to do the job without restarting the cluster we need to prevent concurrent catalog changes on master. A catalog lock is provided for this purpose, all insert/update/delete to catalog tables should hold this lock in shared mode before making the changes; gpexpand can hold this lock in exclusive mode to 1) wait for in-progress catalog updates to commit/rollback and 2) prevent concurrent catalog updates. Add UDF to hold catalog lock in exclusive mode. Add test cases for the catalog lock. Co-authored-by: NJialun Du <jdu@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io> * Numsegments management. GpIdentity.numsegments is a global variable saving the cluster size in each process. It is important for cluster. Changing of the cluster size means the expansion is finished and new segments have taken effect. FTS will count the number of primary segments from gp_segment_configuration and record it in shared memory. Later, other processes including master, fts, gdd, QD will update GpIdentity.numsegmentswith this information. So far it's not easy to make old transactions running with new segments, so for QD GpIdentity.numsegments can only be updated at beginning of transactions. Old transactions can only see old segments. QD will dispatch GpIdentity.numsegments to QEs, so they will see the same cluster size. Catalog changes in old transactions are disallowed. Consider below workflow: A: begin; B: gpexpand from N segments to M (M > N); A: create table t1 (c1 int); A: commit; C: drop table t1; Transaction A began when cluster size was still N, all its commands are dispatched to N segments only even after cluster being expanded to M segments, so t1 was only created on N tables, not only the data distribution but also the catalog records. This will cause the later DROP TABLE to fail on the new segments. To prevent this issue we currently disable all catalog updates in old transactions once the expansion is done. New transactions can update catalog as they are already running on all the M segments. Co-authored-by: NJialun Du <jdu@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io> * Online gpexpand implementation. Do not restart the cluster during expansion. - Lock the catalog - Create the new segments from master segment - Add new segments to cluster - Reload the cluster so new transaction and background processes can see new segments - Unlock the catalog Add test cases. Co-authored-by: NJialun Du <jdu@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io> * New job to run ICW after expansion. It is better to run ICW after online expansion to see if the cluster works well. So we add a new job to create a cluster with two segments first, then expand to three segments and run all the ICW cases. Cluster restarting is forbidden for we must be sure all the cases will be passed after online expansion. So any case which contains cluster restarting must be removed from this job. The new job is icw_planner_centos7_online_expand, it is same as icw_planner_centos7 but contains two new params EXCLUDE_TESTS and ONLINE_EXPAND. If ONLINE_EXPAND is set, the ICW shell will go to different branch, creating cluster with size two, expanding, etc. EXCLUDE_TESTS lists the cases which contain cluster restarting or the test cases will fail without restarting test cases. After the whole run, the pid of master will be checked, if it changes, the cluster must be restarted, so the job will fail. As a result any new restarting test cases should be added to EXCLUDE_TESTS. * Add README. Co-authored-by: NJialun Du <jdu@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io> * Small changes per review comments. - Delete confusing test case - Change function name updateBackendGpIdentityNumsegments to updateSystemProcessGpIdentityNumsegments - Fix some typos * Remove useless Assert These two asserts will cause failure in isolation2/uao/ compaction_utility_insert. This case will check AO table in utility mode. But numsegments are meaningless in sole segment, so this assert must be removed.
-
- 18 9月, 2018 1 次提交
-
-
由 Jialun 提交于
We found gpexpand on master does not start new mirror segments after expansion, and the status of new mirrors are marked as down in gp_segment_configuration. Because the function sync_new_mirrors has been removed in commit 28eec592. And the test cases do not cover status check. So we re-add this function and add status check in test cases.
-
- 17 8月, 2018 1 次提交
-
-
由 Nadeem Ghani 提交于
-
- 15 8月, 2018 1 次提交
-
-
由 David Kimura 提交于
The purpose of this refactor is to more closely align the GUC with postgres. It started as a suggestion in https://github.com/greenplum-db/gpdb/pull/4790. There are still differences, particularly around when this GUC can be set. In GPDB it can be set by anyone at any time (PGC_USERSET), however in postgres it is limited to postmaster restart (PGC_POSTMASTER). This difference was kept on purpose until we have more buy-in as it is a bigger change on the end-user. Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>
-
- 29 3月, 2018 1 次提交
-
-
由 Pengzhou Tang 提交于
* Support replicated table in GPDB Currently, tables are distributed across all segments by hash or random in GPDB. There are requirements to introduce a new table type that all segments have the duplicate and full table data called replicated table. To implement it, we added a new distribution policy named POLICYTYPE_REPLICATED to mark a replicated table and added a new locus type named CdbLocusType_SegmentGeneral to specify the distribution of tuples of a replicated table. CdbLocusType_SegmentGeneral implies data is generally available on all segments but not available on qDisp, so plan node with this locus type can be flexibly planned to execute on either single QE or all QEs. it is similar with CdbLocusType_General, the only difference is that CdbLocusType_SegmentGeneral node can't be executed on qDisp. To guarantee this, we try our best to add a gather motion on the top of a CdbLocusType_SegmentGeneral node when planing motion for join even other rel has bottleneck locus type, a problem is such motion may be redundant if the single QE is not promoted to executed on qDisp finally, so we need to detect such case and omit the redundant motion at the end of apply_motion(). We don't reuse CdbLocusType_Replicated since it's always implies a broadcast motion bellow, it's not easy to plan such node as direct dispatch to avoid getting duplicate data. We don't support replicated table with inherit/partition by clause now, the main problem is that update/delete on multiple result relations can't work correctly now, we can fix this later. * Allow spi_* to access replicated table on QE Previously, GPDB didn't allow QE to access non-catalog table because the data is incomplete, we can remove this limitation now if it only accesses replicated table. One problem is QE need to know if a table is replicated table, previously, QE didn't maintain the gp_distribution_policy catalog, so we need to pass policy info to QE for replicated table. * Change schema of gp_distribution_policy to identify replicated table Previously, we used a magic number -128 in gp_distribution_policy table to identify replicated table which is quite a hack, so we add a new column in gp_distribution_policy to identify replicated table and partitioned table. This commit also abandon the old way that used 1-length-NULL list and 2-length-NULL list to identify DISTRIBUTED RANDOMLY and DISTRIBUTED FULLY clause. Beside, this commit refactor the code to make the decision-making of distribution policy more clear. * support COPY for replicated table * Disable row ctid unique path for replicated table. Previously, GPDB use a special Unique path on rowid to address queries like "x IN (subquery)", For example: select * from t1 where t1.c2 in (select c2 from t3), the plan looks like: -> HashAggregate Group By: t1.ctid, t1.gp_segment_id -> Hash Join Hash Cond: t2.c2 = t1.c2 -> Seq Scan on t2 -> Hash -> Seq Scan on t1 Obviously, the plan is wrong if t1 is a replicated table because ctid + gp_segment_id can't identify a tuple, in replicated table, a logical row may have different ctid and gp_segment_id. So we disable such plan for replicated table temporarily, it's not the best way because rowid unique way maybe the cheapest plan than normal hash semi join, so we left a FIXME for later optimization. * ORCA related fix Reported and added by Bhuvnesh Chaudhary <bchaudhary@pivotal.io> Fallback to legacy query optimizer for queries over replicated table * Adapt pg_dump/gpcheckcat to replicated table gp_distribution_policy is no longer a master-only catalog, do same check as other catalogs. * Support gpexpand on replicated table && alter the dist policy of replicated table
-
- 07 2月, 2018 1 次提交
-
-
由 Shoaib Lari 提交于
gpstart did a cluster-wide check of heap_checksum settings and refused to start the cluster if this setting was inconsistent. This meant a round of ssh'ing across the cluster which was causing OOM errors with large clusters. This commit moves the heap_checksum validation to gpsegstart.py, and changes the logic so that only those segments which have the same heap_checksum setting as master are started. Author: Jim Doty <jdoty@pivotal.io> Author: Nadeem Ghani <nghani@pivotal.io> Author: Shoaib Lari <slari@pivotal.io>
-
- 30 1月, 2018 2 次提交
-
-
由 Shoaib Lari 提交于
- Change gpdeletsystem to delete tablespaces before datadir - Refactor SegmentStart.noWait to pg_ctrl_wait - Create PgBaseBackup class - Revert the change to default Mirror status - Assorted typos and bugfixes Author: Shoaib Lari <slari@pivotal.io> Author: Jim Doty <jdoty@pivotal.io> Author: Nadeem Ghani <nghani@pivotal.io>
-
由 Shoaib Lari 提交于
The gpexpand utility and associated code are modified to work with the WALREP code. Previously, gpexpand only worked with the primaries and relied on Filerep to populate the mirrors. We are changing gpexpand such that it initializes the mirrors using pg_basebackup and set them up for WAL replication. Also, since the WALREP branch removed Filespaces, so we have also done the same and replaced references to Filespaces by the Data Directory of the segments. Author: Marbin Tan <mtan@pivotal.io> Author: Shoaib Lari <slari@pivotal.io> Author: Nadeem Ghani <nghani@pivotal.io>
-
- 13 1月, 2018 9 次提交
-
-
由 Shoaib Lari 提交于
The syncing should be done automatically by the mirror, when the mirror connects to the primary. Author: Marbin Tan <mtan@pivotal.io> Author: Shoaib Lari <slari@pivotal.io>
-
由 Heikki Linnakangas 提交于
The vacuum was causing trouble, see github issue #4298. And it seems pretty pointless or unrelated to gpexpand, anyway.
-
由 Heikki Linnakangas 提交于
This reverts commit 4efea6f73ef1b828e7ca435a9a62790f787ed861. I found the culprit, and don't need that debugging information anymore.
-
由 Heikki Linnakangas 提交于
To hunt down "not enough arguments for format string" failure we're seeing MM_gpexpand_1 test in the pipeline
-
由 C.J. Jameson 提交于
- Continuation of previous general filespace removal commits Author: C.J. Jameson <cjameson@pivotal.io>
-
由 Heikki Linnakangas 提交于
-
由 Heikki Linnakangas 提交于
-
由 Heikki Linnakangas 提交于
Persistent table are no more.
-
由 Heikki Linnakangas 提交于
* Revert almost all the changes in smgr.c / md.c, to not go through the Mirrored* APIs. * Remove mmxlog stuff. Use upstream "pending relation deletion" code instead. * Get rid of multiple startup passes. Now it's just a single pass like in the upstream. * Revert the way database drop/create are handled to the way it is in upstream. Doesn't use PT anymore, but accesses file system directly, and WAL-logs a single CREATE/DROP DATABASE WAL record. * Get rid of MirroredLock * Remove a few tests that were specific to persistent tables. * Plus a lot of little removals and reverts to upstream code.
-
- 09 1月, 2018 2 次提交
-
-
由 Shoaib Lari 提交于
Long ago, we thought we might need to support multiple mirrors. But we don't, and don't forsee it coming soon. Simplify the code to only ever have one mirror, but still allow for the possibility of no mirrors Author: Shoaib Lari <slari@pivotal.io> Author: C.J. Jameson <cjameson@pivotal.io>
-
由 Marbin Tan 提交于
The gpArray use of GpDB and Segment classes was confusing. This change renames GpDB to Segment and Segment to SegmentPair to clarify usage. Its a big diff, but a simple, repeating change. Author: Shoaib Lari <slari@pivotal.io> Author: Marbin Tan <mtan@pivotal.io> Author: C.J. Jameson <cjameson@pivotal.io>
-
- 06 12月, 2017 1 次提交
-
-
由 yanchaozhong 提交于
-
- 14 11月, 2017 1 次提交
-
-
由 Daniel Gustafsson 提交于
-
- 01 9月, 2017 1 次提交
-
-
由 Shoaib Lari 提交于
Previously, RemoveDirectories and RemoveFiles used the unix command "rm -rf", but this is inefficient for huge numbers of files. Also, these functions accepted any globbed path. Instead, use "rsync" to optimize deletion of files in a directory. On a DCA using 1 million files, this increased speed by about 3x. Also, this commit breaks up the different use-cases of deletion into separate methods, adding methods RemoveDirectoryContents() and RemoveFile() and RemoveGlob() to help isolate the assumptions of each case and optimize for them. Signed-off-by: NLarry Hamel <lhamel@pivotal.io> Signed-off-by: NShoaib Lari <slari@pivotal.io>
-
- 30 8月, 2017 4 次提交
-
-
由 Nadeem Ghani 提交于
Remove global variable table_expand_error by checking the pool of done ExpandCommand(s). Signed-off-by: NMarbin Tan <mtan@pivotal.io>
-
由 Shoaib Lari 提交于
This commit adds a check for cluster state, heap_checksum setting on all primary segments match heap_checksum setting on master, before doing the expansion. If all primaries match the master, gpexpand continues with setting up expansion segments. Otherwise, it logs the inconsistent primaries and exits. Signed-off-by: NMarbin Tan <mtan@pivotal.io>
-
由 Nadeem Ghani 提交于
gpexpand had a lot of code in the __main__ module method, along with global vars used by other methods and classes in the module. This commit introduces a main() method, which can be called from unit tests, and converts global vars to params and fields. Signed-off-by: NShoaib Lari <slari@pivotal.io>
-
由 Shoaib Lari 提交于
Signed-off-by: NNadeem Ghani <nghani@pivotal.io>
-
- 02 6月, 2017 1 次提交
-
-
由 Marbin Tan 提交于
When gpexpand is run with a specific duration or end-time using '-d' or '-e' flag, there might be an off-chance that gpexpand reports that the re-distribute was successful, however, that's not the case. gpexpand status details reports as COMPLETED, but the data has not been redistributed and is still distributed by RANDOM. Signed-off-by: NNadeem Ghani <nghani@pivotal.io> Signed-off-by: NTushar Dadlani <tdadlani@pivotal.io>
-
- 04 5月, 2017 2 次提交
-
-
由 Marbin Tan 提交于
Signed-off-by: NLarry Hamel <lhamel@pivotal.io>
-
由 Larry Hamel 提交于
Signed-off-by: NMarbin Tan <mtan@pivotal.io>
-