- 05 11月, 2020 2 次提交
-
-
由 Bhuvnesh Chaudhary 提交于
This commit does the following: 1. Extract config_primaries_for_replication to be used by both gpaddmirrors and gprecoverseg. 2. Add --hba-hostname handling 3. gprecoverseg: add replication entries for primaries and add tests Co-authored-by: NKalen Krempely <kkrempely@vmware.com>
-
由 Bhuvnesh Chaudhary 提交于
Co-authored-by: NKalen Krempely <kkrempely@vmware.com>
-
- 04 11月, 2020 5 次提交
-
-
由 Robert Mu 提交于
019-12-04 23:16:23.075209 CST,"gpadmin","regression",p30552,th-790954624,"[local]",,2019-12-04 23:16:03 CST,8192,con143,cmd54,seg-1,,,x8192,sx1,"ERROR","XX000","unrecognized node type: 0 (copyfuncs.c:6424)",,,,,,"REFRESH MATERIALIZED VIEW m_aocs WITH NO DATA;",0,,"copyfuncs.c",6273,"Stack trace: 1 0xaf1f9c postgres errstart (elog.c:561) 2 0xaf4b83 postgres elog_finish (elog.c:1734) 3 0x7b15f7 postgres copyObject (copyfuncs.c:6424) 4 0x6c6ad1 postgres ExecRefreshMatView (matview.c:409) 5 0x997841 postgres <symbol not found> (utility.c:1743) 6 0x9969f4 postgres standard_ProcessUtility (utility.c:1071) 7 0x993bc5 postgres <symbol not found> (palloc.h:176) 8 0x9945c5 postgres <symbol not found> (pquery.c:1552) 9 0x995a21 postgres PortalRun (pquery.c:1022) 10 0x9908d4 postgres <symbol not found> (postgres.c:1791) 11 0x99359b postgres PostgresMain (postgres.c:5123) 12 0x541c32 postgres <symbol not found> (postmaster.c:4445) 13 0x87dcea postgres PostmasterMain (postmaster.c:1519) 14 0x543fbb postgres main (discriminator 1) 15 0x7f4ccb783505 libc.so.6 __libc_start_main + 0xf5 16 0x54485f postgres <symbol not found> + 0x54485f Root cause 1 Save the rule (query tree with parentStmtType = PARENTSTMTTYPE_NONE) in the pg_rewrite table when creating a matview relation. In ExecRefreshMatView function 2 Use dataQuery pointer to get the rule (query tree) of matview relation data(relcache) 3 Set dataQuery->parentStmtType = PARENTSTMTTYPE_REFRESH_MATVIEW 4 QD may receive a reset message(shared-inval-queue overflow) when make_new_heap is called,causing QD to rebuild the entire relcache of the backend, including the matview relation When rebuilding matview relation(relcache), it is found that oldRel->rule(parentStmtType = PARENTSTMTTYPE_REFRESH_MATVIEW) is not equal to newRel->rule(parentStmtType = PARENTSTMTTYPE_NONE), caused oldRel->rule(dataQuery) to be released 5 refresh_matview_datafill using dataQuery will report an error (cherry picked from commit 474088cb)
-
由 盏一 提交于
The autovacuum worker for template0 would FATAL because Gp_session_role is still the dispatcher role with below message. 2020-09-29 21:20:02.686827 CST,,,p19902,th-881792832,,,,0,,,seg2,,,,sx1,"FATAL","57P03"," connections to primary segments are not allowed","This database instance is running as a primary segment in a Greenplum cluster and does not permit direct connections."," To force a connection anyway (dangerous!), use utility mode.",,,,,0,,"postinit.c",1151, Fixing this by setting Gp_session_role as the utility role.
-
由 Hans Zeller 提交于
* Avoid costing change for IN predicates on btree indexes Commit e5f1716 changed the way we handle IN predicates on indexes, it now uses a more efficient array comparison instead of treating it like an OR predicate. A side effect is that the cost function, CCostModelGPDB::CostBitmapTableScan, now goes through a different code path, using the "small NDV" or "large NDV" costing method. This produces very high cost estimates when the NDV increases beyond 2, so we basically never choose an index for these cases, although a btree index used in a bitmap scan isn't very sensitive to the NDV. To avoid this, we go back to the old formula we used before commit e5f1716. The fix is restricted to IN predicates on btree indexes, used in a bitmap scan. * Add an MDP for a larger IN list, using a btree index on an AO table * Misc. changes to the calibration test program - Added tests for btree indexes (btree_scan_tests). - Changed data distribution so that all column values range from 1...n. - Parameter values for test queries are now proportional to selectivity, a parameter value of 0 produces a selectivity of 0%. - Changed the logic to fake statistics somewhat, hopefully this will lead to more precise estimates. Incorporated the changes to the data distribution with no more 0 values. Added fake stats for unique columns. - Headers of tests now use semicolons to separate parts, to give a nicer output when pasting into Google Docs. - Some formatting changes. - Log fallbacks. - When using existing tables, the program now determines the table structure (heap or append-only) and the row count. - Split off two very slow tests into separate test units. These are not included when running "all" tests, they have to be run explicitly. - Add btree join tests, rename "bitmap_join_tests" to "index_join_tests" and run both bitmap and btree joins - Update min and max parameter values to cover a range that includes or at least is closer to the cross-over between index and table scan - Remove the "high NDV" tests, since the ranges in the general test now include both low and high NDV cases (<= and > 200) - Print out selectivity of each query, if available - Suppress standard deviation output when we execute queries only once - Set search path when connecting - Decrease the parameter range when running bitmap scan tests on heap tables - Run btree scan tests only on AO tables, they are not designed for testing index scans * Updates to the experimental cost model, new calibration 1. Simplify some of the formulas, the calibration process seemed to justify that. We might have to revisit if problems come up. Changes: - Rewrite some of the formulas so the costs per row and costs per byte are more easy to see - Make the cost for the width directly proportional - Unify the formula for scans and joins, use the same per-byte costs and make NDV-dependent costs proportional to num_rebinds * dNDV, except for the logic in item 3. That makes the cost for the new experimental cost model a simple linear formula: num_rebinds * ( rows * c1 + rows * width * c2 + ndv * c3 + bitmap_union_cost + c4 ) + c5 We have 5 constants, c1 ... c5: c1: cost per row (rows on one segment) c2: cost per byte c3: cost per distinct value (total NDV on all segments) c4: cost per rebind c5: initialization cost bitmap_union_cost: see item 3 below 2. Recalibrate some of the cost parameters, using the updated calibration program src/backend/gporca/scripts/cal_bitmap_test.py 3. Add a cost penalty for bitmap index scans on heap tables. The added cost takes the form bitmap_union_cost = <base table rows> * (NDV-1) * c6. The reason for this is, as others have pointed out, that heap tables lead to much larger bit vectors, since their CTIDs are more spaced out than those of AO tables. The main factor seems to be the cost of unioning these bit vectors, and that cost is proportional to the number of bitmaps minus one and the size of the bitmaps, which is approximated here by the number of rows in the table. Note that because we use (NDV-1) in the formula, this penalty does not apply to usual index joins, which have an NDV of 1 per rebind. This is consistent with what we see in measurements and it also seems reasonable, since we don't have to union bitmaps in this case. 4. Fix to select CostModelGPDB for the 'experimental' model, as we do in 5X. 5. Calibrate the constants involved (c1 ... c6), using the calibration program and running experiments with heap and append-only tables on a laptop and also on a Linux cluster with 24 segments. Also run some other workloads for validation. 6. Give a small initial advantage to bitmap scans, so they will be chosen over table scans for small tables. Otherwise, small queries will have more or less random plans, all of which cost around 431, the value of the initial cost. Added a 10% advantage of the bitmap scan.
-
由 Hans Zeller 提交于
* Use original join pred for DPE with index nested loop joins Dynamic partition selection is based on a join predicate. For index nested loop joins, however, we push the join predicate to the inner side and replace the join predicate with "true". This meant that we couldn't do DPE for nested index loop joins. This commit remembers the original join predicate in the index nested loop join, to be used in the generated filter map for DPE. The original join predicate needs to be passed through multiple layers. * SPE for index preds Some of the xforms use method CXformUtils::PexprRedundantSelectForDynamicIndex to duplicate predicates that could be used both as index predicates and as partition elimination predicates. The call was missing in some other xforms. Added it. * Changes to equivalent distribution specs with redundant predicates Adding redundant predicates causes some issues with generating equivalent distribution specs, to be used for the outer table of a nested index loop join. We want the equivalent spec to be expressed in terms of outer references, which are the columns of the outer table. By passing in the outer refs, we can ensure that we won't replace an outer ref in a distribution spec with a local variable from the original distribution spec. Also removed the asserts in CPhysicalFilter::PdsDerive that ensure the distribution spec is complete (consisting of only columns from the outer table) after we see a select node. Even without my changes, the asserts do not always hold, as this test case shows: drop table if exists foo, bar; create table foo(a int, b int, c int, d int, e int) distributed by(a,b,c); create table bar(a int, b int, c int, d int, e int) distributed by(a,b,c); create index bar_ixb on bar(b); set optimizer_enable_hashjoin to off; set client_min_messages to log; -- runs into assert explain select * from foo join bar on foo.a=bar.a and foo.b=bar.b where bar.c > 10 and bar.d = 11; Instead of the asserts, we now use the new method of passing in the outer refs to ensure that we move towards completion. We also know now that we can't always achieve a complete distribution spec, even without redundant predicates. * MDP changes Various changes to MDPs: - New SPE filters used in plan - New redundant predicates (partitioning or on non-partitioning columns) - Plan space changes - Cost changes - Motion changes - Regenerated, because plan switched to a hash join, so used a guc to force an index plan - Fixed lookup failures - Add mdp where we try unsuccessfully to complete a distribution spec * ICG result changes - Test used the 'experimental' cost model to force an index scan, but we now get the index scan even with the default cost model.
-
- 03 11月, 2020 2 次提交
-
-
由 xiong-gang 提交于
In function DropResourceGroup(), group->lockedForDrop is set to true by calling ResGroupCheckForDrop, however, it can only be set to false inside dropResgroupCallback. This callback is registered at the ending of function DropResourceGroup. If an error occured between them, group->lockedForDrop would be true forever. Fix it by putting the register process ahead of the lock call. To prevent Assert(group->nRunning* > 0) if ResGroupCheckForDrop throws an error, return directly if group->lockedForDrop did not change. See: ``` gpconfig -c gp_resource_manager -v group gpstop -r -a psql CPU_RATE_LIMIT=20, MEMORY_LIMIT=20, CONCURRENCY=50, MEMORY_SHARED_QUOTA=80, MEMORY_SPILL_RATIO=20, MEMORY_AUDITOR=vmtracker ); psql -U user_test > \d -- hang ``` Co-authored-by: Ndh-cloud <60729713+dh-cloud@users.noreply.github.com>
-
由 (Jerome)Junfeng Yang 提交于
On QD, it tracks whether QE wrote_xlog in the libpq connection. The logic is, if QE writes xlog, it'll send a libpq msg to QD. But the msg is sent in ReadyForQuery. So, before QE execute this function, the QE may already send back results to QD. Then when QD process this message, it does not read the new wrote_xlog value. This makes the connection still contains the previous dispatch wrote_xlog value, which will affect whether choosing one phase commit. The issue only happens when the QE flush the libpq msg before the ReadyForQuery function, hard to find a case to cover it. I found the issue when I playing the code to send some information from QE to QD. And it breaks the gangsize test which shows the commit info. (cherry picked from commit 777b51cd)
-
- 02 11月, 2020 2 次提交
-
-
由 Ning Wu 提交于
The repo https://github.com/greenplum-db/greenplum-database-release has been changed default repo from master to main. This is to sync up this change Co-authored-by: NNing Wu <ningw@vmware.com> Co-authored-by: NShaoqi Bai <bshaoqi@vmware.com>
-
由 Jialun 提交于
- create table ... as select ... - create materialized view ... as select ... This is backport from commit: 7ae210a1bf7e569a18cda32dcec3b55665a42ee7
-
- 31 10月, 2020 1 次提交
-
-
由 David Yozie 提交于
-
- 30 10月, 2020 3 次提交
-
-
由 David Yozie 提交于
* Docs - update interconnect proxy discussion to cover hostname support * Change gp_interconnect_type -> gp_interconnect_proxy_addresses in note
-
由 Lisa Owen 提交于
-
由 dh-cloud 提交于
Looking at GP documents, there is no indication that master dbid must be 1. However, when CREATE_QD_DB, gpinitsystem always writes "gp_dbid=1" into file `internal.auto.conf` even if we specify: ``` mdw~5432~/data/master/gpseg-1~2~-1 OR mdw~5432~/data/master/gpseg-1~0~-1 ``` But catalog gp_segment_configuration can have the correct master dbid value (2 or 0), the mismatch causes gpinitsystem hang. Users can run into such problem for their first time to use gpinitsystem -I. Here we test dbid 0, because PostmasterMain() will simply check dbid >= 0 (non-utility mode), it says: > This value must be >= 0, or >= -1 in utility mode It seems 0 is a valid value. Changes: - use specified master dbid field when CREATE_QD_DB. Reviewed-by: NAshwin Agrawal <aashwin@vmware.com> (cherry picked from commit 00ae3013)
-
- 29 10月, 2020 2 次提交
-
-
由 Lisa Owen 提交于
-
由 dh-cloud 提交于
If cdbcomponent_getCdbComponents() caught an error threw by function getCdbComponents, FtsNotifyProber would be called. But if it happened inside fts process, ftp process would hang. Skip fts probe for fts process, after that, under the same situation, fts process would exit and then be restarted by postmaster. (cherry picked from commit 3cf72f6c)
-
- 28 10月, 2020 10 次提交
-
-
由 Adam Lee 提交于
Otherwise it will raise an exception "command not run yet".
-
由 Adam Lee 提交于
It didn't log the error message before if pg_rewind fails, fix that to make DBA/field/developer's life eaisier. Before this: ``` 20201022:15:19:10:011118 gprecoverseg:earth:adam-[INFO]:-Running pg_rewind on required mirrors 20201022:15:19:12:011118 gprecoverseg:earth:adam-[WARNING]:-Incremental recovery failed for dbid 2. You must use gprecoverseg -F to recover the segment. 20201022:15:19:12:011118 gprecoverseg:earth:adam-[INFO]:-Starting mirrors 20201022:15:19:12:011118 gprecoverseg:earth:adam-[INFO]:-era is 0406b847bf226356_201022151031 ``` After this: ``` 20201022:15:33:31:019577 gprecoverseg:earth:adam-[INFO]:-Running pg_rewind on required mirrors 20201022:15:33:31:019577 gprecoverseg:earth:adam-[WARNING]:-pg_rewind: fatal: could not find common ancestor of the source and target cluster's timelines 20201022:15:33:31:019577 gprecoverseg:earth:adam-[WARNING]:-Incremental recovery failed for dbid 2. You must use gprecoverseg -F to recover the segment. 20201022:15:33:31:019577 gprecoverseg:earth:adam-[INFO]:-Starting mirrors 20201022:15:33:31:019577 gprecoverseg:earth:adam-[INFO]:-era is 0406b847bf226356_201022151031 ```
-
由 Jimmy Yih 提交于
Currently, when inserting into a table distributed by a bpchar using the legacy bpchar hash operator, the row goes through jump consistent hashing instead of lazy modular hashing. This is because the cdblegacyhash_bpchar funcid is missing from the isLegacyCdbHashFunction check function which determines if an attribute is using a legacy hash function or not. The funcids currently in that check function come from the auto-generated fmgroids.h header file which only creates a DEFINE for the pg_proc.prosrc field. Unfortunately, cdblegacyhash_bpchar is left out because its prosrc is cdblegacyhash_text. A proper fix would require a catalog change. To fix this issue in 6X_STABLE, we need to hardcode cdblegacyhash_bpchar funcid 6148 into the isLegacyCdbHashFunction check function. This should be fine since the GPDB 6X_STABLE catalog is frozen. This issue was reported by github user cobolbaby in the gpbackup repository while the user was migrating GPDB 5X tables to GPDB 6X: https://github.com/greenplum-db/gpbackup/issues/425
-
由 Jesse Zhang 提交于
We started hitting this on Thursday, and there's been ongoing report from the community about this as well. While upstream is figuring out a long term solution [1], we've been advised [2] to pin to the previous release (v0.21.0) to avoid being blocked for hours at once. [1]: https://github.com/telia-oss/github-pr-resource/pull/238 [2]: https://github.com/telia-oss/github-pr-resource/pull/238#issuecomment-714830491 (cherry picked from commit f4bf9be6)
-
由 Bradford D. Boyle 提交于
Compiling with gcc 10 on Debian testing fails with the following error: ``` /usr/bin/ld: utils/misc/guc_gp.o:(.bss+0x308): multiple definition of `data_directory'; utils/misc/guc.o:(.bss+0x70): first defined here ```
-
由 Bradford D. Boyle 提交于
Some platforms do not have unversioned "python" available but do have versioned "python2". Configure should look for either "python" or "python2" when run with the option "--with-python". These changes were manually copied from the Postgres build system but omitted searching for "python3" since Greenplum does not have support for Python 3 yet.
-
由 Lisa Owen 提交于
-
由 Lisa Owen 提交于
-
由 David Kimura 提交于
(cherry picked from commit 91ed33c9)
-
由 David Kimura 提交于
This allows us to reduce code duplication of workload SQL scripts (cherry picked from commit 8c204bd5)
-
- 27 10月, 2020 3 次提交
-
-
由 Xiaoran Wang 提交于
Greenplum only supports INSERT, because UPDATE/DELETE requires the hidden column gp_segment_id and the other "ModifyTable mixes distributed and entry-only tables" issue.
-
由 Chris Hajas 提交于
These assertions started getting tripped in the previous commit when adding tests, but aren't related to the Epsilon change. Rather, we're calculating the frequency of a singleton bucket using two different methods which causes this assertion to break down. The first method (calculating the upper_third) assumes the singleton has 1 NDV and that there is an even distribution across the NDVs. The second (in GetOverlapPercentage) calculates a "resolution" that is based on Epsilon and assumes the bucket contains some small Epsilon frequency. It results in the overlap percentage being too high, instead it too should likely be based on the NDV. In practice, this won't have much impact unless the NDV is very small. Additionally, the conditional logic is based on the bounds, not frequency. However, it would be good to align in the future so our statistics calculations are simpler to understand and predictable. For now, we'll remove the assertions and add a TODO. Once we align the methods, we should add these assertions back.
-
由 Chris Hajas 提交于
When merging statistics buckets for UNION and UNION ALL queries involving a column that maps to Double (eg: floats, numeric, time related types), we could end up in an infinite loop. This occurred if the bucket boundaries that we compared were within a very small value, defined in Orca as Epsilon. While we considered that two values were equal if they were within Epsilon, we didn't when computing whether datum1 < datum2. Therefore we'd get into a situation where a datum could be both equal to and less than another datum, which the logic wasn't able to handle. The fix is to make sure we have a hard boundary of when we consider a datum less than another datum by including the epsilon logic in all datum comparisons. Now, 2 datums are equal if they are within epsilon, but datum1 is less than datum 2 only if datum1 < datum2 - epsilon. Also add some tests since we didn't have any tests for types that mapped to Double.
-
- 26 10月, 2020 1 次提交
-
-
由 Xiaoran Wang 提交于
* Enable postgres_fdw test in icw test postgres_fdw test is disabled by default, and it's enabled in gpdb pipelines.
-
- 23 10月, 2020 4 次提交
-
-
由 dh-cloud 提交于
Postgresql libpq document: > Note that when PQconnectStart or PQconnectStartParams returns a > non-null pointer, you must call PQfinish when you are finished > with it, in order to dispose of the structure and any associated > memory blocks. **This must be done even if the connection attempt > fails or is abandoned**. However, cdbconn_disconnect() function did not call PQfinish when CONNECTION_BAD, it can cause socket leaks (CLOSE_WAIT state).
-
由 Paul Guo 提交于
Fix an orphaned prepared transaction case due to race between checkpointer and COMMIT PREPARE xlog recording On Greenplum, checkpoint would collect prepared transactions which are actually committed. If the COMMIT PREPARE xlog is before checkpoint.redo, after the segment reboot, there would always be an orphaned (actually committed) prepared transaction in memory. That happens when we collect the prepared transaction in checkpointer before gxact->valid is reset and after the COMMIT PREPARE xlog is recorded, see code in FinishPreparedTransaction(). That could lead to various issues. e.g. dtx recovery would keep trying to abort that and then cause panic on the segment with message like "cannot abort transaction 3285003, it was already committed (twophase.c:2205)". Fixing this by adding a new variable committed in gxact to specifiy whether the global transaction is committed or not. If being committed we surely do not need to log the gxact in checkpointer xlog. We could also fix this by delaying checkpointer later after gxact->valid resetting in FinishPreparedTransaction()), but RecordTransactionCommitPrepared() -> SyncRepWaitForLSN() might be time-consuming or block for some time somehow (locking, network lag, etc), thus it could block checkpointer for too long time - that is surely not good. Also it seems that we could fix that by moving "gxact->valid = false" ahead of delayChkpt resetting, but that is kind of ugly also, also that is a bit risky for a stable release. There were two solutions that were discussed previously. One is to use locking mechanism, but that hurts OLTP performance; Another is to remove the false positive cases in RecoverPreparedTransactions(), but it is possible that related clog has been removed by subsequent vacuum operations so it is not reliable also. Co-authored-by: NHao Wu <gfphoenix78@gmail.com> Reviewed-by: NAsim R P <pasim@vmware.com> Reviewed-by: NAshwin Agrawal <aashwin@vmware.com>
-
由 Peter Eisentraut 提交于
Fixup for "Drop slot's LWLock before returning from SaveSlotToPath()" Reported-by: NMichael Paquier <michael@paquier.xyz> (cherry picked from commit 72b2b9c52e3a86ae414fc07acf6db3de0776fc13)
-
由 Peter Eisentraut 提交于
When SaveSlotToPath() is called with elevel=LOG, the early exits didn't release the slot's io_in_progress_lock. This could result in a walsender being stuck on the lock forever. A possible way to get into this situation is if the offending code paths are triggered in a low disk space situation. Author: Pavan Deolasee <pavan.deolasee@2ndquadrant.com> Reported-by: NCraig Ringer <craig@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/56a138c5-de61-f553-7e8f-6789296de785%402ndquadrant.com (cherry picked from commit ce28a43ffa89b49584e75d6bb9f8ae03a8e13151)
-
- 22 10月, 2020 3 次提交
-
-
由 David Yozie 提交于
-
由 Bhuvnesh Chaudhary 提交于
The parameters were incorrectly passed while gprecoverseg was invoked causing gprecoverseg to fail. Co-authored-by: NAleksey Kashin <kashinav@yandex-team.ru>
-
由 Jamie McAtamney 提交于
demo_cluster.sh: remove GPSEARCH
-
- 21 10月, 2020 2 次提交
-
-
由 Jinbao Chen 提交于
The result of NULL not in an unempty set is false. The result of NULL not in an empty set is true. But if an unempty set has partitioned locus. This set will be divided into several subsets. Some subsets may be empty. Because NULL not in empty set equals true. There will be some tuples that shouldn't exist in the result set. The patch disable the partitioned locus of inner table by removing the join clause from the redistribution_clauses. this commit cherry pick from master f77bf087Co-authored-by: NHubert Zhang <hubertzhang@apache.org> Co-authored-by: NRichard Guo <riguo@pivotal.io>
-
由 Richard Guo 提交于
When constructing plans for a list of rollups, the rollup_subplan may be retrieved from root->simple_rel_array[i]->subplan, which is not always type of SubqueryScan. So later in function rebuild_append_simple_rel_and_rte(), we need to verify if the subplan is a SubqueryScan node before assigning a scanrelid to it. Fixes issue #10813 and issue #10841 Reviewed-by: NJunfeng(Jerome) Yang <jeyang@pivotal.io>
-