- 29 11月, 2018 7 次提交
-
-
由 Ning Yu 提交于
This is a preparation of an upcoming random ICG pipeline job. In that pipeline job we will have tables created on different segments, so far the only way to hack this behavior is by using the gp_debug_numsegments extension. We want to load and execute that extension for all the tests without modifying them directly, so a hook mechanism is needed. Added a pg_regress argument --prehook to set a hook. A hook script should be put under src/test/regress/{sql,input}/hooks/ directory, depending on whether it need pg_regress to substitute the @@ tokens. At most one hook can be specied. It can be set via make command like this: # suppose there is src/test/regress/sql/hooks/hookname.sql make installcheck EXTRA_REGRESS_OPTS=--prehook=hookname
-
由 Ning Yu 提交于
This is a preparation of an upcoming random ICG pipeline job. In that pipeline job we will have tables created on different segments, so the plans might be different expectation. In atmsort.pm there is already an argument -gpd_ignore_plans to ignore the plan diffs, but there was no equivalence in pg_regress. Added a pg_regress argument --ignore-plans to ignore plan diffs. It can be enabled via make command like this: make installcheck EXTRA_REGRESS_OPTS=--ignore-plans
-
由 Rahul Iyer 提交于
-
由 Heikki Linnakangas 提交于
The PostgreSQL 9.4 brought us the 'pg_lsn' datatype, which is functionally equivalent. And we weren't actually using 'gpxlogloc' for anything, anyway.
-
由 Ashwin Agrawal 提交于
Given the online gpexpand work, the gp_num_contents_in_cluster GUC is unused. So, delete the same from code to avoid confusions and eliminate this long argument required to start a postgres instance in gpdb.
-
由 Ekta Khanna 提交于
Co-authored-by: NShoaib Lari <slari@pivotal.io>
-
由 Ekta Khanna 提交于
Prior to this commit, we always did a lookup for the relationOid in QE. The original intent of the code was to avoid lookup for relationOid by QE if it has already been stashed in the IndexStmt. This commit updates code for it along with `partition_locking` test expected file. Co-authored-by: NShoaib Lari <slari@pivotal.io>
-
- 28 11月, 2018 2 次提交
-
-
由 Bhuvnesh Chaudhary 提交于
-
由 Pengzhou Tang 提交于
Recently, ICW case 'select_into' failed intermittently, the error looks like "CREATE TABLE selinto_schema.tmp3 (a,b,c) AS SELECT oid,relname,relacl FROM pg_class WHERE relname like '%c%'; -- OK ERROR: role "30973" does not exist" The RCA is, case 'lock' should not run concurrently with 'select_into', a minirepo is: 1. in 'lock' case CREATE TABLE lock_tbl1 (a BIGINT); CREATE ROLE regress_rol_lock1; GRANT UPDATE ON TABLE lock_tbl1 TO regress_rol_lock1; select oid, relname, relacl from pg_class where relname like '%c%'; oid | relname | relacl --------+-----------+------------------------------------------------------- 180367 | lock_tbl1 | {gpadmin=arwdDxt/gpadmin,regress_rol_lock1=w/gpadmin} 2. in 'select_into' test case CREATE TABLE selinto_schema.tmp3 (a,b,c) AS SELECT oid,relname,relacl FROM pg_class WHERE relname like '%c%'; 3. in 'lock' test case DROP TABLE lock_tbl1; DROP ROLE regress_rol_lock1; 4. in 'select_into' test case select * from tmp3; a | b | c --------+-----------+-------------------------------------------- 180367 | lock_tbl1 | {gpadmin=arwdDxt/gpadmin,180370=w/gpadmin} create table tmp4 as select * from tmp3; NOTICE: Table doesn't have 'DISTRIBUTED BY' clause. Creating a NULL policy entry. ERROR: role "180370" does not exist In Step4, role 'regress_rol_lock1' has been dropped, so when it tries to parse acl, it finds role "180370" does not exist, the error seems to be reasonable. In upstream, test 'lock' doesn't run concurrently with 'select_into', I also notice that we have two 'lock' test within parallel_schedule, so remove the first one to match upstream and avoid the intermittent failure of 'select_into'
-
- 27 11月, 2018 2 次提交
-
-
由 xiong-gang 提交于
EvalPlanQual materializes the slot to a heap tuple, PRIVATE_tts_values point to freed memory. We need to reset PRIVATE_tts_nvalid in ExecMaterializeSlot, to prevent the following ExecFilterJunk from referencing the dangling pointer.
-
由 Zhenghua Lyu 提交于
Previously the reshuffle node's numsegments is always set to the cluster size. Now we have flexible gang & dispath API, we should correct the numsegments field of reshuffle node to set it as the its lefttree's flow->numsegments. Co-authored-by: NShujie Zhang <shzhang@pivotal.io> Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>
-
- 26 11月, 2018 1 次提交
-
-
由 Ning Yu 提交于
In CREATE TABLE we used to decide numsegments from LIKE, INHERITS and DISTRIBUTED BY clauses. However we do not want partially distributed tables to be created by end users, so change the logic to always create tables with DEFAULT as numsegments. We still allow developers to hack the DEFAULT numsegments with the gp_debug_numsegments extension.
-
- 25 11月, 2018 1 次提交
-
-
由 Daniel Gustafsson 提交于
Commit 17f9b7f070dbe17b2844a8b4dd428 in the pgweb repository removed the /static/ portion on all doc URLs, leaving a redirect in place. To avoid incurring a needless redirect, remove the /static/ part in the links to the PostgreSQL documentation. The /static/ URLs stem from a time when there were interactive docs that had comment functionality. These docs were removed a very long time ago, but the static differentiator was left in place until now. Reviewed-by: Mel Kiyama
-
- 24 11月, 2018 1 次提交
-
-
由 Heikki Linnakangas 提交于
Old error: ERROR: cannot use expression as distribution key, because it is not hashable (cdbmutate.c:1329) The new error is the same you get with CREATE TABLE. While we're at it, also change "can't" contraction to "cannot" in the error message, to follow the PostgreSQL error message guidelines.
-
- 23 11月, 2018 5 次提交
-
-
由 BaiShaoqi 提交于
Here revert four commits 1. Remove Master/Standby SyncRepWait Greenplum hack: 7f6066ea 2. Add alter system synchronous_standby_names to * when gpinitstandby -n 1136f2fb 3. Hot fix in gpinitstandby behave test failure: b6c77b2f 4. Remove unused variables, to silence compiler warnings 88a185a5 The first commit should revert, because gpinitstandby did not change correctly comform to the this commit. It will cause If the standby is down, and the synchronous_standby_names is *, the cluster will not start and hang on The second, third, fourth should revert, because it is relavent to the first commit.
-
由 Pengzhou Tang 提交于
Previously, when creating join path between CdbLocusType_SingleQE path and CdbLocusType_SegmentGeneral path, we always add a motion on top of CdbLocusType_SegmentGeneral path so even the join path is promoted to executed on QD, the CdbLocusType_SegmentGeneral path can still be executed to segments. join (CdbLocusType_SingleQE) / \ / \ CdbLocusType_SingleQE Gather Motion \ CdbLocusType_SegmentGeneral For example: (select * from partitioned_table limit 1) as t1 Nested Loop -> Gather Motion 1:1 -> Seq Scan on replicated_table -> Materialize -> Subquery Scan on t1 -> Limit -> Gather Motion 3:1 -> Limit -> Seq Scan on partitioned_table replicated_table only store tuples on segments, so without the gather motion, seq scan of replicated_table doesn't provide tuples. There is another problem, if join path is not promoted to QD, the gather motion might be redundant, For example: (select * from replicated_table, (select * from partitioned_table limit 1) t1) sub1; Gather Motion 3:1 -> Nested Loop -> Seq Scan on partitioned_table_2 -> Materialize -> Broadcast Motion 1:3 -> Nested Loop -> Gather Motion 1:1 (redundant motion) -> Seq Scan on replicated_table -> Materialize -> Subquery Scan on t1 -> Limit -> Gather Motion 3:1 -> Limit -> Seq Scan on partitioned_table So in apply_motion_mutator(), we omit such redundant motion if it's not gathered to top slice (QD). sliceDepth == 0 means it is top slice, however, sliceDepth now is shared by both init plans and main plan, so if main plan increased the sliceDepth, init plan may omit the gather motion unexpectedly which create a wrong results. The fix is simple to reset sliceDepth for init plans
-
由 Pengzhou Tang 提交于
-
由 Ning Yu 提交于
When Append node contains SingleQE subpath we used to put Append on ALL the segments, however if the SingleQE is partially distributed then apparently we could not put the SingleQE on ALL the segments, this conflict could results in runtime or incorrect results. To fix this we should put Append on SingleQE's segments. In the other hand when there are multiple SingleQE subpaths we should put Append on the common segments of SingleQEs. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
-
由 Ning Yu 提交于
There are 3 reshuffle tests, the ao one, the co one, and the heap one. They share almost the same cases, but different on table names and create table options. There are also some differences caused when adding regression tests, they are only added in one file but not others. We want to keep minimal differences between these tests, so we ensure that a regression test for ao also covers similar case for heap. And once we understand one of the test file we have almost the same knowledge on the others. Here is a list of changes to these tests: - reduce differences on table names by using schema; - reduce differences on CREATE TABLE options by setting default storage options; - simplify the creation of partially distributed tables by using the gp_debug_numsegments extension; - copy some regression tests to all the tests; - retire the no longer used helper function; - move the tests into an existing parallel test group; pg_regress test framework provides some @@ tokens for ao/co tests, however we still can not merge the ao and co tests into one file as WITH (OIDS) is only supported by ao but not co.
-
- 22 11月, 2018 7 次提交
-
-
由 Heikki Linnakangas 提交于
When determining the locus for a LEFT or RIGHT JOIN, we can use the outer side's distribution key as is. The EquivalenceClasses from the nullable side are not of interest above the join, and the outer side's distribution key can lead to better plans, because it can be made a Hashed locus, rather than HashedOJ. A Hashed locus can be used for grouping, for example, unlike a HashedOJ. This buys back better plans for some INSERT and CTAS queries, that started to need Redistribute Motions after the previous commit. Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
-
由 Heikki Linnakangas 提交于
There was some confusion on how NULLs are distributed, when CdbPathLocus is of Hashed or HashedOJ type. The comment in cdbpathlocus.h suggested that NULLs can be on any segment. But the rest of the code assumed that that's true only for HashedOJ, and that for Hashed, all NULLs are stored on a particular segment. There was a comment in cdbgroup.c that said "Or would HashedOJ ok, too?"; the answer to that is "No!". Given the comment in cdbpathlocus.h, I'm not suprised that the author was not very sure about that. Clarify the comments in cdbpathlocus.h and cdbgroup.c on that. There were a few cases where we got that actively wrong. repartitionPlan() function is used to inject a Redistribute Motion into queries used for CREATE TABLE AS and INSERT, if the "current" locus didn't match the target table's policy. It did not check for HashedOJ. Because of that, if the query contained FULL JOINs, NULL values might end up on all segments. Code elsewhere, particularly in cdbgroup.c, assumes that all NULLs in a table are stored on a single segment, identified by the cdbhash value of a NULL datum. Fix that, by adding a check for HashedOJ in repartitionPlan(), and forcing a Redistribute Motion. CREATE TABLE AS had a similar problem, in the code to decide which distribution key to use, if the user didn't specify DISTRIBUTED BY explicitly. The default behaviour is to choose a distribution key that matches the distribution of the query, so that we can avoid adding an extra Redistribute Motion. After fixing repartitionPlan, there was no correctness problem, but if we chose the key based on a HashedOJ locus, there is no performance benefit because we'd need a Redistribute Motion anyway. So modify the code that chooses the CTAS distribution key to ignore HashedOJ. While we're at it, refactor the code to choose the CTAS distribution key, by moving it to a separate function. It had become ridiculously deeply indented. Fixes https://github.com/greenplum-db/gpdb/issues/6154, and adds tests. Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
-
由 Heikki Linnakangas 提交于
The case where some, but not all, of the query's distribution keys were present in the result set, was not covered by any existing tests. Per Paul Guo's observation.
-
由 Zhenghua Lyu 提交于
This commit is the first step to refactor ATExecSetDistributedBy. Its main purpose is to remove some dead code in this function and during the process we find some helper functions can also be simplified so the simplification is also in this commit. According to MPP-7770, we should disable changing storage options for now. It is ugly to just throw an error when encounter `appendonly` option but without removing the code. In this commit remove all related logic. Because of with clause can only contain reshuffle|reorganize, we only new_rel_opts if the table itself is ao|aoco. No need to deduce it from with clause. We also remove the unnecessary checks at the start of this function. Because These checks have been already done in the function `ATPrepCmd`. Co-authored-by: Shuejie Zhang <shzhang@pivotal.io >
-
由 Jimmy Yih 提交于
When the standby master is unavailable, the master will not block on commits even though we enable synchronous replication. This is because we have a Greenplum hack which checks if the WAL stream with the standby master is valid. If the stream is invalid, the master will quickly skip the SyncRepWait and continue on its commit. Remove this hack in order to make Master/Standby and Primary/Mirror WAL replication more similar. Co-authored-by: NShaoqi Bai <sbai@pivotal.io>
-
由 Ning Yu 提交于
Suppose t1 has numsegments=1 and t2 has numsegments=2, then below query will have incorrect plan: explain (costs off) select * from t2 a join t2 b using(c2) union all select * from t1 c join t1 d using(c2); QUERY PLAN ------------------------------------------------------------------------ Gather Motion 1:1 (slice3; segments: 1) -> Append -> Hash Join Hash Cond: (a.c2 = b.c2) -> Redistribute Motion 2:2 (slice1; segments: 2) Hash Key: a.c2 -> Seq Scan on t2 a -> Hash -> Redistribute Motion 2:2 (slice2; segments: 2) Hash Key: b.c2 -> Seq Scan on t2 b -> Hash Join Hash Cond: (c.c2 = d.c2) -> Seq Scan on t1 c -> Hash -> Seq Scan on t1 d Optimizer: legacy query optimizer (17 rows) slice2 has a 2:2 redistribute motion to slice3, however slice3 only has 1 segment, this is due to Append's numsegments is decided from the last subpath. To fix the issue we should use max numsegments of subpaths for Append. The issue was already fixed in 39856768, we are only adding tests for it now.
-
由 Ning Yu 提交于
Introduced a new debugging extension gp_debug_numsegments to get / set the default numsegments when creating tables. gp_debug_get_create_table_default_numsegments() gets the default numsegments. gp_debug_set_create_table_default_numsegments(text) sets the default numsegments in text format, valid values are: - 'FULL': all the segments; - 'RANDOM': pick a random set of segments each time; - 'MINIMAL': the minimal set of segments; gp_debug_set_create_table_default_numsegments(integer) sets the default numsegments directly, valid range is [1, gp_num_contents_in_cluster]. gp_debug_reset_create_table_default_numsegments(text) or gp_debug_reset_create_table_default_numsegments(integer) reset the default numsegments to the specified value, and the value can be reused later. gp_debug_reset_create_table_default_numsegments() resets the default numsegments to the value passed last time, if there is no previous call to it the value is 'FULL'. Refactored ICG test partial_table.sql to create partial tables with this extension.
-
- 20 11月, 2018 4 次提交
-
-
由 Daniel Gustafsson 提交于
There is no restriction in the SQL specification regarding duplicate column aliases in a CTE. Remove the specific check and align us with how queries work in upstream. Also fix test fallout to match the new behavior with slightly tweaked queries to keep output volume down.
-
由 Ning Yu 提交于
A temp table is created during data reorganization, its numsegments must be the same with the original table, otherwise some data will be lost after the reorganization. Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io> Reviewed-by: NZhenghua Lyu <zlv@pivotal.io> Reviewed-by: NShujie Zhang <shzhang@pivotal.io>
-
由 BaiShaoqi 提交于
Commit 4c54c894 eliminates our divergence in deparsing filter expressions mostly by adding parentheses. However only two out of four expectfiles were updated, mostly likely because the author didn't run alternate systems (i.e. macOS), . This leads to one of those duh-it-fails-on-my-mac failures that can be easily fixed. This commit brings the two expectfiles up-to-date. Co-authored-by: NJesse Zhang <sbjesse@gmail.com> Reviewed by Paul Guo, Heikki Linnakangas
-
由 Heikki Linnakangas 提交于
The total number of rows in table, stored in pg_class.reltuples, was calculated incorrectly. It was set to the number of rows in the sample, not the (estimated?) total number of rows in the table. We didn't notice, because most of the existing tests use small tables, so that the whole table fit in the sample. Put back code like we used to have, for collecting the totals. This papers over the assertion failure we're seeing in the 'aocs' test in the concourse pipeline, with ORCA, by changing the plan to one that doesn't crash. It needs a proper fix, of course, but we're tracking that separately as https://github.com/greenplum-db/gpdb/issues/6274.
-
- 19 11月, 2018 5 次提交
-
-
由 Heikki Linnakangas 提交于
The user-visible feature here is to speed up ANALYZE on heap tables, by using the block-level sampling method from PostgreSQL, instead of the scanning the whole table. This refactors the ANALYZE code, to make it possible to call the upstream function to do the block-level sampling again. A secondary feature is that this enables the ANALYZE callback for Foreign Data Wrappers, introduced in PostgreSQL 9.2. It was disabled in the 9.2 merge, because the function signature had been changed in GPDB, making it cumbersome to call it. This patch reverts acquire-function's signature to the same as in upstream, so the FDW callback now works again. This also moves the logic to decide which rels to analyze, based on optimizer_analyze_root_partition or optimizer_analyze_midlevel_partition GUCs and the ROOTPARTITION option, wholly to get_rel_oids() in vacuum.c. It was mostly done there already, but there were some warts in analyze.c. None of the code in analyze.c should now treat partitions differently from normal inherited tables, which helps to minimize the diff vs upstream. (Except that the HLL code still heavily uses partition-related functions. That's a TODO.)
-
由 Heikki Linnakangas 提交于
Commit 4a174240 changed the hash method, which changed the data distribution of data. It adjusted many test cases that were dependent on the data distribution, but missed this one. This one isn't failing at the moment, but it was clearly mean to look at the relfrozenxid of the segment that it inserts all the data to. Change the value of the distribution key so that all the data falls into segment 0 again.
-
由 Heikki Linnakangas 提交于
-
由 ZhangJackey 提交于
The previous code makes UPDATE statement for root and its children partitions when we reshuffle a partition table. It not only involves redundant work but also will lead to an error while reshuffling a two-level partition table(because the mid-level partitions have no data). The commit does the following work: * Only make UPDATE statement for leaf partition or non-partition table. * Refactor the reshuffle test cases. We remove the python udf code and use `gp_execute_on_server` and `gp_dist_random` to test replicated table. Co-authored-by: Shujie Zhang shzhang@pivotal.io Co-authored-by: Zhenghua Lyu zlv@pivotal.io
-
由 Adam Lee 提交于
This commit adds the support and option of `mpp_execute 'MASTER | ANY | ALL SEGMENTS'` for foreign tables. MASTER is the default, FDW requests for data from master. ANY, FDW requests for data from master or one any segment, depends on which path costs less. ALL SEGMENTS, FDW requests for data from all segments, wrappers need to have a policy matching the segments to data. For instance, file_fdw probes the mpp_execute vaule, then load different files based on the segment number. But something like gpfdist on the foreign side doesn't need this, which hands out a different slice of the data to each request, all segments could request the same location.
-
- 18 11月, 2018 1 次提交
-
-
由 Daniel Gustafsson 提交于
Commit c20e5c4e added the gp_xml testsuite for GPDB XML tests in order to not pollute the upstrean xml testsuite, it however failed to add the alternative output for clusters compiled with out support for XML. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
-
- 17 11月, 2018 1 次提交
-
-
由 Alexandra Wang 提交于
This reverts commit fd602f46. Co-authored-by: NDavid Kimura <dkimura@pivotal.io> Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
-
- 16 11月, 2018 3 次提交
-
-
由 Daniel Gustafsson 提交于
The xml type is not a default built-in type, but requires the server to be compiled with --with-libxml. Since the type doesn't seem to do much in this test, remove it to avoid failing tests in environments not compiled with libxml support. Adding an alternate output for builds without XML support would have solved the issue too, but it would render most, if not all, of this test useless. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NJesse Zhang <jzhang@pivotal.io>
-
由 Ekta Khanna 提交于
When a subtransaction is aborted, a backend calls AbortSubTransaction(), instead of AbortTransaction(). Add a second call to CancelMyReaders() to AbortSubTransaction to cover these cases. Co-authored-by: NAsim R P <apraveen@pivotal.io> Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
-
由 Ekta Khanna 提交于
A proper fix to ensure that a QE writer signals QE readers before aborting a transaction is made in PR #6089. The temporary workaround is no longer necessary. This reverts commit b6dd73a4. Co-authored-by: NAsim R P <apraveen@pivotal.io>
-