- 05 6月, 2020 1 次提交
-
-
由 Lisa Owen 提交于
-
- 04 6月, 2020 3 次提交
-
-
由 Heikki Linnakangas 提交于
The logic with 'whichrow' and 'whichresultset' introduced in commit 991273b2 was slightly wrong. The last row (or first? not sure) of each result set was returned twice, and the corresponding number of rows at the end of the last result set were omitted. For example: postgres=# select gp_segment_id, * from pg_locks; gp_segment_id | locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath | mppsessionid | mppiswriter | gp_segment_id ---------------+------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+-----------------+---------+----------+--------------+-------------+--------------- -1 | relation | 13200 | 11869 | | | | | | | | 1/8 | 28748 | AccessShareLock | t | t | 6 | t | -1 -1 | virtualxid | | | | | 1/8 | | | | | 1/8 | 28748 | ExclusiveLock | t | t | 6 | t | -1 0 | virtualxid | | | | | 1/7 | | | | | 1/7 | 28750 | ExclusiveLock | t | t | 6 | t | 0 1 | virtualxid | | | | | 1/7 | | | | | 1/7 | 28751 | ExclusiveLock | t | t | 6 | t | 1 1 | virtualxid | | | | | 1/7 | | | | | 1/7 | 28751 | ExclusiveLock | t | t | 6 | t | 1 (5 rows) Note how the last row is duplicated. And the row for 'virtualxid' from segment 2 is omitted. I noticed this while working on the PostgreSQL v12 merge: the 'lock' regression test was failing because of this. I'm not entriely sure why we haven't seen failures on 'master'. I think it's pure chance that none of the lines that the test prints have been omitted on 'master'. But because that's been failing, I don't feel the need to add more tests for this.
-
由 Jesse Zhang 提交于
Regression test "gporca" started failing after merging d565edac. This reverts commit d565edac.
-
由 Hans Zeller 提交于
Orca uses this property for cardinality estimation of joins. For example, a join predicate foo join bar on foo.a = upper(bar.b) will have a cardinality estimate similar to foo join bar on foo.a = bar.b. Other functions, like foo join bar on foo.a = substring(bar.b, 1, 1) won't be treated that way, since they are more likely to have a greater effect on join cardinalities. Since this is specific to ORCA, we use logic in the translator to determine whether a function or operator is NDV-preserving. Right now, we consider a very limited set of operators, we may add more at a later time. Let's assume that we join tables R and S and that f is a function or expression that refers to a single column and does not preserve NDVs. Let's also assume that p is a function or expression that also refers to a single column and that does preserve NDVs: join predicate card. estimate comment ------------------- ------------------------------------- ----------------------------- col1 = col2 |R| * |S| / max(NDV(col1), NDV(col2)) build an equi-join histogram f(col1) = p(col2) |R| * |S| / NDV(col2) use NDV-based estimation f(col1) = col2 |R| * |S| / NDV(col2) use NDV-based estimation p(col1) = col2 |R| * |S| / max(NDV(col1), NDV(col2)) use NDV-based estimation p(col1) = p(col2) |R| * |S| / max(NDV(col1), NDV(col2)) use NDV-based estimation otherwise |R| * |S| * 0.4 this is an unsupported pred Note that adding casts to these expressions is ok, as well as switching left and right side. Here is a list of expressions that we currently treat as NDV-preserving: coalesce(col, const) col || const lower(col) trim(col) upper(col) One more note: We need the NDVs of the inner side of Semi and Anti-joins for cardinality estimation, so only normal columns and NDV-preserving functions are allowed in that case. This is a port of these GPDB 5X and GPOrca PRs: https://github.com/greenplum-db/gporca/pull/585 https://github.com/greenplum-db/gpdb/pull/10090
-
- 03 6月, 2020 6 次提交
-
-
由 Shreedhar Hardikar 提交于
Duplicate sensitive HashDistribute Motions generated by ORCA get translated to Result nodes with hashFilter cols set. However, if the Motion needs to distribute based on a complex expression (rather than just a Var), the expression must be added into the targetlist of the Result node and then referenced in hashFilterColIdx. However, this can affect other operators above the Result node. For example, a Hash operator expects the targetlist of its child node to contain only elements that are to be hashed. Additional expressions here can cause issues with memtuple bindings that can lead to errors. (E.g The attached test case, when run without our fix, will give an error: "invalid input syntax for integer:") This PR fixes the issue by adding an additional Result node on top of the duplicate sensitive Result node to project only the elements from the original targetlist in such cases.
-
由 Asim R P 提交于
Remember if the select call was interrupted. Act on it after emitting debug logs and checking cancel requests from dispatcher.
-
由 Asim R P 提交于
Previously, the result of select() system call and errno set by it was checked after performing several function calls, including checking for interrupts and checkForCancelFromQD. That made it very likely for errno to change, losing the original value that was set by the select(). This patch fixes it so that the errno is checked immediately after the system call. This should address intermittent failures in CI with error message like this: ERROR","58M01","interconnect error: select: Success"
-
由 Wen Lin 提交于
-
由 Andrey Borodin 提交于
Cluster on AO tables is implemented by sorting the entire AO table using tuple sort framework, according to a btree index defined on the table. A faster way to cluster is to scan the tuples in index-order, but this requires index-scan support. Append-optimized tables do not support index-scans currently, but when this support is added, the cluster operation can be enhanced accordingly. Author: Andrey Borodin <amborodin@acm.org> Reviewed and slightly edited by: Asim R P <pasim@vmare.com> Merges GitHub PR #9996
-
由 Hans Zeller 提交于
* Make DbgPrint and OsPrint methods on CRefCount Create a single DbgPrint() method on the CRefCount class. Also create a virtual OsPrint() method, making some objects derived from CRefCount easier to print from the debugger. Note that not all the OsPrint methods had the same signatures, some additional OsPrintxxx() methods have been generated for that. * Making print output easier to read, print some stuff on demand Required columns in required plan properties are always the same for a given group. Also, equivalent expressions in required distribution properties are important in certain cases, but in most cases they disrupt the display and make it harder to read. Added two traceflags, EopttracePrintRequiredColumns and EopttracePrintEquivDistrSpecs that have to be set to print this information. If you want to go back to the old display, use these options when running gporca_test: -T 101016 -T 101017 * Add support for printing alternative plans A new method, CEngine::DbgPrintExpr() can be called from COptimizer::PexprOptimize, to allow printing of the best plan for different contexts. This is only enabled in debug builds. To use this: - run an MDP using gporca_test, using a debug build - print out memo after optimization (-T 101006 -T 101010) - set a breakpoint near the end of COptimizer::PexprOptimize() - if, after looking at the contents of memo, you want to see the optimal plan for context c of group g, do the following: p eng.DbgPrintExpr(g, c) You could also get the same info from the memo printout, but it would take a lot longer.
-
- 02 6月, 2020 2 次提交
-
-
由 Heikki Linnakangas 提交于
Introduce a new DistributionKeyElem to hold each element in the list of columns (and optionally their opclasses) in DISTRIBUTED BY (<col> [opclass], ...) syntax. Previously, we have used IndexElem, which conveniently also holds a column name and its opclass, but it was not a very good fit because IndexElem also contains many other fields that are not needed. Using a new node type specifically for DISTRIBUTED BY makes the code dealing with distribution key lists more clear. To compare, PostgreSQL v10 uses a struct called PartitionElem for similar purposes in PARTITION BY clause. (But that is not to be confused with the PartitionElem struct in GPDB 6 and below, which is also related to partitioning sytnax but is quite different!) Unlike IndexElem, the new node type includes a 'location' field, to provide error position information in error messages. This can be seen in the error message changes in 'gp_create_table' test. While we're at it, remove the quotes around DISTRIBUTED BY in the "column <col> named in DISTRIBTED BY clause does not exist", for consistency with the same error message thrown with CREATE TABLE AS from setQryDistributionPolicy() function, and with the "duplicate column in DISTRIBUTED BY clause" error. The error thrown in CTAS case was not covered by existing tests, so also add a test for that. Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
-
由 Richard Guo 提交于
In PostgreSQL, semi-joins are implemented with JOIN_SEMI join types, or by first eliminating duplicates from the inner side, and then performing normal inner join (that's JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER). GPDB has a third way to implement them: Perform an inner join first, and then eliminate duplicates from the result. This is performed by a UniquePath above the join. And there are two ways to implement the UniquePath: sorting and hashing. But if hashing is not enabled, we should not consider it. This patch fixes github issue #8437. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NMelanie Plageman <mplageman@pivotal.io> Reviewed-by: NPaul Guo <pguo@pivotal.io>
-
- 01 6月, 2020 2 次提交
-
-
由 Hubert Zhang 提交于
We now use initplan id to differentiate the tuplestore used by different INITPLAN functions. INITPLAN will also write the function result into different tuplestores. Also fix the bug which appends initplan in the wrong place. It may generate wrong result in UNION ALL case. Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
-
由 Hao Wu 提交于
After some refactors for the path/planner, the code assumes that the module of the hash function of a motion equals to the Gang size of the parent slice. However, it isn't always true, if some tables are distributed on part of the segments. It could happen during gpexpand. Assume the following case, there is a GPDB cluster with 2 segments, and the cluster is running `gpexpand` to add 1 segment. Now t1 and t2 are distributed on the first 2 segments, t3 has finished data transfer, i.e. t3 is distributed on three segments. See the following plan: gpadmin=# explain select t1.a, t2.b from t1 join t2 on t1.a = t2.b union all select * from t3; QUERY PLAN ----------------------------------------------------------------------------------------------------------- Gather Motion 3:1 (slice1; segments: 3) (cost=2037.25..1230798.15 rows=7499310 width=8) -> Append (cost=2037.25..1080811.95 rows=2499770 width=8) -> Hash Join (cost=2037.25..1005718.85 rows=2471070 width=8) Hash Cond: (t2.b = t1.a) -> Redistribute Motion 2:3 (slice2; segments: 2) (cost=0.00..2683.00 rows=43050 width=4) Hash Key: t2.b -> Seq Scan on t2 (cost=0.00..961.00 rows=43050 width=4) -> Hash (cost=961.00..961.00 rows=28700 width=4) -> Seq Scan on t1 (cost=0.00..961.00 rows=28700 width=4) -> Seq Scan on t3 (cost=0.00..961.00 rows=28700 width=8) Optimizer: Postgres query optimizer The slice2 shows that t2 will redistribute to all 3 segments and join with t1. Since t1 is distributed only on the first 2 segments, the data from t2 redistributed to the third segment couldn't have a match, which returns wrong results. The root cause is that the module of the cdb hash to redistribute t2 is 3, i.e. the Gang size of the parent slice. To fix this issue, we add a field in Motion to record the number of receivers. With this patch, the plan generated is: gpadmin=# explain select * from t1 join t2 on t1.a = t2.b union all select a,a,b,b from t3; QUERY PLAN ---------------------------------------------------------------------------------------------------------- Gather Motion 3:1 (slice2; segments: 3) (cost=2.26..155.16 rows=20 width=16) -> Append (cost=2.26..155.16 rows=7 width=16) -> Hash Join (cost=2.26..151.97 rows=7 width=16) Hash Cond: (t1.a = t2.b) -> Seq Scan on t1 (cost=0.00..112.06 rows=5003 width=8) -> Hash (cost=2.18..2.18 rows=2 width=8) -> Redistribute Motion 2:3 (slice1; segments: 2) (cost=0.00..2.18 rows=3 width=8) Hash Key: t2.b Hash Module: 2 -> Seq Scan on t2 (cost=0.00..2.06 rows=3 width=8) -> Seq Scan on t3 (cost=0.00..3.06 rows=2 width=16) Optimizer: Postgres query optimizer Note: the interconnect for the redistribute motion is still 2:3, but the data transfer only happens in 2:2. Co-authored-by: Zhenghua Lyu zlv@pivotal.io Reviewed-by: NPengzhou Tang <ptang@pivotal.io> Reviewed-by: NJesse Zhang <jzhang@pivotal.io>
-
- 31 5月, 2020 4 次提交
-
-
由 Heikki Linnakangas 提交于
Because why not? This was forbidden back in 2012, with an old JIRA ticket that said: MPP-15735 - Inheritance is not supported with column oriented table This shouldn't be possible: Create table parent_tb2 (a1 int, a2 char(5), a3 text, a4 timestamp, a5 date, column a1 ENCODING (compresstype=zlib,compresslevel=9,blocksize=32768)) with (appendonly=true,orientation=column) distributed by (a1); Create table child_tb4 (c1 int, c2 char) inherits(parent_tb2) with (appendonly = true, orientation = column); The reason is, dealing with column compression and inheritance is tricky and inheritance shouldn't even be supported. That explanation didn't go into any details, but I can see the trickiness. There are many places you can specify column compression options even without inheritance: in gp_default_storage_options GUC, in an ENCODING directive in CREATE TABLE command, in a "COLUMN foo ENCODING ..." directive in the CREATE TABLE command, or in the datatype's default storage options. Inheritance adds another dimension to it: should the current gp_default_storage_options override the options inherited from the parent? What if there are multiple parents with conflicting options? The easiest solution is to never inherit the column encoding options from the parent. That's a bit lame, but there's some precedence for that with partitions: if you ALTER TABLE ADD PARTITION, the encoding options are also not copied from the parent. So that's what this patch does. One interesting corner case is to specify the options for an inherited column with "CREATE TABLE (COLUMN parentcol ENCODING ...) INHERITS (...)". Thanks to the previous refactoring, that works, even though 'parentcol' is not explicitly listed in the CREATE TABLE command but inherited from the parent. To make dump & restore of that work correctly, modify pg_dump so that it always uses the "COLUMN foo ENCODING ..." syntax to specify the options, instead of just tacking "ENCODING ..." after the column definition. One excuse for doing this right now is that even though we had forbidden "CREATE TABLE <tab> INHERITS (<parent>)" on AOCO tables, we missed "ALTER TABLE <tab> INHERIT <parent>". That was still allowed, and if you did that you got an inherited AOCO table that worked just fine, except that when it was dumped with pg_dump, the dump was unrestorable. It would have been trivial to add a check to forbid "ALTER TABLE INHERIT" on an AOCO table, instead, but it's even better to allow it. Fixes https://github.com/greenplum-db/gpdb/issues/10111Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
由 Heikki Linnakangas 提交于
Before this commit, this worked: CREATE TABLE mpp17740 ( a integer ENCODING (compresstype=none,blocksize=32768,compresslevel=0), b integer ENCODING (compresstype=zlib,blocksize=32768,compresslevel=2), e date ENCODING (compresstype=none,blocksize=32768,compresslevel=0) ) WITH (appendonly='true', orientation='column') DISTRIBUTED BY ("a") PARTITION BY RANGE("e") ( PARTITION mpp17740_20120520 START ('2012-05-20'::date) END ('2012-05-21'::date) WITH (appendonly='true'), PARTITION mpp17740_20120523 START ('2012-05-23'::date) END ('2012-05-24'::date) WITH (appendonly='true'), PARTITION mpp17740_20120524 START ('2012-05-24'::date) END ('2012-05-25'::date) WITH (appendonly='true' ) ); But this errored out: CREATE TABLE mpp17740 ( a integer, b integer, e date COLUMN "a" ENCODING (compresstype=none,blocksize=32768,compresslevel=0), COLUMN "b" ENCODING (compresstype=zlib,blocksize=32768,compresslevel=2), COLUMN "e" ENCODING (compresstype=none,blocksize=32768,compresslevel=0) ) WITH (appendonly='true', orientation='column') WITH (appendonly='true', orientation='column') DISTRIBUTED BY ("a") PARTITION BY RANGE("e") ( PARTITION mpp17740_20120520 START ('2012-05-20'::date) END ('2012-05-21'::date) WITH (appendonly='true'), PARTITION mpp17740_20120523 START ('2012-05-23'::date) END ('2012-05-24'::date) WITH (appendonly='true'), PARTITION mpp17740_20120524 START ('2012-05-24'::date) END ('2012-05-25'::date) WITH (appendonly='true' ) ); psql: NOTICE: CREATE TABLE will create partition "mpp17740_1_prt_mpp17740_20120520" for table "mpp17740" psql: ERROR: ENCODING clause only supported with column oriented partitioned tables That seems inconsistent. It's also problematic because in the next commit, I'm going to change pg_dump to use the latter syntax. Relax the checks so that the latter syntax doesn't throw an error. This includes a little refactoring to the way CREATE TABLE AS is dispatched. The internal CreateStmt that is constructed to create the table is now saved in the QD and dispatched to the QEs, instead of re-constructing it in each segment separately. A wholesale approach like that seems nicer than dispatching the looked-up table space name separately, and we would've needed an exception to the checks for ENCODING options, to allow the QE to parse them. Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
由 Heikki Linnakangas 提交于
These checks are supposed to run on the original user-supplied clauses, not the clauses derived from the user-supplied values and defaults. The derived clauses always pass the checks. Fixes https://github.com/greenplum-db/gpdb/issues/10115Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
由 Heikki Linnakangas 提交于
Instead of parsing them in the parse analysis phase, delay it until DefineRelation(), after merging the definitions of inherited attributes. We don't support INHERIT clause on an AOCO table today, but if we lift that limitation (as I'm planning to do in a follow-up commit), we need to be able to apply "COLUMN <col> ENCODING ..." clauses to inherited columns too. Before this, all callers of create_ctas_internal() had to also call AddDefaultRelationAttributeOptions(), in case it was an AOCO table. Now that DefineRelation() handles the default storage options, that's no longer needed. Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
- 30 5月, 2020 2 次提交
-
-
由 Chris Hajas 提交于
Previously in the DPv2 transform (exhaustive2) while we penalized cross joins for the remaining joins in greedy, we did not for the first join, which in some cases selected a cross join. This ended up selecting a poor join order in many cases and went against the intent of the alternative being generated, which is to minimize cross joins. We also increase the cost of the default penalty from 5 to 1024, which is the value we use in the cost model during the optimization stage. The greedy alternative also wasn't kept in the heap, so we include that now too.
-
由 Chris Hajas 提交于
In cases where Orca generates a NLJ with a parameter on the inner side, the executor will not pass the EXEC_FLAG_REWIND flag down, as it assumed the inner side will always need to be rescanned. The material node will therefore not have its rewind flag set and can act as a no-op. This is not always correct. While the executor will set EXEC_FLAG_REWIND if the Materialize is directly above a motion, it does not recognize the case where the Materialize is on the inner side with other nodes between it and the motion, even though the Materialize serves to prevent a rescan of the underlying Motion node. This causes the execution to fail with: `Illegal rescan of motion node: invalid plan (nodeMotion.c:1623)` as it would attempt to rescan a motion. Since Orca only produces Materialize when necessary, either for performance reasons or to prevent rescan of an underlying Motion, EXEC_FLAG_REWIND should be set for any Materialize generated by Orca. Below is a valid plan generated by Orca: ``` Result (cost=0.00..3448.01 rows=1 width=4) -> Nested Loop (cost=0.00..3448.01 rows=1 width=1) Join Filter: true -> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..431.00 rows=2 width=4) -> Seq Scan on foo1 (cost=0.00..431.00 rows=1 width=4) -> Result (cost=0.00..431.00 rows=1 width=1) Filter: (foo1.a = foo2.a) -> Materialize (cost=0.00..431.00 rows=1 width=4) -> Hash Semi Join (cost=0.00..431.00 rows=1 width=4) Hash Cond: (foo2.b = foo3.b) -> Gather Motion 3:1 (slice2; segments: 3) (cost=0.00..0.00 rows=1 width=8) -> Bitmap Heap Scan on foo2 (cost=0.00..0.00 rows=1 width=8) Recheck Cond: (c = 3) -> Bitmap Index Scan on f2c (cost=0.00..0.00 rows=0 width=0) Index Cond: (c = 3) -> Hash (cost=431.00..431.00 rows=1 width=4) -> Gather Motion 3:1 (slice3; segments: 3) (cost=0.00..431.00 rows=2 width=4) -> Seq Scan on foo3 (cost=0.00..431.00 rows=1 width=4) Optimizer: Pivotal Optimizer (GPORCA) ``` Co-authored-by: NChris Hajas <chajas@pivotal.io> Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>
-
- 29 5月, 2020 4 次提交
-
-
由 Heikki Linnakangas 提交于
Commit 152783e1 added a test along with the fix, but I broke it soon after in commit c91e320c. I changed the name of the test table from 'test' to 'wide_width_test', but forgot to adjust UPDATE command accordingly, so the UPDATE did nothing. Fix that, and also change the test to not rely on pg_attribute to create the test table. I noticed this as I was working on the PostgreSQL v12 merge: the test was throwing an error because a new 'anyarray' column (attmissingval) was added to pg_attribute in upstream. That caused the CTAS to fail. Listing the table columns explicitly avoids that problem. I verified this by reverting commit 152783e1 locally, and running the test. It now throws an integer overflow error as intended.
-
由 ggbq 提交于
In most cases, the variable LN_S is 'ln -s', however, the LN_S can be changed to 'cp -pR' if the configure finds the file system does not support symbolic links. It would be incompatible when linking a subdir path to a relative path. cd to subdir first before linking a file.
-
由 Heikki Linnakangas 提交于
processIncomingChunks() receives a list of chunks from a one sender, and then calls addChunkToSorter() on each chunk. addChunkToSorter() looks up some things based on the sender. But since all the chunks came from the same sender, we can move the lookups to outside of the loop and save some overhead. Reviewed-by: NGang Xiong <gxiong@pivotal.io>
-
由 Hubert Zhang 提交于
When introducing a new mirror, we need two steps: 1. start mirror segment 2. update gp_segment_configuration catalog Previously gp_add_segment_mirror will be called to update the catalog, but dbid is chosen by get_availableDbId() which cannot ensure to be the same dbid in internal.auto.conf. Reported by issue9837 Reviewed-by: NPaul Guo <pguo@pivotal.io>
-
- 28 5月, 2020 2 次提交
-
-
由 Lena Hunter 提交于
* clarifying pg_upgrade note * gpinitsystem -I second format * gpinitsystem edits * edits from review
-
由 Sambitesh Dash 提交于
This is a continuation of commit 456b2b31 in GPORCA. Adding more errors to the list that doesn't get logged in log file. We are also removing the code that writes to std::cerr, generating a not very nice looking log message. Instead, add the info whether the error was unexpected to another log message that we also generate.
-
- 27 5月, 2020 2 次提交
-
-
由 盏一 提交于
Just like `man strtol` says: > the calling program should set errno to 0 before the call, and then determine if > an error occurred by checking whether errno has a nonzero value after the call.
-
由 Heikki Linnakangas 提交于
The case in 'bfv_index' is identical to the one in 'bfv_joins', except that the case in 'bfv_index' contained a few more queries. Remove the redundant case from 'bfv_joins'.
-
- 26 5月, 2020 3 次提交
-
-
由 Heikki Linnakangas 提交于
It is no longer used.
-
由 Heikki Linnakangas 提交于
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
-
由 Heikki Linnakangas 提交于
To make that possible, add functions to tuplestore.c to share the store across processes, similar to how NTupleStore can be shared. Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
-
- 25 5月, 2020 4 次提交
-
-
由 Heikki Linnakangas 提交于
The commit message of commit a7412321 explained the reason well, but let's also add it in a comment in the test itself.
-
由 盏一 提交于
the distributed xid is just a plain counter, there is no WRAPAROUND in it.
-
由 Jinbao Chen 提交于
The psql client ignored rel storage when he create the \dm command. So the output of \dm was empty. Add the correct rel storage check in command.
-
由 Pengzhou Tang 提交于
This issue is exposed when doing an experiment to remove the special "eval_stable_functions" handling in evaluate_function(), qp_functions_in_* test cases will get stuck sometimes and it turns out to be a gp_interconnect_id disorder issue. Under UDPIFC interconnect, gp_interconnect_id is used to distinguish the executions of MPP-fied plan in the same session and in the receiver side, packets with smaller gp_interconnect_id is treated as 'past' packets, receiver will stop the sender to send the packets. The RCA of the hung is: 1. QD call InitSliceTable() to advance the gp_interconnect_id and store it in slice table. 2. In CdbDispatchPlan->exec_make_plan_constant(), QD find some stable function need to be simplified to const, then it executes this function first. 3. The function contains the SQL, QD init another slice table and advance the gp_interconnect_id again, QD dispatch the new plan and execute it. 4. After the function is simplified to const, QD continues to dispatch the previous plan, however, the gp_interconnect_id for it becomes the older one. When a packet comes, if the receiver hasn't set up the interconnect yet, the packet will be handled by handleMismatch() and it will be treated as `past` packets and the senders will be stopped earlier by the receiver. Later the receiver finish the setup of interconnect, it cannot get any packets from senders and get stuck. To resolve this, we advance the gp_interconnect_id when a plan is really dispatched, the plan is dispatched sequentially, so the later dispatched plan will have a higher gp_interconnect_id. Also limit the usage of gp_interconnect_id in rx thread of UDPIFC, we prefer to use sliceTable->ic_instance_id in main thread. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NAsim R P <apraveen@pivotal.io> Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
-
- 22 5月, 2020 5 次提交
-
-
由 (Jerome)Junfeng Yang 提交于
Enable Copy Form for foreign tables to remove the external table dependency in copy.c. This commit backports small part of the commit 3d956d95 from Postgres. Remove the fileam.h including from non-external code. So we can extract external table into extension later. Move function external_set_env_vars to URL component since extvar_t is defined in url.h. Implement external table fdw's BeginForeignInsert and EndForeignInsert, so COPY FROM will go through the fdw routine instead of the external insert. Reviewed-by: NHeikki Linnakangas <heikki.linnakangas@iki.fi>
-
由 Huiliang.liu 提交于
pkill is not in /bin/ folder on Ubuntu, so gpfdist can't be killed in sreh test. That would make gpfdist regression test fail
-
由 Hao Wu 提交于
`select c.c1, c.c2 from d1 c union all select a.c1, a.c2 from d2 a;` Both d1 and d2 are replicated tables, but the `numsegments` of them in gp_distribution_policy are different. This could happen during gpexpanding. The bug exists in function cdbpath_create_motion_path. Both `subpath->locus` and `locus` are SegmentGeneral, but the locuses are not equal Co-authored-by: NPengzhou Tang <ptang@pivotal.io>
-
由 Pengzhou Tang 提交于
This is mainly to resolve slow response to sequence requests under TCP interconnect, sequence requests are sent through libpqs from QEs to QD (we call them dispatcher connections). In the past, under TCP interconnect, QD checked the events on dispatcher connections every 2 seconds, obviously it's inefficient. Under UDPIFC mode, QD also monitors the dispatcher connections when receving tuples from QEs so QD can process sequence requests in time, this commit applies the same logic to the TCP interconnect. Reviewed-by: NHao Wu <gfphoenix78@gmail.com> Reviewed-by: NNing Yu <nyu@pivotal.io>
-
由 Hubert Zhang 提交于
* Fix flake test: refresh matview should use 2pc commit. We have a refresh matview with unique index check test case CREATE TABLE mvtest_foo(a, b) AS VALUES(1, 10); CREATE MATERIALIZED VIEW mvtest_mv AS SELECT * FROM mvtest_foo distributed by(a); CREATE UNIQUE INDEX ON mvtest_mv(a); INSERT INTO mvtest_foo SELECT * FROM mvtest_foo; REFRESH MATERIALIZED VIEW mvtest_mv; Only one segment contains tuples and will failed the unique index check. Without 2pc, other segemnts will just commit transaction successfully. Since one segment errors out, QD will send cancel signal to all the segments, and if these segments have not finished the commit process, it will report the warning: DETAIL: The transaction has already changed locally, it has to be replicated to standby. Reviewed-by: NPaul Guo <paulguo@gmail.com>
-