- 02 10月, 2019 2 次提交
-
-
由 Chen Mulong 提交于
Due to a option index error, exclude-tests, ignore-plans, prehook and print-failure-diffs are not parsed from command line correctly.
-
由 Adam Berlin 提交于
- as an experiment. We want to see if it becomes less flakey.
-
- 30 9月, 2019 1 次提交
-
-
由 Heikki Linnakangas 提交于
GPDB_EXTRA_COL() is only supposed to affect the next DATA line. But we failed reset it between DATA lines, so the setting stayed in effect for all the subsequent lines, too. This is relatively harmless, because it's mostly only used for the prodataaccess column, which is ignored by the system anyway. The only other place where it was used was to set proexeclocation for pg_event_trigger_dropped_objects(), which also happened to do little damage to the subsequent lines because all the subsequent lines include the GPDB-specific columns, which overrode the bogus GPDB_EXTRA_COL() setting. This was broken in 6X_STABLE, but it's too late to change the catalogs there. Bump catversion. Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io> Reviewed-by: NAdam Lee <ali@pivotal.io>
-
- 26 9月, 2019 2 次提交
-
-
由 Georgios Kokolatos 提交于
The cause of the PANIC was an incorrectly populated list containing the namespace information for the affected the relation. A GrantStmt contains the necessary objects in a list named objects. This gets initially populated during parsing (via the privilege_target rule) and processed during parse analysis based on the target type and object type to RangeVar nodes, FuncWithArgs nodes or plain names. In Greenplum, the catalog information about the partition hierarchies is not propagated to all segments. This information needs to be processed in the dispatcher and to be added backed in the parsed statement for the segments to consume. In this commit, the partition hierarchy information is expanded only for the target and object type required. The parsed statement gets updated uncoditionally of partitioned references before dispatching for required types. The privileges tests have been updated to get check for privileges in the segments also. Problem identified and initial patch by Fenggang <ginobiliwang@gmail.com>, reviewed and refactored by me.
-
由 Ashwin Agrawal 提交于
Current code for COPY FROM picks mode as COPY_DISPATCH for non-distributed/non-replicated table as well. This causes crash. It should be using COPY_DIRECT, which is normal/direct mode to be used for such tables. The crash was exposed by following SQL commands: CREATE TABLE public.heap01 (a int, b int) distributed by (a); INSERT INTO public.heap01 VALUES (generate_series(0,99), generate_series(0,98)); ANALYZE public.heap01; COPY (select * from pg_statistic where starelid = 'public.heap01'::regclass) TO '/tmp/heap01.stat'; DELETE FROM pg_statistic where starelid = 'public.heap01'::regclass; COPY pg_statistic from '/tmp/heap01.stat'; Important note: Yes, it's known and strongly recommended to not touch the `pg_statistics` or any other catalog table this way. But it's no good to panic either. The copy to `pg_statictics` is going to ERROR out "correctly" and not crash after this change with `cannot accept a value of type anyarray`, as there just isn't any way at the SQL level to insert data into pg_statistic's anyarray columns. Refer: https://www.postgresql.org/message-id/12138.1277130186%40sss.pgh.pa.us
-
- 25 9月, 2019 1 次提交
-
-
由 Adam Berlin 提交于
This issue was causing the build pipeline to go red. Reverting for now. This reverts commit ba6148c6.
-
- 24 9月, 2019 10 次提交
-
-
由 Fenggang 提交于
It has been discovered in GPDB v.6 and above that a 'GRAND ALL ON ALL TABLES IN SCHEMA XXX TO YYY;' statement will lead to PANIC. From the resulted coredumps, a now obsolete code in QD that tried to encode objects in a partition reference into RangeVars was identified as the culprit. The list that the resulting vars were ancored, was expecting and treating only StrVars. The original code was added following the premise that catalog informations were not available in Segments. Also it tried to optimise caching, yet the code was not fully writen. Instead, the offending block is removed which solves the issue and allows for greater alignment with upstream. Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
由 Heikki Linnakangas 提交于
I did, in fact, add a test for that case in the previous commit, so the comment that we couldn't repro it was not accurate.
-
由 Heikki Linnakangas 提交于
In PostgreSQL, there can be multiple SubPlan expressions referring to the outputs of the same subquery, but this mechanism had been lobotomized in GPDB. There was a pass over the plan tree, fixup_subplans(), that duplicated any subplans that were referred more than once, and the rest of the GPDB planner and executor code assumed that there is only one reference to each subplan. Refactor the GPDB code, mostly cdbparallelize(), to remove that assumption, and stop duplicating SubPlans. * In cdbparallelize(), instead of immediately recursing into the plan tree of each SubPlan, process the subplan list in glob->subplans as a separate pass. Add a new 'recurse_into_subplans' argument to plan_tree_walker() to facilitate that; all other callers pass 'true' so that they still recurse. * Replace the SubPlan->qDispSliceId and initPlanParallel fields with a new arrays in PlannerGlobal. * In FillSliceTable(), keep track of which subplans have already been recursed into, and only recurse on first encounter. (I've got a feeling that the executor startup is doing more work than it should need to, to set up the slice table. The slice information is available when the plan is built, so why does the executor need to traverse the whole plan to build the slice table? But I'll leave refactoring that for another day..) * Move the logic to remove unused subplans into cdbparallelize(). This used to be done as a separate pass from standard_planner(), but after refactoring cdbparallelize(), it is now very convenient and logical to do the unused subplan removal there, too. * Early in the planner, wrap SubPlan references in PlaceHolderVars. This is needed in case a SubPlan reference gets duplicated to two different slices. A single subplan can only be executed from one slice, because the motion nodes in the subplan are set up to send to a particular parent slice. The PlaceHoldeVar makes sure that the SubPlan is evaluated only once, and if it's needed above the bottommost Plan node where it's evaluated, its values is propagated to the upper Plan nodes in the targetlists. There are many other plan tree walkers that still recurse to subplans from every SubPlan reference, but AFAICS recursing twice is harmless for all of them. Would be nice to refactor them, too, but I'll leave that for another day. Reviewed-by: NBhuvnesh Chaudhary <bhuvnesh2703@gmail.com> Reviewed-by: NRichard Guo <riguo@pivotal.io>
-
由 Heikki Linnakangas 提交于
The mock setup in the old test was very limited, the Node structs it set up were left to zeros, and even allocated with incorrect lengths (SubPlan vs SubPlanState). It worked just enough for the codepath that it was testing, but IMHO it's better to test the error "in vivo", and it requires less setup, too. So remove the mock test, and replace with a fault injector test that exercises the same codepath.
-
由 Heikki Linnakangas 提交于
I thought that after adding the fixup_subplans() pass in commit d0aea184, we shouldn't need the check for subplans in FindEqKey, which was added in commit 9d63d3c1, as long as we had the walker/mutator support. But for some reason, the regression test added in commit 9d63d3c1 passes if the contains_subplan() check is removed, ever without the walker/mutator support, so I'm not sure where exactly that case is still blocked. But in any case, let's be tidy, even if there is no ill-effect at the moment. The missing walker/mutator support was noted by @hsyuan in the comments on PR #2444 already, but we didn't act on it then.
-
由 Zhenghua Lyu 提交于
Currently, for partition table we have maintained some stat info for root table if the GUC optimizer_analyze_root_partition is set so that we could use root's stat info directly. Previously we use largest child's stat info for root partition. This may lead to serious issue. Consider a partition table t, all data with null partition key goes into default partition and it happens to be the largest child. Then for the result size of the query that t join other table on partition key we will estimate 0 because we use the default partition's stat info which contains all null partition key. What is worse, we may broadcast the join result. This commit fixes this issue but leave some future work to do: maintain STATISTIC_KIND_MCELEM and STATISTIC_KIND_DECHIST for root table. This commit sets the GUC gp_statistics_pullup_from_child_partition to false defaultly. Now the whole logic is: * if gp_statistics_pullup_from_child_partition is true, we try to use largest child's stat * if gp_statistics_pullup_from_child_partition is false, we first try to fetch root's stat: - if root contains stat info, that's fine, we just use it - otherwise, we still try to use largest child's stat Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
-
由 Heikki Linnakangas 提交于
Printing the slice information makes sense for Init Plans, which are dispatched separately, before the main query. But not so much for other Sub Plans, which are just part of the plan tree; there is no dispatching or motion involved at such SubPlans. The SubPlan might *contain* Motions, but we print the slice information for those Motions separately. The slice information was always just the same as the parent node's, which adds no information, and can be misleading if it makes the reader think that there is inter-node communication involved in such SubPlans.
-
由 Ashwin Agrawal 提交于
gp_tablespace_with_faults test writes no-op record and waits for mirror to replay the same before deleting the tablespace directories. This step fails sometime in CI and causes flaky behavior. The is due to existing code behavior in startup and walreceiver process. If primary writes big (means spanning across multiple pages) xlog record, flushes only partial xlog record due to XLogBackgroundFlush() but restarts before commiting the transaction, mirror only receives partial record and waits to get complete record. Meanwhile after recover, no-op record gets written in place of that big record, startup process on mirror continues to wait to receive xlog beyond previously received point to proceed further. Hence, as temperory workaround till the actual code problem is not resolved and to avoid failures for this test, switch xlog before emitting no-op xlog record, to have no-op record at far distance from previously emitted xlog record.
-
由 Jimmy Yih 提交于
When gp_use_legacy_hashops GUC was set, CTAS would not assign the legacy hash class operator to the new table. This is because CTAS goes through a different code path and uses the first operator class of the SELECT's result when no distribution key is provided.
-
由 Ashuka Xue 提交于
After commit 1c2489d0 nMotionNodes are no longer part of the plan struct.
-
- 23 9月, 2019 8 次提交
-
-
由 Zhenghua Lyu 提交于
In Greenplum, when estimating costs, most of the time we are in a global view, but sometimes we should shift to a local view. Postgres does not suffer from this issue because everything is in one single segment. The function `estimate_hash_bucketsize` is from postgres and it plays a very important role in the cost model of hash join. It should output a result based on locally view. However, the input parameters like, rows in a table, and ndistinct of the relation, are all taken from a global view (from all segments). So, we have to do some compensation for it. The logic is: 1. for broadcast-like locus, the global ndistinct is the same as the local one, we do the compensation by `ndistinct*=numsegments`. 2. for the case that hash key collcated with locus, on each segment, there are `ndistinct/numsegments` distinct groups, so no need to do the compensation. 3. otherwise, the locus has to be partitioned and not collocated with hash keys, for these cases, we first estimate the local distinct group number, and then do do the compensation. Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
-
由 Heikki Linnakangas 提交于
The function did completely different things for different callers, so seems better to move the logic to the callers instead. Reviewed-by: NAdam Lee <ali@pivotal.io> Reviewed-by: NNing Yu <nyu@pivotal.io>
-
由 Heikki Linnakangas 提交于
Remove the Query argument from cdbparallelize(), and its apply_motion() subroutine. Like most planner functions, these functions are passed a "PlannerInfo root" which represents the query, and its Query struct is available at root->parse. Passing a separate Query is confusing because you might think that you could pass some different query, perhaps a subquery.
-
由 Heikki Linnakangas 提交于
Fixes github issue https://github.com/greenplum-db/gpdb/issues/8621
-
由 Heikki Linnakangas 提交于
It was quite silly to have them in the Plan struct, which all the plan nodes "inherit", when the fields were actually only used in the topmost node in a plan tree. The sillyness was noted in the comments, along with "Someday, find a better place to keep it". Today is that day. In the executor, the natural place for these is the PlannedStmt struct. PlannedStmt contains information for the plan tree as a whole, and in fact, we already had copies of the fields there, we were just not always using them! PlannedStmt is only build in the last steps of planning, though. During planning, stash them PlannerGlobal, like many other fields that are finally copied to PlannedStmt. There was one little wrinkle in this plan: there was a check in EvalPlanQual, which checked that EvalPlanQual is not used on a Plan node that has any Motions in its subtree. Move that check to ExecInitMotion().
-
由 Heikki Linnakangas 提交于
There is a test case that reaches this, in the 'file_fdw' test. With ERRCODE_INTERNAL_ERROR, the error message includes the source file location (planner.c:1513) in the error message. That's problematic, because the line number changes whenever we touch planner.c. Since this error is in fact reachable, mark it as FEATURE_NOT_SUPPORTED.
-
由 Heikki Linnakangas 提交于
-
由 Heikki Linnakangas 提交于
Commit 7d74aa55 introduced a new function, planIsParallel() to check whether the main plan tree needs the interconnect, by checking whether it contains any Motion nodes. However, we already determine that, in cdbparallelize(), by setting the Plan->dispatch flag. We were just not checking it when deciding whether the interconnect needs to be set up. Let's just check the 'dispatch' flag, like we did earlier in the function, instead of introducing another way of determining whether dispatching is needed. I'm about to get rid of the Plan->nMotionNodes field soon, which is why I don't want any new code to rely on it.
-
- 21 9月, 2019 3 次提交
-
-
由 Heikki Linnakangas 提交于
I've been wondering for some time why we have disabled constructing Init Plans in queries that are planned in QEs, like in SPI queries that run in user-defined functions. So I removed the diff vs upstream in build_subplan() to see what happens. It turns out it was because we always ran the ExtractParamsFromInitPlans() function in QEs, to get the InitPlan values that the QD sent with the plan, even for queries that were not dispatched from the QD but planned locally. Fix the call in InitPlan to only call ExtractParamsFromInitPlans() for queries that were actually dispatched from the QD, and allow QE-local queries to build Init Plans. Include a new test case, for clarity, even though there were some existing ones that incidentally covered this case.
-
由 Heikki Linnakangas 提交于
MOTIONTYPE_FIXED was used for both Gather and Broadcast Motions, and there was an extra flag to indicate which one it was. There was a comment that suggested we should have a two different MOTIONTYPE codes for them, instead. I totally agree, Gather and Broadcast motions are quite different, and practically all the code that checked for MOTIONTYPE_FIXED also had to check the flag to see which it is, so separating the two makes a lot of sense. This doesn't have any user-visible effect, just refactoring to make the code nicer.
-
由 Heikki Linnakangas 提交于
If you passed e.g. "1+1" as the 'count' argument and "int" as the 'Type' argument, the macro would expand the allocation to: palloc(1+1*sizeof(int)) when clearly it should be palloc((1+1)*sizeof(int))
-
- 20 9月, 2019 5 次提交
-
-
由 Paul Guo 提交于
* Ship modified python module subprocess32 again subprocess32 is preferred over subprocess according to python documentation. In addition we long ago modified the code to use vfork() against fork() to avoid some "Cannot allocate memory" kind of error (false alarm though - memory is actually sufficient) on gpdb product environment that is usually with memory overcommit disabled. And we compiled and shipped it also but later it was just compiled but not shipped somehow due to makefile change (maybe a regression). Let's ship it again. * Replace subprocess with our own subprocess32 in python code.
-
由 Paul Guo 提交于
1. checkpoint_segments does not exist since pg9.5. Cleaning up the code that includes it. 2. GPTest.pm should be cleaned up in src/test/regress/GNUmakefile Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
由 Adam Lee 提交于
Greenplum's Assert() invokes Trap(), then ExceptionalCondition(), then errdetail(), MemoryAccounting_Allocate(), MemoryAccounting_ConvertIdToAccount() and Assert() again. It will fall into an infinite error handling loop until the process got signaled. Just abort here.
-
由 Sambitesh Dash 提交于
- The corresponding ORCA PR is : https://github.com/greenplum-db/gporca/pull/533 - Change GUC value OPTIMIZER_UNEXPECTED_FAIL so that we log only unexpected failures. Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io> Co-authored-by: NSambitesh Dash <sdash@pivotal.io>
-
由 Shreedhar Hardikar 提交于
- Fix "unelaborated friend declaration" warnings - Fix "missing prototype" warnings - Fix "generalized initializer lists are a C++ extension" warning CTranslatorQueryToDXL.h:63:10: warning: unelaborated friend declaration is a C++11 extension; specify 'class' to befriend 'gpdxl::CTranslatorScalarToDXL' [-Wc++11-extensions] friend CTranslatorScalarToDXL; ^ class funcs.cpp:43:1: warning: no previous prototype for function 'DisableXform' [-Wmissing-prototypes] DisableXform(PG_FUNCTION_ARGS) ^ funcs.cpp:76:1: warning: no previous prototype for function 'EnableXform' [-Wmissing-prototypes] EnableXform(PG_FUNCTION_ARGS) ^ funcs.cpp:109:1: warning: no previous prototype for function 'LibraryVersion' [-Wmissing-prototypes] LibraryVersion() ^ funcs.cpp:123:1: warning: no previous prototype for function 'OptVersion' [-Wmissing-prototypes] OptVersion() ^ 4 warnings generated. CTranslatorDXLToScalar.cpp:730:9: warning: generalized initializer lists are a C++11 extension [-Wc++11-extensions] return { .oid_type = inner_type_oid, .type_modifier = type_modifier};
-
- 19 9月, 2019 8 次提交
-
-
由 xiong-gang 提交于
-
由 Gang Xiong 提交于
to fix it.
-
由 Gang Xiong 提交于
-
由 Gang Xiong 提交于
- some members of 'MyTmGxact' is only accessed locally, extract them to local variable 'MyTmGxactLocal' - get rid of gid in MyTmGxact and form it with timestamp and gxid if needed. - get rid of 'currentGxact' and check 'MyTmGxactLocal->state' to see if the distributed transaction is started or not.
-
由 Gang Xiong 提交于
-
由 Ning Yu 提交于
In standard_ExecutorStart() we should dispatch a plan if it is parallel, currently this is determined by checking planTree->dispatch is DISPATCH_PARALLEL or not. However sometimes a DISPATCH_UNDETERMINED plan can also be parallel. For example: CREATE TABLE arrtest_f (f0 int, f1 text, f2 float8) DISTRIBUTED RANDOMLY; EXPLAIN SELECT ARRAY(select f2 from arrtest_f order by f2) AS "ARRAY" ORDER BY 1; QUERY PLAN -------------------------------------------------- Result InitPlan 1 (returns $0) (slice2) -> Gather Motion 3:1 (slice1; segments: 3) Merge Key: f2 -> Sort Sort Key: f2 -> Seq Scan on arrtest_f Optimizer: Postgres query optimizer (8 rows) To fix it we should also check whether the plan contains motions. Note that we should only check for the motions of itself, the motions of its init plans should not be counted. (cherry picked from commit 7d74aa55)
-
由 xiong-gang 提交于
There was a hang like this: when one QE errors out before 'SetupInterconnect', QD will keep waiting for the incoming connections to be established and doesn't check the error message from dispatcher. Other QEs are finished and hang in function 'waitOnOutbound'. Co-authored-by: NAsim R P <apraveen@pivotal.io> Co-authored-by: NGang Xiong <gxiong@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io> (cherry picked from commit b2101122)
-
由 Weinan WANG 提交于
In DispatchSyncPGVariable, `ListCell *l` is unused. remove it
-