- 12 10月, 2019 1 次提交
-
-
由 David Yozie 提交于
-
- 11 10月, 2019 5 次提交
-
-
由 Heikki Linnakangas 提交于
Bitmap AM relies on B-tree index code for all the operator support. In fact, it creates an actual B-tree index, the LOV index, and lets the LOV index evaluate the operators. We should have all the same operator classes and families for bitmap index AM that we have for B-trees. Perhaps we should teach the planner to reuse the B-tree opfamilies for bitmap indexes. But for now, let's auto-generate the bitmap opfamiles at initdb-time from the corresponding B-tree opfamilies. That ensures that the two stay in sync, and we don't need to carry the diff vs. upstream in the catalog headers. This changes the OIDs for the opclasses and opfamilies, so bump catversion. That shouldn't affect upgrade, or clients, though. They should operate on opclass/opfamily names, not OIDs. Some of the operators are still not indexable using bitmap indexes, because they are handled by special code in the backend, see match_special_index_operator(). But it doesn't do harm to have the entries in the catalogs for them, anyway. An example of this is the inet << inet operator and its siblings. One user-visible effect of this is that you can now create a bitmap index on the 'complex' datatype. I did this now, because I was seeing 'opr_sanity' test failure on the 9.6 merge branch, because some cross-type operators were missing for bitmap indexes. That was because the integer bitmap operator classes were not gathered into the same operator family, they were still following the old model, from before operator families were introduced. So there were separate int2_ops, int4_ops, int8_ops operator families, whereas with the B-tree, there's only one integer_ops operator family, which contains all the opclasses and cross-type operators. I don't think it makes any practical difference, though. For the B-tree, including all "compatible" operators in the same opfamily allows the planner to deduce quals more freely, but I'm not sure if that's significant for the bitmap AM, because all the b-tree operators already exist. Also change pg_am.amsupport for bitmap AM, to account for the new 'sortsupport' support function that was added to B-tree AM. I'm not sure if bitmap indexes can actually make use of it yet, but we now copy all the pg_amproc rows, so pg_am.amsupport better match or 'opr_sanity' test complains. Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
由 Heikki Linnakangas 提交于
- Don't write Path->parent, to avoid infinite recursion, and revert the GPDB changes in outPlannerInfo that were previously used to avoid it. There's no need to differ from upstream in this area, so let's not. - Add missing window_pathkeys handling from outPlannerInfo. Was missed in a merge, I guess. No reason to differ from upstream. - Remove out/readfast support for Paths and other structs that are only used within the planner, and not needed in segments. - Don't serialize Flow->hashExprs. It's also not needed in segments, and it might contain PlaceHolderVars because it's not processed by set_plan_references() The infinite recursion issue was spotted by Melanie and Deep. Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io> Reviewed-by: NSoumyadeep Chakraborty <sochakraborty@pivotal.io>
-
由 Hao Wu 提交于
GCC 7+ has a compiling option, -Wimplicit-fallthrough, which will generate a warning/error if the code falls through cases(or default) implicitly. The implicity may cause some bugs that are hardly to catch. To enable this compile option, all switch-cases are not allowed to fall through implicitly. To accomplish this target, 2 steps are done: 1. Append a comment line /* fallthrough */ or like at the end of case block. 2. Add break clause at the end of case block, if the last statement is ereport(ERROR) or like. When -Wimplicit-fallthrough is enabled, -Werror=implicit-fallthrough is also set. So fallthrough must be done explicitly. Re-generate configure after modifying configure.in No implicit-fallthrough for gpmapreduce and orafce Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io> Reviewed-by: NPaul Guo <pguo@pivotal.io>
-
由 Bhuvnesh Chaudhary 提交于
Earlier while creating the workfile for hash aggregates, the data was written to file for each agg state, and after all the aggstate's data has been written, the total size was updated in the file. For updating the total size, it used to SEEK backwards to the offset where total size was written previously. It used to work for workfiles without compression. However, when gp_workfile_compression=on, the workfiles are compressed, an attempt to SEEK the earlier offset will error out, as for compressed file its not expected to go back. This commit fixes the issue by writing all the data to a buffer, so that the total size is known, and after that its written to the file. Co-Authored-By: NAshwin Agrawal <aagrawal@pivotal.io> Signed-off-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
由 David Yozie 提交于
-
- 10 10月, 2019 10 次提交
-
-
由 Lisa Owen 提交于
* docs - add pxf upgrade topic for greenplum 6.x * clarify to newer version of 6.x
-
由 Georgios Kokolatos 提交于
Commit <00e25afe> removed cdbpullup_exprHasSubplanRef and with it the need for findAnyVar_walker. Proceed to clean up. Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
-
由 Jinbao Chen 提交于
The commit enabled materialized view on both swap file and concurrently mode. There are four main change on the code. 1. Add a new type 'PARENTSTMTTYPE_REFRESH_MATVIEW' on ParentStmtType in Query struct. So that we can make the right query plan without gather motion. 2. add a new 'transientrel_init' function which was called on 'initplan' to create temp table on segment. 3. Call the swap function on both master and segment. 4. Change the spi query to make the right diff on parallel system. Co-authored-by: NZhenghua Lyu <kainwen@gmail.com>
-
由 Ning Yu 提交于
Session order on segments is important in gdd tests, it determines the cancellation order on deadlocks. We used to create a full gang on the BEGIN command, this was used in gdd tests to control the session orders on segments. However this behavior was changed during the optimization of readonly queries, BEGIN command no longer creates a full gang. So now we use the `RESET optimizer` command to trigger the gang creation, we do not really care about the optimizer status although. Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
-
由 Ashuka Xue 提交于
Send btree indexes on AO tables as type btree Previously, GPDB sent btree indexes on AO tables to ORCA as type bitmap as btree indexes were not allowed on AO tables. This was a bit hacky, so instead, send it over as type btree and let ORCA handle it properly. Co-Authored-By: NAshuka Xue <axue@pivotal.io> Co-Authored-By: NHans Zeller <hzeller@pivotal.io>
-
由 Shreedhar Hardikar 提交于
Bumps ORCA version to 3.75.0 and ICG test.
-
由 Lisa Owen 提交于
* docs - combine mgmt and client utilities into single section * address some of the requested edits * misc edits, conditionalizing * minor edits requested by david * correct the superscript referenced in the note * use numbers for superscripts
-
由 Lisa Owen 提交于
* docs - provide more detail about pxf s3 select FILE_HEADER option * address review comments from david and francisco * add error message returned when both order/names differ
-
由 Ashwin Agrawal 提交于
segspace test passes without need to set the GUC gp_workfile_limit_per_segment. Just to set and reset this GUC setup and cleanup test used to restart the cluster. Since it's not needed delete these tests, which helps to save time. I don't have context on why setup was needed earlier but definitely doesn't seem to be required now. Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/9lB2ovwKP3I/W6iQoN4GCAAJ
-
由 Chris Hajas 提交于
This guc makes ORCA use gpdb allocators. This allows for faster optimization due to less overhead, reduced memory during optimization due to smaller/fewer headers, and makes ORCA observe vmem limits instead of crashing. With setting optimizer_use_gpdb_allocators, we now track orca's memory usage within resource groups/vmem. As a result, we need to update the test as we go OOM slightly earlier when ORCA is enabled. Authored-by: NChris Hajas <chajas@pivotal.io>
-
- 09 10月, 2019 6 次提交
-
-
由 Heikki Linnakangas 提交于
The old way was to construct a plan for a correlated subquery was to plan the subquery as usual, except that Index Scans were not allowed. Then, after constructing a Plan tree, the post-processing step apply_motion() added Redistribute motion nodes so that the plan tree was executable on any node. That was pretty simplistic; disabling Index Scans completely obviously hurts performance. Also, because the planner didn't take into account that all base relations are actually redistributed everywhere, it was not reflected in the cost estimates, and the planner might choose a sub-optimal plan. To improve that, change the way that works, so that the Motions are added earlier in the planning. The planner needs to know which restrictions (WHERE clauses) refer to the outer query, i.e. which quals are correlated, and make sure that those quals are always evaluated in the same slice as the parent query. The outer query cannot pass parameters down through a Motion node, so the subquery plan must not contain any Motion nodes between the evaluation of the correlated variable, and the outer plan. This is enforced by having a new CdbPathLocus type, "OuterQuery". A node with OuterQuery locus must be evaluated in the outer query. It is mostly the same as "general", which means that it can be evaluated anywhere, but with the restriction that it is not OK to redistribute an input that has OuterQuery locus. Whenever the planner node evaluates a correlated var, that node must have Upper locus, by adding Motions below that node. This has similar effect as the old approach, but gives the planner a bit more flexibility, and the motions are taken into account in cost estimates. This allows using Index Scans in subplans, but only if the Index Quals don't contain correlated vars. This still isn't perfect, it would sometimes be good to for example delay the evaluation of a correlated var later, above a join node, because that might avoid expensive Redistribute Motions. Even though this patch doesn't allow such plans yet, it's a step in the right direction. This moves the cdbllize() step to run *before* set_plan_references(). Now that we no longer add Motion nodes to an already-constructed plan tree, we don't need the 'useExecutorVarFormat' stuff in many functions anymore. Fixes https://github.com/greenplum-db/gpdb/issues/8648Reviewed-by: NMelanie Plageman <mplageman@pivotal.io> Reviewed-by: NSoumyadeep Chakraborty <sochakraborty@pivotal.io>
-
由 Georgios Kokolatos 提交于
It was initially backported to greenplum before the src/common directory was introduced. It is not needed anymore. Addresses Github issue #8753 Reported-by: NYandong Yao <yyao@pivotal.io> Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
-
由 Chuck Litzell 提交于
-
由 Chuck Litzell 提交于
* Can use gpcopy to migrate data * Conditionalize gpcopy references * Edits for review comments
-
由 Mel Kiyama 提交于
Other changes --updated some queries and results to use state column from pg_stat_activity. --update gp_toolkit.gp_workfile_* tables -based on replacing workfiles with buffiles dev PR https://github.com/greenplum-db/gpdb/pull/6508 Updates in HTML format Changed references to current_query to query, change some queries http://docs-msk-gpdb6-dev.cfapps.io/6-0/admin_guide/perf_troubleshoot.html#topic5 http://docs-msk-gpdb6-dev.cfapps.io/6-0/admin_guide/workload_mgmt.html#topic27 http://docs-msk-gpdb6-dev.cfapps.io/6-0/admin_guide/workload_mgmt_resgroups.html http://docs-msk-gpdb6-dev.cfapps.io/6-0/admin_guide/managing/startstop.html http://docs-msk-gpdb6-dev.cfapps.io/6-0/ref_guide/config_params/guc-list.html#track_activity_query_size http://docs-msk-gpdb6-dev.cfapps.io/6-0/ref_guide/sql_commands/DROP_RESOURCE_GROUP.html Minor edit to column name http://docs-msk-gpdb6-dev.cfapps.io/6-0/ref_guide/system_catalogs/pg_stat_activity.html Reword change of column name from current_query to query. http://docs-msk-gpdb6-dev.cfapps.io/6-0/install_guide/migrate.html update gp_toolkit.gp_workfile_* tables -based on replacing workfiles with buffiles http://docs-msk-gpdb6-dev.cfapps.io/6-0/ref_guide/gp_toolkit.html#topic32 Add schema to tablename http://docs-msk-gpdb6-dev.cfapps.io/6-0/admin_guide/managing/monitor.html
-
由 Heikki Linnakangas 提交于
Add new table_insert/update/delete() functions to stand in place of heap_insert/update_delete(), and move the logic to dispatch the call to AO, AOCO or external table function, depending on what kind of a table it is. This moves the GPDB specific code out of the upstream functions, making the diff vs upstream nicer to read. PostgreSQL v12 introduced a new Table AM API. These new functions sit in the same spot as the calls to the new API are made in v12. The signatures don't quite match, but this is similar interface in spirit. Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
- 08 10月, 2019 10 次提交
-
-
由 Heikki Linnakangas 提交于
The reason to do this now is that the dumputils mock test was failing to compile, on the 9.6 merge branch, because of the Assert that was added to dumputils.c. I'm not sure why, but separating the GPDB functions to a separate file seem like a good idea anyway, so let's just do that.
-
由 Heikki Linnakangas 提交于
In master, we already got this fix as part of the 9.5 merge. Change it slightly, so that we don't use the subplan's tuple descriptor as is, but only copy the "tdhasoid" flag from it. It's in principle possible that some unimportant information, like attribute names or typmods, are not set up in the subplan's result type, and should be computed from the Motion's target list. We're not seeing problems like that on master, but on the 9.6 merge branch we are. More importantly, this adds bespoken tests for this scenario. It arose in the 'rowsecurity' test, but this doesn't have anything to do with row-level security, so it was accidental. We didn't backport the fix to 6X_STABLE before, so do that now. Fixes https://github.com/greenplum-db/gpdb/issues/8765. Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
由 Bradford D. Boyle 提交于
Authored-by: NBradford D. Boyle <bboyle@pivotal.io>
-
由 Bradford D. Boyle 提交于
Server release candidate artifacts are not published until after an extensive set of tests have passed in the CI pipeline. These tests include ICW and all the CLI test suites. It is not unusual for time between a commit being pushed to a release candidate being published to be several hours, potentially slowing down the feedback cycle for component teams. This commit adds a "published" output to the compliation tasks. The new build artifact is stored in an immutable GCS bucket with the version in the filename. This makes it trivial for other pipelines to safely consume the latest build. This artifact **has not** passed any sort of testing (e.g., ICW) and should only be used in development pipelines that need near-instantaneous feedback on commits going into GPDB. For the server-build artifact, `((rc-build-type-gcs))` will resolve to `.debug` for the with asserts pipeline and `''` (i.e., empty string) for the without asserts pipeline. The two types of server artifacts that are "published" are: 1. server-build 2. server-rc server-build is the output of the compilation task and has had no testing; server-rc is a release candidate of the server component. Authored-by: NBradford D. Boyle <bboyle@pivotal.io>
-
由 Ashwin Agrawal 提交于
Previously, fsyc, heap_checksum and walrep tests used separate databases. But seems with merge when src/makefiles/pgxs.mk added --dbname to REGRESS_OPTS, all of these tests started using `contrib_regression` database even if each of these tests defined --dbname=<>, it was overridden from src/makefiles/pgxs.mk. Hence, set USE_MODULE_DB=1 which makes these tests to use --dbname=$(CONTRIB_TESTDB_MODULE) instead of `contrib_regression`. This way these will again start having separate database.
-
由 Ashwin Agrawal 提交于
Command in next session gets executed in parallel to current session in case "&" is used. Hence, if session 1 command is running very slow, but session 0 commands gets executed fast, can cause flaky test. To avoid the same, add retries to check for all processes for session are blocked or not.
-
由 Mel Kiyama 提交于
* docs - add guc optimizer_use_gpdb_allocators * docs - add information about improved GPDB query mem. mgmt.
-
由 Chris Hajas 提交于
This reverts commit b81d4501. The resource group tests use ORCA and need to be modified for this change.
-
由 Chris Hajas 提交于
This guc makes ORCA use gpdb allocators. This allows for faster optimization due to less overhead, reduced memory during optimization due to smaller/fewer headers, and makes ORCA observe vmem limits instead of crashing. Authored-by: NChris Hajas <chajas@pivotal.io>
-
由 Heikki Linnakangas 提交于
I found the logic to decide the target locus hard to understand, so I rewrote it in a table-driven approach. I hope it's not just me. Fixes github issue https://github.com/greenplum-db/gpdb/issues/8711Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
-
- 05 10月, 2019 1 次提交
-
-
由 Chris Hajas 提交于
We introduce a new type of memory pool and memory pool manager: CMemoryPoolPalloc and CMemoryPoolPallocManager The motivation for this PR is to improve memory allocation/deallocation performance when using GPDB allocators. Additionally, we would like to use the GPDB memory allocators by default (change the default for optimizer_use_gpdb_allocators to on), to prevent ORCA from crashing when we run out of memory (OOM). However, with the current way of doing things, doing so would add around 10 % performance overhead to ORCA. CMemoryPoolPallocManager overrides the default CMemoryPoolManager in ORCA, and instead creates a CMemoryPoolPalloc memory pool instead of a CMemoryPoolTracker. In CMemoryPoolPalloc, we now call MemoryContextAlloc and pfree instead of gp_malloc and gp_free, and we don’t do any memory accounting. So where does the performance improvement come from? Previously, we would (essentially) pass in gp_malloc and gp_free to an underlying allocation structure (which has been removed on the ORCA side). However, we would add additional headers and overhead to maintain a list of all of these allocations. When tearing down the memory pool, we would iterate through the list of allocations and explicitly free each one. So we would end up doing overhead on the ORCA side, AND the GPDB side. However, the overhead on both sides was quite expensive! If you want to compare against the previous implementation, see the Allocate and Teardown functions in CMemoryPoolTracker. With this PR, we improve optimization time by ~15% on average and up to 30-40% on some queries which are memory intensive. This PR does remove memory accounting in ORCA. This was only enabled when the optimizer_use_gpdb_allocators GUC was set. By setting `optimizer_use_gpdb_allocators`, we still capture the memory used when optimizing a query in ORCA, without the overhead of the memory accounting framework. Additionally, Add a top level ORCA context where new contexts are created The OptimizerMemoryContext is initialized in InitPostgres(). For each memory pool in ORCA, a new memory context is created in OptimizerMemoryContext. Bumps ORCA version to 3.74.0 This is a re-commit of 339dedf0d2, which didn't properly catch/rethrow exceptions in gpos_init. Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io> Co-authored-by: NChris Hajas <chajas@pivotal.io>
-
- 04 10月, 2019 7 次提交
-
-
由 Asim R P 提交于
Spotted when analyzing a CI failure pertaining to another test. Reviewed by Heikki and Georgios.
-
由 Asim R P 提交于
The privious pattern used by the test was not strong enough. It could accidentally matched names of partitioned tables. Name of a partition being added is generated by suffixing a random number, if no name is specified by the user. E.g. "sales_1_prt_r1171829080_2_prt_usa". At least one time, the test failed CI due to this weakness. The new pattern is strong enough to match only the auxiliary table names that end with "_<oid>". Reviewed by Heikki and Georgios.
-
由 Lisa Owen 提交于
* docs - mention resource groups in spill file topic, misc edits * edits requested by david
-
由 Heikki Linnakangas 提交于
This test is about MergeAppends, there's even a comment saying "we want a plan with two MergeAppends". enable_mergejoin defaults to off in GPDB, so we have to enable it to get the same plan as in upstream.
-
由 Lisa Owen 提交于
* docs - move gpmapreduce yaml info to own utility page; misc topic edits * relocate topic and graphic, add shortdesc, fix linking
-
由 Heikki Linnakangas 提交于
-
由 Heikki Linnakangas 提交于
-