- 16 5月, 2020 3 次提交
-
-
由 Mel Kiyama 提交于
* docs - clarify/fix CREATE TABLE syntax for partitioned tables Also add more partitioned table examples. * doc - minor updates to partitioned table syntax. * docs - minor fix to syntax diagram
-
由 Mel Kiyama 提交于
* docs - update bloat best practices information from dev. --Remove copying or redistributing table data as alternatives to VACUUM FULL --Mention that VACUUM (without FULL) maintenance is for both heap and AO tables. Also Reorganized information. Clarified ACCESS EXCLUSIVE lock is reason users cannot access table during VACUUM FULL * docs - updates based on review comments. * docs - removed warning about stopping VACUUM FULL.
-
由 mkiyama 提交于
-
- 15 5月, 2020 8 次提交
-
-
由 Tom Lane 提交于
Negative availMemLessRefund would be problematic. It's not entirely clear whether the case can be hit in the code as it stands, but this seems like good future-proofing in any case. While we're at it, insist that the value be not merely positive but not tiny, so as to avoid doing a lot of repalloc work for little gain. Peter Geoghegan Discussion: <CAM3SWZRVkuUB68DbAkgw=532gW0f+fofKueAMsY7hVYi68MuYQ@mail.gmail.com>
-
由 Heikki Linnakangas 提交于
Now that ShareInputScan manages its own tuplestore, Material doesn't need the extra features that tuplestorenew.c provides. Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
由 Heikki Linnakangas 提交于
Reviewed-by: NSoumyadeep Chakraborty <sochakraborty@pivotal.io> Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
由 Heikki Linnakangas 提交于
Seems a bit silly to have the Material node involved. Just create and manage the tuplestore in ShareInputScan node itself, and leave out the Material node. Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
由 Heikki Linnakangas 提交于
ShareInputScan no longer tries to share the sort tapes between processes, so all this infrastructure to track multiple read positions is no longer needed. Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
由 Heikki Linnakangas 提交于
Previously, ShareInputScan could co-opearate with a Sort node, to share the final sort tape directly with other processes. Remove that support. If a Sort node is shared, we now put a Materialize node on top of the Sort, like all other nodes. That is obviously less performant than sharing the sort tape directly. However, I don't believe that is significant in practice. Firstly, if you consider how a ShareInputScan is used, having a Sort below the ShareInputScan should be rare. A ShareInputScan is used to implement CTEs, and in order to have a Sort node just below the ShareInputScan, you need to have an ORDER BY in the CTE. For example (from the 'sisc_sort_spill' test): select avg(i3) from ( with ctesisc as (select * from testsisc order by i2) select t1.i3, t2.i2 from ctesisc as t1, ctesisc as t2 where t1.i1 = t2.i2 ) foo; However, in a query like that, the ORDER BY is actually useless; the order is not guaranteed to be preserved. In fact, ORCA optimizes it away completely. Secondly, even if you have a query like that, I don't think optimizing away the Material is very significant. If the number of rows is small enough to fit in memory, the Sort can be performed in memory, so you're still writing it to disk only once, in the Material node. If it's large enough to spill, the Material node will shield the Sort node from needing to support random access, which enables the "on-the-fly" final merge optimization in the tuplesort. So I believe you'll do roughly the same amount of I/O in that case, too. One way to think about this is that the final merge will be written out to the Material's tuplestore instead of the tuplesort's file. There is one drawback to that: the Material node won't be able to reuse the disk space used by the sort tapes, as the final merge is performed, so you'll momentarily need twice the disk space. I think that's acceptable. If you don't like that, don't put superfluous ORDER BYs in your WITH queries. Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
由 Wen Lin 提交于
while gpload is loading data if the configure file contains "error_table" and doesn't contain "preload", an error of no attribute "staging_table" or "fast_path" occurs.
-
由 David Yozie 提交于
-
- 14 5月, 2020 14 次提交
-
-
由 Tyler Ramer 提交于
I'm not quite sure of the purpose of this utility, nor, apparently, is any readme or historical repo. Apart from a small fix provided in commit 71d67305, there has been no modification to this file since at least 2008. More importantly, I'm not quite sure of any reasonable use for this file. The supported platforms are only linux, darwin, or sunos5, and the listed use, of printing the memory size in bytes, is trivial on any of those systems without resorting to some python script that wraps a command line call. Given that it hasn't been updated since 2008, it's still compatible with some ancient version of python, which means that it's yet another file to upgrade to python 3 - in this case, let's drop the program, rather than bother upgrading it. Authored-by: NTyler Ramer <tramer@pivotal.io>
-
由 Ning Yu 提交于
The pg_partition_oid_index of template0 is used as a template of empty indices, its path, however, is not fixed, we need to determine it at runtime.
-
由 Zhenghua Lyu 提交于
When handling join, if the inner path's locus is CdbLocusType_OuterQuery, it also set the outer path's locus to CdbLocusType_OuterQuery in the function cdbpath_motion_for_join. And later in cdbpath_create_motion_path, it will add a materialize path over the motion path if the original path's locus is CdbLocusType_OuterQuery. If the whole subquery can never be rescanned, this materialize path is useless and will lead to performance lose. A typical plan before this patch (from the case regress/expected/join.out) is: explain (verbose, costs off) select * from int4_tbl a, lateral ( select * from int4_tbl b left join int8_tbl c on (b.f1 = q1 and a.f1 = q2) ) ss; QUERY PLAN ------------------------------------------------------------------------ Nested Loop Output: a.f1, b.f1, c.q1, c.q2 -> Materialize **< this plannode is useless >** Output: a.f1 -> Gather Motion 3:1 (slice1; segments: 3) Output: a.f1 -> Seq Scan on public.int4_tbl a Output: a.f1 -> Materialize Output: b.f1, c.q1, c.q2 -> Hash Right Join Output: b.f1, c.q1, c.q2 Hash Cond: (c.q1 = b.f1) -> Result Output: c.q1, c.q2 Filter: (a.f1 = c.q2) There are two cases that we can do this safely: * not in SubPlan * not in lateral join's inner subquery This commit add a flag in PlannerInfo->config to guide if we can remove the useless Materialize plannode in join's left tree.
-
由 Ning Yu 提交于
It is no loner needed, the correct approach is to install meta-only index files on the new segments.
-
由 Ning Yu 提交于
An empty b-tree index file is not empty, it contains only the meta page. By transfer meta-only index files to the new segments, they can be launched directly without the "ignore_system_indexes" setting, and we do not need an extra relaunch of the new segments. We use base/13199/5112 as the template of meta-only index files, it is pg_partition_oid_index of template0.
-
由 Ning Yu 提交于
Which was introduced to exclude a large amount of paths. Also changed the excluding logic of './db_dumps' and './promote'. They were excluded only when an empty 'excludePaths' was specified by the caller, this is weird, so I changed the logic to always exclude these two paths.
-
由 Ning Yu 提交于
- be careful when creating placeholders of the master-only files in the template, raise an error if they already exist; - increase code readability slightly;
-
由 Ning Yu 提交于
Gpexpand creates new primary segments by first creating a template from the master datadir and then copying it to the new segments. Some catalog tables are only meaningful on master, such as gp_segment_configuration, their content are then cleared on each new segment with the "delete from ..." commands. This works but is slow because we have to include the content of the master-only tables in the archive, distribute them via network, and clear them via the slow "delete from ..." commands -- the "truncate" command is fast but it is disallowed on catalog tables as filenode must not be changed for catalog tables. To make it faster we now exclude these tables from the template directly, so less data are transferred and there is no need to "delete from" them explicitly.
-
由 Ning Yu 提交于
When cleaning up the master-only files on the new segments we used to do the job one by one, when there are tens or hundreds of segments it can be very slow. Now we cleanup in parallel.
-
由 Ning Yu 提交于
Removed the duplicated 'gp_segment_configuration' entry in the MASTER_ONLY_TABLES list. Also sort the list in alphabetic order to prevent dulicates in the future.
-
由 Ning Yu 提交于
In the gpexpand behave tests we used to have the same name for multiple scenarios, now we give them different and descriptive names. Also correct some bad indents.
-
由 xiong-gang 提交于
It takes time to start the walsender after gpinitstandby, this commit added a wait loop to reduce the flaky. It also fixes the next test commit_blocking_on_standby.
-
由 Ashuka Xue 提交于
-
由 Ashuka Xue 提交于
In commit `Improve statistics calculation for exprs like "var = ANY (ARRAY[...])"`, we improve performance in cardinality estimation for ArrayCmp. However, it caused ArrayCmp expressions with text-like types to default to NDV based cardinality estimations in spite of present and valid histograms. This commit re-enables using histograms for text-like types provided it is safe to do so. Removed because non-singleton buckets for text is not valid: - src/backend/gporca/data/dxl/minidump/CTE-12.mdp - src/backend/gporca/data/dxl/statistics/Join-Statistics-Text-Input.xml - src/backend/gporca/data/dxl/statistics/Join-Statistics-Text-Output.xml Co-authored-by: NAshuka Xue <axue@pivotal.io> Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>
-
- 13 5月, 2020 12 次提交
-
-
由 Heikki Linnakangas 提交于
This showed up as bogus "Executor memory" lines in EXPLAIN ANALYZE output. Reviewed-by: NJesse Zhang <jzhang@pivotal.io>
-
由 Adam Lee 提交于
Greenplum only supports INSERT, because UPDATE/DELETE requires the hidden column gp_segment_id and the other "ModifyTable mixes distributed and entry-only tables" issue.
-
由 Adam Lee 提交于
GPDB uses the high bits of the major version to indicate special internal libpq communications, and it was the default before. This commit tags it the opposite way, uses the PG version by default, but tags the QD-QE, FTS, WAL and fault injector connections as the special internal ones.
-
由 Adam Lee 提交于
Otherwise, inheritance_planner() and grouping_planner() report differently.
-
由 Adam Lee 提交于
PG doesn't read it, because it would be transformed. But Greenplum dispatches it. ``` 16 0x8d6730 postgres nodeToBinaryStringFast (outfast.c:2165) 17 0xc9ceea postgres serializeNode (cdbsrlz.c:52) 18 0xc5cf8c postgres <symbol not found> (cdbdisp_query.c:585) 19 0xc5ddbc postgres <symbol not found> (cdbdisp_query.c:1081) 20 0xc5c761 postgres CdbDispatchPlan (cdbdisp_query.c:234) 21 0x7bcee8 postgres standard_ExecutorStart (execMain.c:658) 22 0x7bc126 postgres ExecutorStart (execMain.c:214) 23 0xa1ece6 postgres PortalStart (pquery.c:743) 24 0x746f76 postgres PerformCursorOpen (portalcmds.c:164) 25 0xa21364 postgres standard_ProcessUtility (utility.c:554) 26 0xa20e0c postgres ProcessUtility (utility.c:363) 27 0xa1fc80 postgres <symbol not found> (discriminator 4) 28 0xa1ff3b postgres <symbol not found> (pquery.c:1552) 29 0xa1f3a9 postgres PortalRun (pquery.c:1022) ```
-
由 Adam Lee 提交于
-
由 Adam Lee 提交于
Greenplum does that on QE, which is reasonable for a MPP system, but please extensions don't panic.
-
由 Heikki Linnakangas 提交于
Everywhere except ShareInputScanState, we're always dealing with either a tuplestore or a tuplesort, so we don't need to use GenericTupStore. Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
由 Tingfang Bao 提交于
The future Greenplum 7 (master) may not ever target SLES 12 as a supported platform. We can backport this to 6X_STABLE as well because it is not yet a supported platform. It will be at some point in the future. [#172595616] Authored-by: NTingfang Bao <baotingfang@gmail.com>
-
由 Ning Yu 提交于
We use "pkill postgres" to cleanup leaked segments in the behave tests, if the postgress processes already exited the pkill command would fail with code 1, "No processes matched or none of them could be signalled". Fixed by ignoring the return code of pkill.
-
由 Hans Zeller 提交于
The scripts we use in Concourse pipelines download Apache xerces-c-3.1.2 and then apply a patch that is part of our source code tree. Abhijit has pointed out that this is no longer necessary. This commit removes the patch and uses the vanilla xerces-c-3.1.2 source code instead. Eventually, we want to stop including xerces into our releases and rely on the natively installed xerces. See also https://github.com/greenplum-db/gpdb/pull/10068.
-
由 Ning Yu 提交于
In the scenario "inject a fail and test if rollback is ok" the expansion is canceled after the new segment is launched, it must be shutdown in time to prevent port conflicts in followings scenarios.
-
- 12 5月, 2020 3 次提交
-
-
由 xiong-gang 提交于
-
由 Jesse Zhang 提交于
This fixes an accidental trailing semicolon that "liberated" some logging from the condition. This was introduced in 4c7854ee and it generates a compiler warning for me: > workfile_mgr.c:603:54: warning: if statement has empty body [-Wempty-body] > if (node->next == NULL || node->next->prev != node); > ^ > workfile_mgr.c:603:54: note: put the semicolon on a separate line to silence this warning > 1 warning generated.
-
由 Peifeng Qiu 提交于
gpload in the latest windows client package requires VS redistributable package. Output more meaningful message if pg.py fails to load.
-