- 11 2月, 2020 2 次提交
-
-
由 Adam Lee 提交于
Remove the gpperfmon since we have the alternative metrics collector. 1, remove all gpperfmon codes, including gpmon, gpsmon, gpmmon, gpperfmon and alert_log. 2, remove all gpperfmon GUCs, `gp_enable_gpperfmon`, `gp_perfmon_segment_interval`, `gp_perfmon_print_packet_info`, `gpperfmon_port` and `gpperfmon_log_alert_level`. 3, remove the `perfmon` and `stats sender` processes/bgworkers. 4, remove the `apu` and `libsigar` dependencies.
-
由 Jamie McAtamney 提交于
Previously, gpstart could not start the cluster if a standby master host was configured but currently down. In order to check whether the standby was supposed to be the acting master (and prevent the master from being started if that was the case), gpstart needed to access the standby host to retrieve the TimeLineID of the standby, and if the standby host was down the master would not start. This commit modifies gpstart to assume that the master host is the acting master if the standby is unreachable, so that it never gets into a state where neither the master nor the standby can be started. Co-authored-by: NKalen Krempely <kkrempely@pivotal.io> Co-authored-by: NMark Sliva <msliva@pivotal.io> Co-authored-by: NAdam Berlin <aberlin@pivotal.io>
-
- 10 2月, 2020 2 次提交
-
-
由 Daniel Gustafsson 提交于
Commit a693c889 removed all callers of skipPadding, but left the function in which generates a compiler warning. Fix by removing. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
-
由 Huiliang.liu 提交于
gpload uses gpversion.py to parse gpdb version. So that it can compatible with gpdb5 and gpdb6. Then we can only maintain one gpload version and some new features or bug fix could be used by gpdb5 customers. so we package gppylib.gpversion into gpdb clients tarball
-
- 08 2月, 2020 2 次提交
-
-
由 David Yozie 提交于
* Add gpss link * Correct madlib typo * Remove broken link (unneeded) * Fix link to gptext/fts comparison
-
由 Ashwin Agrawal 提交于
gpcheckcat hard-coded master dbid to 1 for various queries. This assumption is flawed. There is no restriction master can only have dbid 1, it can be any value. For example, failover to standby and gpcheckat is not usable with that assumption. Hence, run-time find the value of master's dbid using the info that it's content-id is always -1 and use the same. Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
-
- 07 2月, 2020 3 次提交
-
-
由 Ning Yu 提交于
-
由 Ning Yu 提交于
We used to generate ubuntu only jobs even if ubuntu is not in the os list, wrap them with os checkers now.
-
由 Zhenghua Lyu 提交于
Previously the number motions in subquery plan is not counted. This commit fixes this.
-
- 06 2月, 2020 1 次提交
-
-
由 Hubert Zhang 提交于
Using 'select pg_catalog.gp_acquire_sample_rows(...)' instead of 'select * from pg_catalog.gp_acquire_sample_rows(...) as (...)' to avoid specify columns in function return value explicitly. The old style requires USAGE privilege on each columns which is not consistent with GPDB 5X. The following SQL failed to pass acl check in master now: revoke all on schema public from public; create role gmuser1; grant create on schema public to gmuser1; create extension citext; create table testid (id int , test citext); alter table testid owner to gmuser1; analyze testid; Idea from Ashwin Agrawal <aagrawal@pivotal.io> Idea from Taylor Vesely <tvesely@pivotal.io> Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
-
- 05 2月, 2020 5 次提交
-
-
由 Alexandra Wang 提交于
neededColumnContextWalker() is called to scan through VARs for targetlist, quals, etc.. It should only look at VARS for the table being scanned and avoid all other VARS. Currently, we are not aware of any plans which can produce situation where neededColumnContextWalker() will encounter some other VARs. But for GPDB5, we get OUTER vars here if Index scan is right tree for NestedLoop join. Hence, seems better to have the protective code to not write out-of-bound. Adds test to cover the scenario as well which is missing currently. Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io> Reviewed-by: NRichard Guo <guofenglinux@gmail.com> Reviewed-by: NAsim R P <apraveen@pivotal.io>
-
由 Shreedhar Hardikar 提交于
Better cardinality estimation for citext in ORCA
-
由 Mel Kiyama 提交于
* reorganize ddboost replication information. --move replication info. into separate topic. --update toc * docs - updated docs based on review comments. --created sections for gpbackup and gpbackup_manager --added link to example config. files.
-
由 Huiliang.liu 提交于
Use bin_gpdb_centos7 instead of bin_gpdb_centos7_rc as bin_gpdb in test_gpdb_clients_windows job Update output file of gpfdist_ssl test case due to message changed after external table refactor.
-
由 Mel Kiyama 提交于
* docs - resource group support of runaway query detection update GUC runaway_detector_activation_percent Add cross reference in --Admin Guide resource group memory management topic --CREATE RESOURCE GROUP parameter MEMORY_AUDITOR This will be backported to 5X)_STABLE * docs - minor edit * docs - review comment updates * docs - simplified description for resource groups --replaced requirement for vmtracker mem. auditor w/ admin_group, and default_group --Added global shared memory example from Simon * docs - created an Admin Guide section for resource group automatic query termination. * docs - fix math error
-
- 04 2月, 2020 3 次提交
-
-
由 Daniel Gustafsson 提交于
postgres_fdw was enabled by mistake in c9d4c1e5, but it should remain disabled as it's still undergoing work in order to function properly in for Greenplum. Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/Lepem3qwJGw
-
由 Zhenghua Lyu 提交于
The test suite isolation2/gdd is for global-deadlock-detector. Previous commit has done some optimization to make it faster. gdd/end is using to reset the cluster's state of global-deadlock -detector. It invokes helper UDF pg_ctl to only restart the Master's postmaster. pg_ctl may invoke test_postmaster_connection to check the restart status, and if it takes more time than the checking interval, it will print different message that making this case flaky. Since at the end of gdd/end.sql, we have tests to make sure the cluster is under correct condition, so this commit fixes the flakiness by modifying the code of UDF pg_ctl, now if it works correctly (return value is 0) then it just prints 'OK', if something is wrong, it raise an exception contains the stdout and stderr for debugging.
-
https://github.com/greenplum-db/gpdb/pull/9414由 dyozie 提交于
Squashed commit of the following commit e76c278c96f49f21ee0464097e55ff7d6fc2568e Merge: 9a14adba 98a67e1197 Author: dyozie <dyozie@pivotal.io> Date: Mon Feb 3 14:01:01 2020 -0800 Merge branch 'feature/docs-analytics-2' of git://github.com/lenapivotal/gpdb into lenapivotal-feature/docs-analytics-2 commit 98a67e1197d0156e791739ba4239ac9fd34c3346 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Jan 23 13:38:03 2020 -0800 edits from most recent review commit 0b203c8a1f33293cc8276b81c8ef8b6293178d27 Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Jan 13 16:16:05 2020 -0800 link fixes commit 388d3f9499f54174a48e448ab9c7e84865bd217a Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Jan 13 15:46:15 2020 -0800 changes for 6.x -> 7.0 and link fixes commit 57eaafb142badb685ade787a7e8a3b621289c125 Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Jan 13 14:14:18 2020 -0800 fixing gp version commit cb4be6ade0319312a5c9cc9fd49c6c321c16981f Merge: 944bb49b71 5e98376148 Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Jan 13 13:57:59 2020 -0800 resolving conficts commit 944bb49b71bcf38803336662e081c628352cb53e Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Jan 6 14:36:17 2020 -0800 adding menu times overview title change edits to intro for PL commit 09c0a07f2cf5a3e62723b024aae812678bd83952 Author: Lena Hunter <lhunter@pivotal.io> Date: Fri Dec 27 12:27:44 2019 -0800 edits from feedback commit f1ac6bebc07b99dcc61b747723d8dcf79bc7a883 Author: Lena Hunter <lhunter@pivotal.io> Date: Fri Dec 13 16:16:25 2019 -0800 changes for new text.xml page commit 82221149283732615cd51f0cf0f63b1318a30963 Author: Lena Hunter <lhunter@pivotal.io> Date: Wed Dec 11 10:20:52 2019 -0800 madlib page changes with diagram commit 4ef989ca69aae6a132c5e6b0ff79ca963b441f45 Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Dec 9 16:10:52 2019 -0800 small edit for DITA format commit e39cfbfaca820fde9047574f7722bf8d9c63eb60 Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Dec 9 14:29:25 2019 -0800 edits to phase 1 menu commit 5a88a930240f719755a4c1c535ba3dd726f9241e Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Dec 9 13:56:00 2019 -0800 edits to MADlib overview commit b3da2616e48ab840cbcde5002520b33de35286f7 Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Nov 11 15:07:37 2019 -0800 analytics edits commit ccdbdcb0fe34216e6ad5fbde9594e408ac470636 Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Nov 11 14:10:51 2019 -0800 further edits to overview page commit db3679cd2ff1cf5a944518be06dec1526e208d8b Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Nov 4 14:46:05 2019 -0800 edits to gp analytics commit 54679c104ab5823acca6a8f11a934136f51fe3c3 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 31 16:34:02 2019 -0700 changes to analytics subject: commit 40c4e35e22529ddd13e5e5f2fbe1c946c5581fd1 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 31 16:32:00 2019 -0700 updates to analytics work commit caa37b01e869b106c6c2d52feecf4614d169f6a9 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 31 14:42:57 2019 -0700 further edits to analytics commit 318a742a17f405725848670f89cccec2fbda55ba Author: Lena Hunter <lhunter@pivotal.io> Date: Fri Oct 25 15:38:32 2019 -0700 menu order change commit a01c62da78154e45873dc31b8f84e6ac0dac337e Author: Lena Hunter <lhunter@pivotal.io> Date: Fri Oct 25 10:54:32 2019 -0700 added graphics folder for analytics commit c76c1b9db711d3c14d2779031c7a7e4580e82016 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 24 16:31:02 2019 -0700 fixing xml error and image location/size commit 67c7d37f909f65768c3473ec943f77697a4f5e80 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 24 16:09:09 2019 -0700 ditamaps and erb edits for menus commit b6be012de62fffc442647b4469ed8ac81e1da432 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 24 13:41:23 2019 -0700 changes relating to ditamaps commit c3ac846df5dfba0fd2a77b99b3eeede0fd168f13 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 24 08:34:46 2019 -0700 adding new overview page commit a6c37d7a709c2907ac504e40eb81b32427ddae79 Author: Lena Hunter <lhunter@pivotal.io> Date: Wed Oct 23 15:12:43 2019 -0700 testing image insert commit 3045f7d8164df04ceb133973616d4ff45f06d5d2 Author: Lena Hunter <lhunter@pivotal.io> Date: Wed Oct 23 14:46:06 2019 -0700 fixes to ref links commit 5693353a94878a65fec8d0f3c4458afbd47b8fc5 Author: Lena Hunter <lhunter@pivotal.io> Date: Wed Oct 23 14:38:22 2019 -0700 fixes for broken ref links commit d1b389f87a20d3a7eba22d97d59e45c1fc4021f1 Author: Lena Hunter <lhunter@pivotal.io> Date: Tue Oct 22 16:36:58 2019 -0700 changes to analytics section commit 8c6a9b41f4228cdeba9a00389766c82a00effae8 Author: Lena Hunter <lhunter@pivotal.io> Date: Tue Oct 22 16:08:24 2019 -0700 initial reorg of analytics work commit 5e983761481b49c58a1025fc3a394e3b8ab3fa3a Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Jan 6 14:36:17 2020 -0800 adding menu times overview title change edits to intro for PL commit e94ffe63c665b3badb09ebd08bcd12605701ed75 Author: Lena Hunter <lhunter@pivotal.io> Date: Fri Dec 27 12:27:44 2019 -0800 edits from feedback commit 3060c37ab2c5aaa94e72e5f2df12e2b6089d32d1 Author: Lena Hunter <lhunter@pivotal.io> Date: Fri Dec 13 16:16:25 2019 -0800 changes for new text.xml page commit 43c0cbf4ad932c7b4b72a21de004a46b3e7abfd5 Author: Lena Hunter <lhunter@pivotal.io> Date: Wed Dec 11 10:20:52 2019 -0800 madlib page changes with diagram commit 57d06c23097607c1507d5908bd706a38b4de1b46 Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Dec 9 16:10:52 2019 -0800 small edit for DITA format commit 2e5bf3be7f479a5fe5921b91b0aaaa8ed25d255b Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Dec 9 14:29:25 2019 -0800 edits to phase 1 menu commit 75a33778ee4dc6f013d8cd1e8c8fbb0cd4b0b083 Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Dec 9 13:56:00 2019 -0800 edits to MADlib overview commit da1301b13837f7bbac96e139a37a06b0c693f0be Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Nov 11 15:07:37 2019 -0800 analytics edits commit 5b98b31fcac8a2905b4c2cc6d30dbed41e04345e Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Nov 11 14:10:51 2019 -0800 further edits to overview page commit 9d68eea525c3f07af3bb4269371ba54178204906 Author: Lena Hunter <lhunter@pivotal.io> Date: Mon Nov 4 14:46:05 2019 -0800 edits to gp analytics commit 8eb2988d6b8d5027dba247ada3dd3cc58ce97f67 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 31 16:34:02 2019 -0700 changes to analytics subject: commit 1c1fdffabb0036413509084afe4caa2e58cec110 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 31 16:32:00 2019 -0700 updates to analytics work commit 14bfaa14ca9c3e5873522a2cd0a485ed291698b0 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 31 14:42:57 2019 -0700 further edits to analytics commit adc11169f1dcac9ab92f9d5ec1dfbd080fac6df3 Author: Lena Hunter <lhunter@pivotal.io> Date: Fri Oct 25 15:38:32 2019 -0700 menu order change commit 84dbee9d8de0d245372c8858254b0eab3d8fa0b3 Author: Lena Hunter <lhunter@pivotal.io> Date: Fri Oct 25 10:54:32 2019 -0700 added graphics folder for analytics commit bd66357bd577eb967b765a3740db145ce17a44cd Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 24 16:31:02 2019 -0700 fixing xml error and image location/size commit 8836b5f9dd262d06582497be398c91d9385ccb1b Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 24 16:09:09 2019 -0700 ditamaps and erb edits for menus commit 3472e6cc9dae118239651bd9447fcc996b8c52e7 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 24 13:41:23 2019 -0700 changes relating to ditamaps commit 7df800760b834bcbd37746f2248e7696dcf497a9 Author: Lena Hunter <lhunter@pivotal.io> Date: Thu Oct 24 08:34:46 2019 -0700 adding new overview page commit df01bfa4f9bb18c99bbbf93b089c27b69e953ad4 Author: Lena Hunter <lhunter@pivotal.io> Date: Wed Oct 23 15:12:43 2019 -0700 testing image insert commit f34ca13d9842f844af2fb06846981bea7c63ff9b Author: Lena Hunter <lhunter@pivotal.io> Date: Wed Oct 23 14:46:06 2019 -0700 fixes to ref links commit 17d98775fb142efc760c404a402d85493bf9bf0b Author: Lena Hunter <lhunter@pivotal.io> Date: Wed Oct 23 14:38:22 2019 -0700 fixes for broken ref links commit 9d35b209e1bae9933fd85cec8df7b4b6f73036ec Author: Lena Hunter <lhunter@pivotal.io> Date: Tue Oct 22 16:36:58 2019 -0700 changes to analytics section commit 7e5de7a01d506158ef106a494f2fdfe679cb25f5 Author: Lena Hunter <lhunter@pivotal.io> Date: Tue Oct 22 16:08:24 2019 -0700 initial reorg of analytics work
-
- 31 1月, 2020 11 次提交
-
-
由 Heikki Linnakangas 提交于
I initially thought that this was dead code, because you can't create a CHECK constraint on an external table normally. However, when you exchange an external table with a table partition, the partition's CHECK constraints, which check the partition boundaries, are applied to the external table. This fixes a regression failure in 'partindex_test' test. Resurrect the old partition checking code, so that you get the same behavior as before. I'm not convinced this is really the best behavior, but this lets us move forward while we discuss what behavior we actually want. Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/v-ZTreV0ud4/gZutaKYNDQAJ
-
由 Heikki Linnakangas 提交于
This makes external tables less of a special case throughout the planner and executor. They're now mostly handled through the FDW API callbacks. * Add a FDW handler function and implement the API routines to construct ForeignScan paths and Plan nodes, iterate through a scan, and to insert tuples to the external/foreign table. The API routines just call through to the existing external table functions, like external_getnext() and external_insert(). * Remove ExternalScan plan node in the executor and the ExternalScan path node from the planner. Use ForeignScan plan and path nodes instead, like for any other FDW. Move code related to external table planning to a new exttable_fdw_shim.c file. * The parameters previously carried in the ExternalScan struct are now in a new ExternalScanInfo struct. (Or, the ExternalScan struct has been renamed to ExternalScanInfo, if you want to think of it that way.) It's not a plan node type anymore, but it still needs read/out function support so that the parameters can be serialized. Alternatively, the parameters could be carried in Lists of Values and other existing expression types, but a struct seems easier to handle. Perhaps the cleanest solution would be to use the ExtensibleNode infrastructure for it, but I'll leave that for another patch. As long as the external table FDW is in the backend anyway, it's simplest to have the out/read/copy funcs built-in as well. * Modify ForeignScan executor node so that it can make up "fake ctids" like ExternalScan did, and also add "squelching" support to it. * Remove special handling of external tables from ModifyTable node. It now uses the normal FDW routines for it. COPY still calls directly into the external_insert() function, because PostgreSQL doesn't support COPY into a foreign table until version 11. (We don't seem to have any tests for COPY TO/FROM an external table. TODO: add test.)
-
由 Heikki Linnakangas 提交于
External tables now use relkind='f', like all foreign tables. They have an entry in pg_foreign_table, as if they belonged to a special foreign server called "exttable_server". That foreign server gets special treatment in the planner and executor, so that we still plan and execute it the same as before. * ALTER / DROP EXTERNAL TABLE is now mapped to ALTER / DROP FOREIGN TABLE. There is no "OCLASS_EXTTABLE" anymore. This leaks through to the user in error messages, e.g: postgres=# drop external table boo; ERROR: foreign table "boo" does not exist and to the command tag on success: postgres=# drop external table boo; DROP FOREIGN TABLE * psql \d now prints external tables as Foreign Tables. Next steps: * Use the foreign table API routines instead of special casing "exttable_server" everywhere. * Get rid of the pg_exttable table, and store the all the options in pg_foreign_table.ftoptions instead. * Get rid of the extra fields in pg_authid to store permissions to create different kinds of external tables. Store them as ACLs in pg_foreign_server.
-
由 Heikki Linnakangas 提交于
-
由 Heikki Linnakangas 提交于
In PostgreSQL, ALTER TABLE is allowed where ALTER FOREIGN TABLE is. Allow external tables the same leniency.
-
由 Heikki Linnakangas 提交于
It was set but never used.
-
由 Heikki Linnakangas 提交于
The condition listed all possible values of relstorage, except for 'f' for RELSTORAGE_FOREIGN. The condition on relkind filters out foreign tables as well, so the condition on relstorage is redundant. (Although I don't think filtering out foreign table was even the intention here.)
-
由 Oliver Albertini 提交于
Now that we are moving to FDW, we can remove the old external table-based PXF module. Since PXF FDW only currently supports Greenplum (not stand-alone Postgres) `pxf_fdw` should live under `gpcontrib`. Authored-by: NOliver Albertini <oalbertini@pivotal.io>
-
由 Francisco Guerrero 提交于
Currently, every segment node retrieves metadata about the list of fragments it's going to process. Then it filters out fragments assigned to that segment, and then it processes each fragment, one at a time. This operation can stress the external metadata servers when the Greenplum cluster is large, because every segment will connect at the same time to the external system to fetch metadata. An optimization was introduced in PXF to cache the metadata at the PXF Server level, when multiple segments were trying to access the same metadata, PXF would only issue 1 query to the external system. This helped improved the situation, but still, every segment host was getting the same metadata. In Foreign Data Wrappers, this metadata query can be done in a single place from master. And master can provide this information to the segments.
-
由 Francisco Guerrero 提交于
PXF, (Platform Extension Framework) provides access to external data in Greenplum[1]. Previously, it was based on external tables, this commit introduces the FDW that can be used to communicate with PXF. * Provide the skeleton for PXF Foreign Data Wrappers. * Add pxf_fdw_version() * Validation for WRAPPER, SERVER, USER-MAPPING, and FOREIGN TABLE options * Only build pxf_fdw when --enable_pxf is present PXF is enabled by default, but this gives the user the option of turning off the building of the PXF contrib module with --disable-pxf. * Integrate FDW with PXF's fragmenter and bridge This allows a Greenplum user to create foreign tables that read from external data via PXF. See [2] for documentation. The only wire format (format used to communicate between PXF JVM and Greenplum segments) supported is going to be TEXT for FDW. Other formats which come across as a binary stream of data (like Parquet) are not implemented. We validate and pass along the following options: - header - delimiter - quote - escape - null - encoding - newline - fill_missing_fields - force_not_null - force_null - reject_limit - reject_limit_type (percent/rows) and enforce precedence rules for `other_options`. For options other than protocol, resource, format, reject_limit*, use the following precedence rules: - Table options take precedence over all other options - User-Mapping options take precedence over server and wrapper options - Server options take precedence over wrapper options * Add support for and validation of log_errors. * Complete FDW write (master only) This is facilitated by externalizing a new function in COPY: `BeginCopy()`. * Introduce a `config` option for SERVER: Access to Hadoop clusters is nuanced, because with a single set of configurations users are able to access HDFS, Hive, HBase and other services. Suppose an enterprise user has a Hortonworks hadoop installation that includes HDFS, Hive, and HBase. We would configure one server per technology we access, for example: CREATE SERVER hdfs_hdp FOREIGN DATA WRAPPER hdfs_pxf_fdw OPTIONS ( config 'hdp_1' ); CREATE SERVER hive_hdp FOREIGN DATA WRAPPER hive_pxf_fdw OPTIONS ( config 'hdp_1' ); CREATE SERVER hbase_hdp FOREIGN DATA WRAPPER hbase_pxf_fdw OPTIONS ( config 'hdp_1' ); To reduce the amount of configuration required for each server, we introduce a new option `config`. This new option provides the name of the server directory where the configuration files reside. In the example above, configuration files live in the `$PXF_CONF/servers/hdp_1` directory, and all three servers share the same configuration directory. * Default wire_format to CSV The wire_format in PXF defaults to CSV. Only when the file format is tab-delimited text, we will use TEXT as the wire_format. This commit makes CSV the default wire_format for PXF * Add column projection * Pass filter string (WHERE clauses) to PXF [1] https://github.com/greenplum-db/pxf [2] https://github.com/greenplum-db/pxf/blob/pxf-fdw-d/PXF_FDW.mdCo-authored-by: NOliver Albertini <oalbertini@pivotal.io> Co-authored-by: NRaymond Yin <ryin@pivotal.io> Co-authored-by: NFrancisco Guerrero <aguerrero@pivotal.io>
-
由 Francisco Guerrero 提交于
Currently, BeginCopyToForExternalTable allows External Tables to hook into the copy code for writing. Instead of adding a similar BeginCopyToForForeignTable function, we instead expose BeginCopy. This will allow pxf_fdw to get a CopyState for write data from greenplum to an external source through PXF. Co-authored-by: NFrancisco Guerrero <aguerrero@pivotal.io> Co-authored-by: NOliver Albertini <oalbertini@pivotal.io>
-
- 30 1月, 2020 9 次提交
-
-
由 Heikki Linnakangas 提交于
Remove unnecessary includes that referenced 'currentSliceId', the files don't actually use it. Fix placement of local variable in ExecInitNode.
-
由 Heikki Linnakangas 提交于
The Plan->memoryAccountId field was removed in commit 7c9cc053.
-
由 Heikki Linnakangas 提交于
When executor nodes are initialized at executor startup, in the ExecInitPlan() stage, any nodes that are not going to be executed in the current slice were assigned to so-called Alien memory account. Previously, that was done to keep the useless nodes out of the "real" memory balances, but nowadays we normally don't bother initializing alien nodes in the first place. Alien node elimination can be disabled with 'set execute_pruned_plan=off', but that's a developer option that people shouldn't normally mess with. So in normal operation, the Alien memory account is never used. The Alien memory account was kept around when the alien node elimination was implemented (see commit 9b8f5c0b). The idea was that we could turn it on/off, and see how much we're saving by looking at the memory usage in the Alien memory account when it's turned 'off'. But that's hardly interesting anymore, we know that alien node elimination is useful, and it has worked great in production for some time now. We could probably get rid of the 'execute_pruned_plan' GUC altogether at this point, but I kept it for now. If you do turn it off, all the alien nodes will now get their own memory accounts, like non-alien nodes. Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
-
由 Heikki Linnakangas 提交于
Commit 9936ca3b improved the cost model of multi-stage Aggregates, introducing a formula for estimating the number of distinct groups seen in each segment (groupNumberPerSegment() function, later renamed to estimate_num_groups_per_segment()). However, we lost that with the rewrite of the multi-stage agg planning code in the 9.6 merge, and reverted to a more naive estimate. Put back the more accurate formula. Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/zsl3m_Tcb1g/MCo7pY-vAgAJReviewed-by: NZhenghua Lyu <zlv@pivotal.io>
-
由 Mel Kiyama 提交于
* docs - add python3 information PL/Container configuration example Also some other minor updates and fixed * docs - updates based on review comments for pl/container support of python 3 * docs - minor edit
-
由 Sambitesh Dash 提交于
-
由 Sambitesh Dash 提交于
-
由 Sambitesh Dash 提交于
-
由 Sambitesh Dash 提交于
As of now ORCA doesn't support multi argument DQA like below SELECT distinct (a,b) from foo; Earlier planner didn't support it too so we errored out in the parser itself. But now planner supports it so ORCA needs to handle the fallback. Co-authored-by: Sambitesh Dash sdash@pivotal.io
-
- 29 1月, 2020 2 次提交
-
-
由 Heikki Linnakangas 提交于
In a query with a single DISTINCT-qualified aggregate, the row count estimate for the bottom deduplication aggregate steps was taken from the overall aggregation's row count estimate. That could be dramatically different. For example, in this query from the regression tests: > explain select sum(distinct b) from olap_test_single; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------- > Finalize Aggregate (cost=166.23..166.24 rows=1 width=8) > -> Gather Motion 3:1 (slice1; segments: 3) (cost=166.19..166.22 rows=1 width=8) > -> Partial Aggregate (cost=166.19..166.20 rows=1 width=8) > -> HashAggregate (cost=166.16..166.19 rows=1 width=4) > Group Key: b > -> Redistribute Motion 3:3 (slice2; segments: 3) (cost=165.00..165.99 rows=11 width=4) > Hash Key: b > -> Streaming HashAggregate (cost=165.00..165.33 rows=11 width=4) > Group Key: b > -> Seq Scan on olap_test_single (cost=0.00..115.00 rows=3334 width=4) > Optimizer: Postgres query optimizer > (11 rows) Before this patch, the Streaming HashAggregate at the bottom had a row count estimate of 1 rows. 1 row is correct for the overall query, as an aggregate query with no GROUP BY always returns one row, but wildly incorrect for the deduplicating Streaming HashAggregate. It returns as many rows as there are distinct values, the aggregation that rolls them up to one row only happens in the Partial and Finalize Aggregate steps. Reviewed-by: NTaylor Vesely <tvesely@pivotal.io>
-
由 Heikki Linnakangas 提交于
A number of small changes to improve the readability of the function: - Introduce NUM_SAMPLE_FIXED_COLS constant for the number of "header" columns in the gp_acquire_sample_rows() result set. - Rename 'rows' and 'numrows' fields to 'sample_rows' and 'num_sample_rows', to make it more clear that they refer to the rows in the sample, not to the rows in the function's result set. (The result set has one row per sample row, plus one summary row, emitted by each segment) - Rename 'natts' local variable to 'live_natts', to make it more clear that it does not include dropped cols. - Remove 'natts' (live_natts) from the state struct. It can be computed from the number of output attributes. - Remove redundant code to initialize output values/isnulls arrays for the summary row. They were initialized to all-NULLs twice, which is harmless but unnecessary and confusing. Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
-