- 09 11月, 2020 1 次提交
-
-
由 xiaoxiao 提交于
* refactor gpload test file TEST.py 1. migrate gpload test to pytest 2. new function to form config file through yaml package and make it more reasonable 3. add a case to cover gpload update_condition arggument * migrate gpload and TEST.py to python3.6 new test case 43 to test gpload behavior when column name has capital letters and without data type change some ans file since psql react different * change sql to find reuseable external table to make gpload compatible in gp7 and gp6 better TEST.py to write config file with ruamel.yaml moudle Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>
-
- 25 9月, 2020 5 次提交
-
-
由 Jamie McAtamney 提交于
Co-authored-by: NAshwin Agrawal <aashwin@vmware.com> Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>
-
由 Jamie McAtamney 提交于
This commit makes several broad changes to address conversion issues common to multiple utilities: - The input and output of subprocess in Python 3 are now bytestrings instead of strings. Thus, some sanitizing of inputs and outputs is necessary - Many built-in functions like raw_input and __cmp__ are deprecated in Python 3, and as a side effect list sorting and hashing work differently, requiring a different set of helper functions - Implicit relative imports no longer work, so dbconn (in utilities code) and mgmt_utils (in test code) must be added to the search path and imported using a full path instead - File objects require flush methods in python3, and popen2 has been deprecated Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com> Co-authored-by: NTyler Ramer <tramer@vmware.com>
-
由 Tyler Ramer 提交于
The subprocess32 package is a backport of Python 3 subprocess functionality to Python 2, so with the upgrade to Python 3 it is no longer necessary. This commit deletes the package from pythonSrc and changes import statements to import subprocess directly, instead of falling back to it only if subprocess32 is not importable. Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com> Co-authored-by: NTyler Ramer <tramer@vmware.com>
-
由 Tyler Ramer 提交于
- Update Python file shebangs to use python3 and update gp_replicate_check and gpversion.py to allow running under Python 3 - Use Centos 7 dev containers with Python 3 and pip3 installed for testing, as prod containers do not yet work with Python 3, and update Travis with Python 3 - Install dependencies with pip3 to get Python 3-compatible versions - Copy the Python 3 version of .so files, don't unset PYTHONHOME and PYTHONPATH, and don't remove built files from install locations, so that the Python 2 and Python 3 versions of various files can coexist Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com> Co-authored-by: NKris Macoskey <kmacoskey@vmware.com> Co-authored-by: NTyler Ramer <tramer@vmware.com>
-
由 Jamie McAtamney 提交于
The 2to3 utility is an officially-supported script to automatically convert Python 2 code to Python 3. It's not a complete fix by any means, but it handles most basic syntax transformations and similar. This commit is the result of running 2to3 against every Python file in the gpMgmt directory, so it's quite large and fairly scattershot. Manual updates to any code that 2to3 can't handle will come in later commits.
-
- 16 9月, 2020 1 次提交
-
-
由 xiaoxiao 提交于
* add double quatations when creating staging table omit distribution key * fix gpload fail when column names have capital letters in merge mode Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>
-
- 07 9月, 2020 2 次提交
-
-
由 xiaoxiao 提交于
* fix gpload fail when capital letters in column name in merge mode add double quotations in column names when create staging tables omit distribution key Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>
- 31 8月, 2020 1 次提交
-
-
由 xiaoxiao 提交于
fix match column condition to resovle primary key conflict when using the gpload merge mode to import data to the Multi-level partition table fix fail when special char and capital letters in column names Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>
-
- 08 7月, 2020 1 次提交
-
-
由 Tyler Ramer 提交于
The version of PyYAML vendored in gpMgmt/bin/ext is old, unmaintained, and does not support python3. Actually, it does not even contain a `__version__` attribute, so it is not possible to know the version. We need to unvendor YAML and get to a library version that supports python3 - for this reason, we are updating to the latest PyYAML available. Also update yaml.load to use yaml.safe_load instead. Co-authored-by: NTyler Ramer <tramer@vmware.com> Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>
-
- 17 6月, 2020 1 次提交
-
-
由 Tyler Ramer 提交于
This commit updates pygresql from 4.0.0 to 5.1.2, which requires numerous changes to take advantages of the major result syntax change that pygresql5 implemented. Of note, cursors or query objects automatically cast returned values as appropriate python types - list of ints, for example, instead of a string like "{1,2}". This is the bulk of the changes. Updating to pygresql 5.1.2 provides numerous benfits, including the following: - CVE-2018-1058 was addressed in pygresql 5.1.1 - We can save notices in the pgdb module, rather than relying on importing the pg module, thanks to the new "set_notices()" - pygresql 5 supports python3 - Thanks to a change in the cursor, using a "with" syntax guarentees a "commit" on the close of the with block. This commit is a starting point for additional changes, including refactoring the dbconn module. Additionally, since isolation2 uses pygresql, some pl/python scripts were updated, and isolation2 SQL output is further decoupled from pygresql. The output of a psql command should be similar enough to isolation2's pg output that minimal or no modification is needed to ensure gpdiff can recognize the output. Co-Authored-by: NTyler Ramer <tramer@pivotal.io> Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
-
- 03 6月, 2020 1 次提交
-
-
由 Wen Lin 提交于
-
- 15 5月, 2020 1 次提交
-
-
由 Wen Lin 提交于
while gpload is loading data if the configure file contains "error_table" and doesn't contain "preload", an error of no attribute "staging_table" or "fast_path" occurs.
-
- 12 5月, 2020 1 次提交
-
-
由 Peifeng Qiu 提交于
gpload in the latest windows client package requires VS redistributable package. Output more meaningful message if pg.py fails to load.
-
- 03 4月, 2020 1 次提交
-
-
由 Ryan Zhang 提交于
Co-authored-by: NRyan <ryan@chapterx.com>
-
- 20 3月, 2020 1 次提交
-
-
由 (Jerome)Junfeng Yang 提交于
For ETL user scenarios, there are some cases that may frequently create and drop the same external table. And once the external table gets dropped. All errors stored in the error log will lose. To enable error log persistent for external with the same "dbname"."namespace"."table". Bring in "error_log_persistent" external table option. If create the external table with `OPTIONS (error_log_persistent 'true')` and `LOG ERROR`, the external's error log will be name as "dbid_namespaceid_tablename" under "errlogpersistent" directory. And drop external table will ignore to delete the error log. Since GPDB 5, 6 still use pg_exttable's options to mark LOG ERRORS PERSISTENTLY, so keep the ability for loading from OPTIONS(error_log_persistent 'true'). Create separate `gp_read_persistent_error_log` function to read persistent error log. If the external table gets deleted, only the namespace owner has permission to delete the error log. Create separate `gp_truncate_persistent_error_log` function to delete persistent error log. If the external table gets deleted. Only the namespace owner has permission to delete the error log. It also supports wildcard input to delete error logs belong to a database or whole cluster. If drop an external table create with `error_log_persistent`. And then create the same "dbname"."namespace"."table" external table without persistent error log. It'll write errors to the normal error log. The persistent error log still exists. Reviewed-by: NHaozhouWang <hawang@pivotal.io> Reviewed-by: NAdam Lee <ali@pivotal.io>
-
- 28 2月, 2020 1 次提交
-
-
由 Huiliang.liu 提交于
Add max_retries flag for gpload. It indicates the max times on connecting to GPDB timed out. max_retries default value is 0 which means no retry. If max_retries is -1 or other negative value, it means retry forever. Test has been done manually.
-
- 31 1月, 2020 2 次提交
-
-
由 Heikki Linnakangas 提交于
External tables now use relkind='f', like all foreign tables. They have an entry in pg_foreign_table, as if they belonged to a special foreign server called "exttable_server". That foreign server gets special treatment in the planner and executor, so that we still plan and execute it the same as before. * ALTER / DROP EXTERNAL TABLE is now mapped to ALTER / DROP FOREIGN TABLE. There is no "OCLASS_EXTTABLE" anymore. This leaks through to the user in error messages, e.g: postgres=# drop external table boo; ERROR: foreign table "boo" does not exist and to the command tag on success: postgres=# drop external table boo; DROP FOREIGN TABLE * psql \d now prints external tables as Foreign Tables. Next steps: * Use the foreign table API routines instead of special casing "exttable_server" everywhere. * Get rid of the pg_exttable table, and store the all the options in pg_foreign_table.ftoptions instead. * Get rid of the extra fields in pg_authid to store permissions to create different kinds of external tables. Store them as ACLs in pg_foreign_server.
-
由 Heikki Linnakangas 提交于
The condition listed all possible values of relstorage, except for 'f' for RELSTORAGE_FOREIGN. The condition on relkind filters out foreign tables as well, so the condition on relstorage is redundant. (Although I don't think filtering out foreign table was even the intention here.)
-
- 01 11月, 2019 1 次提交
-
-
由 Huiliang.liu 提交于
GPload: change metadata query SQL to improvement performance Old query SQL may take long time if catalog is large.
-
- 27 9月, 2019 1 次提交
-
-
由 Huiliang.liu 提交于
-
- 20 9月, 2019 1 次提交
-
-
由 Paul Guo 提交于
* Ship modified python module subprocess32 again subprocess32 is preferred over subprocess according to python documentation. In addition we long ago modified the code to use vfork() against fork() to avoid some "Cannot allocate memory" kind of error (false alarm though - memory is actually sufficient) on gpdb product environment that is usually with memory overcommit disabled. And we compiled and shipped it also but later it was just compiled but not shipped somehow due to makefile change (maybe a regression). Let's ship it again. * Replace subprocess with our own subprocess32 in python code.
-
- 26 8月, 2019 2 次提交
-
-
由 Huiliang.liu 提交于
-
由 Huiliang.liu 提交于
* Get gpdb version and support gpdb5 and gpdb6 * add gpversion.py into windows package
-
- 09 7月, 2019 1 次提交
-
-
由 Daniel Gustafsson 提交于
Setting a variable to itself is a no-op which can be removed. This may have been introduced in error and instead masking a real bug, but if it so then we have lived with it for two years so I'm opting for removing. Reviewed-by: Asim R P and Bhuvnesh Chaudhary
-
- 10 4月, 2019 1 次提交
-
-
由 Ben Christel 提交于
We don't support Greenplum on these platforms. Some files (e.g. Makefile.{hpux,solaris}) have been left in place because they are upstream postgres files. Removing them isn't worth the headache it would cause when merging commits from postgres. Authored-by: NBen Christel <bchristel@pivotal.io>
-
- 01 2月, 2019 1 次提交
-
-
由 Heikki Linnakangas 提交于
This is in preparation for adding operator classes as a new column (distclass) to gp_distribution_policy. This naming is consistent with pg_index.indkey/indclass. Change the datatype to int2vector, also for consistency with pg_index, and some other catalogs that store attribute numbers, and because int2vector is slightly more convenient to work with in the backend. Move the column to the end of the table, so that all the variable-length and nullable columns are at the end, which makes it possible to reference the other columns directly in Form_gp_policy. Add a backend function, pg_get_table_distributedby(), to deparse the DISTRIBUTED BY definition of a table into a string. This is similar to pg_get_indexdef_columns(), pg_get_functiondef() etc. functions that we have. Use the new function in psql and pg_dump, when connected to a GPDB6 server. Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Co-authored-by: NPeifeng Qiu <pqiu@pivotal.io> Co-authored-by: NAdam Lee <ali@pivotal.io>
-
- 17 1月, 2019 1 次提交
-
-
由 Daniel Gustafsson 提交于
This removes a duplicate import and a few set, but never used, vars from the gpload.py code as well as the including_defaults token as it was clearly unused. Also fixes a few typos while in there, one of which is a user facing error message. Reviewed-by: NJacob Champion <pchampion@pivotal.io>
-
- 13 12月, 2018 1 次提交
-
-
由 Daniel Gustafsson 提交于
The Greenplum specific error handling via ereport()/elog() calls was in need of a unification effort as some parts of the code was using a different messaging style to others (and to upstream). This aims at bringing many of the GPDB error calls in line with the upstream error message writing guidelines and thus make the user experience of Greenplum more consistent. The main contributions of this patch are: * errmsg() messages shall start with a lowercase letter, and not end with a period. errhint() and errdetail() shall be complete sentences starting with capital letter and ending with a period. This attempts to fix this on as many ereport() calls as possible, with too detailed errmsg() content broken up into details and hints where possible. * Reindent ereport() calls to be more consistent with the common style used in upstream and most parts of Greenplum: ereport(ERROR, (errcode(<CODE>), errmsg("short message describing error"), errhint("Longer message as a complete sentence."))); * Avoid breaking messages due to long lines since it makes grepping for error messages harder when debugging. This is also the de facto standard in upstream code. * Convert a few internal error ereport() calls to elog(). There are no doubt more that can be converted, but the low hanging fruit has been dealt with. Also convert a few elog() calls which are user facing to ereport(). * Update the testfiles to match the new messages. Spelling and wording is mostly left for a follow-up commit, as this was getting big enough as it was. The most obvious cases have been handled but there is work left to be done here. Discussion: https://github.com/greenplum-db/gpdb/pull/6378Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io> Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
-
- 30 11月, 2018 1 次提交
-
-
由 Daniel Gustafsson 提交于
Reviewed-by: NJacob Champion <pchampion@pivotal.io> Reviewed-by: NJimmy Yih <jyih@pivotal.io>
-
- 29 11月, 2018 1 次提交
-
-
由 Daniel Gustafsson 提交于
While == None works for comparison, it's a wasteful operation as it performs type conversion and expansion. Instead move to using the "is" operator which is the documented best practice for Python code. Reviewed-by: Jacob Champion
-
- 14 11月, 2018 1 次提交
-
-
由 Huiliang.liu 提交于
* Add external table encoding option as condition of finding reusable table Get database default encoding if ENCODING is not set in config file. Find encoding code by encoding string and then add encoding code as one of conditions of finding reusable table.
-
- 30 7月, 2018 1 次提交
-
-
由 Peifeng Qiu 提交于
gpload test case will run gpload with subprocess, read stdout and stderr from it and wait for exit. sys.exit in gpload does some cleanup may cause deadlock between test and gpload. os._exit will exit immediately, but we need to flush stdout and stderr before that.
-
- 24 7月, 2018 1 次提交
-
-
由 Huiliang.liu 提交于
- The results of fast_match SQL don't include shema name, so we need add shema name to extSchemaTable for fast_match - Remove locationStr which is unused.
-
- 23 7月, 2018 1 次提交
-
-
由 Huiliang.liu 提交于
- add fast_match option in gpload config file. If both reuse_tables and fast_match are true, gpload will try fast match external table(without checking columns). If reuse_tables is false and fast_match is true, it will print warning message.
-
- 23 4月, 2018 1 次提交
-
-
由 Peifeng Qiu 提交于
-
- 03 4月, 2018 1 次提交
-
-
由 Adam Lee 提交于
The pg_exttable.fmterrtbl column stored the OID of the error table, but without an error table it is just set to the OID of the external table. That is not necessary, there are other columns which indicate if error logging is enabled. Therefore this column can be removed.
-
- 27 3月, 2018 1 次提交
-
-
由 Peifeng Qiu 提交于
When gpload finishes its query, it will send SIGTERM to gpfdist. gpfdist handle SIGTERM with exit(1), which will invoke registered apr handlers and cleanup all apr resources including apr_pool. If this happens just during normal destruction of apr_pool in do_close, gpfdist will hang. Call _exit in gpfdist to avoid any cleanup handlers, and let gpload send SIGKILL to perform hard kill.
-
- 26 2月, 2018 1 次提交
-
-
由 huiliang-liu 提交于
- if the data file contains "\N" as the delimiter, it would not be recognized properly by gpload - root cause: gpload replace the quote in nullas option as well as replace '\' as '\\' - solution: add quote_no_slash function to handle nullas option
-