- 24 6月, 2020 1 次提交
-
-
由 Tyler Ramer 提交于
[Lockfile](https://pypi.org/project/lockfile/) has not been maintained since around 2015. Further, the functionality it provided seems poor - a review of the code indicated that it used the presence of the PID file itself as the lock - in Unix, using a file's existence followed by a creation is not atomic, so a lock could be prone to race conditions. The lockfile package also did not clean up after itself - a process which was destroyed unexpectedly would not clear the created locks, so some faulty logic was added to mainUtils.py, which checked to see if a process with the same PID as the lockfile's creator was running. This is obviously failure prone, as a new process might be assigned the same PID as the old lockfile's owner, without actually being the same process. (Of note, the SIG_DFL argument to os.kill() is not a signal at all, but rather of type signal.handler. It appears that the python cast this handler to the int 0, which, according to man 2 kill, leads to no signal being sent, but existance and permission checks are still performed. So it is a happy accident that this code worked at all) This commit removes lockfile from the codebase entirely. It also adds a "PIDLockFile" class which provides an atomic-guarenteed lock via the mkdir and rmdir commands on Unix - thus, it is not safely portable to Windows, but this should not be an issue as only Unix-based utilities use the "simple_main()" function. PIDLockFile provides API compatible classes to replace most of the functionality from lockfile.PidLockFile, but does remove any timeout logic as it was not used in any meaningful sense - a hard-coded timeout of 1 second was used, but an immediate result of if the lock is held is sufficient. PIDLockFile also includes appropriate __enter__, __exit__, and __del__ attributes, so that, should we extend this class in the future, with syntax is functional, and __del__ calls release, so a process reaped unexpectedly should still clean its own locks as part of the garbage collection process. Authored-by: NTyler Ramer <tramer@pivotal.io>
-
- 23 6月, 2020 2 次提交
-
-
由 Tyler Ramer 提交于
PyGreSQL 5.2.0 which contains the fixes submitted and referenced in cb8d54a6 was released on June 21, 2020. Update the build process to use this tagged release rather than a pre-release hash Authored-by: NTyler Ramer <tramer@vmware.com>
-
由 Tyler Ramer 提交于
psutil 4.0.0 is quite old, and only lists support for python 3.4. We'll need support for python 3.6 and 3.8 as we update to python3. Authored-by: NTyler Ramer <tramer@pivotal.io>
-
- 18 6月, 2020 1 次提交
-
-
由 Tyler Ramer 提交于
PyGreSQL may now be installed via pip or via Ubuntu apt. Update the travis pipeline as well, using submodules to pull the necessary python dependencies. Thus, they are removed from PIP as well. Authored-by: NTyler Ramer <tramer@pivotal.io>
-
- 17 6月, 2020 4 次提交
-
-
由 Tyler Ramer 提交于
We encounted a bug in escaping dbname and connection options in pygresql 5.1.2, which we submitted a patch for here: https://github.com/PyGreSQL/PyGreSQL/pull/40 This has been merged, but it will take time to be added to a tagged release. For this reason, we have downloaded the source using this commit, https://github.com/PyGreSQL/PyGreSQL/commit/b1e040e989b5b1b75f42c1103562bfe8f09f93c3 to install. Co-authored-by: NTyler Ramer <tramer@pivotal.io> Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
-
由 Tyler Ramer 提交于
Due to refactor of dbconn and newer versions of pygresql, using `with dbconn.connect() as conn` no longer attempts to close a connection, even if it did prior. Instead, this syntax uses the connection itself as context and, as noted in execSQL, overrides the autocommit functionality of execSQL. Therefore, close the connection manually to ensure that execSQL is auto-commited, and the connection is closed. Co-authored-by: NTyler Ramer <tramer@pivotal.io> Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
-
由 Tyler Ramer 提交于
One reason pygresql was previously modified was that it did not handle closing a connection very gracefully. In the process of updating pygresql, we've wrapped the connection it provides with a ClosingConnection function, which should handle gracefully closing the connection when the "with dbconn.connect as conn" syntax is used. This did, however, illustrate issues where a cursor might have been created as the result of a dbconn.execSQL() call, which seems to hold the connection open if not specifically closed. It is therefore necessary to remove the ability to get a cursor from dbconn.execSQL(). To highlight this difference, and to ensure that future calls to this library is easy to use, I've cleaned up and clarified the dbconn execution code, to include the following features. - dbconn.execSQL() closes the cursor as part of the function. It returns no rows - functions dbconn.query() is added, which behaves like dbconn.execSQL() except that it now returns a cursor - function dbconn.execQueryforSingleton() is renamed dconn.querySingleton() - function dbconn.execQueryforSingletonRow() is renamed dconn.queryRow() Authored-by: NTyler Ramer <tramer@pivotal.io>
-
由 Tyler Ramer 提交于
This commit updates pygresql from 4.0.0 to 5.1.2, which requires numerous changes to take advantages of the major result syntax change that pygresql5 implemented. Of note, cursors or query objects automatically cast returned values as appropriate python types - list of ints, for example, instead of a string like "{1,2}". This is the bulk of the changes. Updating to pygresql 5.1.2 provides numerous benfits, including the following: - CVE-2018-1058 was addressed in pygresql 5.1.1 - We can save notices in the pgdb module, rather than relying on importing the pg module, thanks to the new "set_notices()" - pygresql 5 supports python3 - Thanks to a change in the cursor, using a "with" syntax guarentees a "commit" on the close of the with block. This commit is a starting point for additional changes, including refactoring the dbconn module. Additionally, since isolation2 uses pygresql, some pl/python scripts were updated, and isolation2 SQL output is further decoupled from pygresql. The output of a psql command should be similar enough to isolation2's pg output that minimal or no modification is needed to ensure gpdiff can recognize the output. Co-Authored-by: NTyler Ramer <tramer@pivotal.io> Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
-
- 08 6月, 2020 1 次提交
-
-
由 Paul Guo 提交于
Use guc gp_role only now and replace the functionality of guc gp_session_role with it also. Previously we have both gucs. The difference of the two gucs are (copied from code comment): * gp_session_role * * - does not affect the operation of the backend, and * - does not change during the lifetime of PostgreSQL session. * * gp_role * * - determines the operating role of the backend, and * - may be changed by a superuser via the SET command. This is not friendly for coding. For example, You could find Gp_role and Gp_session_role are set as GP_ROLE_DISPATCH on Postmaster & many aux processes on all nodes (even QE nodes) in a cluster, so you can see that to differ from QD postmaster and QE postmaster, current gpdb uses an additional -E option in postmaster arguments. These makes developers confusing when writing role branch related code given we have three related variables. Also some related code is even buggy now (e.g. 'set gp_role' even FATAL quits). With this patch we just have gp_role now. Some changes which might be interesting in the patch are: 1. For postmaster, we should specify '-c gp_role=' (e.g. via pg_ctl argument) to determine the role else we assume the utility role. 2. For stand-alone backend, utility role is enforced (no need to specify by users). 3. Could still connect QE/QD nodes using utility mode with PGOPTIONS, etc as before. 4. Remove the '-E' gpdb hacking and align the '-E' usage with upstream. 5. Move pm_launch_walreceiver out of the fts related shmem given the later is not used on QE. Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io> Reviewed-by: NGang Xiong <gxiong@pivotal.io> Reviewed-by: NHao Wu <gfphoenix78@gmail.com> Reviewed-by: NYandong Yao <yyao@pivotal.io>
-
- 03 6月, 2020 1 次提交
-
-
由 Wen Lin 提交于
-
- 29 5月, 2020 2 次提交
-
-
由 ggbq 提交于
In most cases, the variable LN_S is 'ln -s', however, the LN_S can be changed to 'cp -pR' if the configure finds the file system does not support symbolic links. It would be incompatible when linking a subdir path to a relative path. cd to subdir first before linking a file.
-
由 Hubert Zhang 提交于
When introducing a new mirror, we need two steps: 1. start mirror segment 2. update gp_segment_configuration catalog Previously gp_add_segment_mirror will be called to update the catalog, but dbid is chosen by get_availableDbId() which cannot ensure to be the same dbid in internal.auto.conf. Reported by issue9837 Reviewed-by: NPaul Guo <pguo@pivotal.io>
-
- 19 5月, 2020 1 次提交
-
-
由 Adam Lee 提交于
-
- 18 5月, 2020 6 次提交
-
- 15 5月, 2020 1 次提交
-
-
由 Wen Lin 提交于
while gpload is loading data if the configure file contains "error_table" and doesn't contain "preload", an error of no attribute "staging_table" or "fast_path" occurs.
-
- 14 5月, 2020 9 次提交
-
-
由 Tyler Ramer 提交于
I'm not quite sure of the purpose of this utility, nor, apparently, is any readme or historical repo. Apart from a small fix provided in commit 71d67305, there has been no modification to this file since at least 2008. More importantly, I'm not quite sure of any reasonable use for this file. The supported platforms are only linux, darwin, or sunos5, and the listed use, of printing the memory size in bytes, is trivial on any of those systems without resorting to some python script that wraps a command line call. Given that it hasn't been updated since 2008, it's still compatible with some ancient version of python, which means that it's yet another file to upgrade to python 3 - in this case, let's drop the program, rather than bother upgrading it. Authored-by: NTyler Ramer <tramer@pivotal.io>
-
由 Ning Yu 提交于
The pg_partition_oid_index of template0 is used as a template of empty indices, its path, however, is not fixed, we need to determine it at runtime.
-
由 Ning Yu 提交于
It is no loner needed, the correct approach is to install meta-only index files on the new segments.
-
由 Ning Yu 提交于
An empty b-tree index file is not empty, it contains only the meta page. By transfer meta-only index files to the new segments, they can be launched directly without the "ignore_system_indexes" setting, and we do not need an extra relaunch of the new segments. We use base/13199/5112 as the template of meta-only index files, it is pg_partition_oid_index of template0.
-
由 Ning Yu 提交于
Which was introduced to exclude a large amount of paths. Also changed the excluding logic of './db_dumps' and './promote'. They were excluded only when an empty 'excludePaths' was specified by the caller, this is weird, so I changed the logic to always exclude these two paths.
-
由 Ning Yu 提交于
- be careful when creating placeholders of the master-only files in the template, raise an error if they already exist; - increase code readability slightly;
-
由 Ning Yu 提交于
Gpexpand creates new primary segments by first creating a template from the master datadir and then copying it to the new segments. Some catalog tables are only meaningful on master, such as gp_segment_configuration, their content are then cleared on each new segment with the "delete from ..." commands. This works but is slow because we have to include the content of the master-only tables in the archive, distribute them via network, and clear them via the slow "delete from ..." commands -- the "truncate" command is fast but it is disallowed on catalog tables as filenode must not be changed for catalog tables. To make it faster we now exclude these tables from the template directly, so less data are transferred and there is no need to "delete from" them explicitly.
-
由 Ning Yu 提交于
When cleaning up the master-only files on the new segments we used to do the job one by one, when there are tens or hundreds of segments it can be very slow. Now we cleanup in parallel.
-
由 Ning Yu 提交于
Removed the duplicated 'gp_segment_configuration' entry in the MASTER_ONLY_TABLES list. Also sort the list in alphabetic order to prevent dulicates in the future.
-
- 12 5月, 2020 1 次提交
-
-
由 Peifeng Qiu 提交于
gpload in the latest windows client package requires VS redistributable package. Output more meaningful message if pg.py fails to load.
-
- 05 5月, 2020 1 次提交
-
-
由 Tyler Ramer 提交于
This commit removes any reference to vendored python, formerly installed in $GPHOME/ext/python. Unvendoring python means that a system python of 2.7 is required. In order to make this possible, several sub-fixes or testing scope fixes are required: - Python requirements should be installed globally using pip - References to PYTHONHOME are removed - PYTHONPATH becomes "$GPHOME/lib/python:${PYTHONPATH}" GCC is no longer overridden as part of gpAux makefile process. - Previously, gpAux Makefile overrode the $CC variable with the value "gcc". This obviously breaks convention which itself is a problem, but it is also broken because the top level Makefile and configure DO respect a CC variable being set. - Setting CC="gcc" also means that gcc binary must be part of the user's path. This isn't a requirement or guarantee to compiling, so why keep this behavior? - However, python packages should be compiled with the same GCC version that compiled system python - thus, we unset the "CC" variable when installing additional python libraries. Specifically, the configure args used to compile python are saved and re-used to compile libraries when the python setup.py build process is used. So if system python has different compiler flags than might be allowed in a newer version of gcc, the build of the libraries fails. This is specificaly of note when compiling python libraries in SLES, where the system python compiled with GCC 4.8 uses the `-fstack-clash-protection` flag, which is replaced by `-fstack-protector` in newer GCC versions. Thus, the configure args passed cause a failure to compile if a newer gcc version is used with "unrecognized command line option" error. This does make significant improvements to simplify the code building and testing framework: - Patchelf requirements go away, as virtualenv is no longer necessary - There is no need to copy system or other python into $GPHOME/etc/python This commit does not address any of the following: - Unvendoring individual python libraries, like psutil, pygresql, or yaml - Updating any python code to work with python newer than 2.7 Co-authored-by: NTyler Ramer <tramer@pivotal.io> Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
-
- 15 4月, 2020 1 次提交
-
-
由 Daniel Gustafsson 提交于
Various typos spotted in internal in-tree documentation.
-
- 03 4月, 2020 1 次提交
-
-
由 Ryan Zhang 提交于
Co-authored-by: NRyan <ryan@chapterx.com>
-
- 20 3月, 2020 1 次提交
-
-
由 (Jerome)Junfeng Yang 提交于
For ETL user scenarios, there are some cases that may frequently create and drop the same external table. And once the external table gets dropped. All errors stored in the error log will lose. To enable error log persistent for external with the same "dbname"."namespace"."table". Bring in "error_log_persistent" external table option. If create the external table with `OPTIONS (error_log_persistent 'true')` and `LOG ERROR`, the external's error log will be name as "dbid_namespaceid_tablename" under "errlogpersistent" directory. And drop external table will ignore to delete the error log. Since GPDB 5, 6 still use pg_exttable's options to mark LOG ERRORS PERSISTENTLY, so keep the ability for loading from OPTIONS(error_log_persistent 'true'). Create separate `gp_read_persistent_error_log` function to read persistent error log. If the external table gets deleted, only the namespace owner has permission to delete the error log. Create separate `gp_truncate_persistent_error_log` function to delete persistent error log. If the external table gets deleted. Only the namespace owner has permission to delete the error log. It also supports wildcard input to delete error logs belong to a database or whole cluster. If drop an external table create with `error_log_persistent`. And then create the same "dbname"."namespace"."table" external table without persistent error log. It'll write errors to the normal error log. The persistent error log still exists. Reviewed-by: NHaozhouWang <hawang@pivotal.io> Reviewed-by: NAdam Lee <ali@pivotal.io>
-
- 18 3月, 2020 1 次提交
-
-
由 Jamie McAtamney 提交于
Previously, gpintsystem was incorrectly filling the hostname field of each segment in gp_segment_configuration with the segment's address. This commit changes it to correctly resolve hostnames and update the catalog accordingly. Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
-
- 14 3月, 2020 2 次提交
-
-
由 Ashuka Xue 提交于
Previously, analyzedb would error out and fail if a table was dropped during analyzedb. Now, we silently skip dropped tables when determining the tables to analyze.
-
由 Adam Berlin 提交于
gpinitsystem did not quote the username while performing ALTER USER. When the username is a numeric value the postgres parser gets upset - unless the username is quoted. See here for more details: https://www.postgresql.org/docs/9.4/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS - SQL identifiers and key words must begin with a letter (a-z, but also letters with diacritical marks and non-Latin letters) or an underscore (_). - Also, there is a second kind of identifier: the delimited identifier or quoted identifier. It is formed by enclosing an arbitrary sequence of characters in double-quotes (") - use variable interpolation provided by psql to properly quote user-provided values. - use RETVAL to perform testing due to Commit d7b7a40aCo-authored-by: NJacob Champion <pchampion@pivotal.io>
-
- 13 3月, 2020 1 次提交
-
-
由 Chris Hajas 提交于
Previously, running analyzedb with an input file (`analyzedb -f <config_file`) containing a root partition would fail as we did not properly populate the list of leaf partitions. The logic in analyzedb assumes that we enumerate leaf partitions from the root partition that the user had input (either from the command line or from an input file). While we did this properly when the table was passed in from the command line, we looked for the table name rather than the schema-qualifed table for input files. This would cause partitioned heap tables to fail when writing the report/status files at the end, and would cause analyzedb to not track DML changes in partitioned AO tables. Now, we properly check for the schema-qualified table name.
-
- 10 3月, 2020 1 次提交
-
-
由 Heikki Linnakangas 提交于
It was only used for one message in gprecoverseg, and it doesn't seem important. The second argument to the function didn't do anything, since the removal of email and SNMP alerts in commit 65822b80. And the NULL checks for the arguments were pointless, because the function was marked as strict. But rather than clean those up, let's just remove it altogether. Reviewed-by: NAsim R P <apraveen@pivotal.io> Reviewed-by: NJimmy Yih <jyih@pivotal.io>
-
- 09 3月, 2020 1 次提交
-
-
由 Heikki Linnakangas 提交于
'modcount' is not kept up-to-date in the QD node anymore, so we need to sum it up across all the segments. The analyzedb tests on concourse master pipeline were failing because modcount was always caming out as 0.
-