1. 24 6月, 2020 1 次提交
    • T
      Remove lockfile from mainUtils · 8190ed40
      Tyler Ramer 提交于
      [Lockfile](https://pypi.org/project/lockfile/) has not been maintained
      since around 2015. Further, the functionality it provided seems poor - a
      review of the code indicated that it used the presence of the PID file
      itself as the lock - in Unix, using a file's existence followed by a
      creation is not atomic, so a lock could be prone to race conditions.
      
      The lockfile package also did not clean up after itself - a process
      which was destroyed unexpectedly would not clear the created locks, so
      some faulty logic was added to mainUtils.py, which checked to see if a
      process with the same PID as the lockfile's creator was running. This
      is obviously failure prone, as a new process might be assigned the same
      PID as the old lockfile's owner, without actually being the same process.
      
      (Of note, the SIG_DFL argument to os.kill() is not a signal at all, but
      rather of type signal.handler. It appears that the python cast this
      handler to the int 0, which, according to man 2 kill, leads to no signal
      being sent, but existance and permission checks are still performed. So
      it is a happy accident that this code worked at all)
      
      This commit removes lockfile from the codebase entirely.
      
      It also adds a "PIDLockFile" class which provides an atomic-guarenteed
      lock via the mkdir and rmdir commands on Unix - thus, it is not safely
      portable to Windows, but this should not be an issue as only Unix-based
      utilities use the "simple_main()" function.
      
      PIDLockFile provides API compatible classes to replace most of the
      functionality from lockfile.PidLockFile, but does remove any timeout
      logic as it was not used in any meaningful sense - a hard-coded timeout
      of 1 second was used, but an immediate result of if the lock is held is
      sufficient.
      
      PIDLockFile also includes appropriate __enter__, __exit__, and __del__
      attributes, so that, should we extend this class in the future, with
      syntax is functional, and __del__ calls release, so a process reaped
      unexpectedly should still clean its own locks as part of the garbage
      collection process.
      Authored-by: NTyler Ramer <tramer@pivotal.io>
      8190ed40
  2. 17 6月, 2020 3 次提交
    • T
      Close short lived connections · bc35b6b2
      Tyler Ramer 提交于
      Due to refactor of dbconn and newer versions of pygresql, using
      `with dbconn.connect() as conn` no longer attempts to close a
      connection, even if it did prior. Instead, this syntax uses the
      connection itself as context and, as noted in execSQL, overrides the
      autocommit functionality of execSQL.
      
      Therefore, close the connection manually to ensure that execSQL is
      auto-commited, and the connection is closed.
      Co-authored-by: NTyler Ramer <tramer@pivotal.io>
      Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      bc35b6b2
    • T
      Refactor dbconn · 330db230
      Tyler Ramer 提交于
      One reason pygresql was previously modified was that it did not handle
      closing a connection very gracefully. In the process of updating
      pygresql, we've wrapped the connection it provides with a
      ClosingConnection function, which should handle gracefully closing the
      connection when the "with dbconn.connect as conn" syntax is used.
      
      This did, however, illustrate issues where a cursor might have been
      created as the result of a dbconn.execSQL() call, which seems to hold
      the connection open if not specifically closed.
      
      It is therefore necessary to remove the ability to get a cursor from
      dbconn.execSQL(). To highlight this difference, and to ensure that
      future calls to this library is easy to use, I've cleaned up and
      clarified the dbconn execution code, to include the following features.
      
      - dbconn.execSQL() closes the cursor as part of the function. It returns
        no rows
      - functions dbconn.query() is added, which behaves like dbconn.execSQL()
        except that it now returns a cursor
      - function dbconn.execQueryforSingleton() is renamed
        dconn.querySingleton()
      - function dbconn.execQueryforSingletonRow() is renamed
        dconn.queryRow()
      Authored-by: NTyler Ramer <tramer@pivotal.io>
      330db230
    • T
      Update PyGreSQL from 4.0.0 to 5.1.2 · f5758021
      Tyler Ramer 提交于
      This commit updates pygresql from 4.0.0 to 5.1.2, which requires
      numerous changes to take advantages of the major result syntax change
      that pygresql5 implemented. Of note, cursors or query objects
      automatically cast returned values as appropriate python types - list of
      ints, for example, instead of a string like "{1,2}". This is the bulk of
      the changes.
      
      Updating to pygresql 5.1.2 provides numerous benfits, including the
      following:
      
      - CVE-2018-1058 was addressed in pygresql 5.1.1
      
      - We can save notices in the pgdb module, rather than relying on importing
      the pg module, thanks to the new "set_notices()"
      
      - pygresql 5 supports python3
      
      - Thanks to a change in the cursor, using a "with" syntax guarentees a
        "commit" on the close of the with block.
      
      This commit is a starting point for additional changes, including
      refactoring the dbconn module.
      
      Additionally, since isolation2 uses pygresql, some pl/python scripts
      were updated, and isolation2 SQL output is further decoupled from
      pygresql. The output of a psql command should be similar enough to
      isolation2's pg output that minimal or no modification is needed to
      ensure gpdiff can recognize the output.
      Co-Authored-by: NTyler Ramer <tramer@pivotal.io>
      Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      f5758021
  3. 10 3月, 2020 1 次提交
    • H
      Remove gp_elog() function. · 4f0a5ced
      Heikki Linnakangas 提交于
      It was only used for one message in gprecoverseg, and it doesn't seem
      important.
      
      The second argument to the function didn't do anything, since the removal
      of email and SNMP alerts in commit 65822b80. And the NULL checks for the
      arguments were pointless, because the function was marked as strict. But
      rather than clean those up, let's just remove it altogether.
      Reviewed-by: NAsim R P <apraveen@pivotal.io>
      Reviewed-by: NJimmy Yih <jyih@pivotal.io>
      4f0a5ced
  4. 30 5月, 2019 1 次提交
    • D
      parseutils: cluster management input file refactoring · 66448a3c
      David Krieger 提交于
      We simplify and refactor the parsing for gpaddmirrors, gpmovemirrors,
      gprecoverseg and gpexpand.  This eliminates a few hundred lines of code.
      In addition, this commit changes the format of the input lines for the
      input files to these routines.  The separator is now '|' instead of ':'.
      
      <Co-Authored-By> Mark Sliva <msliva@pivotal.io>
      <Co-Authored-By> Jamie McCatamney <jmcatamney@pivotal.io>
      66448a3c
  5. 06 4月, 2019 3 次提交
    • J
      gprecoverseg: attempt a workaround for FTS probe races · 18d02286
      Jacob Champion 提交于
      replication_slots tests are hitting frequent intermittent failures
      during gprecoverseg operation. Some failures are because mirrors aren't
      being marked 'up' after gprecoverseg exits cleanly, and we already know
      that FTS probes have a known race that occasionally causes them to do
      nothing on the first call.
      
      As a stop-gap before the probe race is fixed, try a double-call to
      gp_request_fts_probe_scan() to see if that helps the situation any.
      
      Originally introduced in commit 02cccd85,
      which was reverted due to unexpected unit test failures. Those are fixed
      in this patch.
      Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      18d02286
    • J
      Revert "gprecoverseg: attempt a workaround for FTS probe races" · 443895e0
      Jacob Champion 提交于
      Unexpected unit test failures; reverting for now.
      
      This reverts commit 02cccd85.
      443895e0
    • J
      gprecoverseg: attempt a workaround for FTS probe races · 02cccd85
      Jacob Champion 提交于
      replication_slots tests are hitting frequent intermittent failures
      during gprecoverseg operation. Some failures are because mirrors aren't
      being marked 'up' after gprecoverseg exits cleanly, and we already know
      that FTS probes have a known race that occasionally causes them to do
      nothing on the first call.
      
      As a stop-gap before the probe race is fixed, try a double-call to
      gp_request_fts_probe_scan() to see if that helps the situation any.
      Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      02cccd85
  6. 29 3月, 2019 1 次提交
  7. 22 3月, 2019 2 次提交
    • S
      gprecoverseg: Add --no-progress flag. · eb064718
      Shoaib Lari 提交于
      For some areas of the ICW test framework -- isolation2 in particular --
      the additional data written to stdout by gprecoverseg's progress
      increased the load on the system significantly. (Some tests are
      buffering stdout without bound, for instance.)  Additionally, the
      updates were coming at ten times a second, which is an order of
      magnitude more than the update interval we get from pg_basebackup
      itself.
      
      To help with this, we have have added a --no-progress flag that
      suppresses the output of pg_basebackup.  We have also changed the
      pg_basebackup progress update rate to once per second to minimize I/O.
      
      The impacted regression/isolation2 tests utilizing gprecoverseg have
      also been modified to use the --no-progress flag.
      Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      Co-authored-by: NJacob Champion <pchampion@pivotal.io>
      eb064718
    • K
      Add gprecoverseg -s to show progress sequentially · f04c206a
      Kalen Krempely 提交于
      When -s is present, show pg_basebackup progress sequentially instead
      of inplace. Useful when writing to a file, or if a tty does not support
      escape sequences. Defaults to showing the progress inplace.
      f04c206a
  8. 18 3月, 2019 1 次提交
    • J
      Gpexpand minor fix (#7159) · cea594f0
      Jialun 提交于
      - get port from MASTER_DATA_DIRECTORY, so there is no confusion if
        PGPORT and MASTER_DATA_DIRECTORY are set to different clusters
      - delete tmp status file 'gpexpand.standby.status' and copy the
        status file to standby directly
      - get standby data directory from catalog instead of assuming its
        same with master
      - copy gp_segment_configuration backup file to standby also, so
        standby can restore this catalog if master is down
      cea594f0
  9. 09 3月, 2019 2 次提交
    • J
      Revert recent changes to gpinitstandby and gprecoverseg · 659f0ee5
      Jacob Champion 提交于
      One of these changes appears to have possibly introduced a serious
      performance regression in the master pipeline. To avoid destabilizing
      work over the weekend, I'm reverting for now and we can investigate more
      fully next week.
      
      This reverts the following commits:
      "gprecoverseg: Show progress of pg_basebackup on each segment"
          1b38c6e8
      "Add gprecoverseg -s to show progress sequentially"
          9e89b5ad
      "gpinitstandby: guide the user on single-host systems"
          c9c3c351
      "gpinitstandby: rename -F to -S and document it"
          ba3eb5b4
      659f0ee5
    • K
      Add gprecoverseg -s to show progress sequentially · 9e89b5ad
      Kalen Krempely 提交于
      When -s is present, show pg_basebackup progress sequentially instead
      of inplace. Useful when writing to a file, or if a tty does not support
      escape sequences. Defaults to showing the progress inplace.
      9e89b5ad
  10. 02 10月, 2018 1 次提交
    • D
      Remove gppylib/testold · 83c1d270
      Daniel Gustafsson 提交于
      The tests, and the testharness code, has seemingly been unused for
      quite some time. Remove all the dead code and the invocations of
      testUtils functions in which were all noops as the global variable
      written to hadn't been initialized.
      83c1d270
  11. 22 6月, 2018 1 次提交
    • J
      Make gprecoverseg use new pg_backup flag --force-overwrite · e029720d
      Jimmy Yih 提交于
      This is needed during gprecoverseg full to preserve important files
      such as pg_log files. We pass this flag down the call stack to prevent
      other utilities such as gpinitstandby or gpaddmirror from using the
      new flag. The new flag can be dangerous if not used properly and
      should only be used when data directory file preservation is
      necessary.
      e029720d
  12. 16 2月, 2018 1 次提交
  13. 01 2月, 2018 2 次提交
    • J
      Make gprecoverseg incremental work · 6922219f
      Jimmy Yih 提交于
      The gprecoverseg tool has been broken after filerep and persistent
      tables were removed. This commit cleans it up a little bit and makes
      simple incremental mirror recovery work. This does not mean
      incremental recovery with pg_rewind is implemented yet though.
      6922219f
    • X
      Remove primary-mirror related tests. · 825f5bdc
      Xin Zhang 提交于
      Author: Xin Zhang <xzhang@pivotal.io>
      Author: Asim R P <apraveen@pivotal.io>
      825f5bdc
  14. 13 1月, 2018 4 次提交
  15. 06 12月, 2017 1 次提交
  16. 15 9月, 2017 1 次提交
    • A
      Remove gp_fault_strategy catalog table and corresponding code. · f5b5c218
      Ashwin Agrawal 提交于
      Using gp_segment_configuration catalog table easily can find if mirrors exist or
      not, do not need special table to communicate the same. Earlier
      gp_fault_strategy used to convey 'n' for mirrorless system, 'f' for replication
      and 's' for san mirrors. Since support for 's' was removed in 5.0 only purpose
      gp_fault_strategy served was mirrored or not mirrored system. Hence deleting the
      gp_fault_strategy table and at required places using gp_segment_configuration to
      find the required info.
      f5b5c218
  17. 08 9月, 2017 1 次提交
    • S
      Fix gprecoverseg recursive behavior (#3165) · 6af7082e
      Shoaib Lari 提交于
      * whitespace reformat
      Signed-off-by: NShoaib Lari <slari@pivotal.io>
      
      * gprecverseg unit test: remove redundancy
      Signed-off-by: NShoaib Lari <slari@pivotal.io>
      
      * Fix gprecoverseg recursive behavior
      
      gprecoverseg called itself during a rebalance, through a Command object.
      But this command didn't signal failures through a non-zero ret-code. So
      the top-level gprecoverseg didn't check its stdout/stderr for error
      messages and didn't echo them to its own stdout, though they were being
      logged.
      
      This commit changes behavior so gprecoverseg make another object and
      invokes its run() method. Any errors are now shown to the user and
      logged.
      Signed-off-by: NMarbin Tan <mtan@pivotal.io>
      Signed-off-by: NNadeem Ghani <nghani@pivotal.io>
      6af7082e
  18. 26 8月, 2017 2 次提交
    • L
      gprecoverseg: validate checksum setting · e0e331f2
      Larry Hamel 提交于
      As part of the validation phase of gprecoverseg, before proceeding,
      validate that the setting for GUC data_checksums is the same between
      master and segments. The validation is done by comparing pg_control file content.
      Fail fast if the settings are not the same.
      
      If no segments are able to report their settings, then gprecoverseg
      fails. (This failure to report would be unexpected since there is already a check for at least
      one segment alive to progress to the validation phase.)
      Signed-off-by: NMarbin Tan <mtan@pivotal.io>
      Signed-off-by: NNadeem Ghani <nghani@pivotal.io>
      e0e331f2
    • L
      Whitespace formatting · c325604e
      Larry Hamel 提交于
      c325604e
  19. 02 5月, 2017 1 次提交
    • D
      Remove dead code from catalog.py · 2c8a44fc
      Daniel Gustafsson 提交于
      While looking at other things, spotted that there were quite a few
      functions in catalog.py that were unused. Remove these, and also
      remove imports of gppylib.db.catalog when unused.
      2c8a44fc
  20. 27 4月, 2017 1 次提交
    • D
      Retire SAN failover strategy · a8f956c6
      Daniel Gustafsson 提交于
      The SAN FTS failover strategy is deprecated and no longer maintained,
      as it's no longer needed. Retire the dead code to clean up. Since we
      currently are in catalog freeze, the gp_san_configuration catalog is
      left in place but will be removed once we open up for the next major
      version cycle. Since the gp_san_config header file contains catalogs
      for general FTS fault strategies, rename the file for clarity.
      a8f956c6
  21. 13 12月, 2016 1 次提交
  22. 10 12月, 2016 3 次提交
  23. 20 10月, 2016 1 次提交
  24. 14 10月, 2016 3 次提交
  25. 11 10月, 2016 1 次提交
    • C
      Ignore stdout that precede desired content. (#1176) · ae90706b
      Chris Hajas 提交于
      When using a docker image, we noticed that gprecoverseg will fail due to
      parsing stdout that has a SSH warning line such as "Warning: Permanently
      added [host] to the list ...". We now ignore these lines until the
      desired output begins.
      
      Authors: Larry Hamel and Chris Hajas
      ae90706b