1. 04 12月, 2018 2 次提交
  2. 30 11月, 2018 1 次提交
  3. 29 11月, 2018 1 次提交
    • D
      Compare with None using the is operator · e39047b5
      Daniel Gustafsson 提交于
      While == None works for comparison, it's a wasteful operation as it
      performs type conversion and expansion. Instead move to using the
      "is" operator which is the documented best practice for Python code.
      
      Reviewed-by: Jacob Champion
      e39047b5
  4. 25 10月, 2018 3 次提交
    • T
      Unify the way to fetch/manage the number of segments (#6034) · 8eed4217
      Tang Pengzhou 提交于
      * Don't use GpIdentity.numsegments directly for the number of segments
      
      Use getgpsegmentCount() instead.
      
      * Unify the way to fetch/manage the number of segments
      
      Commit e0b06678 lets us expanding a GPDB cluster without a restart,
      the number of segments may changes during a transaction now, so we
      need to take care of the numsegments.
      
      We now have two way to get segments number, 1) from GpIdentity.numsegments
      2) from gp_segment_configuration (cdb_component_dbs) which dispatcher used
      to decide the segments range of dispatching. We did some hard work to
      update GpIdentity.numsegments correctly within e0b06678 which made the
      management of segments more complicated, now we want to use an easier way
      to do it:
      
      1. We only allow getting segments info (include number of segments) through
      gp_segment_configuration, gp_segment_configuration has newest segments info,
      there is no need to update GpIdentity.numsegments, GpIdentity.numsegments is
      left only for debugging and can be removed totally in the future.
      
      2. Each global transaction fetches/updates the newest snapshot of
      gp_segment_configuration and never change it until the end of transaction
      unless a writer gang is lost, so a global transaction can see consistent
      state of segments. We used to use gxidDispatched to do the same thing, now
      it can be removed.
      
      * Remove GpIdentity.numsegments
      
      GpIdentity.numsegments take no effect now, remove it. This commit
      does not remove gp_num_contents_in_cluster because it needs to
      modify utilities like gpstart, gpstop, gprecoverseg etc, let's
      do such cleanup work in another PR.
      
      * Exchange the default UP/DOWN value in fts cache
      
      Previously, Fts prober read gp_segment_configuration, checked the
      status of segments and then set the status of segments in the shared
      memory struct named ftsProbeInfo->fts_status[], so other components
      (mainly used by dispatcher) can detect a segment was down.
      
      All segments were initialized as down and then be updated to up in
      most common cases, this brings two problems:
      
      1. The fts_status is invalid until FTS does the first loop, so QD
      need to check ftsProbeInfo->fts_statusVersion > 0
      2. gpexpand add a new segment in gp_segment_configuration, the
      new added segment may be marked as DOWN if FTS doesn't scan it
      yet.
      
      This commit changes the default value from DOWN to UP which can
      resolve problems mentioned above.
      
      * Fts should not be used to notify backends that a gpexpand occurs
      
      As Ashwin mentioned in PR#5679, "I don't think giving FTS responsibility to
      provide new segment count is right. FTS should only be responsible for HA
      of the segments. The dispatcher should independently figure out the count
      based on catalog.gp_segment_configuration should be the only way to get
      the segment count", FTS should decouple from gpexpand.
      
      * Access gp_segment_configuration inside a transaction
      
      * upgrade log level from ERROR to FATAL if expand version changed
      
      * Modify gpexpand test cases according to new design
      8eed4217
    • J
      Fix bad variable name · dcd6e301
      Jamie McAtamney 提交于
      Authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      dcd6e301
    • J
      Update gp_distribution_policy in gpexpand · 20d55451
      Jamie McAtamney 提交于
      After commit 4eb65a53 added the numsegments column to gp_distribution_policy,
      gpexpand would no longer correctly redistribute tables over the new segments
      after an expansion.  ALTER TABLE ... SET WITH(REORGANIZE=true) now relies upon
      the numsegments value, and that value was not being updated by gpexpand.
      
      This commit adds an UPDATE statement before every redistribution statement to
      set numsegments to the proper value for the corresponding table so that the
      data is correctly redistributed over the new segments.
      Authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      20d55451
  5. 17 10月, 2018 1 次提交
  6. 15 10月, 2018 1 次提交
    • N
      Online expand without restarting gpdb cluster (#5679) · e0b06678
      Ning Yu 提交于
      * Protect catalog changes on master.
      
      To allow gpexpand to do the job without restarting the cluster we need
      to prevent concurrent catalog changes on master.  A catalog lock is
      provided for this purpose, all insert/update/delete to catalog tables
      should hold this lock in shared mode before making the changes; gpexpand
      can hold this lock in exclusive mode to 1) wait for in-progress catalog
      updates to commit/rollback and 2) prevent concurrent catalog updates.
      
      Add UDF to hold catalog lock in exclusive mode.
      
      Add test cases for the catalog lock.
      Co-authored-by: NJialun Du <jdu@pivotal.io>
      Co-authored-by: NNing Yu <nyu@pivotal.io>
      
      * Numsegments management.
      
      GpIdentity.numsegments is a global variable saving the cluster size in
      each process. It is important for cluster. Changing of the cluster size
      means the expansion is finished and new segments have taken effect.
      
      FTS will count the number of primary segments from gp_segment_configuration
      and record it in shared memory. Later, other processes including master,
      fts, gdd, QD will update GpIdentity.numsegmentswith this information.
      
      So far it's not easy to make old transactions running with new segments,
      so for QD GpIdentity.numsegments can only be updated at beginning of
      transactions. Old transactions can only see old segments. QD will dispatch
      GpIdentity.numsegments to QEs, so they will see the same cluster size.
      
      Catalog changes in old transactions are disallowed.
      
      Consider below workflow:
      
              A: begin;
              B: gpexpand from N segments to M (M > N);
              A: create table t1 (c1 int);
              A: commit;
              C: drop table t1;
      
      Transaction A began when cluster size was still N, all its commands are
      dispatched to N segments only even after cluster being expanded to M
      segments, so t1 was only created on N tables, not only the data
      distribution but also the catalog records.  This will cause the later
      DROP TABLE to fail on the new segments.
      
      To prevent this issue we currently disable all catalog updates in old
      transactions once the expansion is done.  New transactions can update
      catalog as they are already running on all the M segments.
      Co-authored-by: NJialun Du <jdu@pivotal.io>
      Co-authored-by: NNing Yu <nyu@pivotal.io>
      
      * Online gpexpand implementation.
      
      Do not restart the cluster during expansion.
      - Lock the catalog
      - Create the new segments from master segment
      - Add new segments to cluster
      - Reload the cluster so new transaction and background processes
        can see new segments
      - Unlock the catalog
      
      Add test cases.
      Co-authored-by: NJialun Du <jdu@pivotal.io>
      Co-authored-by: NNing Yu <nyu@pivotal.io>
      
      * New job to run ICW after expansion.
      
      It is better to run ICW after online expansion to see if the
      cluster works well. So we add a new job to create a cluster with
      two segments first, then expand to three segments and run all the
      ICW cases. Cluster restarting is forbidden for we must be sure
      all the cases will be passed after online expansion. So any case
      which contains cluster restarting must be removed from this job.
      
      The new job is icw_planner_centos7_online_expand, it is same as
      icw_planner_centos7 but contains two new params EXCLUDE_TESTS and
      ONLINE_EXPAND. If ONLINE_EXPAND is set, the ICW shell will go to
      different branch, creating cluster with size two, expanding, etc.
      EXCLUDE_TESTS lists the cases which contain cluster restarting
      or the test cases will fail without restarting test cases.
      
      After the whole run, the pid of master will be checked, if it
      changes, the cluster must be restarted, so the job will fail.
      As a result any new restarting test cases should be added to
      EXCLUDE_TESTS.
      
      * Add README.
      Co-authored-by: NJialun Du <jdu@pivotal.io>
      Co-authored-by: NNing Yu <nyu@pivotal.io>
      
      * Small changes per review comments.
      
      - Delete confusing test case
      - Change function name updateBackendGpIdentityNumsegments to
        updateSystemProcessGpIdentityNumsegments
      - Fix some typos
      
      * Remove useless Assert
      
      These two asserts will cause failure in isolation2/uao/
      compaction_utility_insert. This case will check AO table in utility
      mode. But numsegments are meaningless in sole segment, so this
      assert must be removed.
      e0b06678
  7. 18 9月, 2018 1 次提交
    • J
      Fix gpexpand: start new mirror segments after expansion. (#5773) · 018525e4
      Jialun 提交于
      We found gpexpand on master does not start new mirror segments after expansion, and the status of new mirrors are marked as down in gp_segment_configuration. Because the function sync_new_mirrors has been removed in commit 28eec592. And the test cases do not cover status check. So we re-add this function and add status check in test cases.
      018525e4
  8. 17 8月, 2018 1 次提交
  9. 15 8月, 2018 1 次提交
  10. 29 3月, 2018 1 次提交
    • P
      Support replicated table in GPDB · 7efe3204
      Pengzhou Tang 提交于
      * Support replicated table in GPDB
      
      Currently, tables are distributed across all segments by hash or random in GPDB. There
      are requirements to introduce a new table type that all segments have the duplicate
      and full table data called replicated table.
      
      To implement it, we added a new distribution policy named POLICYTYPE_REPLICATED to mark
      a replicated table and added a new locus type named CdbLocusType_SegmentGeneral to specify
      the distribution of tuples of a replicated table.  CdbLocusType_SegmentGeneral implies
      data is generally available on all segments but not available on qDisp, so plan node with
      this locus type can be flexibly planned to execute on either single QE or all QEs. it is
      similar with CdbLocusType_General, the only difference is that CdbLocusType_SegmentGeneral
      node can't be executed on qDisp. To guarantee this, we try our best to add a gather motion
      on the top of a CdbLocusType_SegmentGeneral node when planing motion for join even other
      rel has bottleneck locus type, a problem is such motion may be redundant if the single QE
      is not promoted to executed on qDisp finally, so we need to detect such case and omit the
      redundant motion at the end of apply_motion(). We don't reuse CdbLocusType_Replicated since
      it's always implies a broadcast motion bellow, it's not easy to plan such node as direct
      dispatch to avoid getting duplicate data.
      
      We don't support replicated table with inherit/partition by clause now, the main problem is
      that update/delete on multiple result relations can't work correctly now, we can fix this
      later.
      
      * Allow spi_* to access replicated table on QE
      
      Previously, GPDB didn't allow QE to access non-catalog table because the
      data is incomplete,
      we can remove this limitation now if it only accesses replicated table.
      
      One problem is QE need to know if a table is replicated table,
      previously, QE didn't maintain
      the gp_distribution_policy catalog, so we need to pass policy info to QE
      for replicated table.
      
      * Change schema of gp_distribution_policy to identify replicated table
      
      Previously, we used a magic number -128 in gp_distribution_policy table
      to identify replicated table which is quite a hack, so we add a new column
      in gp_distribution_policy to identify replicated table and partitioned
      table.
      
      This commit also abandon the old way that used 1-length-NULL list and
      2-length-NULL list to identify DISTRIBUTED RANDOMLY and DISTRIBUTED
      FULLY clause.
      
      Beside, this commit refactor the code to make the decision-making of
      distribution policy more clear.
      
      * support COPY for replicated table
      
      * Disable row ctid unique path for replicated table.
        Previously, GPDB use a special Unique path on rowid to address queries
        like "x IN (subquery)", For example:
        select * from t1 where t1.c2 in (select c2 from t3), the plan looks
        like:
         ->  HashAggregate
               Group By: t1.ctid, t1.gp_segment_id
                  ->  Hash Join
                        Hash Cond: t2.c2 = t1.c2
                      ->  Seq Scan on t2
                      ->  Hash
                          ->  Seq Scan on t1
      
        Obviously, the plan is wrong if t1 is a replicated table because ctid
        + gp_segment_id can't identify a tuple, in replicated table, a logical
        row may have different ctid and gp_segment_id. So we disable such plan
        for replicated table temporarily, it's not the best way because rowid
        unique way maybe the cheapest plan than normal hash semi join, so
        we left a FIXME for later optimization.
      
      * ORCA related fix
        Reported and added by Bhuvnesh Chaudhary <bchaudhary@pivotal.io>
        Fallback to legacy query optimizer for queries over replicated table
      
      * Adapt pg_dump/gpcheckcat to replicated table
        gp_distribution_policy is no longer a master-only catalog, do
        same check as other catalogs.
      
      * Support gpexpand on replicated table && alter the dist policy of replicated table
      7efe3204
  11. 07 2月, 2018 1 次提交
    • S
      gpstart: fix OOM issue · a993ef03
      Shoaib Lari 提交于
      gpstart did a cluster-wide check of heap_checksum settings and refused
      to start the cluster if this setting was inconsistent. This meant a
      round of ssh'ing across the cluster which was causing OOM errors with
      large clusters.
      
      This commit moves the heap_checksum validation to gpsegstart.py, and
      changes the logic so that only those segments which have the same
      heap_checksum setting as master are started.
      
      Author: Jim Doty <jdoty@pivotal.io>
      Author: Nadeem Ghani <nghani@pivotal.io>
      Author: Shoaib Lari <slari@pivotal.io>
      a993ef03
  12. 30 1月, 2018 2 次提交
    • S
      Address PR comments: · 73763d6c
      Shoaib Lari 提交于
      - Change gpdeletsystem to delete tablespaces before datadir
      - Refactor SegmentStart.noWait to pg_ctrl_wait
      - Create PgBaseBackup class
      - Revert the change to default Mirror status
      - Assorted typos and bugfixes
      
      Author: Shoaib Lari <slari@pivotal.io>
      Author: Jim Doty <jdoty@pivotal.io>
      Author: Nadeem Ghani <nghani@pivotal.io>
      73763d6c
    • S
      gpexpand: Make it work with segwalrep. · 4534d171
      Shoaib Lari 提交于
      The gpexpand utility and associated code are modified to work with the WALREP
      code.  Previously, gpexpand only worked with the primaries and relied on Filerep
      to populate the mirrors. We are changing gpexpand such that it initializes the
      mirrors using pg_basebackup and set them up for WAL replication.
      
      Also, since the WALREP branch removed Filespaces, so we have also done the same
      and replaced references to Filespaces by the Data Directory of the segments.
      
      Author: Marbin Tan <mtan@pivotal.io>
      Author: Shoaib Lari <slari@pivotal.io>
      Author: Nadeem Ghani <nghani@pivotal.io>
      4534d171
  13. 13 1月, 2018 9 次提交
  14. 09 1月, 2018 2 次提交
    • S
      gppylib: refactor SegmentPair to not support multiple mirrors · a19f7327
      Shoaib Lari 提交于
      Long ago, we thought we might need to support multiple mirrors. But we
      don't, and don't forsee it coming soon. Simplify the code to only ever
      have one mirror, but still allow for the possibility of no mirrors
      
      Author: Shoaib Lari <slari@pivotal.io>
      Author: C.J. Jameson <cjameson@pivotal.io>
      a19f7327
    • M
      gppylib: Rename gpArray variables and classes · bbc47080
      Marbin Tan 提交于
      The gpArray use of GpDB and Segment classes was confusing. This change renames
      GpDB to Segment and Segment to SegmentPair to clarify usage. Its a big diff, but
      a simple, repeating change.
      
      Author: Shoaib Lari <slari@pivotal.io>
      Author: Marbin Tan <mtan@pivotal.io>
      Author: C.J. Jameson <cjameson@pivotal.io>
      bbc47080
  15. 06 12月, 2017 1 次提交
  16. 14 11月, 2017 1 次提交
  17. 01 9月, 2017 1 次提交
    • S
      unix.RemoveDirectories: optimize and isolate cases · 872c02b3
      Shoaib Lari 提交于
      Previously, RemoveDirectories and RemoveFiles used the unix command
      "rm -rf", but this is inefficient for huge numbers of files.
      Also, these functions accepted any globbed path.
      
      Instead, use "rsync" to optimize deletion of files in a directory.
      On a DCA using 1 million files, this increased speed by about 3x.
      
      Also, this commit breaks up the different use-cases of deletion into
      separate methods, adding methods RemoveDirectoryContents() and RemoveFile()
      and RemoveGlob() to help isolate the assumptions of each case and
      optimize for them.
      Signed-off-by: NLarry Hamel <lhamel@pivotal.io>
      Signed-off-by: NShoaib Lari <slari@pivotal.io>
      872c02b3
  18. 30 8月, 2017 4 次提交
  19. 02 6月, 2017 1 次提交
  20. 04 5月, 2017 3 次提交
  21. 08 2月, 2017 2 次提交