1. 23 1月, 2020 14 次提交
    • N
      Revert "basebackup: increase max count of --exclude args" · 7cd757d6
      Ning Yu 提交于
      This reverts commit 34803183.
      7cd757d6
    • N
      Revert "gpexpand: correct scenario names and indents" · 24680df7
      Ning Yu 提交于
      This reverts commit cff087dc.
      24680df7
    • N
      Revert "gpexpand: exclude master-only tables from the template" · 0348f712
      Ning Yu 提交于
      This reverts commit 62d53c21.
      0348f712
    • N
      Revert "gppylib: remove duplicated entries in MASTER_ONLY_TABLES" · 844d4bd8
      Ning Yu 提交于
      This reverts commit e046859f.
      844d4bd8
    • H
      GUC should be synchronized after change/restore. · c6079933
      Hubert Zhang 提交于
      Function with `SET search_path = t1` will change the
      search_path inside function but restore it back after
      function finishes.
      
      After search_path is restored on QD, we should also sync
      it to all the cached QEs.
      Reviewed-by: NWeinan WANG <wewang@pivotal.io>
      c6079933
    • N
      gppylib: remove duplicated entries in MASTER_ONLY_TABLES · e046859f
      Ning Yu 提交于
      Removed the duplicated 'gp_segment_configuration' entry in the
      MASTER_ONLY_TABLES list.  Also sort the list in alphabetic order to
      prevent dulicates in the future.
      e046859f
    • N
      gpexpand: exclude master-only tables from the template · 62d53c21
      Ning Yu 提交于
      Gpexpand creates new primary segments by first creating a template from
      the master datadir and then copying it to the new segments.  Some
      catalog tables are only meaningful on master, such as
      gp_segment_configuration, their content are then cleared on each new
      segment with the "delete from ..." commands.
      
      This works but is slow because we have to include the content of the
      master-only tables in the archive, distribute them via network, and
      clear them via the slow "delete from ..." commands -- the "truncate"
      command is fast but it is disallowed on catalog tables as filenode must
      not be changed for catalog tables.
      
      To make it faster we now exclude these tables from the template
      directly, so less data are transferred and there is no need to "delete
      from" them explicitly.
      62d53c21
    • N
      gpexpand: correct scenario names and indents · cff087dc
      Ning Yu 提交于
      In the gpexpand behave tests we used to have the same name for multiple
      scenarios, now we give them different and descriptive names.
      
      Also correct some bad indents.
      cff087dc
    • N
      basebackup: increase max count of --exclude args · 34803183
      Ning Yu 提交于
      The --exclude option is a gpdb specific option for pg_basebackup, it
      could be used to specify a path to exclude from the backup archive, the
      option can be provided multiple times to exclude multiple paths.
      
      We used to allow at most 255 excludes, this worked well in the past, but
      now we plan to use it to exclude the master-only catalog tables from the
      segment template of gpexpand, we may exceed this limit easily as many
      catalog tables are per-database, so when there are enough databases we
      can have thousands of or even more paths to exclude.
      
      Increase the limit to 65535, this should be enough in practice.  In the
      future we may want to remove the limit entirely, but for now we could
      just stick on a hard coded value.
      34803183
    • G
      Bugfix: rows might be split into wrong partitions · 101922f1
      ggbq 提交于
      split_rows() scans tuples from T and route them to new parts (A, B) based
      on A's or B's constraints. If T has one or more dropped columns before its
      partition key, T's partition key would have a different attribute number
      from its new parts. In this case, the constraints choose a wrong column
      which can cause bad behaviors.
      
      To fix it, each tuple iteration should reconstruct the partition tuple
      slot and assign it to econtext before ExecQual calls. The reconstruction
      process can happen once or twice because we assume A, B might have two
      different tupdescs.
      
      One bad behavior, rows are split into wrong partitions. Reproduce:
      
      ```sql
      DROP TABLE IF EXISTS users_test;
      
      CREATE TABLE users_test
      (
        id          INT,
        dd          TEXT,
        user_name   VARCHAR(40),
        user_email  VARCHAR(60),
        born_time   TIMESTAMP,
        create_time TIMESTAMP
      )
      DISTRIBUTED BY (id)
      PARTITION BY RANGE (create_time)
      (
        PARTITION p2019 START ('2019-01-01'::TIMESTAMP) END ('2020-01-01'::TIMESTAMP),
        DEFAULT PARTITION extra
      );
      
      /* Drop useless column dd for some reason */
      ALTER TABLE users_test DROP COLUMN dd;
      
      /* Forgot/Failed to split out new partitions beforehand */
      INSERT INTO users_test VALUES(1, 'A', 'A@abc.com', '1970-01-01', '2020-01-01 12:00:00');
      INSERT INTO users_test VALUES(2, 'B', 'B@abc.com', '1980-01-01', '2020-01-02 18:00:00');
      INSERT INTO users_test VALUES(3, 'C', 'C@abc.com', '1990-01-01', '2020-01-03 08:00:00');
      
      /* New partition arrives late */
      ALTER TABLE users_test SPLIT DEFAULT PARTITION START ('2020-01-01'::TIMESTAMP) END ('2021-01-01'::TIMESTAMP)
       INTO (PARTITION p2020, DEFAULT PARTITION);
      
      /*
       * - How many new users already in 2020?
       * - Wow, no one.
       */
      SELECT count(1) FROM users_test_1_prt_p2020;
      ```
      Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      101922f1
    • M
      cross-subnet: make gpexpand support cross-subnet expansion · cdd1e934
      Mark Sliva 提交于
      We update the pg_hba.conf file with replication entries for each
      hostname/address to enable cross-subnet cluster expansion. There are no tests
      for this change, but they can be added at a later time.
      Co-authored-by: NJacob Champion <pchampion@pivotal.io>
      Co-authored-by: NAdam Berlin <aberlin@pivotal.io>
      Co-authored-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
      Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
      Co-authored-by: NDavid Krieger <dkrieger@pivotal.io>
      Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      cdd1e934
    • M
      cross-subnet: add its pipeline job · b74a20a9
      Mark Sliva 提交于
      We add a cli_cross_subnet job that creates a cross_subnet cluster, and then
      runs the cross_subnet behave tests. It tests that replication works for each of
      the affected cross-subnet utilities.
      
      We provision 2 ccp clusters in 2 different subnets, and the gpinitsystem task
      creates a cluster in which every segment pair (including master/standby)
      replicates across subnets.
      Co-authored-by: NJacob Champion <pchampion@pivotal.io>
      Co-authored-by: NAdam Berlin <aberlin@pivotal.io>
      Co-authored-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
      Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
      Co-authored-by: NDavid Krieger <dkrieger@pivotal.io>
      Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      b74a20a9
    • M
      cross-subnet: fix replication on cross-subnet Greenplum Clusters · 79637980
      Mark Sliva 提交于
      The four CM utilities gpinitsystem, gpinitstandby, gpaddmirrors, and
      gpmovemirrors now have the relevant pg_hba.conf entries to allow WAL
      replication to mirrors from their respective primaries across subnets.
      
      There are two parts to this commit:
      1). modify the CM utilities to add the pg_hba.conf entries to
      allow WAL replication to mirrors across a subnet.
      2). test the relevant CM utilities across subnets
      
      The previous pg_hba.conf replication entry:
          'host replication $USER samenet trust'
      does not allow WAL replication connections across subnets. We keep this entry
      in order to support single-host development. We then add one replication line
      for each primary and mirror interface address to new primaries and mirrors to
      allow this. It looks like:
          'host replication $USER $IP_ADDRESS trust'
          or when HBA_HOSTNAMES=1
          'host replication $USER $HOSTNAME trust'
      Further, if there is ever a failover and subsequent promotion,
      replication connections can be made to the newly promoted primary from the host
      on which the previous primary failed, because those addresses get copied over
      to the new mirror during a pg_base_backup. We also add similar logic to support
      cross-subnet replication between the master and standby. This behavior is
      tested in the cross_subnet behave tests.
      
      The cross_subnet behave tests assert that the replication connection is valid
      by manually making the connection in addition to relying on segments being
      synchronized, as a way to ensure that the pg_hba.conf file is being used.
      Co-authored-by: NJacob Champion <pchampion@pivotal.io>
      Co-authored-by: NAdam Berlin <aberlin@pivotal.io>
      Co-authored-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
      Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
      Co-authored-by: NDavid Krieger <dkrieger@pivotal.io>
      Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      79637980
    • M
      cross-subnet: add ifaddrs utility · 15a30510
      Mark Sliva 提交于
      The interface addresses used for replication will be scanned using this new
      utility we added called ifaddrs that returns all of the interface addresses
      separated by newlines. As an internal utility, this will be installed into
      $GPHOME/libexec. There is no Python 2 library that provides this functionality,
      so we add it ourselves.
      
      Also add a configure dependency on getifaddrs and inet_ntop, which are now
      required to build a functioning GPDB system. As far as we can tell, the
      other headers and functions are already handled through other configure
      checks.
      Co-authored-by: NJacob Champion <pchampion@pivotal.io>
      Co-authored-by: NAdam Berlin <aberlin@pivotal.io>
      Co-authored-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
      Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
      Co-authored-by: NDavid Krieger <dkrieger@pivotal.io>
      Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      15a30510
  2. 22 1月, 2020 4 次提交
  3. 21 1月, 2020 8 次提交
    • H
      Fix CATALOG_VARLEN markings in header files. · 43690c89
      Heikki Linnakangas 提交于
      Some GPDB-added system tables and colums in upstream catalogs were missing
      CATALOG_VARLEN markings or they were wrong. It doesn't cause any ill
      effect, but it's a hazard if someone tries to access the fields through
      the Form_pg_* struct. Let's be tidy.
      Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
      43690c89
    • H
      Remove obsolete optimized code from EncodeDateTime. · 8661e51e
      Heikki Linnakangas 提交于
      We had optimized this piece of code in Greenplum, but in PostgreSQL 9.6,
      commit aa2387e2, PostgreSQL made a similar optimization. In the 9.6
      merge, I picked up the PostgreSQL version, but left the Greenplum version
      in a commented out block, with the plan to do some performance testing to
      see if we can switch to the PostgreSQL version now.
      
      I did that performance testing now, and it seems that the old GPDB version
      was about the same speed as the new PostgreSQL version. I used this to
      test it:
      
          -- generate test data
          create table tstest  (distkey int4, ts timestamp) distributed by (distkey);
          insert into tstest
            select 1, g from generate_series(now(), now()+ '1 year', '1 second') g;
          vacuum tstest;
      
          -- test query
          \timing on
          select min(ts::text) from tstest;
      
      The query took about 15 seconds on my laptop, and 'perf' says that about
      10% of the CPU time was spent in EncodeDateTime, with both versions.
      Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
      8661e51e
    • H
      Fix commented-out code. · b49993a6
      Heikki Linnakangas 提交于
      I'm just about to remove it, but let's fix it first so that the we have
      the fixed version in the git history, in case someone wants to revisit
      this.
      b49993a6
    • Z
      Add sanity check for on conflict update to avoid wrong data distribution · 8bd2f1c2
      Zhenghua Lyu 提交于
      Statement `Insert on conflict do update` will invoke update on segments.
      If the on conflict update modifies the distkeys of the table, this would
      lead to wrong data distribution.
      
      This commit avoid this issue by raising error when transformInsertStmt,
      if it finds that the on conflict update will touch the distribution keys of
      the table.
      
      Fixes github issue: https://github.com/greenplum-db/gpdb/issues/9444
      8bd2f1c2
    • Z
      Fix potential global deadlock for upsert · 893c5293
      Zhenghua Lyu 提交于
      Statement `insert on conflict do update` may invoke ExecUpdate
      on segments, so it should be treated as update statement for the
      lock mode issue.
      
      Fixes github issue https://github.com/greenplum-db/gpdb/issues/9449
      893c5293
    • R
      Implementing multi-phase grouping sets. (#9219) · 4e00f481
      Richard Guo 提交于
      This is a work based on multi-phase aggregation in master. The idea is
      to perform grouping sets aggregation in partial phase and then perform
      normal aggregation in final phase.
      
      In partial phase, we attach a GROUPINGSET_ID to each transvalue. In
      final phase, we include this GROUPINGSET_ID at the head of sort keys and
      group keys for group aggregation. This will ensure correctness in the
      case of NULLs.
      4e00f481
    • N
      Revert "ci: remove server-build-* resources from non-prod pipelines" · ea975b6e
      Ning Yu 提交于
      This reverts commit 6f9729c1.
      ea975b6e
    • N
      ci: stick python modules to py2 compatible versions · 68a82378
      Ning Yu 提交于
      Python 2 has reached the end-of-life on Jan 1st, 2020, many upstream
      python modules are in the progress of dropping python 2 support, their
      newer versions may not work for us.  We have encountered several such
      kind of issues on the pipeline this year, to make our lifes easier we
      now stick the modules to the python2 compatible versions.  For now we
      only stick the modules used by the concourse scripts, later we might
      want to stick more which are used by our utility scripts.
      68a82378
  4. 20 1月, 2020 5 次提交
  5. 19 1月, 2020 1 次提交
    • W
      Fix gpperfmon tmid inconsistent across master and segments (#9450) · c9989a93
      Wang Hao 提交于
      The tmid should be of the same value among the cluster.
      It increases only on gpdb cluster full restart.
      Tmid, gp_session_id, and gp_command_count are put together to
      uniquely identify a single query execution, monitoring agents
      such as gpperfmon relies on this uniqueness.
      
      In 58c7833d introduced a different
      way for gpmon_gettmid(), it brings in a problem that on master
      and segments the tmid may be different.
      This commit fixes the problem.
      Reviewed-by: NNing Yu <nyu@pivotal.io>
      c9989a93
  6. 17 1月, 2020 5 次提交
  7. 16 1月, 2020 1 次提交
    • A
      ForgetRelationFsyncRequests in mdunlink for AO during crash recovery · 1223ffac
      Ashwin Agrawal 提交于
      On primary segment, in normal mode backend performs fsync for AO table
      and doesn't delegate the work to checkpointer. Hence, doesn't need to
      register or forget fsync request for AO table. TRUNCATE command
      currently shares same code for heap and AO via ExecuteTruncate() ->
      heap_truncate_one_rel(). It writes generic file truncate wal record
      and registers the request with checkpointer process. In normal mode,
      UNLINK request is registered with checkpointer process and backend
      doesn't perform the unlink, irrespective of AO or heap for base file
      and hence this works fine.
      
      But during crash recovery, as can't determine based on file truncate
      record, it's heap or AO file, fsync request is registered. So, if
      mdunlink() skips sending forget fsync request for AO and instead
      direclty unlinks the file, it causes PANIC with "could not fsync
      file.....No such file or directory".
      
      To avoid this situation and have defensive code, better to always
      forget relation fsync request during crash recovery in mdunlink(),
      irrespective of storage type. This makes the system resilient in
      presence of any such kind of issues. As only way to recovery from
      "could not fsync file.....No such file or directory" during crash
      recovery is reset xlogs which is very dangerous and causes data
      loss. In normal mode continuing to avoid mdunlink() to forget fsync
      request for AO, just to avoid overwhelming the fsync request queue.
      
      In longer term, mostly better to have separate code for truncate for
      AO and heap. Not to use generic file truncate record for both will be
      better. But irrespective seems better as stated above to have this
      change.
      Reviewed-by: NAsim R P <apraveen@pivotal.io>
      Reviewed-by: NPaul Guo <pguo@pivotal.io>
      1223ffac
  8. 15 1月, 2020 2 次提交