1. 06 4月, 2019 1 次提交
    • A
      Do not modify a cached plan statement during ExecuteTruncate. · e3cf4f26
      Adam Berlin 提交于
      User defined functions cache their plan, therefore if we modify the
      plan during execution, we risk having invalid data during the next
      execution of the cached plan.
      
      ExecuteTruncate modifies the plan's relations when declaring partitions
      that need truncation.
      
      Instead, we copy the list of relations that need truncation and add the
      partition relations to the copy.
      e3cf4f26
  2. 05 4月, 2019 1 次提交
    • D
      Refactor old bugbuster metadata_track suite · 2bf45876
      Daniel Gustafsson 提交于
      The metadata_track suite was originally part of cdbfast, dating back
      around 10-11 years. It was later moved into bugbuster around six years
      ago, already then with doubts as to what it actually did. After the
      open sourcing we scrapped bugbuster, moving anything worthwhile (or
      just not analyzed for usefulness yet) into the normal regress schedule
      which is where metadata_track remained till now. This all according to
      memory and the old proprietary issue tracker.
      
      Looking at metadata_track, it's entirely duplicative only issuing
      lots of DDL already tested elsewhere without verifying the results,
      most likely since it was originally testing the metadata tracking
      of the operations. Since the latter part is no longer happening,
      move parts of the test into the existing pg_stat_last_operation
      test and remove the rest, as the remaining value of this quite slow
      test (spending ~10-12 minutes serially in the pipeline) is highly
      debatable.
      
      The existing pg_stat_last_operation was, and I quote, "underwhelming"
      so most of it is replaced herein. There is still work to be done in
      order to boost metadata tracking test coverage, but this is at least
      a start.
      
      Reviewed-by: Jimmy Yih
      2bf45876
  3. 26 3月, 2019 1 次提交
  4. 14 3月, 2019 1 次提交
  5. 13 3月, 2019 2 次提交
    • Z
      Move test partial_table to another test group to avoid deadlock · d9606d18
      Zhenghua Lyu 提交于
      Previously, test cases `partial_table` and `subselect_gp2` are
      in the same test group so that they might be running concurrently.
      
      `partital_table` contains a statement: `update gp_distribution_policy`,
      `subselect_gp2` contains a statement: `VACUUM FULL pg_authid`. These
      two statements may lead to local deadlock on QD when running concurrently
      if GDD is disabled.
      
      If GDD is disabled,
      
        `update gp_distribution_policy`'s lock-acquire:
          1. at parsing stage, lock `gp_distribution_policy` in Exclusive Mode
            2. later when it needs to check authentication, lock `pg_authid` in
                 AccessShare Mode
      
        `VACUUM FULL pg_authid`'s lock-acquire:
          1. lock pg_authid in Access Exclusive Mode
            2. later when rebuilding heap, it might delete some dependencies,
                 this will do GpPolicyRemove, which locks `gp_distribution_policy`
      	        in RowExclusive Mode
      
      So there is a potential local deadlock.
      d9606d18
    • N
      Tolerant misplaced tuples during data expansion · 28026210
      Ning Yu 提交于
      Add tests to ensure that a table can be expanded correctly even if it
      contains misplaced tuples.
      28026210
  6. 12 3月, 2019 1 次提交
  7. 11 3月, 2019 1 次提交
    • N
      Retire the reshuffle method for table data expansion (#7091) · 1c262c6e
      Ning Yu 提交于
      This method was introduced to improve the data redistribution
      performance during gpexpand phase2, however per benchmark results the
      effect does not reach our expectation.  For example when expanding a
      table from 7 segments to 8 segments the reshuffle method is only 30%
      faster than the traditional CTAS method, when expanding from 4 to 8
      segments reshuffle is even 10% slower than CTAS.  When there are indexes
      on the table the reshuffle performance can be worse, and extra VACUUM is
      needed to actually free the disk space.  According to our experiments
      the bottleneck of reshuffle method is on the tuple deletion operation,
      it is much slower than the insertion operation used by CTAS.
      
      The reshuffle method does have some benefits, it requires less extra
      disk space, it also requires less network bandwidth (similar to CTAS
      method with the new JCH reduce method, but less than CTAS + MOD).  And
      it can be faster in some cases, however as we can not automatically
      determine when it is faster it is not easy to get benefit from it in
      practice.
      
      On the other side the reshuffle method is less tested, it is possible to
      have bugs in corner cases, so it is not production ready yet.
      
      In such a case we decided to retire it entirely for now, we might add it
      back in the future if we can get rid of the slow deletion or find out
      reliable ways to automatically choose between reshuffle and ctas
      methods.
      
      Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/8xknWag-SkI/5OsIhZWdDgAJReviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      1c262c6e
  8. 01 3月, 2019 1 次提交
    • Z
      Add gpexpand status check in some utility · 7f5705cd
      Zhenghua Lyu 提交于
      The following utilities do not work when we are in gpexpand phase 1:
      
      * gppkg
      * gpconfig
      * gpcheckat
      
      Add check for them so that if cluster is expanding in phase1, they will
      print error message and exit.
      7f5705cd
  9. 20 2月, 2019 1 次提交
  10. 13 2月, 2019 1 次提交
    • E
      Test minimal explain formatting in explain_format test. · 0132983b
      Ekta Khanna 提交于
      explain_format tests validate for memory related information. But the
      printing for that information is not stable and varies based on orca vs
      planner, assert enabled vs disabled, query reusing the slice vs run on
      fresh session. Plus, future modifications not related to explain
      formatting can cause this test to fail. Hence, only minimal explain
      format validation which is found to be stable currently is being
      retained for this test.
      
      Better alternative needs to be found to perform for full validation for
      explain formatting, seems sql way is too fragile for it.
      Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
      0132983b
  11. 06 2月, 2019 1 次提交
    • H
      Replace WorkFile stuff with plain BufFiles. · f1ef3668
      Heikki Linnakangas 提交于
      Temporary files have been somewhat inconsistent across different
      operations. Some operations used the Greenplum-specific "workfile" API,
      while others used the upstream BufFile API directly.
      
      The workfile API provides some extra features: workfiles are visible in
      the gp_toolkit views, and you can limit their size with the
      gp_workfile_limit_* GUCs. The temporary files that didn't go through the
      workfile API were exempt, which is not cool.
      
      To make things consistent, remove the workfile APIs. Use BufFiles
      directly everywhere. Re-implement the user-facing view and tracking the
      limits, on top of the BufFile API, so that those features are not lost.
      
      The workfile API also supported compressing the temporary files using
      zlib. That feature is lost with this commit, but will be re-introduced by
      the next commit.
      
      Another feature that this removes, is checksumming temporary files.
      That doesn't seem very useful, so we can probably live without it. But
      if it's still needed, then that should also be re-implemented on top
      of the BufFile API later.
      
      Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/8Xe9MGor0pM/SjqiOo83BAAJReviewed-by: NMel Kiyama <mkiyama@pivotal.io>
      Reviewed-by: NYandong Yao <yyao@pivotal.io>
      f1ef3668
  12. 01 2月, 2019 1 次提交
    • H
      Use normal hash operator classes for data distribution. · 242783ae
      Heikki Linnakangas 提交于
      Replace the use of the built-in hashing support for built-in datatypes, in
      cdbhash.c, with the normal PostgreSQL hash functions. Now is a good time
      to do this, since we've already made the change to use jump consistent
      hashing in GPDB 6, so we'll need to deal with the upgrade problems
      associated with changing the hash functions, anyway.
      
      It is no longer enough to track which columns/expressions are used to
      distribute data. You also need to know the hash function used. For that,
      a new field is added to gp_distribution_policy, to record the hash
      operator class used for each distribution key column. In the planner,
      a new opfamily field is added to DistributionKey, to track that throughout
      the planning.
      
      Normally, if you do "CREATE TABLE ... DISTRIBUTED BY (column)", the
      default hash operator class for the datatype is used. But this patch
      extends the syntax so that you can specify the operator class explicitly,
      like "... DISTRIBUTED BY (column opclass)". This is similar to how an
      operator class can be specified for each column in CREATE INDEX.
      
      To support upgrade, the old hash functions have been converted to special
      (non-default) operator classes, named cdbhash_*_ops. For example, if you
      want to use the old hash function for an integer column, you could do
      "DISTRIBUTED BY (intcol cdbhash_int4_ops)". The old hard-coded whitelist
      of operators that have "compatible" cdbhash functions has been replaced
      by putting the compatible hash opclasses in the same operator family. For
      example, all legacy integer operator classes, cdbhash_int2_ops,
      cdbhash_int4_ops and cdbhash_int8_ops, are all part of the
      cdbhash_integer_ops operator family).
      
      This removes the pg_database.hashmethod field. The hash method is now
      tracked on a per-table and per-column basis, using the opclasses, so it's
      not needed anymore.
      
      To help with upgrade from GPDB 5, this introduces a new GUC called
      'gp_use_legacy_hashops'. If it's set, CREATE TABLE uses the legacy hash
      opclasses, instead of the default hash opclasses, if the opclass is not
      specified explicitly. pg_upgrade will set the new GUC, to force the use of
      legacy hashops, when restoring the schema dump. It will also set the GUC
      on all upgraded databases, as a per-database option, so any new tables
      created after upgrade will also use the legacy opclasses. It seems better
      to be consistent after upgrade, so that collocation between old and new
      tables work for example. The idea is that some time after the upgrade, the
      admin can reorganize all tables to use the default opclasses instead. At
      that point, he should also clear the GUC on the converted databases. (Or
      rather, the automated tool that hasn't been written yet, should do that.)
      
      ORCA doesn't know about hash operator classes, or the possibility that we
      might need to use a different hash function for two columns with the same
      datatype. Therefore, it cannot produce correct plans for queries that mix
      different distribution hash opclasses for the same datatype, in the same
      query. There are checks in the Query->DXL translation, to detect that
      case, and fall back to planner. As long as you stick to the default
      opclasses in all tables, we let ORCA to create the plan without any regard
      to them, and use the default opclasses when translating the DXL plan to a
      Plan tree. We also allow the case that all tables in the query use the
      "legacy" opclasses, so that ORCA works after pg_upgrade. But a mix of the
      two, or using any non-default opclasses, forces ORCA to fall back.
      
      One curiosity with this is the "int2vector" and "aclitem" datatypes. They
      have a hash opclass, but no b-tree operators. GPDB 4 used to allow them
      as DISTRIBUTED BY columns, but we forbid that in GPDB 5, in commit
      56e7c16b. Now they are allowed again, so you can specify an int2vector
      or aclitem column in DISTRIBUTED BY, but it's still pretty useless,
      because the planner still can't form EquivalenceClasses on it, and will
      treat it as "strewn" distribution, and won't co-locate joins.
      
      Abstime, reltime, tinterval datatypes don't have default hash opclasses.
      They are being removed completely on PostgreSQL v12, and users shouldn't
      be using them in the first place, so instead of adding hash opclasses for
      them now, we accept that they can't be used as distribution key columns
      anymore. Add a check to pg_upgrade, to refuse upgrade if they are used
      as distribution keys in the old cluster. Do the same for 'money' datatype
      as well, although that's not being removed in upstream.
      
      The legacy hashing code for anyarray in GPDB 5 was actually broken. It
      could produce a different hash value for two arrays that are considered
      equal, according to the = operator, if there were differences in e.g.
      whether the null bitmap was stored or not. Add a check to pg_upgrade, to
      reject the upgrade if array types were used as distribution keys. The
      upstream hash opclass for anyarray works, though, so it is OK to use
      arrays as distribution keys in new tables. We just don't support binary
      upgrading them from GPDB 5. (See github issue
      https://github.com/greenplum-db/gpdb/issues/5467). The legacy hashing of
      'anyrange' had the same problem, but that was new in GPDB 6, so we don't
      need a pg_upgrade check for that.
      
      This also tightens the checks ALTER TABLE ALTER COLUMN and CREATE UNIQUE
      INDEX, so that you can no longer create a situation where a non-hashable
      column becomes the distribution key. (Fixes github issue
      https://github.com/greenplum-db/gpdb/issues/6317)
      
      Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/4fZVeOpXllQCo-authored-by: NMel Kiyama <mkiyama@pivotal.io>
      Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
      Co-authored-by: NPengzhou Tang <ptang@pivotal.io>
      Co-authored-by: NChris Hajas <chajas@pivotal.io>
      Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
      Reviewed-by: NNing Yu <nyu@pivotal.io>
      Reviewed-by: NSimon Gao <sgao@pivotal.io>
      Reviewed-by: NJesse Zhang <jzhang@pivotal.io>
      Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
      Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
      Reviewed-by: NYandong Yao <yyao@pivotal.io>
      242783ae
  13. 24 1月, 2019 1 次提交
    • M
      Don't choose indexscan when you need a motion for a subplan (#6665) · cd055f99
      Melanie 提交于
      
      When you have subquery under a SUBLINK that might get pulled up, you should
      not allow indexscans to be chosen for the relation which is the
      rangetable for the subquery. If that relation is distributed and the
      subquery is pulled up, you will need to redistribute or broadcast that
      relation and materialize it on the segments, and cdbparallelize will not
      add a motion and materialize an indexscan, so you cannot use indexscan
      in these cases.
      You can't materialize an indexscan because it will materialize only one
      tuple at a time and when you compare that to the param you get from the
      relation on the segments, you can get wrong results.
      
      Because we don't pick indexscan very often, we don't see this issue very
      often. You need a subquery referring to a distributed table in a subplan
      which, during planning, gets pulled up and then when adding paths, the
      indexscan is cheapest.
      Co-authored-by: NAdam Berlin <aberlin@pivotal.io>
      Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
      Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
      cd055f99
  14. 23 1月, 2019 1 次提交
    • P
      Run gp_toolkit early to reduce testing time of it due to less logs. · 5bc0bcb2
      Paul Guo 提交于
      gp_toolkit test tests various log related views like gp_log_system(), etc.  If
      we run the test earlier, less logs are generated and thus the test runs fater.
      In my test environment, the test time reduces from ~22 seconds to 6.X seconds
      with this patch. Also, I check the whole test case, this change will not affect
      the test coverage.
      5bc0bcb2
  15. 21 1月, 2019 1 次提交
  16. 14 1月, 2019 1 次提交
    • H
      Set client_encoding to match DB encoding in QD->QE connections. · a6c9b436
      Heikki Linnakangas 提交于
      QE processes largely don't care about client_encoding, because query
      results are sent through the interconnect, except for a few internal
      commands, and the query text is presumed to already be in the database
      encoding, in QD->QE messages.
      
      But there were a couple of cases where it mattered. Error messages
      generated in QEs were being converted to client_encoding, but QD assumed
      that they were in server encoding.
      
      Now that the QEs don't know the user's client_encoding, COPY TO needs
      changes. In COPY TO, the QEs are responsible for forming the rows in the
      final cilent_encoding, so the QD now needs to explicitly use the COPY's
      ENCODING option, when it dispatches the COPY to QEs.
      
      The COPY TO handling wasn't quite right, even before this patch. It showed
      up as regression failure in the src/test/mb/mbregress.sh 'sjis' test. When
      client_encoding was set with the PGCLIENTENCODING, however, it wasn't set
      correctly in the QEs, which showed up as incorrectly encoded COPY output.
      Now that we always set it to match the database encoding in QEs, that's
      moot.
      
      While we're at it, change the mbregress test so that it's not sensitive to
      row orderings. And make atmsort.pm more lenient, to recognize "COPY
      <tablename> TO STDOUT", even when the tablename contains non-ascii
      characters. These changes were needed to make the src/test/mb/ tests pass
      cleanly.
      
      Fixes https://github.com/greenplum-db/gpdb/issues/5241.
      
      Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/WPmHXuU9T94/gvpNOE73FwAJReviewed-by: NPengzhou Tang <ptang@pivotal.io>
      a6c9b436
  17. 10 1月, 2019 1 次提交
    • M
      Resolve FIXME for validatepart by passing relpersistence of root · 263355cf
      Melanie Plageman 提交于
      MergeAttributes was used in atpxPart_validate_spec to get the schema and
      constraints to make a new leaf partition as part of ADD or SPLIT
      PARTITION. It was likely used as a convenience, since it already
      existed, and seems like the wrong function for the job.
      
      Previously atpxPart_validate_spec simply hard-coded in false for the relation
      persistence since the parameter was simply `isTemp`. Once the options
      for relation persistence were expanded to included unlogged, this
      paramter was changed to take a relpersistence. In MergeAttributes, for
      the part which we actually hit when calling it from here (we pass in the
      schema as NIL and therefore hit only half of the MergeAttributes code)
      the `supers` parameter is actually that of the parent partition and
      includes relpersistence, so, by passing in the relpersistence of the
      parent as relpersistence here, the checks we do around relpersistence are
      redundant because we are comparing the parent's relpersistence to its
      own. However, because, currently, this function is only called when we
      are making a new relation that, because we don't allow a different
      persistence to be specified for the child would actually just be using
      the relpersistence of the parent anyway, by passing it in hard-coded we
      would actually be incorrectly assuming that we are creating a permanent
      relation always.
      
      Since MergeAttributes was overkill, we wrote a new helper
      function, SetSchemaAndConstraints, to get the schema and constraints of
      a relation. This function doesn't do very many special validation checks
      that may be required by callers when using it in the context of
      partition tables (so user beware), however, it is probably only useful
      in the context of partition tables because it assumes constraints will
      be cooked, which, wouldn't be the case for all relations.
      We split it into two smaller inline functions for clarity. We also felt
      this would be a useful helper function in general, so we extern'd it.
      
      This commit also sets the relpersistence that is used to make the leaf
      partition when adding a new partition or splitting an existing a partition.
      
      makeRangeVar is a function from upstream which is basically a
      constructor. It sets relpersistence in the RangeVar to a hard-coded
      value of RELPERSISTENCE_PERMANENT. However, because we use the root
      partition to get the constraints and column information for the new
      leaf, after we use the default construction of the RangeVar, we need to
      set the relpersistence to that of the parent.
      
      This commit specifically only sets it back for the case in which we are
      adding a partition with `ADD PARTITION` or through `SPLIT PARTITION`.
      
      Without this commit, a leaf partition of an unlogged table created
      through `ADD PARTITION` or `SPLIT PARTITION` would incorrectly have its
      relpersistence set to permanent.
      Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
      Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
      263355cf
  18. 05 12月, 2018 1 次提交
  19. 30 11月, 2018 1 次提交
    • J
      pg_dump: address DROP COLUMN FIXME and add regression tests · b93d631d
      Jim Doty 提交于
      After reviewing the use of ALTER TABLE DROP COLUMN in the pg_dump code,
      we feel that the ordering of DDL statements in the dump do not raise any
      concerns with respect to vanilla table inheritance hierarchies.
      
      In the current implementation, we need to cascade the DROP COLUMN to
      partitions, but we do not want to cascade to "normal" inherited child
      tables. Because the child tables are not hooked into the inheritance
      hierarchy until after the DROP COLUMN is performed, it looks like ALTER
      TABLE and ALTER TABLE ONLY are equivalent for vanilla inheritance.
      
      We've added some regression tests that will hopefully catch any changes
      to the above assumptions. The added sql test file is run from not only
      pg_regress, but also during the standalone pg_upgrade cluster test. It
      is intended to contain corner cases specific to pg_upgrade where the
      state that needs to be tested would otherwise be destroyed during a dump
      and restore (such as dropped columns).
      Co-authored-by: NJacob Champion <pchampion@pivotal.io>
      Co-authored-by: NJim Doty <jdoty@pivotal.io>
      b93d631d
  20. 29 11月, 2018 1 次提交
  21. 23 11月, 2018 1 次提交
    • N
      Reduce differences between reshuffle tests · 2eef2ba2
      Ning Yu 提交于
      There are 3 reshuffle tests, the ao one, the co one, and the heap one.
      They share almost the same cases, but different on table names and
      create table options.  There are also some differences caused when
      adding regression tests, they are only added in one file but not others.
      
      We want to keep minimal differences between these tests, so we ensure
      that a regression test for ao also covers similar case for heap.  And
      once we understand one of the test file we have almost the same
      knowledge on the others.
      
      Here is a list of changes to these tests:
      - reduce differences on table names by using schema;
      - reduce differences on CREATE TABLE options by setting default storage
        options;
      - simplify the creation of partially distributed tables by using the
        gp_debug_numsegments extension;
      - copy some regression tests to all the tests;
      - retire the no longer used helper function;
      - move the tests into an existing parallel test group;
      
      pg_regress test framework provides some @@ tokens for ao/co tests,
      however we still can not merge the ao and co tests into one file as
      WITH (OIDS) is only supported by ao but not co.
      2eef2ba2
  22. 22 11月, 2018 1 次提交
    • N
      New extension to debug partially distributed tables · 3119009a
      Ning Yu 提交于
      Introduced a new debugging extension gp_debug_numsegments to get / set
      the default numsegments when creating tables.
      
      gp_debug_get_create_table_default_numsegments() gets the default
      numsegments.
      
      gp_debug_set_create_table_default_numsegments(text) sets the default
      numsegments in text format, valid values are:
      - 'FULL': all the segments;
      - 'RANDOM': pick a random set of segments each time;
      - 'MINIMAL': the minimal set of segments;
      
      gp_debug_set_create_table_default_numsegments(integer) sets the default
      numsegments directly, valid range is [1, gp_num_contents_in_cluster].
      
      gp_debug_reset_create_table_default_numsegments(text) or
      gp_debug_reset_create_table_default_numsegments(integer) reset the
      default numsegments to the specified value, and the value can be reused
      later.
      
      gp_debug_reset_create_table_default_numsegments() resets the default
      numsegments to the value passed last time, if there is no previous call
      to it the value is 'FULL'.
      
      Refactored ICG test partial_table.sql to create partial tables with this
      extension.
      3119009a
  23. 19 11月, 2018 1 次提交
    • Z
      Correct behavior for reshuffling partition tables · 8bf413d6
      ZhangJackey 提交于
      The previous code makes UPDATE statement for root
      and its children partitions when we reshuffle a partition
      table. It not only involves redundant work but also will
      lead to an error while reshuffling a two-level partition
      table(because the mid-level partitions have no data).
      
      The commit does the following work:
      
      * Only make UPDATE statement for leaf partition or
         non-partition table.
      * Refactor the reshuffle test cases. We remove the
         python udf code and use `gp_execute_on_server`
         and `gp_dist_random` to test replicated table.
      
      Co-authored-by: Shujie Zhang shzhang@pivotal.io
      Co-authored-by: Zhenghua Lyu zlv@pivotal.io
      8bf413d6
  24. 16 11月, 2018 1 次提交
    • D
      QE writer should cancel QE readers before aborting · 206ffa6c
      David Kimura 提交于
      If a QE writer marks the transaction as aborted while a reader is
      still executing the query the reader may affect shared memory state.
      E.g. consider a transaction being aborted and a table needs to be
      dropped as part of abort processing.  If the writer has dropped all
      the buffers belonging to the table from shared buffer cache and is
      about to unlink the file for the table.  Concurrently, a reader,
      unaware of the writer's abort, is still executing the query.  The
      reader may bring in a page from the file that the writer is about to
      unlink into shared buffer cache.
      
      In order to prevent such situations the writer walks procArray to find
      the readers and sends SIGINT to them.
      
      Walking procArray is expensive, and is avoided as much as possible.
      The patch walks the procArray only if the command being aborted (or
      the last command in a transaction that is being aborted) performed a
      write and at least one reader slice was part of the query plan.
      
      To avoid confusion on the QD due to "canceling MPP operation" error
      messages emitted by the readers upon receiving the SIGINT, the readers
      do not emit them on the libpq channel.
      
      Discussion:
      https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/S2WL1FtJEJ0/Wh6DfJ-RBwAJCo-authored-by: NEkta Khanna <ekhanna@pivotal.io>
      Co-authored-by: NAsim R P <apraveen@pivotal.io>
      Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
      206ffa6c
  25. 14 11月, 2018 2 次提交
  26. 12 11月, 2018 1 次提交
    • Z
      Refactor reshufle regression test cases. · f79a10c3
      Zhenghua Lyu 提交于
      This commit does three things:
        - move reshufle tests to greenplum_schedule
        - add cases to test that reshuffle can correctly abort
        - remove some redundant cases(I think in this regress
          tests, it is enough to just test expanding from 2 to 3)
      f79a10c3
  27. 07 11月, 2018 1 次提交
    • Z
      Adjust GANG size according to numsegments · 6dd2759a
      ZhangJackey 提交于
      Now we have  partial tables and flexible GANG API, so we can allocate
      GANG according to numsegments.
      
      With the commit 4eb65a53, GPDB supports table distributed on partial segments,
      and with the series of commits (a3ddac06, 576690f2), GPDB supports flexible
      gang API. Now it is a good time to combine both the new features. The goal is
      that creating gang only on the necessary segments for each slice. This commit
      also improves singleQE gang scheduling and does some code clean work. However,
      if ORCA is enabled, the behavior is just like before.
      
      The outline of this commit is:
      
        * Modify the FillSliceGangInfo API so that gang_size is truly flexible.
        * Remove numOutputSegs and outputSegIdx fields in motion node. Add a new
           field isBroadcast to mark if the motion is a broadcast motion.
        * Remove the global variable gp_singleton_segindex and make singleQE
           segment_id randomly(by gp_sess_id).
        * Remove the field numGangMembersToBeActive in Slice because it is now
           exactly slice->gangsize.
        * Modify the message printed if the GUC Test_print_direct_dispatch_info
           is set.
        * Explicitly BEGIN create a full gang now.
        * format and remove destSegIndex
        * The isReshuffle flag in ModifyTable is useless, because it only is used
           when we want to insert tuple to the segment which is out the range of
           the numsegments.
      
      Co-authored-by: Zhenghua Lyu zlv@pivotal.io
      6dd2759a
  28. 24 10月, 2018 1 次提交
  29. 23 10月, 2018 1 次提交
    • N
      Convert interconnect udp tests from tinc to regress · 78bbcf8c
      Ning Yu 提交于
      All existing interconnect udp tests are converted from tinc to regress,
      they need about 16 minutes to finish, so we put them in a new test
      target installcheck-icudp to prevent slowing ICW.  The fast ones are
      also put in ICW to get more verifications.
      
      On pipeline we also made a bit changes to the running environment, we
      used to run the tests in a multiple segments cluster, now we run
      directly on a demo cluster.
      78bbcf8c
  30. 19 10月, 2018 1 次提交
    • J
      Improve Oid sync logic for both QD and QE in wraparound cases · b3805d3c
      Jimmy Yih 提交于
      When Oid count between the QD and QE were not synced as expected, there are
      cases where object creation could take a long time due to Oid synchronizing.
      
      These are the two that we found:
      1. If the QD wrapped around and the QE did not, the QD syncing with the highest
      QE Oid counter value could result in a long loop. Additionally, there's a hack
      in how the QD advances its Oid counter above the highest QE Oid counter value
      where it will increment the Oid counter 10 more times to be *safe*. However,
      this hack introduces holes in the Oid count and makes this bug more probable.
      2. The QE advances its Oid counter when creating objects with dispatched
      pre-assigned Oids. If the pre-assigned Oid dispatched by the QD was not from
      wraparound and the QE had already wrapped around, it would result in a long
      loop for the QE to synchronize with the QD.
      
      To prevent long Oid synchronization loops, we add logic to handle wraparound
      cases in both QD and QE.
      
      This can be easily reproduced by running `pg_resetxlog -o <Oid>` to set the Oid
      counter value on QD and/or QE to simulate the scenarios.
      Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>
      Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
      b3805d3c
  31. 16 10月, 2018 1 次提交
    • J
      DROP DATABASE should remove shared buffer cache entries · 4db039a0
      Jimmy Yih 提交于
      This used to be handled by persistent tables. When persistent tables were
      removed, we forgot to add back the dropping of shared buffer cache entries as
      part of DROP DATABASE operation. This commit adds it back along with a test so
      we do not ever forget.
      
      Some relevant comments from Heikki/Jesse:
      The reason why this issue was not seen is because there is a RequestCheckpoint()
      near the end of dropdb(). So before dropdb() actually removes the files for the
      database, all dirty buffers have already been flushed to disk. The buffer
      manager will not try to write out clean buffers back to disk, so they will just
      age out of the buffer cache over time.
      
      One way this issue could have shown itself if we did not catch this could be the
      rare scenario that the same database OID is reused later for a different
      database, where this could cause false positives in future buffer cache lookups.
      4db039a0
  32. 13 10月, 2018 1 次提交
  33. 02 10月, 2018 1 次提交
    • A
      Avoid PANIC on multiple function execution when using ORCA · 04e43e64
      Adam Berlin 提交于
      A cached query planned statement contains information that is
      freed after the first execution of a function. The second execution
      used the cached planned statement to populate the execution state
      using a freed pointer and throws a segmentation fault.
      
      To resolve, we copy the contents out of the planned statement rather
      than copying a pointer to the planned statement, so that when our
      executor state gets freed, it does not also free the cached data.
      
      Note: `numSelectorsPerScanId` is the problematic property, and is only
      used for partition tables.
      Co-authored-by: NDavid Kimura <dkimura@pivotal.io>
      Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
      04e43e64
  34. 01 10月, 2018 1 次提交
    • H
      Also support event triggers on CREATE PROTOCOL. · d920bf31
      Heikki Linnakangas 提交于
      We don't particularly care about event triggers on CREATE PROTOCOL as such,
      but we were tripping the assertion in EventTriggerCommonSetup(), because
      it was being called for CREATE PROTOCOL, even though it was listed as
      not supported. It seems to actually work fine, if we just mark it as
      suppported, so might as well.
      
      Add a test case, also for the CREATE EXTERNAL TABLE support.
      d920bf31
  35. 29 9月, 2018 1 次提交
    • X
      Fix bitmap index recovery issue · 47369d3c
      xiong-gang 提交于
      Bitmap index doesn't write the backup page in xlog, so if system crashes
      after checkpoint and the index is dropped, bitmap_redo will find the the
      page is invalid and won't be able to recover.
      
      For example:
      create table bm_test (a int4, b int4);
      create index bm_b_idx on bm_test using bitmap(b);
      checkpoint;
      insert into bm_test select i,i from generate_series(1,100000)i;
      drop table bm_test;
      
      pkill -9 postgres
      gpstart -a
      47369d3c
  36. 28 9月, 2018 1 次提交
    • Z
      Allow tables to be distributed on a subset of segments · 4eb65a53
      ZhangJackey 提交于
      There was an assumption in gpdb that a table's data is always
      distributed on all segments, however this is not always true for example
      when a cluster is expanded from M segments to N (N > M) all the tables
      are still on M segments, to workaround the problem we used to have to
      alter all the hash distributed tables to randomly distributed to get
      correct query results, at the cost of bad performance.
      
      Now we support table data to be distributed on a subset of segments.
      
      A new columne `numsegments` is added to catalog table
      `gp_distribution_policy` to record how many segments a table's data is
      distributed on.  By doing so we could allow DMLs on M tables, joins
      between M and N tables are also supported.
      
      ```sql
      -- t1 and t2 are both distributed on (c1, c2),
      -- one on 1 segments, the other on 2 segments
      select localoid::regclass, attrnums, policytype, numsegments
          from gp_distribution_policy;
       localoid | attrnums | policytype | numsegments
      ----------+----------+------------+-------------
       t1       | {1,2}    | p          |           1
       t2       | {1,2}    | p          |           2
      (2 rows)
      
      -- t1 and t1 have exactly the same distribution policy,
      -- join locally
      explain select * from t1 a join t1 b using (c1, c2);
                         QUERY PLAN
      ------------------------------------------------
       Gather Motion 1:1  (slice1; segments: 1)
         ->  Hash Join
               Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
               ->  Seq Scan on t1 a
               ->  Hash
                     ->  Seq Scan on t1 b
       Optimizer: legacy query optimizer
      
      -- t1 and t2 are both distributed on (c1, c2),
      -- but as they have different numsegments,
      -- one has to be redistributed
      explain select * from t1 a join t2 b using (c1, c2);
                                QUERY PLAN
      ------------------------------------------------------------------
       Gather Motion 1:1  (slice2; segments: 1)
         ->  Hash Join
               Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
               ->  Seq Scan on t1 a
               ->  Hash
                     ->  Redistribute Motion 2:1  (slice1; segments: 2)
                           Hash Key: b.c1, b.c2
                           ->  Seq Scan on t2 b
       Optimizer: legacy query optimizer
      ```
      4eb65a53
  37. 26 9月, 2018 1 次提交
    • D
      Move zstd ICG test into its gpcontrib module · 3dc4b156
      David Kimura 提交于
      The test is included in "installcheck" under the zstd module. It should
      be eventually included as part of ICW.
      
      CAVEAT: the test for zstd, as it turns out, is wrong (nondeterministic)
      since its inclusion in commit 724f9d27.
      
      This move eliminiates the need to have a separate "error: zstd not
      supported" answer file in ICG.
      Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
      3dc4b156
  38. 21 9月, 2018 1 次提交
    • T
      Fix pg_stat_activity show wrong session id after session reset bug (#5757) · ac54faad
      Teng Zhang 提交于
      * Fix pg_stat_activity show wrong session id after session reset bug
      
      Currently if a session is reset because of some error such as OOM,
      after call CheckForResetSession, gp_session_id will bump to a new one,
      but sess_id in pg_stat_activity remains unchanged and show the wrong number.
      This commit changes sess_id in pg_stat_activity, once a session is reset.
      
      * Refactor test using gp_execute_on_server to trigger session reset
      ac54faad