1. 28 8月, 2017 12 次提交
    • H
      Use ereport() rather than elog(), for "expected" ERRORs. · 522c7c09
      Heikki Linnakangas 提交于
      Use ereport(), with a proper error code, for errors that are "expected" to
      happen sometimes, like missing configuration files, or network failure.
      
      That's nicer in general, but the reason I bumped into this is that internal
      error messages include a source file name and line number in GPDB, and if
      those error messages are included in the expected output of regression
      tests, those tests will fail any time you do change the file, so that the
      elog() moves around and the line number changes.
      522c7c09
    • D
      Avoid side effects in assertions · 288dde95
      Daniel Gustafsson 提交于
      An assertion with a side effect may alter the main codepath when
      the tree is built with --enable-cassert, which in turn may lead
      to subtle differences due compiler optimizations and/or straight
      bugs in the side effect. Rewrite the assertions without side
      effects to leave the main codepath intact.
      288dde95
    • P
      Enlarge the range of memory_shared_quota in tests · 46eb2c73
      Pengzhou Tang 提交于
      46eb2c73
    • P
      Add test cases under utility mode for resource group · 910036b0
      Pengzhou Tang 提交于
      This commit verified that connections in utility mode should not be
      governed by resource group
      910036b0
    • P
      Add resource group tests for cursors, pl functions and prepare/execute statements · 9392d17b
      Pengzhou Tang 提交于
      This commit verified that cursors, prepare and statements within pl functions didn't
      trigger self-deadlock by the concurrency control of resource group.
      9392d17b
    • P
      Use index scan on pg_resgroupcapability for resource group · 3fd8a618
      Pengzhou Tang 提交于
      We used to use sequence scan on pg_resgroupcapability for functions that
      need to do a full scan of pg_resgroupcapability. Problem is accessing
      in this way will take a long time after pg_resgroupcapability table was
      updated/deleted million times as our stress tests do, pg_resgroupcapability
      was filled of invalid blocks and sequence scan wasted lots of time to bypass
      those blocks. Using index scan on it can resolve this problem.
      3fd8a618
    • P
      Add parallel tests for resource group · ee1d30c9
      Pengzhou Tang 提交于
      This commit contain all kinds of parallel tests include combination of CREATE,
      DROP, ALTER and Queries, this commit depends on dblink component to run
      queries concurrently.
      Signed-off-by: NRichard Guo <riguo@pivotal.io>
      ee1d30c9
    • A
      Optimize `COPY TO ON SEGMENT` result processing · 266355d3
      Adam Lee 提交于
      Don't send nonsense '\n' characters just for counting, let segments
      report how many rows are processed instead.
      Signed-off-by: NMing LI <mli@apache.org>
      266355d3
    • X
      Add GUC to control the distribution key checking for "COPY FROM ON SEGMENT" · 6566d48c
      Xiaoran Wang 提交于
      GUC value is `gp_enable_segment_copy_checking`, its default value is true.
      User can disable the distribution key check with with a GUC value.
      Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>
      6566d48c
    • X
      Check distribution key restriction for `COPY FROM ON SEGMEN` · 65321259
      Xiaoran Wang 提交于
      When use command `COPY FROM ON SEGMENT`, we copy data from
      local file to the table on the segment directly. When copying
      data, we need to apply the distribution policy on the record to compute
      the target segment. If the target segment ID isn't equal to
      current segment ID, we will report error to keep the distribution
      key restriction.
      
      Because the segment has no meta data info about table distribution policy and
      partition policy,we copy the distribution policy of main table from
      master to segment in the query plan. When the parent table and
      partitioned sub table has different distribution policy, it is difficult
      to check all the distribution key restriction in all sub tables. In this
      case , we will report error.
      
      In case of the partitioned table's distribution policy is
      RANDOMLY and different from the parent table, user can use GUC value
      `gp_enable_segment_copy_checking` to disable this check.
      
      Check the distribution key restriction as follows:
      
      1) Table isn't partioned:
          Compute the data target segment.If the data doesn't belong the
          segment, will report error.
      
      2) Table is partitioned and the distribution policy of partitioned table
      as same as the main table:
          Compute the data target segment.If the data doesn't belong
          the segment, will report error.
      
      3) Table is partitioned and the distribution policy of partitioned
      table is different from main table:
          Not support to check ,report error.
      Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>
      Signed-off-by: NMing LI <mli@apache.org>
      Signed-off-by: NAdam Lee <ali@pivotal.io>
      65321259
    • H
      Add new GPDB_EXTRA_COL mechanism to process_col_defaults.pl · 215283a6
      Heikki Linnakangas 提交于
      This allows setting one of the GPDB-added attributes on a line, without
      modifying the original line. This reduces the diff of pg_proc.h from
      upstream.
      
      This has no effect on the resulting BKI file, except for whitespace. IOW,
      there are no catalog changes in this commit. I checked that by diffing the
      resulting BKI file, before and after this patch, with "diff -w".
      
      I still left the TODO comment in pg_proc.h in place, which pointed out
      that it'd be nice if we could automatically use prodataaccess = 'c' as the
      default for SQL-language columns, and 'n' for others. I actually wrote a
      more flexible prototype at first that could do that. In that prototype, you
      could provide an arbitrary perl expression that was evaluated on every row,
      and could compute a value based on other columns. But that was more
      complicated, and at the same time, not as flexible, because you could still
      not specify particular values for just one row. So I think this is better
      in the end.
      
      Also, I noticed that we haven't actually marked all SQL-language functions
      with prodataaccess = 'c'. Tsk tsk. It's too late for catalog changes, so
      not fixing that now. At some point, we should discuss whether we should
      do something different with prodataaccess, like change the code so that
      it's ignored for SQL language functions altogether. Or perhaps just remove
      the column, the only useful value for it is the magic 's' value, which
      can only be used in built-in functions because there's no DDL syntax for
      it. But that's a whole different story.
      215283a6
    • H
      Don't strip semicolon from re-constructed DATA lines in BKI sources. · e6569903
      Heikki Linnakangas 提交于
      It's harmless, as the genbki script ignores them, but seems a bit untidy.
      It also makes it harder to diff between the re-constructed DATA lines and
      the originals.
      e6569903
  2. 27 8月, 2017 1 次提交
    • D
      Ensure directory cleanup in workfile manager · 6aecdb98
      Daniel Gustafsson 提交于
      Calling elog(ERROR ..) will exit the current context, so the
      cleanup function would never run. Shift around to ensure the
      cleanup is called. This codepath was recently introduced in
      commit 00ce2c14 where a surrounding try/catch block was removed.
      
      Also change to ereport from elog since this is an error that
      a user could run into.
      6aecdb98
  3. 26 8月, 2017 12 次提交
  4. 25 8月, 2017 13 次提交
    • H
      Change a few error messages to match upstream again. · ceb602da
      Heikki Linnakangas 提交于
      I don't understand why these were modified in GPDB in the first place.
      I dug into the old git history, from before Greenplum was open sourced, and
      traced the change to a massive commit from 2011, which added support for
      (non-recursive) WITH clause. I think the change was just collateral damage
      in that patch; I don't see any relationship between WITH clause support and
      these error messages.
      
      These errors can be reproduced with queries like this:
      
          (select 'foobar' order by 1) order by 1;
          (select 'foobar' limit 1) limit 2;
          (select 'foobar' offset 1) offset 2;
      ceb602da
    • H
      Use ereport, rather than elog, for performance. · 01dff3ba
      Heikki Linnakangas 提交于
      ereport() has one subtle but important difference to elog: it doesn't
      evaluate its arguments, if the log level says that the message doesn't
      need to be printed. This makes a small but measurable difference in
      performance, if the arguments contain more complicated expressions, like
      function calls.
      
      While performance testing a workload with very short queries, I saw some
      CPU time being used in DtxContextToString. Those calls were coming from the
      arguments to elog() statements, and the result was always thrown away,
      because the log level was not high enough to actually log anything. Turn
      those elog()s into ereport()s, for speed.
      
      The problematic case here was a few elogs containing DtxContextToString
      calls, in hot codepaths, but I changed a few surrounding ones too, for
      consistency.
      
      Simplify the mock test, to not bother mocking elog(), while we're at it.
      The real elog/ereport work just fine in the mock environment.
      01dff3ba
    • H
      Fix assertion failure in single-user mode. · a29aecf7
      Heikki Linnakangas 提交于
      In single-user mode, MyQueueId isn't set. But there was an assertion for
      that in ResourceQueueGetQueryMemoryLimit. To fix, don't apply memory limits
      in single-user mode.
      a29aecf7
    • P
      Cleanup specialized test cases for UDP type of interconnect · 61b82012
      Pengzhou Tang 提交于
      UDP type of interconnect has been removed from the code base, we
      need also cleanup it's specialized test cases.
      61b82012
    • A
      e61a0134
    • X
      Make pg_resetxlog consistent with gpinitsystem checksum setting · 0bc37e07
      Xin Zhang 提交于
      By default, gpinitsystem will turn on HEAP_CHECKSUM by calling initdb
      with --data-checksum.
      
      Originally, pg_resetxlog will set data_checksum_version to 0 if
      pg_control is not readable.
      
      In this fix, we make pg_resetxlog to set the data_checksum_version to
      PG_DATA_CHECKSUM_VERSION instead, hence its default behavior will be
      consistent with gpinitsystem.
      Signed-off-by: NAshwin Agrawal <aagrawal@pivotal.io>
      0bc37e07
    • Z
      Fix bug: resgroup decrease concurrency_limit not work correctly · 79f3d357
      Zhenghua Lyu 提交于
      In previous code, when user decreases resgroup concurrency_limit
      to a value that less than the number of current running jobs in
      that resgroup, the calculation of memory that need to return to
      SYSPOOL is not correct and it might cause assert fail.
      
      We add code that takes into account this situation. And the logic
      here is that when we decide to return some memory to SYSPOOL, we
      only return `Min(total_memory_should_return, max_mem_can_return)`.
      
      And since alter-memory command has gradually-effect semantic, when
      a job is just before ending and it finds out that its ending could
      provide free slot for others(not blocked by concurrency_limit), it
      will try not to return all the memory it could but reserve some to
      make sure that new job would not be blocked because of memory_quota.
      Signed-off-by: NGang Xiong <gxiong@pivotal.io>
      79f3d357
    • M
      DOCS: update gpconfig. Updated, simplified quoting syntax for GUC values (#3016) · 0aed7aef
      Mel Kiyama 提交于
      * DOCS: update gpconfig. Updated, simplified quoting syntax for GUC values
      
      * docs: update gpconfig examples in docs to use updated quoting syntax.
      
      * docs: gpconfig -fix typos
      0aed7aef
    • L
      docs - misc edits (#3019) · e53d773f
      Lisa Owen 提交于
      * misc doc edits
      
      - pg_proc prorows column
      - pg_locks virtualxid, virtualtransaction columns
      - update a few queries to remove pg_locks transaction
      - add TRUNCATE to LOCK sql page ACCESS RESTRICTIVE command list
      
      * fix pg_locks intro wording
      e53d773f
    • X
      Update PR pipeline with CONFIGURE_FLAGS · 8409d622
      Xin Zhang 提交于
      Signed-off-by: NAshwin Agrawal <aagrawal@pivotal.io>
      8409d622
    • H
      Drop test view after test, to work around failure in binary swap test. · fef69349
      Heikki Linnakangas 提交于
      Commit a7de4d60 added a test view, which is causing trouble for the
      binary swap test. The binary swap test runs the regression tests using
      the new binaries, but then swaps out an old binary, and runs pg_dump
      against the regression database. That fails with this test view, because
      the old version still has the bug that commit a7de4d60 fixed, and
      cannot parse the view definition stored in the catalogs correctly.
      
      So, the code is fine, and we are still binary-compatible. To silence the
      false failure in the binary swap test, drop the test view after the test,
      so that it won't be present in the regression database, when the binaries
      are swapped.
      
      Analysis by Jesse Zhang. See discssuon on gpdb-dev mailing list.
      
      Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/UJ3U_yqs38A
      fef69349
    • S
      122186bc
    • A
      40391f6d
  5. 24 8月, 2017 2 次提交
    • H
      Revert the premature optimization in readfuncs.c. · a7de4d60
      Heikki Linnakangas 提交于
      We had replaced the upstream code in readfuncs.c that checks what kind of
      a Node we're reading, with a seemingly smarter binary search. However, that's
      a premature optimization. Firstly, the linear search is pretty darn fast,
      because after compiler optimizations, it will check for the string length
      first. Secondly, the binary search implementation required an extra
      palloc+pfree, which is expensive enough that it almost surely destroys any
      performance gain from using a binary search. Thirdly, this isn't a very
      performance-sensitive codepath anyway. This is used e.g. to read view
      definitions from the catalog, which doesn't happen very often. The
      serialization code used when dispatching a query from QD to QEs is a more
      hot codepath, but that path uses the different method, in readfast.c.
      
      So, revert the code the way it is in the upstream. This hopefully reduces
      merge conflicts in the future.
      
      Also, there was in fact a silly bug in the old implementation. It used
      wrong identifier string for the RowCompare expression. Because of that, if
      you tried to use a row comparison in a view, you got an error. Fix that,
      and also add a regression test for it.
      a7de4d60
    • H
      Enable forgotten test. · e1cbe31b
      Heikki Linnakangas 提交于
      Commit 0be2e5b0 added a regression test to check that if a view contains
      an ORDER BY, that ORDER BY is obeyed on selecting from the view. However,
      it forgot to add the test to the schedule, so it was never run. Add it.
      
      There is actually a problem with the test as it is written: gpdiff masks
      out the differences in the row order, so this test won't catch the problem
      that it was originally written for. Nevertheless, seems better to enable
      the test and run, than not run it at all. But add a comment to point that
      out.
      e1cbe31b