1. 28 12月, 2017 7 次提交
    • A
      Able to cancel COPY PROGRAM ON SEGMENT if the program hangs · 110b825f
      Adam Lee 提交于
      There are two places that QD keep trying to get data, ignore SIGINT, and
      not send signal to QEs. If the program on segment has no input/output,
      copy command hangs.
      
      To fix it, this commit:
      
      1, lets QD wait connections able to be read before PQgetResult(), and
      cancels queries if gets interrupt signals while waiting
      2, sets DF_CANCEL_ON_ERROR when dispatch in cdbcopy.c
      3, completes copy error handling
      
      -- prepare
      create table test(t text);
      copy test from program 'yes|head -n 655360';
      
      -- could be canceled
      copy test from program 'sleep 100 && yes test';
      copy test from program 'sleep 100 && yes test<SEGID>' on segment;
      copy test from program 'yes test';
      copy test to '/dev/null';
      copy test to program 'sleep 100 && yes test';
      copy test to program 'sleep 100 && yes test<SEGID>' on segment;
      
      -- should fail
      copy test from program 'yes test<SEGID>' on segment;
      copy test to program 'sleep 0.1 && cat > /dev/nulls';
      copy test to program 'sleep 0.1<SEGID> && cat > /dev/nulls' on segment;
      110b825f
    • N
      Make btdexppages parameter to function gp_bloat_diag() numeric. · 152783e1
      Nadeem Ghani 提交于
      The gp_bloat_expected_pages.btdexppages column is numeric, but it was passed to
      the function gp_bloat_diag() as integer in the definition of the view of the
      same name gp_bloat_diag. This caused integer overflow errors when the number of
      expected pages exceeded max integer limit for columns with very large widths.
      
      This changes the function signature and call to use numeric for the
      btxexppages paramter.
      
      Adding a simple test to mimic the cutomer issue.
      
      Author: Nadeem Ghani <nghani@pivotal.io>
      Author: Shoaib Lari <slari@pivotal.io>
      152783e1
    • J
      CI: depricate CCP remote state path convention · 3520f6c8
      Jim Doty 提交于
      Going forward any pipeline with a terraform resource should specify a
      BUCKET_PATH of `clusters/`.
      
      As a static value it can be hardcoded in place of being an interpolated
      value from a secrets file.  The key `tf-bucket-path` in any secrets yaml
      files is now depricated and should be removed.
      
      We are dropping the convention where we had teams place their clusters'
      tfstate files in a path. In the past this convention was necessary, but
      now that we are tagging the clusters with much richer metadata, this
      convention is no longer strictly necessary.
      
      Balanced against the need for keeping the bucket organized was the
      possibility of two clusters attempting to register their AWS public key
      with the same name.  This collision came as a result of the mismatch in
      scope.  The terraform resource that provisioned the clusters, and
      selected names is only looking in the path given for name collisions,
      but the names for AWS keys has to be unique for the account.
      
      By collapsing all of the clusters into an account wide bucket, we will
      rely on the terraform resource to check for name conflicts going
      forward.
      
      Addresses bug: https://www.pivotaltracker.com/story/show/153928527Signed-off-by: NKris Macoskey <kmacoskey@pivotal.io>
      3520f6c8
    • K
      CI: depricate CCP remote state path convention · 6442a16a
      Kris Macoskey 提交于
      Going forward any pipeline with a terraform resource should specify a
      BUCKET_PATH of `clusters/`.
      
      As a static value it can be hardcoded in place of being an interpolated
      value from a secrets file.  The key `tf-bucket-path` in any secrets yaml
      files is now depricated and should be removed.
      
      We are dropping the convention where we had teams place their clusters'
      tfstate files in a path. In the past this convention was necessary, but
      now that we are tagging the clusters with much richer metadata, this
      convention is no longer strictly necessary.
      
      Balanced against the need for keeping the bucket organized was the
      possibility of two clusters attempting to register their AWS public key
      with the same name.  This collision came as a result of the mismatch in
      scope.  The terraform resource that provisioned the clusters, and
      selected names is only looking in the path given for name collisions,
      but the names for AWS keys has to be unique for the account.
      
      By collapsing all of the clusters into an account wide bucket, we will
      rely on the terraform resource to check for name conflicts going
      forward.
      
      Addresses bug: https://www.pivotaltracker.com/story/show/153928527Signed-off-by: NJim Doty <jdoty@pivotal.io>
      6442a16a
    • B
      Update ORCA version 2.53.2 · a808030f
      Bhuvnesh Chaudhary 提交于
      Test files are also updated in this commit as now we don't generate
      cross join alternative if an input join was present.
      
      cross join contains CScalarConst(1) as the join condition. if the
      input expression is as below with cross join at top level between
      CLogicalInnerJoin and CLogicalGet "t3"
      
      +--CLogicalInnerJoin
         |--CLogicalInnerJoin
         |  |--CLogicalGet "t1"
         |  |--CLogicalGet "t2"
         |  +--CScalarCmp (=)
         |     |--CScalarIdent "a" (0)
         |     +--CScalarIdent "b" (9)
         |--CLogicalGet "t3"
         +--CScalarConst (1)
      for the above expression (lower) predicate generated for the cross join
      between t1 and t3 will be: CScalarConst (1) In only such cases, donot
      generate such alternative with the lower join as cross join example:
      
      +--CLogicalInnerJoin
         |--CLogicalInnerJoin
         |  |--CLogicalGet "t1"
         |  |--CLogicalGet "t3"
         |  +--CScalarConst (1)
         |--CLogicalGet "t2"
         +--CScalarCmp (=)
            |--CScalarIdent "a" (0)
            +--CScalarIdent "b" (9)
      Signed-off-by: NShreedhar Hardikar <shardikar@pivotal.io>
      a808030f
    • X
      Fix bug in ALTER TABLE ADD COLUMN for AOCS · 7e7ee7e2
      Xin Zhang 提交于
      If the first insert into AOCS table aborted, the visible blocks in the block
      directory should be greater than 1. By default, we initialize the
      `DatumStreamWriter` with `blockFirstRowNumber=1` for newly added columns. Hence,
      the first row numbers are not consistent between the visible blocks. This caused
      inconsistency between the base table scan vs. the scan using indexes through
      block directory.
      
      This wrong result issue is only happened to the first invisible blocks. The
      current code (`aocs_addcol_endblock()` called in `ATAocsWriteNewColumns()`) already
      handles other gaps after the first visible blocks.
      
      The fix updates the `blockFirstRowNumber` with `expectedFRN`, and hence fixed the
      mis-alignment of visible blocks.
      
      Author: Xin Zhang <xzhang@pivotal.io>
      Author: Ashwin Agrawal <aagrawal@pivotal.io>
      7e7ee7e2
    • L
      Fix recursive chown for pxf_automation (#4162) · 8624a163
      Lav Jain 提交于
      8624a163
  2. 27 12月, 2017 2 次提交
  3. 22 12月, 2017 12 次提交
  4. 21 12月, 2017 17 次提交
    • H
      Remove code to pass locale data to gpsegstart.py. · 2faf8885
      Heikki Linnakangas 提交于
      The locale information is not used for anything, so no need to collect and
      pass it around.
      2faf8885
    • K
      CI: Remove no longer necessary package installs · e79198a0
      Kris Macoskey 提交于
      Installation of packages on every execution of a test suffers from any
      upstream flakiness. Therefore installation of generic packages is being
      moved to the underlying OS, in this case the AMI being used for the CCP
      job.
      
      In place of outright removing the package installation, it is a much
      better pattern to instead replace installation with a validation of the
      assumptions made for packages installed on the underlying OS the test
      will run within.
      
      The call `yum --cacheonly list installed [List of Packages]` does a number of things:
      
        1. For the given list of packages, if installed the command will
           return 0, and if any are not installed will return 1
      
        2. The `--cacheonly` prevents the call from issuing an upstream
           repository metadata refresh. This is not a requirement, but is an easy
           optimization that avoids upstream flakiness even further.
      
      Note: `--cacheonly` assumes that the repostiroy metadata cache has
      been refresh atleast once. If not, the flag will cause the command to
      fail. We are assuming that it has been performed at least once in the
      underlying OS in order to install the packages in the first place.
      Signed-off-by: NAlexandra Wang <lewang@pivotal.io>
      Signed-off-by: NDivya Bhargov <dbhargov@pivotal.io>
      e79198a0
    • C
      tinc: gpexpand move port range lower than ephemeral ports · 4c7ea6dc
      C.J. Jameson 提交于
      Fix gpexpand the same way as this commit 4b439bc9
      
      Port numbers used by GPDB should be below kernel's ephemeral port range
      
      The ephemeral port range is given by net.ipv4.ip_local_port_range kernel
      parameter. It is set to 32768 --> 60999. If GPDB uses port numbers in this
      range, FTS probe request may not get a response, resulting in FTS incorrectly
      marking a primary down.
      
      We change the example configuration files to lower the port number to proper
      range.
      
      Author: Marbin Tan <mtan@pivotal.io>
      Author: C.J. Jameson <cjameson@pivotal.io>
      4c7ea6dc
    • L
      docs - review zstd doc additions, misc related edits (#4170) · ce407afa
      Lisa Owen 提交于
      * docs - review zstd doc additions, misc related edits
      
      * modify ddl storage example 1 to use zstd
      ce407afa
    • B
      Pass Row Number and Rank Oid to Optimizer Config · 1c85430f
      Bhuvnesh Chaudhary 提交于
      1c85430f
    • S
      Updates to work with the new PXF Repo · cdef7a30
      Shivram Mani 提交于
      cdef7a30
    • L
      docs - add note about MaxStartups to relevant utility cmds (#4186) · 0bf54573
      Lisa Owen 提交于
      * docs - add note about MaxStartups to relevant utility cmds
      
      * uses ...
      0bf54573
    • H
      Refactor the lists of fault injector identifiers, for easier maintenance. · 24991ba9
      Heikki Linnakangas 提交于
      Maintaining these lists, in particular the list of fault injector
      identifier, or "fault points", is a bit tedious, because the names and enum
      values of the identifiers were kept in separate lists, in two different
      files. They needed to be kept in sync. To ease that pain, merge the two
      lists into a single file, with the identifier name and ID for each entry on
      the same line. Use C macro magic to generate the enum and array of strings
      from the single header file, similar to how the keyword list in
      src/include/parser/kwlist.h works.
      
      Remove the list of fault injection points from the --help message of
      clsInjectFault.py. It's just too easy to forget updating that. We could
      write some perl magic or something to create that list from the
      authoritative list in the header file at compilation time, but it hardly
      seems worth the effort, as this is just a developer tool. Advise the user
      to look directly into the header file, instead.
      24991ba9
    • N
      Remove dead code · b386e9e8
      Nadeem Ghani 提交于
      There was step defined but not being used and a helper method only being
      called from that step. This commit removes both.
      
      Author: Nadeem Ghani <nghani@pivotal.io>
      Author: Shoaib Lari <slari@pivotal.io>
      b386e9e8
    • N
      behave: Allow gptransfer test to run at midnight without flaking. · 66e16369
      Nadeem Ghani 提交于
      The gptransfer test used a step that looked for a logfile with a date in the
      name. If that logfile existed at 11:59PM on the day before, and the test looked
      for it at 12:00AM on the next day, it "wouldn't be there"
      
      Refactor the test so that assertions about using the typical
      gpAdminLogs directory are as banal as possible.
      
      Also see https://github.com/greenplum-db/gpdb/pull/4072/commits/84bf83d5013f891547dc21576d04a281cfd2faf7.
      
      Author: Nadeem Ghani <nghani@pivotal.io>
      Author: Shoaib Lari <slari@pivotal.io>
      66e16369
    • S
      Bump ORCA version · 6df10e43
      sambitesh 提交于
      6df10e43
    • H
      Update comment · 24fa7505
      Haisheng Yuan 提交于
      24fa7505
    • H
      Update regression test sql file to analyze table · 99162ec0
      Haisheng Yuan 提交于
      99162ec0
    • H
      Revert "Update ICG test expected output files" · 8ffb673b
      Haisheng Yuan 提交于
      This reverts commit 4fac169fb1204de54a05ac14fba1a5e4d9f82c08.
      8ffb673b
    • H
      Update ICG test expected output files · 67788bb1
      Haisheng Yuan 提交于
      67788bb1
    • H
      Tune bitmap scan cost model by updating to nonlinear cost per page · 982bcd4e
      Haisheng Yuan 提交于
      We had some customers reporting that planner generates plan using seqscan
      instead of bitmapscan, and the execution time of seqscan is 5x slower than
      using bitmapscan. The statistics were updated and quite accurate.
      
      Bitmap table scan uses some formula to interpolate between random_page_cost and
      seq_page_cost to determine the cost per page. But it turns out that the default
      value of random_page_cost is 100x of the value of seq_page_cost. With the
      original cost formula, random_page_cost predominates in the final cost result,
      even the formula is declared to be non-linear, but it is still more like linear,
      which can't reflect the real cost per page when a majority of pages are fetched.
      
      Therefore, the cost formula is updated to real non-linear function to reflect
      both random_page_cost and seq_page_cost for different percentage of pages
      fetched.
      
      For example, for the default value random_page_cost = 100, seq_page_cost = 1,
      if 80% pages are fetched, the cost per page in old formula is 11.45, which is
      10x more than seqscan, because the cost is dominated by random_page_cost in the
      formula. With the new formula, the cost per page is 1.63, which can reflect the
      real cost better, in my opinion.
      
      [#151934601]
      982bcd4e
    • H
      Use strcmp for string comparison. · a7d11627
      Heikki Linnakangas 提交于
      On my laptop, with gcc -O, these comparisons didn't work as intended,
      and as result, the FTS probe messages were not processed, and
      gp_segment_configuration always claimed the mirrors to be down, even though
      they were running OK.
      a7d11627
  5. 20 12月, 2017 2 次提交
    • S
      Bump ORCA version to 2.52.0 · 322770c5
      Shreedhar Hardikar 提交于
      322770c5
    • S
      Reimplement ORCA interrupts using a callback function · a6283c82
      Shreedhar Hardikar 提交于
      As pointed out by Heikki, maintaining another variable to match one in
      the database system will be error-prone and cumbersome, especially while
      merging with upstream. This commit initializes ORCA with a pointer to a
      GPDB function that returns true when QueryCancelPending or
      ProcDiePending is set. This way we no longer have to micro-manage
      setting and re-setting some internal ORCA variable, or touch signal
      handlers.
      
      This commit also reverts commit 0dfd0ebc "Support optimization interrupts
      in ORCA" and reuses tests already pushed by 916f460f and 0dfd0ebc.
      a6283c82