1. 17 7月, 2020 2 次提交
    • M
      docs - update utility docs with IP/hostname information. (#10379) · 54dbd926
      Mel Kiyama 提交于
      * docs - update utility docs with IP/hostname information.
      
      Add information to gpinitsystem, gpaddmirrors, and gpexpand ref. docs
      --Information about using hostnames vs. IP addresses
      --Information about configuring hosts that are configured with mulitple NICs
      
      Also updated some examples in gpinitsystem
      
      * docs - review comment updates. Add more information from dev.
      
      * docs - change examples to show valid configurations that support failorver.
      Also fix typos and minor edits.
      
      * docs - updates based on review comments.
      54dbd926
    • L
      docs - greenplumr input.signature (#10477) · 1c294e95
      Lisa Owen 提交于
      1c294e95
  2. 16 7月, 2020 3 次提交
  3. 15 7月, 2020 5 次提交
    • Z
      Remove deadcode contain_ctid_var_reference. · d229288a
      Zhenghua Lyu 提交于
      It was used to implement dedup plan which has been
      refactored by the commit 9628a332.
      
      So in this commit we remove these unused functions.
      d229288a
    • P
      Fix flaky test case 'gpcopy' · 9480d631
      Pengzhou Tang 提交于
      The failed test case is to test the command "copy lineitem to '/tmp/abort.csv'"
      can be cancelled after COPY is dispatched to QEs. To verify this, it checks that
      /tmp/abort.csv has fewer rows than lineitem.
      
      The cancel logical in codes is:
      
      QD dispatched the COPY command to QEs, then if QD get a cancel interrupt, it
      sends a cancel request to QEs, however, the QD will keep receiving data from
      QEs even QD already get a cancel interrupt. QD relies on QEs to receive the
      cancel request and explicitly stop copying data to QD.
      
      Obviously, QEs may already have copied out all data to QDs before they
      get cancel requests, so the test case cannot guarantee /tmp/aborted.csv
      has fewer rows than lineitem.
      
      To fix this, we just verify the COPY command can be aborted with message
      'ERROR:  canceling statement due to user request', the count
      verification looks pointless here.
      9480d631
    • H
      Cleanup idle reader gang after utility statements · d1ba4da5
      Hubert Zhang 提交于
      Reader gangs use local snapshot to access catalog, as a result, it will
      not synchronize with the sharedSnapshot from write gang which will
      lead to inconsistent visibility of catalog table on idle reader gang.
      Considering the case:
      
      select * from t, t t1; -- create a reader gang.
      begin;
      create role r1;
      set role r1;  -- set command will also dispatched to idle reader gang
      
      When set role command dispatched to idle reader gang, reader gang
      cannot see the new tuple t1 in catalog table pg_auth.
      To fix this issue, we should drop the idle reader gangs after each
      utility statement which may modify the catalog table.
      Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
      d1ba4da5
    • Z
      Correct plan of general & segmentGeneral path with volatiole functions. · d1f9b96b
      Zhenghua Lyu 提交于
      General and segmentGeneral locus imply that if the corresponding slice
      is executed in many different segments should provide the same result
      data set. Thus, in some cases, General and segmentGeneral can be
      treated like broadcast.
      
      But what if the segmentGeneral and general locus path contain volatile
      functions? volatile functions, by definition, do not guarantee results
      of different invokes. So for such cases, they lose the property and
      cannot be treated as *general. Previously, Greenplum planner
      does not handle these cases correctly. Limit general or segmentgeneral
      path also has such issue.
      
      The fix idea of this commit is: when we find the pattern (a general or
      segmentGeneral locus paths contain volatile functions), we create a
      motion path above it to turn its locus to singleQE and then create a
      projection path. Then the core job becomes how we choose the places to
      check:
      
        1. For a single base rel, we should only check its restriction, this is
           the at bottom of planner, this is at the function set_rel_pathlist
        2. When creating a join path, if the join locus is general or segmentGeneral,
           check its joinqual to see if it contains volatile functions
        3. When handling subquery, we will invoke set_subquery_pathlist function,
           at the end of this function, check the targetlist and havingQual
        4. When creating limit path, the check and change algorithm should also be used
        5. Correctly handle make_subplan
      
      OrderBy clause and Group Clause should be included in targetlist and handled
      by the above Step 3.
      
      Also this commit fixes DMLs on replicated table. Update & Delete Statement on
      a replicated table is special. These statements have to be dispatched to each
      segment to execute. So if they contain volatile functions in their targetList
      or where clause, we should reject such statements:
      
        1. For targetList, we check it at the function create_motion_path_for_upddel
        2. For where clause, they will be handled in the query planner and if we
           find the pattern and want to fix it, do another check if we are updating
           or deleting replicated table, if so reject the statement.
        3. Upsert case is handled in transform stage.
      d1f9b96b
    • J
      Fix uninitialized variable in pgrowlocks · 75283bc7
      Japin 提交于
      Because the variable rel is only used in if (SRF_IS_FIRSTCALL()) branch,
      we should move it's declaration into this branch (suggested by Hubert Zhang).
      75283bc7
  4. 14 7月, 2020 3 次提交
  5. 13 7月, 2020 4 次提交
    • D
      Docs - remove HCI warning · 9eb9c2ac
      David Yozie 提交于
      9eb9c2ac
    • T
      Update linux installation guide · ba5792fa
      Tyler Ramer 提交于
      Issue #10069 noted some problems with the linux documentation.
      
      Updating this documentation to be more accurate and direct configuration
      steps to the appropriate documentation.
      Co-authored-by: NTyler Ramer <tramer@vmware.com>
      Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>
      ba5792fa
    • Z
      Remove unused function pathnode_walk_node. · 7339a178
      Zhenghua Lyu 提交于
      Previously, `cdbpath_dedup_fixup` is the only function that
      will invoke `pathnode_walk_node`. And it was removed by the
      commit 9628a332.
      
      So in this commit we remove these unused functions.
      7339a178
    • (
      Fix flaky test for replication_keeps_crash. (#10423) · db60b003
      (Jerome)Junfeng Yang 提交于
      Remove the set `gp_fts_probe_retries to 1` which may cause FTS probe failed.
      This was first added to reduce the test time, but set a lower retry
      value may cause the test failed to probe FTS update segment
      configuration. Since reduce the `gp_fts_replication_attempt_count` also
      save the test time, so skip alter ``gp_fts_probe_retries`.
      
      Also find an assertion may not match when mark mirror down happens before
      walsender exit, which will free the replication status before walsender
      exit and try to record disconnect info. Which lead the segment crash
      and starts recover.
      db60b003
  6. 10 7月, 2020 11 次提交
    • N
      ic-proxy: enable ic-proxy with --enable-ic-proxy · 81810a20
      Ning Yu 提交于
      We used to use the option --with-libuv to enable ic-proxy, it is not
      staightforward to understand the purpose of that option, though.  So we
      renamed it to --enable-ic-proxy, and the default setting is changed to
      "disable".
      
      Suggested by Kris Macoskey <kmacoskey@pivotal.io>
      81810a20
    • N
      ic-proxy: let backends connect to the proxy bgworker · 94c9d996
      Ning Yu 提交于
      Only in proxy mode, of course.  Currently the ic-proxy mode shares most
      of the backend logic with ic-tcp mode, so instead of copying the code we
      actually embed the ic-proxy specific logic in ic_tcp.c .
      94c9d996
    • N
      ic-proxy: launch as a bgworker · 5b60069c
      Ning Yu 提交于
      5b60069c
    • N
      ic-proxy: new value "proxy" in GUC gp_interconnect_type · 245ca266
      Ning Yu 提交于
      It is for the ic-proxy mode.
      245ca266
    • N
      ic-proxy: make gp_interconnect_proxy_addresses a GUC · 3140a44f
      Ning Yu 提交于
      3140a44f
    • N
      ic-proxy: implement the core logic · 6188fb1f
      Ning Yu 提交于
      The interconnect proxy mode, a.k.a. ic-proxy, is a new interconnect
      mode, all the backends communicate via a proxy bgworker, all the
      backends on the same segment share the same proxy bgworker, so every two
      segments only need one network connection between them, which reduces
      the network flows as well the ports.
      
      To enable the proxy mode we need to first configure the guc
      gp_interconnect_proxy_addresses, for example:
      
          gpconfig \
            -c gp_interconnect_proxy_addresses \
            -v "'1:-1:10.0.0.1:2000,2:0:10.0.0.2:2001,3:1:10.0.0.3:2002'" \
            --skipvalidation
      
      Then restart to take effect.
      6188fb1f
    • N
      Store dbid in CdbProcess · 8804bf39
      Ning Yu 提交于
      It is a preparation for the ic-proxy mode, we need this information to
      distinguish a primary segment with its mirror.
      8804bf39
    • P
      Fix pyyaml windows build (#10451) · 3daafd2f
      Peifeng Qiu 提交于
      Local fork at gpMgmt/bin/ext/yaml was removed by 8d6c3059. Unpack
      it from gpMgmt/bin/pythonSrc/ext just like pygresql.
      3daafd2f
    • A
      [Refactor] Pull out KHeap into CKHeap.h · 9e8f261d
      Ashuka Xue 提交于
      Pull out the implementation for binary heap into its own templated h
      file.
      9e8f261d
    • A
      Make histograms commutative when merging · 9b427611
      Ashuka Xue 提交于
      Prior to this commit, merging two histograms was not commutative.
      Meaning histogram1->Union(histogram2) could result in a row estimate of
      1500 rows, but histogram2->Union(histogram1) could result in a row
      estimate of 600 rows.
      
      Now, MakeBucketMerged has been renamed to SplitAndMergeBuckets. This
      function, which calculates the statistics for the merged bucket, now
      consistently return the same histogram buckets regardless of the order
      of input. This in turn, makes MakeUnionHistogramNormalize and
      MakeUnionAllHistogramNormalize commutative.
      
      Once we have successfully split the buckets and merged them as
      necessary, we may have generated up to 3X the number of buckets that
      were originally present. Thus we cap the number of buckets to be either
      the max size of the two incoming buckets, or, 100 buckets.
      
      CombineBuckets will then reduce the size of the histogram by combining
      consecutive buckets that have similar information. It does this by using
      a combination of two ratios: freq/ndv and freq/bucket_width. These two
      ratios were decided based off the following examples:
      
      Assuming that we calculate row counts for selections like the following:
      - For a predicate col = const: rows * freq / NDVs
      - For a predicate col < const: rows * (sum of full or fractional frequencies)
      
      Example 1 (rows = 100), freq/width, ndvs/width and ndvs/freq are all the same:
        ```
        Bucket 1: [0, 4)   freq .2  NDVs 2  width 4  freq/width = .05 ndv/width = .5 freq/ndv = .1
        Bucket 2: [4, 12)  freq .4  NDVs 4  width 8  freq/width = .05 ndv/width = .5 freq/ndv = .1
        Combined: [0, 12)  freq .6  NDVs 6  width 12
        ```
      
      This should give the same estimates for various predicates, with separate or combined buckets:
      ```
      pred          separate buckets         combined bucket   result
      -------       ---------------------    ---------------   -----------
      col = 3  ==>  100 * .2 / 2           = 100 * .6 / 6    = 10 rows
      col = 5  ==>  100 * .4 / 4           = 100 * .6 / 6    = 10 rows
      col < 6  ==>  100 * (.2 + .25 * .4)  = 100 * .5 * .6   = 30 rows
      ```
      
      Example 2 (rows = 100), freq and ndvs are the same, but width is different:
      ```
      Bucket 1: [0, 4)   freq .4  NDVs 4  width 4  freq/width = .1 ndv/width = 1 freq/ndv = .1
      Bucket 2: [4, 12)  freq .4  NDVs 4  width 8  freq/width = .05 ndv/width = .5 freq/ndv = .1
      Combined: [0, 12)  freq .8  NDVs 8  width 12
      ```
      
      This will give different estimates with the combined bucket, but only for non-equal preds:
      ```
      pred          separate buckets         combined bucket   results
      -------       ---------------------    ---------------   --------------
      col = 3  ==>  100 * .4 / 4           = 100 * .8 / 8    = 10 rows
      col = 5  ==>  100 * .4 / 4           = 100 * .8 / 8    = 10 rows
      col < 6  ==>  100 * (.4 + .25 * .4) != 100 * .5 * .8     50 vs. 40 rows
      ```
      
      Example 3 (rows = 100), now NDVs / freq is different:
      ```
      Bucket 1: [0, 4)   freq .2  NDVs 4  width 4  freq/width = .05 ndv/width = 1 freq/ndv = .05
      Bucket 2: [4, 12)  freq .4  NDVs 4  width 8  freq/width = .05 ndv/width = .5 freq/ndv = .1
      Combined: [0, 12)  freq .6  NDVs 8  width 12
      ```
      
      This will give different estimates with the combined bucket, but only for equal preds:
      ```
      pred          separate buckets         combined bucket   results
      -------       ---------------------    ---------------   ---------------
      col = 3  ==>  100 * .2 / 4          != 100 * .6 / 8      5 vs. 7.5 rows
      col = 5  ==>  100 * .4 / 4          != 100 * .8 / 8      10 vs. 7.5 rows
      col < 6  ==>  100 * (.2 + .25 * .4)  = 100 * .5 * .6   = 30 rows
      ```
      
      This commit also adds an attribute to the statsconfig for MaxStatsBuckets
      and changes the scaling method when creating singleton buckets.
      9b427611
    • A
      [Refactor] Update MakeStatsFilter, Rename CreateHistMashMapAfterMergingDisjPreds -> · c14fbb92
      Ashuka Xue 提交于
      MergeHistogramMapsforDisjPreds
      
      This commit refactors MakeStatsFilter to use
      MakeHistHashMapConjOrDisjFilter instead of individually calling
      MakeHistHashMapConj and MakeHistHashMapDisj.
      
      This commit also modifies MergeHistogramMapsForDisjPreds to avoid copy
      and creating unnecessary histogram buckets.
      c14fbb92
  7. 09 7月, 2020 4 次提交
  8. 08 7月, 2020 8 次提交