1. 18 5月, 2020 2 次提交
  2. 16 5月, 2020 4 次提交
  3. 15 5月, 2020 8 次提交
    • T
      Guard against possible memory allocation botch in batchmemtuples(). · 706f7483
      Tom Lane 提交于
      Negative availMemLessRefund would be problematic.  It's not entirely
      clear whether the case can be hit in the code as it stands, but this
      seems like good future-proofing in any case.  While we're at it,
      insist that the value be not merely positive but not tiny, so as to
      avoid doing a lot of repalloc work for little gain.
      
      Peter Geoghegan
      
      Discussion: <CAM3SWZRVkuUB68DbAkgw=532gW0f+fofKueAMsY7hVYi68MuYQ@mail.gmail.com>
      706f7483
    • H
      Change Material back to using upstream tuplestore. · 2e4d99fa
      Heikki Linnakangas 提交于
      Now that ShareInputScan manages its own tuplestore, Material doesn't need
      the extra features that tuplestorenew.c provides.
      Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
      2e4d99fa
    • H
    • H
      Have ShareInputScan manage the tuplestore by itself. · 5bdf2c8b
      Heikki Linnakangas 提交于
      Seems a bit silly to have the Material node involved. Just create and
      manage the tuplestore in ShareInputScan node itself, and leave out the
      Material node.
      Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
      5bdf2c8b
    • H
      Remove GPDB changes in tuplesort that were used by ShareInputScans. · 0b233b59
      Heikki Linnakangas 提交于
      ShareInputScan no longer tries to share the sort tapes between processes,
      so all this infrastructure to track multiple read positions is no longer
      needed.
      Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
      0b233b59
    • H
      Remove optimization to share Sort tapes directly in ShareInputScan. · 611fa500
      Heikki Linnakangas 提交于
      Previously, ShareInputScan could co-opearate with a Sort node, to share
      the final sort tape directly with other processes. Remove that support. If
      a Sort node is shared, we now put a Materialize node on top of the Sort,
      like all other nodes.
      
      That is obviously less performant than sharing the sort tape directly.
      However, I don't believe that is significant in practice. Firstly, if you
      consider how a ShareInputScan is used, having a Sort below the
      ShareInputScan should be rare. A ShareInputScan is used to implement CTEs,
      and in order to have a Sort node just below the ShareInputScan, you need
      to have an ORDER BY in the CTE. For example (from the 'sisc_sort_spill'
      test):
      
          select avg(i3) from (
            with ctesisc as (select * from testsisc order by i2)
            select t1.i3, t2.i2
            from ctesisc as t1, ctesisc as t2
            where t1.i1 = t2.i2
          ) foo;
      
      However, in a query like that, the ORDER BY is actually useless; the
      order is not guaranteed to be preserved. In fact, ORCA optimizes it away
      completely.
      
      Secondly, even if you have a query like that, I don't think optimizing
      away the Material is very significant. If the number of rows is small
      enough to fit in memory, the Sort can be performed in memory, so you're
      still writing it to disk only once, in the Material node. If it's large
      enough to spill, the Material node will shield the Sort node from needing
      to support random access, which enables the "on-the-fly" final merge
      optimization in the tuplesort. So I believe you'll do roughly the same
      amount of I/O in that case, too. One way to think about this is that the
      final merge will be written out to the Material's tuplestore instead of
      the tuplesort's file.
      
      There is one drawback to that: the Material node won't be able to reuse
      the disk space used by the sort tapes, as the final merge is performed,
      so you'll momentarily need twice the disk space. I think that's
      acceptable. If you don't like that, don't put superfluous ORDER BYs in
      your WITH queries.
      Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
      611fa500
    • W
      Fix the gpload error that have no attribute "staging_table" or "fast_path" · 524f3105
      Wen Lin 提交于
      while gpload is loading data if the configure file contains "error_table" and doesn't contain "preload", an error of no attribute "staging_table" or "fast_path"  occurs.
      524f3105
    • D
      Docs - remove Beta designation from Greenplum R docs · 7e4bbafb
      David Yozie 提交于
      7e4bbafb
  4. 14 5月, 2020 14 次提交
    • T
      Remove gpsys1 · eee440e6
      Tyler Ramer 提交于
      I'm not quite sure of the purpose of this utility, nor, apparently, is
      any readme or historical repo.
      
      Apart from a small fix provided in
      commit 71d67305,
      there has been no modification to this file since at least 2008. More
      importantly, I'm not quite sure of any reasonable use for this file. The
      supported platforms are only linux, darwin, or sunos5, and the listed
      use, of printing the memory size in bytes, is trivial on any of those
      systems without resorting to some python script that wraps a command line
      call.
      
      Given that it hasn't been updated since 2008, it's still compatible with
      some ancient version of python, which means that it's yet another file to
      upgrade to python 3 - in this case, let's drop the program, rather than
      bother upgrading it.
      Authored-by: NTyler Ramer <tramer@pivotal.io>
      eee440e6
    • N
      gpexpand: fix the template creation · f91e2d81
      Ning Yu 提交于
      The pg_partition_oid_index of template0 is used as a template of empty
      indices, its path, however, is not fixed, we need to determine it at
      runtime.
      f91e2d81
    • Z
      Remove useless materialize plannode in left tree of join · 7dcecec5
      Zhenghua Lyu 提交于
      When handling join, if the inner path's locus is CdbLocusType_OuterQuery,
      it also set the outer path's locus to CdbLocusType_OuterQuery in the
      function cdbpath_motion_for_join. And later in cdbpath_create_motion_path,
      it will add a materialize path over the motion path if the original path's
      locus is CdbLocusType_OuterQuery. If the whole subquery can never be rescanned,
      this materialize path is useless and will lead to performance lose. A typical
      plan before this patch (from the case regress/expected/join.out) is:
      
        explain (verbose, costs off)
        select * from int4_tbl a,
          lateral (
            select * from int4_tbl b left join int8_tbl c on (b.f1 = q1 and a.f1 = q2)
          ) ss;
                                     QUERY PLAN
        ------------------------------------------------------------------------
         Nested Loop
           Output: a.f1, b.f1, c.q1, c.q2
           ->  Materialize  **< this plannode is useless >**
                 Output: a.f1
                 ->  Gather Motion 3:1  (slice1; segments: 3)
                       Output: a.f1
                       ->  Seq Scan on public.int4_tbl a
                             Output: a.f1
           ->  Materialize
                 Output: b.f1, c.q1, c.q2
                 ->  Hash Right Join
                       Output: b.f1, c.q1, c.q2
                       Hash Cond: (c.q1 = b.f1)
                       ->  Result
                             Output: c.q1, c.q2
                             Filter: (a.f1 = c.q2)
      
      There are two cases that we can do this safely:
      
        * not in SubPlan
        * not in lateral join's inner subquery
      
      This commit add a flag in PlannerInfo->config to guide
      if we can remove the useless Materialize plannode in join's left tree.
      7dcecec5
    • N
      Retire the disableSystemIndexes option of SegmentStart · bebc77af
      Ning Yu 提交于
      It is no loner needed, the correct approach is to install meta-only
      index files on the new segments.
      bebc77af
    • N
      Transfer meta-only index files instead of empty ones · effae659
      Ning Yu 提交于
      An empty b-tree index file is not empty, it contains only the meta page.
      By transfer meta-only index files to the new segments, they can be
      launched directly without the "ignore_system_indexes" setting, and we do
      not need an extra relaunch of the new segments.
      
      We use base/13199/5112 as the template of meta-only index files, it is
      pg_partition_oid_index of template0.
      effae659
    • N
      Exclude files with the --exclude-from option · b0b0b958
      Ning Yu 提交于
      Which was introduced to exclude a large amount of paths.
      
      Also changed the excluding logic of './db_dumps' and './promote'.  They
      were excluded only when an empty 'excludePaths' was specified by the
      caller, this is weird, so I changed the logic to always exclude these
      two paths.
      b0b0b958
    • N
      Updates according to PR comments · 43b95607
      Ning Yu 提交于
      - be careful when creating placeholders of the master-only files in the
        template, raise an error if they already exist;
      - increase code readability slightly;
      43b95607
    • N
      gpexpand: exclude master-only tables from the template · 1d04cab0
      Ning Yu 提交于
      Gpexpand creates new primary segments by first creating a template from
      the master datadir and then copying it to the new segments.  Some
      catalog tables are only meaningful on master, such as
      gp_segment_configuration, their content are then cleared on each new
      segment with the "delete from ..." commands.
      
      This works but is slow because we have to include the content of the
      master-only tables in the archive, distribute them via network, and
      clear them via the slow "delete from ..." commands -- the "truncate"
      command is fast but it is disallowed on catalog tables as filenode must
      not be changed for catalog tables.
      
      To make it faster we now exclude these tables from the template
      directly, so less data are transferred and there is no need to "delete
      from" them explicitly.
      1d04cab0
    • N
      gpexpand: cleanup new segments in parallel · 857763ae
      Ning Yu 提交于
      When cleaning up the master-only files on the new segments we used to do
      the job one by one, when there are tens or hundreds of segments it can
      be very slow.
      
      Now we cleanup in parallel.
      857763ae
    • N
      gppylib: remove duplicated entries in MASTER_ONLY_TABLES · c0b05f8d
      Ning Yu 提交于
      Removed the duplicated 'gp_segment_configuration' entry in the
      MASTER_ONLY_TABLES list.  Also sort the list in alphabetic order to
      prevent dulicates in the future.
      c0b05f8d
    • N
      gpexpand: correct scenario names and indents · 5e7a4e5c
      Ning Yu 提交于
      In the gpexpand behave tests we used to have the same name for multiple
      scenarios, now we give them different and descriptive names.
      
      Also correct some bad indents.
      5e7a4e5c
    • X
      Fix flaky test dtm_recovery_on_standby · 45328e5e
      xiong-gang 提交于
      It takes time to start the walsender after gpinitstandby, this commit added a wait loop to reduce the flaky. It also fixes the next test commit_blocking_on_standby.
      45328e5e
    • A
      Address PR Feedback · d90ceb45
      Ashuka Xue 提交于
      d90ceb45
    • A
      Allow stats estimation for text-like types only for histograms containing singleton buckets · ecefcc1c
      Ashuka Xue 提交于
      In commit `Improve statistics calculation for exprs like "var = ANY
      (ARRAY[...])"`, we improve performance in cardinality estimation for
      ArrayCmp. However, it caused ArrayCmp expressions with text-like types
      to default to NDV based cardinality estimations in spite of present and
      valid histograms.
      
      This commit re-enables using histograms for text-like types provided it
      is safe to do so.
      
      Removed because non-singleton buckets for text is not valid:
      - src/backend/gporca/data/dxl/minidump/CTE-12.mdp
      - src/backend/gporca/data/dxl/statistics/Join-Statistics-Text-Input.xml
      - src/backend/gporca/data/dxl/statistics/Join-Statistics-Text-Output.xml
      Co-authored-by: NAshuka Xue <axue@pivotal.io>
      Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>
      ecefcc1c
  5. 13 5月, 2020 12 次提交
    • H
      Fix memory accounting. · 87c905e9
      Heikki Linnakangas 提交于
      This showed up as bogus "Executor memory" lines in EXPLAIN ANALYZE
      output.
      Reviewed-by: NJesse Zhang <jzhang@pivotal.io>
      87c905e9
    • A
      postgres_fdw: disable UPDATE/DELETE on foreign Greenplum servers · c577eb5e
      Adam Lee 提交于
      Greenplum only supports INSERT, because UPDATE/DELETE requires the
      hidden column gp_segment_id and the other "ModifyTable mixes distributed
      and entry-only tables" issue.
      c577eb5e
    • A
      Tag libpq connections with postgres version number by default · 4b65eb59
      Adam Lee 提交于
      GPDB uses the high bits of the major version to indicate special
      internal libpq communications, and it was the default before.
      
      This commit tags it the opposite way, uses the PG version by default,
      but tags the QD-QE, FTS, WAL and fault injector connections as the
      special internal ones.
      4b65eb59
    • A
      Use the same warning for locus not matched modification · 4e367cdd
      Adam Lee 提交于
      Otherwise, inheritance_planner() and grouping_planner() report
      differently.
      4e367cdd
    • A
      Add binary (de)serialize functions for RestrictInfo node · 8caca5fe
      Adam Lee 提交于
      PG doesn't read it, because it would be transformed. But Greenplum
      dispatches it.
      
      ```
      16   0x8d6730 postgres nodeToBinaryStringFast (outfast.c:2165)
      17   0xc9ceea postgres serializeNode (cdbsrlz.c:52)
      18   0xc5cf8c postgres <symbol not found> (cdbdisp_query.c:585)
      19   0xc5ddbc postgres <symbol not found> (cdbdisp_query.c:1081)
      20   0xc5c761 postgres CdbDispatchPlan (cdbdisp_query.c:234)
      21   0x7bcee8 postgres standard_ExecutorStart (execMain.c:658)
      22   0x7bc126 postgres ExecutorStart (execMain.c:214)
      23   0xa1ece6 postgres PortalStart (pquery.c:743)
      24   0x746f76 postgres PerformCursorOpen (portalcmds.c:164)
      25   0xa21364 postgres standard_ProcessUtility (utility.c:554)
      26   0xa20e0c postgres ProcessUtility (utility.c:363)
      27   0xa1fc80 postgres <symbol not found> (discriminator 4)
      28   0xa1ff3b postgres <symbol not found> (pquery.c:1552)
      29   0xa1f3a9 postgres PortalRun (pquery.c:1022)
      ```
      8caca5fe
    • A
      65d02fa6
    • A
      postgres_fdw: ignore Greenplum's XACT_EVENT_PRE_PREPARE · 16d9de17
      Adam Lee 提交于
      Greenplum does that on QE, which is reasonable for a MPP system, but
      please extensions don't panic.
      16d9de17
    • H
      Don't use GenericTupStore where the genericity is not needed. · a7a86494
      Heikki Linnakangas 提交于
      Everywhere except ShareInputScanState, we're always dealing with either
      a tuplestore or a tuplesort, so we don't need to use GenericTupStore.
      Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
      a7a86494
    • T
      Remove sles12 jobs (#10073) · f0f0c4e3
      Tingfang Bao 提交于
        The future Greenplum 7 (master) may not ever target SLES 12 as a supported platform. 
        We can backport this to 6X_STABLE as well because it is not yet a 
        supported platform. It will be at some point in the future.
      
      [#172595616]
      Authored-by: NTingfang Bao <baotingfang@gmail.com>
      f0f0c4e3
    • N
      gpexpand: behave: allow pkill to fail · a92e0a33
      Ning Yu 提交于
      We use "pkill postgres" to cleanup leaked segments in the behave tests,
      if the postgress processes already exited the pkill command would fail
      with code 1, "No processes matched or none of them could be signalled".
      
      Fixed by ignoring the return code of pkill.
      a92e0a33
    • H
      Removing xerces patch (#10091) · 2448be9b
      Hans Zeller 提交于
      The scripts we use in Concourse pipelines download Apache xerces-c-3.1.2 and then apply a patch that is part of our source code tree. Abhijit has pointed out that this is no longer necessary. This commit removes the patch and uses the vanilla xerces-c-3.1.2 source code instead.
      
      Eventually, we want to stop including xerces into our releases and rely on the natively installed xerces. See also https://github.com/greenplum-db/gpdb/pull/10068.
      2448be9b
    • N
      gpexpand: behave: cleanup leaked segs after rollback · a5e32530
      Ning Yu 提交于
      In the scenario "inject a fail and test if rollback is ok" the expansion
      is canceled after the new segment is launched, it must be shutdown in
      time to prevent port conflicts in followings scenarios.
      a5e32530