1. 27 9月, 2017 15 次提交
    • E
      Replace JOIN_LASJ by JOIN_ANTI · 6e7b4722
      Ekta Khanna 提交于
      After merging with e006a24a, Anti Semi Join will
      be denoted by `JOIN_ANTI` instead of `JOIN_LASJ`
      
      Ref [#142355175]
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      6e7b4722
    • E
      Remove InClauseInfo and OuterJoinInfo · 8b63aafb
      Ekta Khanna 提交于
      Since `InClauseInfo` and `OuterJoinInfo` are now combined into
      `SpecialJoinInfo` after merging with e006a24a; this commit remove them
      from the relevant places.
      
      Access `join_info_list` instead of `in_info_list` and `oj_info_list`
      
      Previously, `CdbRelDedupInfo` contained list of `InClauseInfo` s. While
      making join decisions and overall join processing, we traversed this list
      and invoked cdb specific functions: `cdb_make_rel_dedup_info()`, `cdbpath_dedup_fixup()`
      
      Since `InClauseInfo` is no longer available,  `CdbRelDedupInfo` will contain list of
      `SpecialJoinInfo` s. All the cdb specific routines which were previously called for
      `InClauseInfo` list will now be called if `CdbRelDedupInfo` has valid `SpecialJoinInfo`
      list and if join type in `SpecialJoinInfo` is `JOIN_SEMI`. A new helper routine `hasSemiJoin()`
      has been added which traverses `SpecialJoinInfo` list to check if it contains `JOIN_SEMI`.
      
      Ref [#142355175]
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      8b63aafb
    • E
      Replace JOIN_IN by JOIN_SEMI in ORCA translator · db853de9
      Ekta Khanna 提交于
      After merging with e006a24a, the jointype JOIN_IN has been renamed to
      JOIN_SEMI.
      This commit makes appropriate changes in ORCA for the same.
      
      Ref [#142355175]
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      db853de9
    • E
      Add pullup decisions in `convert_ANY_sublink_to_join` · 7be2c0ad
      Ekta Khanna 提交于
      Add pullup decisions specific to CDB from `convert_IN_to_join`
      
      Ref [#142355175]
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      7be2c0ad
    • D
      Add pullup decisions in `convert_EXISTS_sublink_to_join` · d91f0efb
      Dhanashree Kashid 提交于
      After merging with e006a24a, this commit adds CDB specific restrictions
      as follows:
      
      0. Add pullup decisions specific to CDB from
      `convert_EXISTS_to_join` and `convert_NOT_EXISTS_to_join`
      
      0. Before this cherry-pick, we used to generate extra quals
      for the NOT EXISTS query. This was done by calling `cdbpullup_expr()`
      in `convert_NOT_EXISTS_to_join()`.
      However, for the exact same query with EXISTS, we never generated these
      extra quals.
      ```
      create table foo(t text, n numeric, i int, v varchar(10)) distributed by (t);
      explain select * from foo t0 where not exists (select 1 from foo t1 where t0.i=t1.i + 1);
                                                 QUERY PLAN
      -------------------------------------------------------------------------------------------------
       Gather Motion 3:1  (slice2; segments: 3)  (cost=1.08..2.12 rows=4 width=19)
         ->  Hash Left Anti Semi Join  (cost=1.08..2.12 rows=2 width=19)
               Hash Cond: t0.i = (t1.i + 1)
               ->  Seq Scan on foo t0  (cost=0.00..1.00 rows=1 width=19)
               ->  Hash  (cost=1.04..1.04 rows=1 width=4)
                     ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..1.04 rows=1 width=4)
                           ->  Seq Scan on foo t1  (cost=0.00..1.00 rows=1 width=4)
                                 Filter: (i + 1) IS NOT NULL  -> extra filter
       Settings:  optimizer=off
       Optimizer status: legacy query optimizer
      (10 rows)
      
      explain select * from foo t0 where exists (select 1 from foo t1 where t0.i=t1.i + 1);
                                                 QUERY PLAN
      -------------------------------------------------------------------------------------------------
       Gather Motion 3:1  (slice2; segments: 3)  (cost=1.08..2.12 rows=4 width=19)
         ->  Hash EXISTS Join  (cost=1.08..2.12 rows=2 width=19)
               Hash Cond: t0.i = (t1.i + 1)
               ->  Seq Scan on foo t0  (cost=0.00..1.00 rows=1 width=19)
               ->  Hash  (cost=1.04..1.04 rows=1 width=4)
                     ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..1.04 rows=1 width=4)
                           ->  Seq Scan on foo t1  (cost=0.00..1.00 rows=1 width=4)
       Settings:  optimizer=off
       Optimizer status: legacy query optimizer
      (9 rows)
      
      ```
      Currently with this commit, the combined pull-up code for EXISTS and NOT EXISTS
      does not generate extra filters. This will be a future TODO.
      
      0. Use `is_simple_subquery` in `simplify_EXISTS_query` to check if
      subquery can be pulled up or not.
      
      Ref [#142355175]
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      d91f0efb
    • D
      Remove old pullup functions after merging e006a24a · 2b5c1b9e
      Dhanashree Kashid 提交于
      With the new flow, we don't need the following functions:
      
       - pull_up_IN_clauses
       - convert_EXISTS_to_join
       - convert_NOT_EXISTS_to_antijoin
       - not_null_inner_vars
       - safe_to_convert_NOT_EXISTS
       - convert_sublink_to_join
      
      Ref [#142355175]
      Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
      2b5c1b9e
    • E
      CDBlize the cherry-pick e006a24a · 0feb1bd9
      Ekta Khanna 提交于
      Original Flow:
      cdb_flatten_sublinks
      	+--> pull_up_IN_clauses
      		+--> convert_sublink_to_join
      
      New Flow:
      cdb_flatten_sublinks
      	+--> pull_up_sublinks
      
      This commit contains relevant changes for the above flow.
      
      Previously, `try_join_unique` was part of `InClauseInfo`. It was getting
      set in `convert_IN_to_join()` and used in `cdb_make_rel_dedup_info()`.
      Now, since `InClauseInfo` is not present and we construct
      `FlattenedSublink` instead in `convert_ANY_sublink_to_join()`. And later
      in the flow, we construct `SpecialJoinInfo` from `FlattenedSublink` in
      `deconstruct_sublink_quals_to_rel()`. Hence, adding `try_join_unique` as
      part of both `FlattenedSublink` and `SpecialJoinInfo`.
      
      Ref [#142355175]
      Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
      0feb1bd9
    • E
      Implement SEMI and ANTI joins in the planner and executor. · fe2eb2c9
      Ekta Khanna 提交于
      commit e006a24a
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Thu Aug 14 18:48:00 2008 +0000
      
          Implement SEMI and ANTI joins in the planner and executor.  (Semijoins replace
          the old JOIN_IN code, but antijoins are new functionality.)  Teach the planner
          to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti
          joins respectively.  Also, LEFT JOINs with suitable upper-level IS NULL
          filters are recognized as being anti joins.  Unify the InClauseInfo and
          OuterJoinInfo infrastructure into "SpecialJoinInfo".  With that change,
          it becomes possible to associate a SpecialJoinInfo with every join attempt,
          which permits some cleanup of join selectivity estimation.  That needs to be
          taken much further than this patch does, but the next step is to change the
          API for oprjoin selectivity functions, which seems like material for a
          separate patch.  So for the moment the output size estimates for semi and
          especially anti joins are quite bogus.
      
      Ref [#142355175]
      Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
      fe2eb2c9
    • L
      b68bcd89
    • L
      docs - restructure admin guide top level topics (#3389) · 56cdb519
      Lisa Owen 提交于
      * admin guide working w/dbs section - pull some topics up a level
      
      * promote ddl, crud topics; move querying topic up
      56cdb519
    • N
      Force behave to use older version of its dependency · b4cfe392
      Nadeem Ghani 提交于
      Behave by default just grabs the latest parse_type package as part of
      the setup requirement. However, the newest parse_type package (0.4.2)
      uses a new convention on how it handles packaging which is not in the
      older python versions before 2.7.13. Since we're in python 2.7.12, we
      break.
      
      Force requirement to use an older version as a hack to bypass this
      issue.
      Signed-off-by: NMarbin Tan <mtan@pivotal.io>
      b4cfe392
    • M
      2bc5401c
    • L
      f1bfd7d9
    • L
    • L
      e397112d
  2. 26 9月, 2017 13 次提交
  3. 25 9月, 2017 8 次提交
    • H
      Remove the concept of window "key levels". · b1651a43
      Heikki Linnakangas 提交于
      It wasn't very useful. ORCA and Postgres both just stack WindowAgg nodes
      on top of each other, and no-one's been unhappy about that, so we might as
      well do that, too. This reduces the difference between GPDB and the upstream
      implementation, and will hopefully make it smoother to switch.
      
      Rename the Window Plan node type to WindowAgg, to match upstream, now
      that it is fairly close to the upstream version.
      b1651a43
    • H
      Rename WindowRef et al. to WindowFunc. · 9e82d83d
      Heikki Linnakangas 提交于
      To match upstream.
      9e82d83d
    • H
      Avoid Division by Zero error. · ee7f7a9e
      Heikki Linnakangas 提交于
      This test case could throw either "ROWS parameter cannot be negative", or
      "Division By Zero", depending on which gets evaluated first. Remove the
      division by zero error, to make it more predictable.
      ee7f7a9e
    • R
      Refactor test case for concurrent query cancellation/termination. · 286b431c
      Richard Guo 提交于
      This case is to test cancel/terminate queries concurrently while they are 
      running or waiting in resource group.
      286b431c
    • H
      Remove row order information from Flow. · 7e268107
      Heikki Linnakangas 提交于
      A Motion node often needs to "merge" the incoming streams, to preserve the
      overall sort order. Instead of carrying sort order information throughout
      the later stages of planning, in the Flow struct, pass it as argument
      directly to make_motion() and other functions, where a Motion node is
      created. This simplifies things.
      
      To make that work, we can no longer rely on apply_motion() to add the final
      Motion on top of the plan, when the (sub-)query contains an ORDER BY. That's
      because we no longer have that information available at apply_motion(). Add
      the Motion node in grouping_planner() instead, where we still have that
      information, as a path key.
      
      When I started to work on this, this also fixed a bug, where the sortColIdx
      of plan flow node may refer to wrong resno. A test case for that is
      included. However, that case was since fixed by other coincidental changes
      to partition elimination, so now this is just refactoring.
      7e268107
    • P
      Add pipeline support for AIX clients and loaders · 68362b41
      Peifeng Qiu 提交于
      Concourse doesn't support AIX natively, we need to clone the repo
      with the correspond commit on remote machine, compile the packages,
      and download them back to concourse container as output.
      
      Testing client and loader for platform without gpdb server is
      another challenge. We setup GPDB server on concourse container just
      like most installcheck tests, and use SSH tunnel to forward ports
      from and to the remote host. This way both CL tools and GPDB server
      feel they are on the same machine, and the test can run normally.
      68362b41
    • A
      Report COPY PROGRAM's error output · 2b51c16b
      Adam Lee 提交于
      Replace popen() with popen_with_stderr() which is used in external web
      table also to collect the stderr output of program.
      
      Since popen_with_stderr() forks a `sh` process, it's almost always
      sucessful, this commit catches errors happen in fwrite().
      
      Also passes variables as the same as what external web table does.
      Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>
      2b51c16b
    • Z
      Fix cgroup mount point detect in gpconfig. · 37e3e66d
      Zhenghua Lyu 提交于
      Previous code us python package psutil to get the mount
      information of the system which will read the content
      of /etc/mtab. In some environments, /etc/mtab does not
      contain the mount point information of cgroups. In this
      commit, we scan /proc/self/mounts to find out cgroup
      mount point.
      37e3e66d
  4. 23 9月, 2017 4 次提交
    • K
      Coverity fix: elog string formatting · d4a707c7
      Kavinder Dhaliwal 提交于
      d4a707c7
    • K
      Add a long living account for Relinquished Memory · 1822c826
      Kavinder Dhaliwal 提交于
      There are cases where during execution a Memory Intensive Operator (MI)
      may not use all the memory that is allocated to it. This means that this
      extra memory (quota - allocated) can be relinquished for other MI nodes
      to use during execution of a statement. For example
      
      ->  Hash Join
               ->  HashAggregate
               ->  Hash
      In the above query fragment the HashJoin operator has a MI operator for
      both its inner and outer subtree. If there ever is the case that the
      Hash node used much less memory than was given as its quota it will now
      call MemoryAccounting_DeclareDone() and the difference between its
      quota and allocated amount will be added to the allocated amount of the
      RelinquishedPool. Doing this will enable HashAggregate to request memory
      from this RelinquishedPool if it exhausts its quota to prevent spilling.
      
      This PR adds two new API's to the MemoryAccounting Framework
      
      MemoryAccounting_DeclareDone(): Add the difference between a memory
      account's quota and its allocated amount to the long living
      RelinquishedPool
      
      MemoryAccounting_RequestQuotaIncrease(): Retrieve all relinquished
      memory by incrementing an operator's operatorMemKb and setting the
      RelinquishedPool to 0
      
      Note: This PR introduces the facility for Hash to relinquish memory to
      the RelinquishedPool memory account and for the Agg operator
      (specifically HashAgg) to request an increase to its quota before it
      builds its hash table. This commit does not generally apply this
      paradigm to all MI operators
      Signed-off-by: NSambitesh Dash <sdash@pivotal.io>
      Signed-off-by: NMelanie Plageman <mplageman@pivotal.io>
      1822c826
    • S
      Cherry-pick 'ae47eb1' from upstream to fix Nested CTE errors (#3360) · 009b1809
      sambitesh 提交于
      Before this cherry-pick the below query would have errored out
      
      WITH outermost(x) AS (
        SELECT 1
        UNION (WITH innermost as (SELECT 2)
               SELECT * FROM innermost
               UNION SELECT 3)
      )
      SELECT * FROM outermost;
      Signed-off-by: NMelanie Plageman <mplageman@pivotal.io>
      009b1809
    • T
      Update 5.json with catalog changes (amgetmulti -> amgetbitmap) · 4daa7c5f
      Tom Meyer 提交于
      To update 5.json, we ran:
      
      cat src/include/catalog/*.h | perl src/backend/catalog/process_foreign_keys.pl > gpMgmt/bin/gppylib/data/5.json
      Signed-off-by: NJacob Champion <pchampion@pivotal.io>
      4daa7c5f