1. 09 11月, 2017 1 次提交
  2. 08 11月, 2017 3 次提交
  3. 07 11月, 2017 1 次提交
    • S
      Merge conditions into corresponding Bitmap Index Probes · 39d0145b
      Sambitesh Dash 提交于
      There was bug in the way Bitmap Index Probes were being merged.
      
      Consider the table query below :
      
      SELECT * FROM foo WHERE b = 2 AND c >=6 AND c <= 6;
      
      Let's assume there is index on coloumn 'b' and 'c'. Bitmap Index Probe
      on second and third condition ('c>=6' and 'c<=6') are mergeable because
      they are on the same indexed coloumn. Due to the bug, Orca would merge
      second Index Probe with first Index Probe condition ('b=2') instead. This lead to a wrong 'Recheck
      Condition' which if selected as a filter lead to a wrong plan. This
      commit fixes this bug.
      Signed-off-by: NJesse Zhang <sbjesse@gmail.com>
      39d0145b
  4. 03 11月, 2017 2 次提交
  5. 02 11月, 2017 1 次提交
    • J
      Slight README clean up (#250) · b40be5f7
      Jesse Zhang 提交于
      Cleans up README and simplifies build steps
      
      1. Remove leftover mentions of `make` in the context of building ORCA
      1. Because ninja is parallel by default, remove mentions of how to
         parallelize the build
      
      * Simplify build steps in README
      
      Noticeably, we no longer require the two most hated steps:
      cd-after-mkdir-build. Instead `cmake` will directly mkdir the build
      directly if it doesn't exist.
      
      * [ci skip]
      This fixes #248
      b40be5f7
  6. 28 10月, 2017 1 次提交
  7. 24 10月, 2017 3 次提交
    • D
      Bump ORCA Version · 18ebd876
      Dhanashree Kashid 提交于
      Bump ORCA version after commit b111104f
      18ebd876
    • S
    • J
      Remove tautological undefined comparisons · b111104f
      Jesse Zhang 提交于
      A "tautological" comparison is a comparison that's only meaningfully
      necessary when we consider "undefined" C++ behaviors.
      
      For context, in well-formed C++ code:
      
        1. references can never be bound to NULL; and
        1. the `this` pointer in a member function can never be NULL
      
      Historically ORCA has relied on implementation-specific (undefined,
      actually) behavior where
      
        1. we might call a member function on a potentially NULL object with
        the `->` operator, or
        1. some callers may bind a (possibly-NULL) pointer to a reference with
        the '*' operator and try to print it into an IOStream
      
      While doing so gives the benefit of centralizing the check, a dependence
      on undefined behavior means we risk producing the wrong code. Indeed,
      more modern compilers aggressively optimize against undefined behaviors:
      e.g. by eliminating the `NULL` checks, or assuming the variable used for
      indexing an array is never out of bound.
      
      Commit ee5ef334 is a "tick" towards
      reducing such undefined comparisons. This commit is the "tock" that
      eliminates them.
      
      For more context GCC 6+ chokes without this:
      
      Running on a macOS iMac:
      
      ```
      env CC='gcc-6' CXX='g++-6' cmake -GNinja -DCMAKE_BUILD_TYPE=Debug -H. -Bbuild.gcc6.debug
      ninja -C build.gcc6.debug
      ```
      
      GCC produces this error:
      
      ```
      ninja: Entering directory `build.gcc6.debug'
      [548/1027] Building CXX object libgpopt/CMakeFiles/gpopt.dir/src/base/CCTEMap.cpp.o
      FAILED: libgpopt/CMakeFiles/gpopt.dir/src/base/CCTEMap.cpp.o
      ccache /usr/local/bin/g++-6  -Dgpopt_EXPORTS -I/usr/local/include -I../libgpos/include -I../libgpopt/include -I../libgpopt/../libgpcost/include -I../libgpopt/../libnaucrates/include -Ilibgpos/include -Wall -Werror -Wextra -pedantic-errors -Wno-variadic-macros -Wno-tautological-undefined-compare -fno-omit-frame-pointer -g -g3 -fPIC -MD -MT libgpopt/CMakeFiles/gpopt.dir/src/base/CCTEMap.cpp.o -MF libgpopt/CMakeFiles/gpopt.dir/src/base/CCTEMap.cpp.o.d -o libgpopt/CMakeFiles/gpopt.dir/src/base/CCTEMap.cpp.o -c ../libgpopt/src/base/CCTEMap.cpp
      ../libgpopt/src/base/CCTEMap.cpp: In function 'gpos::IOstream& gpopt::operator<<(gpos::IOstream&, gpopt::CCTEMap&)':
      ../libgpopt/src/base/CCTEMap.cpp:432:18: error: the compiler can assume that the address of 'cm' will never be NULL [-Werror=address]
           return (NULL == &cm) ? os : cm.OsPrint(os);
                        ^
      At global scope:
      cc1plus: all warnings being treated as errors
      [549/1027] Building CXX object libgpopt/CMakeFiles/gpopt.dir/src/base/CCTEReq.cpp.o
      FAILED: libgpopt/CMakeFiles/gpopt.dir/src/base/CCTEReq.cpp.o
      ccache /usr/local/bin/g++-6  -Dgpopt_EXPORTS -I/usr/local/include -I../libgpos/include -I../libgpopt/include -I../libgpopt/../libgpcost/include -I../libgpopt/../libnaucrates/include -Ilibgpos/include -Wall -Werror -Wextra -pedantic-errors -Wno-variadic-macros -Wno-tautological-undefined-compare -fno-omit-frame-pointer -g -g3 -fPIC -MD -MT libgpopt/CMakeFiles/gpopt.dir/src/base/CCTEReq.cpp.o -MF libgpopt/CMakeFiles/gpopt.dir/src/base/CCTEReq.cpp.o.d -o libgpopt/CMakeFiles/gpopt.dir/src/base/CCTEReq.cpp.o -c ../libgpopt/src/base/CCTEReq.cpp
      ../libgpopt/src/base/CCTEReq.cpp: In function 'gpos::IOstream& gpopt::operator<<(gpos::IOstream&, gpopt::CCTEReq&)':
      ../libgpopt/src/base/CCTEReq.cpp:569:18: error: the compiler can assume that the address of 'cter' will never be NULL [-Werror=address]
           return (NULL == &cter) ? os : cter.OsPrint(os);
                        ^
      At global scope:
      cc1plus: all warnings being treated as errors
      [557/1027] Linking CXX shared library libnaucrates/libnaucrates.2.47.0.dylib
      ninja: build stopped: subcommand failed.
      ```
      b111104f
  8. 20 10月, 2017 2 次提交
    • H
      Add a configuration option to disable enforcements of constraints. · 04293f5a
      Heikki Linnakangas 提交于
      Add a new EnforceConstraints hint configuration option. If it's set,
      ORCA will not add Assert nodes to enforce CHECK and NOT NULL constraints
      on INSERT and UPDATE statements.
      
      This is useful for GPDB, which is prepared to enforce the constraints on
      its own. In theory, the optimizer could prove the inserted row to always
      satisfy the constraints, in which case it could optimize away the checks,
      whereas the executor can't do that on its own. But ORCA doesn't currently
      attempt to do that. The reason I'd like to enforce the constraints in
      GPDB side instead of as Assertion nodes, is that you get a different error
      message from the assertion node. That's annoying, because we have to
      maintain alternative expected output files for tests that hit CHECK or
      NOT NULL constraints because of that.
      
      Bump version number to 2.48.0, since this is an incompatible change.
      04293f5a
    • J
      Use PROJECT_SOURCE_DIR instead of CMAKE_SOURCE_DIR · 46d3a54b
      Jesse Zhang 提交于
      This is semantically more precise, and also enables ORCA to be included
      in other CMake projects
      
      [ci skip]
      46d3a54b
  9. 18 10月, 2017 3 次提交
  10. 12 10月, 2017 3 次提交
    • S
      Bump ORCA Version · 42aeca48
      Shreedhar Hardikar 提交于
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      42aeca48
    • S
      Enable partition selection when a cast is present · 268a8e62
      Shreedhar Hardikar 提交于
      Enabled by including CScalarCmp expressions that contain a CScalarCast
      in the types of expressions considered for partition filters. Observe
      that if the expression contains a CScalarCast over a CScalarIdent, it
      must be preserved all the way to the final plan. That is, we cannot
      "peak" and extract the identifer under the cast. For this reason, in
      case it is an equality comparison with a cast, levelEqExprs can no
      longer be used.
      
      Also, Convert a cast on list part filters to array coerce.
      
      During Expr to DXL translation, construct a CDXLScalarArrayCoerceExpr
      operator on top of CDXLScalarPartListValues when a cast is present on
      top of the partition key in the partition filter expression.
      
      Also, refactor SplitPartPredicates() to make it easier to read; and
      refactor methods around PdxlnListFilterScCmp() by extracting out the
      generation of Part Key expression for LIST partition filters.
      
      However, ORCA won't be able to generate a partition filter from an
      expression of the form : pk::int8 IS DISTINCT FROM 5. This is because
      IDF expressions are handled by the CConstraintInterval framwork which
      converts it to a corresponding CScalarCmp + CScalarNullTest. This
      framwork cannot preserve cast information on the partkey since it stores
      only a CColRef.
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      268a8e62
    • S
      Fix incorrect check in PciIntervalFromScalarIDF() · 160d60b2
      Shreedhar Hardikar 提交于
      When using CConstraintInterval to derive partition filters, we cannot
      use a casted Ident, since the cast information is lost. This tripped an
      assertion. Also implement FIdentIDFConst().
      160d60b2
  11. 06 10月, 2017 4 次提交
  12. 05 10月, 2017 1 次提交
    • V
      Fixing incorrect asserts in CXformUtils · bded4a81
      Venkatesh Raghavan 提交于
      There was a fairly classic bug / typo where the assertion would never
      fail because we put an enum member (non-zero) in a boolean context.
      
      Even though the method in question is generically named
      `TransformImplementBinaryOp`, it's actually only on the code path of
      transforming a physical nested loop (non-correlated, non-indexed) join.
      
      This commit adds back all types of eligible nested loop joins into the
      assertion.
      Signed-off-by: NJesse Zhang <sbjesse@gmail.com>
      bded4a81
  13. 29 9月, 2017 1 次提交
  14. 27 9月, 2017 2 次提交
    • O
      Update space size of a minidump · f6a453c1
      Omer Arap 提交于
      f6a453c1
    • O
      Reorder preprocessing steps to avoid false trimming of existential subqueries [#150988530] · bec7b6af
      Omer Arap 提交于
      Orca remove outer references from order grouping columns in GbAgg. Orca also
      trims an existential subquery whose inner expression is a GbAgg  with no grouping
      columns by replacing it with a Boolean constant. However in same caes
      applying the first preprocessing step affects the latter and produces an
      unintended trimming of existential subqueries.
      
      This commit changes the order of these two preprocessing step to avoid
      that complications.
      
      E.g
      ```
      SELECT * from foo where not exits (SELECT sum(bar.a) from bar where
      foo.a = bar.a GROUP BY foo.a;
      ```
      
      In this example the grouping column is an outer reference and it is
      removed by `PexprRemoveSuperfluousOuterRefs`. And the next preprocessing
      step `PexprTrimExistentialSubqueries` sees that the GbAgg has no
      grouping colums and implys `NOT EXISTS` as `false`.
      
      Therefore we change the order and this fixes the problem.
      
      Bump version to 2.46.3
      bec7b6af
  15. 23 9月, 2017 1 次提交
    • B
      Push ORCA src to bintray repository for conan · f395a029
      Bhuvnesh 提交于
      ORCA source will be pushed to bintray repository for each version bump.
      Developers can use conan dependency manager to build orca before
      building GPDB via minimal steps and need not worry about the version
      dependency.
      As we currently bump orca version used by GPDB, we will ensure that
      conanfile.txt gets updated for each bump of ORCA.
      
      To build orca from GPDB:
      Step 1: cd <path_to_gpdb>/depends
      Step 2: env CONAN_CMAKE_GENERATOR=Ninja conan install -s build_type=Debug --build
      where: build_type can be `Debug` or `Release`
      Signed-off-by: NJemish Patel <jpatel@pivotal.io>
      f395a029
  16. 20 9月, 2017 1 次提交
  17. 19 9月, 2017 3 次提交
  18. 16 9月, 2017 1 次提交
    • B
      Incorrect Decorrelation results in wrong plan · 94e509d0
      Bhuvnesh Chaudhary 提交于
      While attempting to decorrelate the subquery, we were
      incorrectly pulling up the join before calculating the window function
      of the results of the join. In cases where we have subqueries with
      window function and the subquery has outer references we should not
      attempt decorrelating it.
      
      Ex: select C.j from C where C.i in (select rank() over (order by B.i) from B where B.i=C.i) order by C.j;
      The above subquery has outer references and result of window function is projected from subquery
      
      There are further optimization which can be done in case of existential
      queries but this PR fixes the plan.
      Signed-off-by: NJemish Patel <jpatel@pivotal.io>
      Signed-off-by: NJemish Patel <jpatel@pivotal.io>
      94e509d0
  19. 15 9月, 2017 4 次提交
    • V
      GPORCA incorrectly collapsing FOJ with condition false · 30cfe889
      Venkatesh Raghavan 提交于
      Prior to this fix, the logic that calculated max cardinality for each
      logical operator assumed that the Full Outer Join with condition false will always
      return empty set. This was used by the following preprocessing step
      
      CExpressionPreprocessor::PexprPruneEmptySubtrees
      
      to eliminate the FOJ subtrees (since it thought had zero output cardinality),
      replacing them with a const table get and zero tuples
      
      ```
      vraghavan=# explain select * from foo a full outer join foo b on false;
                         QUERY PLAN
      ------------------------------------------------
       Result  (cost=0.00..0.00 rows=0 width=8)
         ->  Result  (cost=0.00..0.00 rows=0 width=8)
               One-Time Filter: false
       Optimizer status: PQO version 2.43.1
      (4 rows)
      ```
      This collapsing is incorrect. In this CL, the max cardinality logic has been
      fixed to ensure that GPORCA plan generates correct plan.
      
      After the fix:
      
      ```
      vraghavan=# explain select * from foo a full outer join foo b on false;
                                                                  QUERY PLAN
      ----------------------------------------------------------------------------------------------------------------------------------
       Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..2207585.75 rows=35 width=8)
         ->  Result  (cost=0.00..2207585.75 rows=12 width=8)
               ->  Sequence  (cost=0.00..2207585.75 rows=12 width=8)
                     ->  Shared Scan (share slice:id 2:0)  (cost=0.00..431.00 rows=2 width=1)
                           ->  Materialize  (cost=0.00..431.00 rows=2 width=1)
                                 ->  Table Scan on foo  (cost=0.00..431.00 rows=2 width=34)
                     ->  Sequence  (cost=0.00..2207154.75 rows=12 width=8)
                           ->  Shared Scan (share slice:id 2:1)  (cost=0.00..431.00 rows=2 width=1)
                                 ->  Materialize  (cost=0.00..431.00 rows=2 width=1)
                                       ->  Table Scan on foo  (cost=0.00..431.00 rows=2 width=34)
                           ->  Append  (cost=0.00..2206723.75 rows=12 width=8)
                                 ->  Nested Loop Left Join  (cost=0.00..882691.26 rows=10 width=68)
                                       Join Filter: false
                                       ->  Shared Scan (share slice:id 2:0)  (cost=0.00..431.00 rows=2 width=34)
                                       ->  Result  (cost=0.00..0.00 rows=0 width=34)
                                             One-Time Filter: false
                                 ->  Result  (cost=0.00..1324032.49 rows=2 width=68)
                                       ->  Nested Loop Left Anti Semi Join  (cost=0.00..1324032.49 rows=2 width=34)
                                             Join Filter: false
                                             ->  Shared Scan (share slice:id 2:1)  (cost=0.00..431.00 rows=2 width=34)
                                             ->  Materialize  (cost=0.00..431.00 rows=5 width=1)
                                                   ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..431.00 rows=5 width=1)
                                                         ->  Result  (cost=0.00..431.00 rows=2 width=1)
                                                               ->  Shared Scan (share slice:id 1:0)  (cost=0.00..431.00 rows=2 width=1)
       Optimizer status: PQO version 2.43.1
      (25 rows)
      ```
      30cfe889
    • O
      Bump Orca version to 2.44.0 · f55e6f70
      Omer Arap 提交于
      f55e6f70
    • O
      Minidump updates · 8df82709
      Omer Arap 提交于
      8df82709
    • O
      Only request stats of columns needed for cardinality estimation [#150424379] · 05a26924
      Omer Arap 提交于
      GPORCA should not spend time extracting column statistics that are not
      needed for cardinality estimation. This commit eliminates this overhead
      of requesting and generating the statistics for columns that are not
      used in cardinality estimation unnecessarily.
      
      E.g:
      `CREATE TABLE foo (a int, b int, c int);`
      
      For table foo, the query below only needs for stats for column `a` which
      is the distribution column and column `c` which is the column used in
      where clause.
      `select * from foo where c=2;`
      
      However, prior to that commit, the column statistics for column `b` is
      also calculated and passed for the cardinality estimation. The only
      information needed by the optimizer is the `width` of column `b`. For
      this tiny information, we transfer every stats information for that
      column.
      
      This commit and its counterpart commit in GPDB ensures that the column
      width information is passed and extracted in the `dxl:Relation` metadata
      information.
      
      Preliminary results for short running queries provides up to 65x
      performance improvement.
      Signed-off-by: NJemish Patel <jpatel@pivotal.io>
      05a26924
  20. 12 9月, 2017 2 次提交
    • J
      Use col position from table descriptor for index key retrieval · 7ccedc81
      Jemish Patel 提交于
      Index keys are relative to the relation and the list of columns in
      `pdrgpcr` is relative to the table descriptor and does not include any
      dropped columns. Consider the case where we had 20 columns in a table. We
      create an index that covers col # 20 as one of its keys. Then we drop
      columns 10 through 15. Now the index key still points to col #20 but the
      column ref list in `pdrgpcr` will only have 15 elements in it and cause
      ORCA to crash with an `Out of Bounds` exception when
      `CLogical::PosFromIndex()` gets called.
      
      This commit fixes that issue by using the index key as the position to
      retrieve the column from the relation. Use the column's `attno` to get
      the column's current position relative to the table descriptor `ulPos`.
      Then uses this `ulPos` to retrieve the `CColRef` of the index key
      column as shown below:
      
      ```
      CColRef *pcr = (*pdrgpcr)[ulPos];
      ```
      
      We also added the 2 test cases below to test for the above condition:
      
      1. `DynamicIndexGetDroppedCols`
      2. `LogicalIndexGetDroppedCols`
      
      Both of the above test cases use a table with 30 columns; create a btree
      index on 6 columns and then drop 7 columns so the table only has 23
      columns.
      
      This commit also bumps ORCA to version 2.43.1
      7ccedc81
    • B
      Bump ORCA version · 9dec59e0
      Bhuvnesh Chaudhary 提交于
      9dec59e0