1. 17 3月, 2017 1 次提交
    • H
      Fix bug that can't generate equality partition filters for multilevel partition table · e99b9fef
      Haisheng Yuan 提交于
      The equality parition filter of partition selector works very well for single
      level partition table, but if the table has multilevel partitions, e.g. 2
      levels with pk1, pk2 as the partition key for level 1 and 2, and there is a
      equality predicate in the query, say pk1 = 2, then level 2 equality filter is null,
      the function `FEqPartFiltersAllLevels` will return false, causing the equality
      predicate put into PartFilters instead of PartEqFilters. This bug has been
      fixed in the patch.
      
      [#141826453]
      e99b9fef
  2. 16 3月, 2017 2 次提交
  3. 11 3月, 2017 1 次提交
    • O
      [#141511349] Improve HashMap iterator implementation · 4d9e03a8
      Omer Arap 提交于
      Currently, the HashMapIter implementation scans through all hash map
      buckets to get the next existing hash chain. This degrades performance
      significantly.
      
      This commit improves the iterator implementation by maintaining a
      dynamic key array which holds the existing keys in HashMap. The
      iteration is done using this array.
      Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
      4d9e03a8
  4. 08 3月, 2017 5 次提交
    • O
      Bump ORCA version to 2.10 · 8263c6b4
      Omer Arap 提交于
      8263c6b4
    • H
      Remove wrappers over standard C99 math functons. · c779beb7
      Heikki Linnakangas 提交于
      Makes the code simpler for humans, and also allows the compiler to "see"
      what the operations are, and optimize accordingly.
      
      I hoped for the GPDB compiler warnings about failed inlining with -Winline
      to go away with commit 5f774da5. It turns out that commit was not enough,
      unfortunately, but this commit does the trick.
      c779beb7
    • B
      Bump ORCA version to 2.9 · a8865d7d
      Bhuvnesh Chaudhary 提交于
      Signed-off-by: NOmer Arap <oarap@pivotal.io>
      a8865d7d
    • B
      Added test cases for equivalance classes · 83aac2cb
      Bhuvnesh Chaudhary 提交于
      Added utility function to compare two Equivalence Class
      Arrays
      Signed-off-by: NOmer Arap <oarap@pivotal.io>
      83aac2cb
    • O
      [#140601033] Optimize equivalence classes intersection · 66c3c843
      Omer Arap 提交于
      This commit request introduces a better algorithm for generating the intersection
      of two sets of equivalence classes. The input sets are arrays of column
      reference sets which denotes individual equivalence classes.
      
      Since the equivalence classes are disjoint by definition, the implementation
      could be reduced from quadratic time to the linear time by utilizing an HashMap.
      
      We build a hash-map using the columns as keys and equivalence classes as the
      value.
      
      Below is a sample input and output scenario.
      
      `Classes1`: `[(a,b),(c,d,e),(f,g)]`
      `Classes2`: `[(a),(b,c,d),(e),(f,g)]`
      
      The hashmap after first iteration on `Classes1`:
      `a->(a,b)`
      `b->(a,b)`
      `c->(c,d,e)`
      `d->(c,d,e)`
      `e->(c,d,e)`
      `f->(f,g)`
      `g->(f,g)`
      
      In the probe iteration, once we detect an intersection, we replace the intersected
      columns' entry in the HashMap with an empty set to avoid duplication.
      
      In the probe iteration, first `(a)` will be read first and column `a` will be
      probed. `map[a] = (a,b)`. So the intersection of `(a,b)` and `(a)` is `(a)` and it will
      be added to result list. `a`'s entry in the map is emptied. Hash map after this step
      will look like below:
      `a->()`
      `b->(a,b)`
      `c->(c,d,e)`
      `d->(c,d,e)`
      `e->(c,d,e)`
      `f->(f,g)`
      `g->(f,g)`
      
      Then, `(b,c,d)` will be processed. First column `b` is used to probe the hash map and
      it will return `(a,b)`. The intersection will be `(b)` and it will be appended to the
      result list. `b`'s entry in the map is emptied. Hash Map after this step:
      `a->()`
      `b->()`
      `c->(c,d,e)`
      `d->(c,d,e)`
      `e->(c,d,e)`
      `f->(f,g)`
      `g->(f,g)`
      
      Then column `c` will be looked up and hash map will return `(c,d,e)`. The intersaction
      of `(b,c,d)` and `(c,d,e)` will be `(c,d)` and it will be added to result list. `c` and
      `d`'s entry is also emptied in the Hash Map after this step and it will be like below:
      `a->()`
      `b->()`
      `c->()`
      `d->()`
      `e->(c,d,e)`
      `f->(f,g)`
      `g->(f,g)`
      
      So, when we look up for column `d` at this point which is the last column of `(b,c,d)`,
      since `d`'s entry is blank in the hash map, it will not produce a duplicate intersection
      result of `(c,d)` again.
      
      The final result will be `[(a),(b),(c,d),(e),(f,g)]`
      Signed-off-by: NTaylor Vesely <tvesely@pivotal.io>
      66c3c843
  5. 07 3月, 2017 6 次提交
  6. 23 2月, 2017 2 次提交
    • B
      Bump ORCA version to 2.7 [#139042295] · aa466bfe
      Bhunvesh Chaudhary 提交于
      Signed-off-by: NTaylor Vesely <tvesely@pivotal.io>
      aa466bfe
    • O
      [#139042295] Dedup constraints in linear time (#143) · 319a6eb9
      Omer Arap 提交于
      * [#139042295] Dedup constraints in linear time
      
      GP Orca rearranges constraints list where a constrint only refers a single
      column reference as much as possible. In order to do that the previous implementation
      traverse the list for every single constraint and that results in O(n^2) (quadratic)
      running time.
      
      If the number of constraints in the list is higher than some threashold
      (hardcoded as 5 for now) it results in significant performance degredation.
      
      This commit uses hash map to create a map from column referances to constraints where
      a constraint only refers a single column. This will reduce the quadratic running time
      of deduplication process to a linear running time. However, creating a hashmap for short
      list of constraints also results in performnce degradation.
      Signed-off-by: NOmer Arap <oarap@pivotal.io>
      319a6eb9
  7. 16 2月, 2017 1 次提交
    • H
      Fix bug that ORCA cannot do DPE for array coerce predicate · f112264b
      Haisheng Yuan 提交于
      Before this patch, Orca can't extract the array elements inside ScalarArrayCmp
      if the array is wrapped with a ArrayCoerceExpr. This patch tries to extract
      array inside, let Orca be able to expand the array into disjunctions and do DPE
      for predicate with array coerce expression.
      
      [#138085391]
      f112264b
  8. 15 2月, 2017 1 次提交
    • H
      Remove unused support for dynamic min/max abs value values in CDouble. · 5f774da5
      Heikki Linnakangas 提交于
      It was possible to specify a minimum and maximum abs value for a CDouble,
      in the constructor. That would allow using different minimum and maximum
      in different places. However, that facility was unused; no caller passed
      non-default min/max values. So remove the unnecessary flexibility, and
      just use the same, default limits everywhere.
      
      I was seeing compiler warnings on failed inlining in GPDB from the CDouble's
      CheckValidity function, with -Winline. As well as saving some memory and
      some cycles from all operations on CDoubles and making the code simpler,
      I'm hoping that this will make those warnings go away.
      5f774da5
  9. 21 1月, 2017 9 次提交
    • E
      Bump ORCA version to 2.5 [#138076347] · 51779e96
      Ekta Khanna and Jesse Zhang 提交于
      This version should bring much-needed fix that unblocks everybody on a
      Mac. Kudos to Heikki Linnakangas and Ekta Khanna.
      51779e96
    • E
      3a0921d8
    • H
      Fix populating config.h. · c60f5ea5
      Heikki Linnakangas 提交于
      Commmit 76af8194, to generate a config.h with all the required GPOS_*
      flags that affect the API of the resulting binaries, most notably
      GPOS_DEBUG, was broken. It didn't derive the GPOS_DEBUG flag from
      from CMAKE_BUILD_TYPE sit should've. It did the right thing if you set
      "GPOS_DEBUG=1" property on the cmake command line, but that property was
      not set automatically with CMAKE_BUILD_TYPE=Debug. CMAKE_BUILD_TYPE=Debug
      added GPOS_DEBUG=1 to the COMPILE_DEFINITIONS property, but that's different
      from having a stand-along GPOS_DEBUG property.
      
      Likewise, the GPOS_<arch> and GPOS_32BIT/64BIT variables were also not set
      correctly. The reason I didn't notice this while testing was that the
      flags in COMPILE_DEFINITIONS were still set, even though if config.h was
      generated correctly, those would not have been necessary aynmore.
      
      To fix, remove those flags from COMPILE_DEFINITIONS so that they are not
      passed to the compiler command line anymore. The defines are now always
      read from the config.h file, and the defines are set correctly in config.h.
      
      Per report off-list by Jesse Zhang.
      c60f5ea5
    • V
      Orca should skip collapsing projects when it has multiple set returning... · 9aa15351
      Venkatesh Raghavan 提交于
      Orca should skip collapsing projects when it has multiple set returning functions and one of project elements cannot be collapsed [#137642207]
      
      If the project element of a subquery has a set returning function, it should not be merged with its child since it may generate wrong results under the following conditions:
      
      * The child's project list contains a set returning function (this was fixed in a prior fix)
      * The current project list has other set returning functions that may not be merged with its child since it uses the childs output - see the example below. Splitting these project elements will change the semantic of the query (fixed by the current fix).
      
      Example SQL:
      
      select generate_series(0,2) rn, unnest(arr) cnt_div from (select ARRAY['0','1','2', '3'] as arr) arra;
      
      The SQL has two project nodes:
      * The top project node with two set returning function.
      * The lower ARRAY function over a constant table get.
      
      ```
      
      +--CLogicalProject
         |--CLogicalProject
         |  |--CLogicalConstTableGet Columns: ["" (0)] Values: [(1)]
         |  +--CScalarProjectList
         |     +--CScalarProjectElement "arr" (1)
         |        +--CScalarArray: {eleMDId: (25,1.0), arrayMDId: (1009,1.0)}
         |           |--CScalarConst (161088044.000)
         |           |--CScalarConst (161096236.000)
         |           +--CScalarConst (161104428.000)
         +--CScalarProjectList
            |--CScalarProjectElement "rn" (2)
            |  +--CScalarFunc (generate_series)
            |     |--CScalarConst (0)
            |     +--CScalarConst (2)
            +--CScalarProjectElement "cnt_div" (3)
               +--CScalarFunc (unnest)
                  +--CScalarIdent "arr" (1)
      ```
      
      The top project node has two set returning functions:
      * generate_series - that has no dependence on the child's project list so can be collapsed.
      * unnest(arr) - takes as input `arr` which comes from child's project list so cannot be collapsed.
      
      Before this fix, we collapsed the first function and not the second. Like below.
      
      ```
      Algebrized preprocessed query:
      +--CLogicalProject
         |--CLogicalProject
         |  |--CLogicalConstTableGet Columns: ["" (0)] Values: [(1)]
         |  +--CScalarProjectList
         |     |--CScalarProjectElement "rn" (2)
         |     |  +--CScalarFunc (generate_series)
         |     |     |--CScalarConst (0)
         |     |     +--CScalarConst (2)
         |     +--CScalarProjectElement "arr" (1)
         |        +--CScalarArray: {eleMDId: (25,1.0), arrayMDId: (1009,1.0)}
         |           |--CScalarConst (161088044.000)
         |           |--CScalarConst (161096236.000)
         |           +--CScalarConst (161104428.000)
         +--CScalarProjectList
            +--CScalarProjectElement "cnt_div" (3)
               +--CScalarFunc (unnest)
                  +--CScalarIdent "arr" (1)
      ```
      
      This causes wrong result since it changes the SQL semantics when we have multiple set returning functions in the a single project list.
      It is all or nothing.
      9aa15351
    • O
      Bump orca version 2.3.0 [#117186547] · 5ab1e794
      Omer Arap 提交于
      Signed-off-by: NHaisheng Yuan <hyuan@pivotal.io>
      5ab1e794
    • E
    • E
    • B
      aad4e7fa
    • J
      Append `.0` to version string · 78b6ccd5
      Jesse Zhang 提交于
      This patch adds a `.0` (or as they call it over semver.org, the "patch"
      number) to our version. This makes our version semver-compliant. It also
      makes it abundantly clear that our version number (say `2.2.0`) is not a
      decimal floating point number. ("What is newer? 2.2 or 2.19?")
      
      N.B. our practice is only bumping the minor and major version numbers.
      This change doesn't not signal a change in that practice.
      
      I'm also hiding this commit from CI so that it only impacts the next
      push (so it will be, say, `2.3.0` if we are currently on `2.2`).
      
      [ci skip]
      78b6ccd5
  10. 20 1月, 2017 1 次提交
    • J
      Append `.0` to version string · 8bd624bd
      Jesse Zhang 提交于
      This patch adds a `.0` (or as they call it over semver.org, the "patch"
      number) to our version. This makes our version semver-compliant. It also
      makes it abundantly clear that our version number (say `2.2.0`) is not a
      decimal floating point number. ("What is newer? 2.2 or 2.19?")
      
      N.B. our practice is only bumping the minor and major version numbers.
      This change doesn't not signal a change in that practice.
      
      I'm also hiding this commit from CI so that it only impacts the next
      push (so it will be, say, `2.3.0` if we are currently on `2.2`).
      
      [ci skip]
      8bd624bd
  11. 19 1月, 2017 1 次提交
    • H
      Add config.h, for options that affect binary compatibility. · 76af8194
      Heikki Linnakangas 提交于
      Before this patch, consumers of ORCA had to know out-of-band which
      flags were used to compile ORCA, because e.g. ORCA compiled with
      GPOS_DEBUG would only work if the application using ORCA was also
      compiled with GPOS_DEBUG. This is because many of the structs differ
      depending on GPOS_DEBUG. Same for the architecture flags, like GPOS_i386.
      
      The new config.h file is #included from a few central other header files,
      to make sure it gets included in any application that uses other gpos
      headers. We probably should include config.h from all other gpos header
      files, to be sure, but this seems to be enough for ORCA itself and GPDB at
      least.
      
      Bump version number to 2.2.
      76af8194
  12. 07 1月, 2017 2 次提交
  13. 06 1月, 2017 5 次提交
  14. 05 1月, 2017 3 次提交