提交 · 67785cdd1e1c060819c59af7e985d8a9f4ef6714 · Greenplum / Gpdb

08 6月, 2017 1 次提交
- J
  Adding a **How to Contribute** section to the README [#146394531] · 67785cdd
  由 Jemish Patel 提交于 6月 01, 2017
```
[ci skip]
Signed-off-by: NJemish Patel <jpatel@pivotal.io>
```
  67785cdd
18 5月, 2017 1 次提交
- V
  
  Add api to return all xforms that generate hash join · 658e05ac
  由 Venkatesh Raghavan 提交于 5月 17, 2017
  
  658e05ac
16 5月, 2017 2 次提交
- J
  Don't trigger a build on changes over README · 226e9d5d
  由 Jesse Zhang 提交于 5月 15, 2017
```
[ci skip]
```
  226e9d5d
- J
  Get all submodules when building GPDB · 8c315ea4
  由 Jesse Zhang 提交于 5月 15, 2017
```
Todd broke our CI in greenplum-db/gpdb@398534a9.

[ci skip]
```
  8c315ea4
15 5月, 2017 2 次提交
- V
  
  Update traceflags used in Debug build · 9510ab9a
  由 Venkatesh Raghavan 提交于 5月 14, 2017
  
  9510ab9a
- V
  Streamline Orca Traceflags · ecc57e1b
  由 Venkatesh Raghavan 提交于 5月 14, 2017
```
* Make sure intent of the traceflags is clear
* Remove double negation where possible
* Update comments
```
  ecc57e1b
12 5月, 2017 1 次提交
- C
  
  Format codeblocks in the readme so Github shows them correctly · 120a4741
  由 C.J. Jameson 提交于 5月 09, 2017
  
  120a4741
09 5月, 2017 6 次提交

D
Bump ORCA version to 2.30 · a4f89e1d
由 Dhanashree Kashid 提交于 5月 08, 2017
```
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
```
a4f89e1d

Retain additional conditions while translating Correlated Nested Loop Join [#144468913] · 8a0552f0

由 Jemish Patel and Jesse Zhang 提交于 5月 02, 2017

For a query using a correlated subselect as predicate, such as:

```
CREATE TABLE partitioned_table (a int, pk int) DISTRIBUTED BY (a)
	PARTITION BY range(pk) (end(5), end(10));
CREATE TABLE other_table (c int, d int) DISTRIBUTED BY (c);
INSERT INTO partitioned_table VALUES (1, 1), (2, 4), (3, 9);
ANALYZE ROOTPARTITION partitioned_table;

EXPLAIN SELECT pk from partitioned_table WHERE a > (SELECT d FROM other_table WHERE c = a) AND pk < 12;
```

ORCA will generate a Correlated Nested Loop Left Outer Join, which
should translate to a DXL Scan under a DXL SubPlan filter. However, the
translation happened in the following order:

0. Translate outer child (which has a filter) of correlated NLJ
0. Build a DXL SubPlan using the inner child, which is intended to serve as an
   "additional filter condition" on top of the outer child
0. Now based on the DXL plan for the outer child, we decide whether or not to
   use the additional condition (generated in #2 or #3) as a filter in the
   final result.
0. If the outer child is a Physical Sequence, we discarded the condition
   assuming that filter condition is already present in the partition
   selector.
0. This code to discard the subplan was added in
   e99325cc because, previously, we were always
   inserting additional filter as "the second child" based on the assumption
   that every DXL node has a filter child in the 2nd place. As it turned out,
   the DXL Sequence node is one counterexample: it has no filter, and it's
   second child is expected to be a partition selector.
0. We didn't catch this error in e99325cc because the test cases had a trivial
   additional filter condition of a constant true, so dropping it didn't really
   raise any eyebrows. However, the generally correct approach should be to
   retain this additional condition, either as an additional DXL Result node on
   top of the DXL for the outer child, or for appropriate types of nodes,
   inline the condition into existing filters.

This patch set fixes that.
Signed-off-by: NJemish Patel <jpatel@pivotal.io>
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

8a0552f0

J
More consistent ref-counting [#144468913] · b2fdaf9b
由 Jemish Patel and Jesse Zhang 提交于 5月 02, 2017
```
We ensure all three cases of PdxlnCorrelatedNLJoin take ownership of the
DXLProperties object
```
b2fdaf9b
J

Re-use the same dxl properties [#144468913] · 54565e06
由 Jemish Patel and Jesse Zhang 提交于 5月 02, 2017

54565e06
J

Extract true-check into PdxlnResultFromScalarConditionAndDxlChild [#144468913] · e3b1d571
由 Jemish Patel and Jesse Zhang 提交于 5月 02, 2017

e3b1d571

Move predicate combination into PdxlnResultFromNLJoinOuter [#144468913] · ce347b4a

由 Dhanashree Kashid, Jemish Patel and Jesse Zhang 提交于 5月 02, 2017

0. There are cases where the additional scalar condition cannot be combined
with the original condition in the DXL plan. Case in point: when the outer
child gets translated into a DXL Sequence, we cannot put the "combined
condition" into the sequence.
0. Deferring the combination of conditions also gives us an optimization
opportunity to reduce double translations.

ce347b4a

30 4月, 2017 1 次提交

Extend GPDB ICG with optimizer to an hour · c940456a

由 Jesse Zhang 提交于 4月 29, 2017

installcheck has over the last year gotten slowly bloated. This test
runs about 32 minutes when nothing else happens in Concourse. Given we
also run the planner ICG in parallel, we'd better err on the safe side
and extend this to an hour.

[ci skip]

c940456a

26 4月, 2017 11 次提交

O
Bump Orca version to 2.29.0 · 26e07c30
由 Omer Arap 提交于 4月 25, 2017
```
Signed-off-by: NJemish Patel <jpatel@pivotal.io>
```
26e07c30

Avoid throwing an error recursively while serializing error context. · c4b1cfda

由 Heikki Linnakangas 提交于 4月 25, 2017

If an error occurs while serializing an exception's error context, don't
recurse into Serialize. Serializing the context is likely to just fail
again, leading to infinite recursion.

Also disable abort-signal while serializing error context, to also avoid
recursing.

c4b1cfda

Clear CWorker->m_ptsk field on exception. · 4238ad87

由 Heikki Linnakangas 提交于 4月 25, 2017

If an error occurs, a worker is no longer executing the given task. This
was causing problems later, when the Task object had already been
destroyed, but m_ptsk was left dangling.

4238ad87

Avoid mallocs while iterating shared hash table. · e6fbc911

由 Heikki Linnakangas 提交于 4月 25, 2017

The hash table iterator holds a spinlock on the hash table, and there's
a GPOS_ASSERT_NO_SPINLOCK in PvMalloc. Writing to the dumper stream can
cause allocations, so avoid doing that while iterating.

e6fbc911

Speed up serialization of XML attributes to a file. · 72b865e4

由 Heikki Linnakangas 提交于 4月 03, 2017

This isn't strictly necessary, but considerably speeds up the dumping of
large queries. This brought down the time needed to process and dump a
600 MB minidump from about 90 s to 30 s on my laptop.

72b865e4

Serialize minidumps in a streaming fashion. · a223a100

由 Heikki Linnakangas 提交于 4月 03, 2017

Instead of having a fixed-size buffer to serialize minidumps to, refactor
the serialization functions to write to a stream. That simplifies the
serialization functions, as they no longer need to reserve space in the
buffer ahead of time (i.e. the UlpRequiredSpace() functions are gone),
reduces memory usage when dumping small queries, and makes it possible to
minidump large queries without running out of memory.

This removes the arbitrary 16 MB limit on minidump size. I'm not sure it's
a good idea to create multi-gigabyte minidumps in practice, but that's a
case of "if it hurts, don't do it". At least it's now possible, if you need
to.

a223a100

O
Bump up Orca version to 2.28.0 · ae65932c
由 Omer Arap 提交于 4月 25, 2017
```
Signed-off-by: NJemish Patel <jpatel@pivotal.io>
```
ae65932c
J
Change the tests for CHashMapIter and CHashSetIter · 16ada974
由 Jesse Zhang 提交于 4月 25, 2017
```
Signed-off-by: NOmer Arap <oarap@pivotal.io>
```
16ada974
O

Convert stats config map to a set · 21601aeb
由 Omer Arap 提交于 4月 19, 2017

21601aeb
O

Implement dedup with CHashSet instead of CHashMap · afbea5f9
由 Omer Arap 提交于 4月 19, 2017

afbea5f9

New CHashSet with matching iterator · cd856d3d

由 Omer Arap 提交于 4月 19, 2017

This commit provides better implementation of `CHashSet`. The improvements
follows the same logic as `CHashMap` and provided a new iterator called
`CHashSetIter`. Even though `CHashSet` exists in the code base, there
was no real usage in the Orca codebase.

cd856d3d

22 4月, 2017 6 次提交
- V
  
  Bump up Orca version to 2.27.0 · 62b4f337
  由 Venkatesh Raghavan 提交于 4月 21, 2017
  
  62b4f337
- V
  
  Delete dummy wrapper for MDId · 72ad25d5
  由 Venkatesh Raghavan 提交于 4月 10, 2017
  
  72ad25d5
- V
  
  Bump Orca version to 2.26 · 64568300
  由 Venkatesh Raghavan 提交于 4月 21, 2017
  
  64568300
- V
  
  Do not record time needed to access metadata when profiling is disabled · 74bb58eb
  由 Venkatesh Raghavan 提交于 4月 21, 2017
  
  74bb58eb
- V
  
  Bump Orca version to 2.25 · 023258ef
  由 Venkatesh Raghavan 提交于 4月 21, 2017
  
  023258ef
- V
  
  Do not record time for tranformations when profiling is disabled · d5a66765
  由 Venkatesh Raghavan 提交于 4月 21, 2017
  
  d5a66765
21 4月, 2017 2 次提交

B
Bump ORCA version to 2.24 · ebc132d2
由 Bhuvnesh Chaudhary 提交于 4月 20, 2017
```
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
```
ebc132d2

Handle NULL while deriving propagation expression [#143443847] · de0a7729

由 Bhuvnesh Chaudhary 提交于 4月 20, 2017

While constructing the propagation expression in case of queries
involving default list partitions which have indexes, we ended
up with NULLs crashing the database.

Currently, we fallback to planner. But we need to handle it in a
better way in future.
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>

de0a7729

13 4月, 2017 4 次提交

O

Disabling a test for space size issue · 2567ef1c
由 Omer Arap 提交于 4月 12, 2017

2567ef1c
O
Bump ORCA version to 2.23.0 · ce63300b
由 Omer Arap 提交于 4月 12, 2017
```
Signed-off-by: NBhunvesh Chaudhary <bchaudhary@pivotal.io>
```
ce63300b

Fixed Test Cases · 6471f12b

由 Omer Arap 提交于 4月 07, 2017

With improvement in the expression de duplication algorithm, we now
also retain the order of expression due to which the test cases required
the order expression to be fixed.

We see a lot of plan diffs and plan size changes and the reason for that
is the we always preserve the order of deduped list.

Since we output different deduped list the plan changes are expected
specifically the way that we hash join and hash distribute.
Signed-off-by: NBhunvesh Chaudhary <bchaudhary@pivotal.io>

6471f12b

Expression dedup complexity from Quad to Linear · 76467ab2

由 Omer Arap 提交于 4月 06, 2017

This commit introduces a performance improvement and stability for
deduplication process of passed expression list.

Earlier the running time complexity of this operation was O(n^2). This
commit reduces the running time to O(n) using a HashMap.

In addition, previously the order of expression is not preserved, as we
used to add the last entry for a expr in the output list, however now
we insert the first entry which keeps the order as is.

E.g. Let's assume everything in the below list is an expression:

A = `a,a,b,b,c,c`
If you run `dedup(A)` It will produce B=`a,b,c`.

Then if you add another `a` to the list B, it will B'=`a,b,c,a`
and when we run the `dedup(B')` it will generate `b,c,a` in the old
code.

With this change, it will always generate `a,b,c`.
Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>

76467ab2

12 4月, 2017 3 次提交
- V
  
  Bump ORCA version to 2.21.0 [#143323299] · d2b90209
  由 Venkatesh Raghavan 提交于 4月 11, 2017
  
  d2b90209
- V
  
  Remove remaining DXL Init Plan references · 63838396
  由 Venkatesh Raghavan 提交于 4月 11, 2017
  
  63838396
- V
  
  Remove CDXLScalarInitPlan related dead code · 675d9c3d
  由 Venkatesh Raghavan 提交于 4月 10, 2017
  
  675d9c3d