src/test/regress/expected/gporca.out · efb2777a84b255fa7bd3f82d7c532fc7caa7ddcd · Greenplum / Gpdb

Implement CDB like pre-join deduplication · efb2777a

由 Dhanashree Kashid, Ekta Khanna and Omer Arap 提交于 6月 20, 2017

For flattened IN or EXISTS sublinks, if we chose INNER JOIN path instead
of SEMI JOIN then we need to apply duplicate suppression.

The deduplication can be done in two ways:
1. post-join dedup
unique-ify the inner join results. try_postjoin_dedup in CdbRelDedupInfo denotes
if we need to got for post-join dedup

2. pre-join dedup
unique-ify the rows coming from the rel containing the subquery result,
before that is joined with any other rels. join_unique_ininfo in
CdbRelDedupInfo denotes if we need to go for pre-join dedup.
semi_operators and semi_rhs_exprs are used for this. We ported a
function from 9.5 to compute these in make_outerjoininfo().

Upstream has completely different implementation of this. Upstream explores JOIN_UNIQUE_INNER
and JOIN_UNIQUE_OUTER paths for this and deduplication is done create_unique_path().
GPDB does this differently since JOIN_UNIQUE_INNER and JOIN_UNIQUE_OUTER are obsolete
for us. Hence we have kept the GPDB style deduplication mechanism as it in this merge.

Post-join has been implemented in previous merge commits.

Ref [#146890743]

efb2777a

gporca.out 233.6 KB

Greenplum / Gpdb

Replace gporca.out