• D
    Implement CDB like pre-join deduplication · efb2777a
    Dhanashree Kashid, Ekta Khanna and Omer Arap 提交于
    For flattened IN or EXISTS sublinks, if we chose INNER JOIN path instead
    of SEMI JOIN then we need to apply duplicate suppression.
    
    The deduplication can be done in two ways:
    1. post-join dedup
    unique-ify the inner join results. try_postjoin_dedup in CdbRelDedupInfo denotes
    if we need to got for post-join dedup
    
    2. pre-join dedup
    unique-ify the rows coming from the rel containing the subquery result,
    before that is joined with any other rels. join_unique_ininfo in
    CdbRelDedupInfo denotes if we need to go for pre-join dedup.
    semi_operators and semi_rhs_exprs are used for this. We ported a
    function from 9.5 to compute these in make_outerjoininfo().
    
    Upstream has completely different implementation of this. Upstream explores JOIN_UNIQUE_INNER
    and JOIN_UNIQUE_OUTER paths for this and deduplication is done create_unique_path().
    GPDB does this differently since JOIN_UNIQUE_INNER and JOIN_UNIQUE_OUTER are obsolete
    for us. Hence we have kept the GPDB style deduplication mechanism as it in this merge.
    
    Post-join has been implemented in previous merge commits.
    
    Ref [#146890743]
    efb2777a
gporca.out 233.6 KB