未验证 提交 a4362cba 编写于 作者: H Hans Zeller 提交者: GitHub

Support "NDV-preserving" function and op property (#10247)

Orca uses this property for cardinality estimation of joins.
For example, a join predicate foo join bar on foo.a = upper(bar.b)
will have a cardinality estimate similar to foo join bar on foo.a = bar.b.

Other functions, like foo join bar on foo.a = substring(bar.b, 1, 1)
won't be treated that way, since they are more likely to have a greater
effect on join cardinalities.

Since this is specific to ORCA, we use logic in the translator to determine
whether a function or operator is NDV-preserving. Right now, we consider
a very limited set of operators, we may add more at a later time.

Let's assume that we join tables R and S and that f is a function or
expression that refers to a single column and does not preserve
NDVs. Let's also assume that p is a function or expression that also
refers to a single column and that does preserve NDVs:

join predicate       card. estimate                         comment
-------------------  -------------------------------------  -----------------------------
col1 = col2          |R| * |S| / max(NDV(col1), NDV(col2))  build an equi-join histogram
f(col1) = p(col2)    |R| * |S| / NDV(col2)                  use NDV-based estimation
f(col1) = col2       |R| * |S| / NDV(col2)                  use NDV-based estimation
p(col1) = col2       |R| * |S| / max(NDV(col1), NDV(col2))  use NDV-based estimation
p(col1) = p(col2)    |R| * |S| / max(NDV(col1), NDV(col2))  use NDV-based estimation
otherwise            |R| * |S| * 0.4                        this is an unsupported pred
Note that adding casts to these expressions is ok, as well as switching left and right side.

Here is a list of expressions that we currently treat as NDV-preserving:

coalesce(col, const)
col || const
lower(col)
trim(col)
upper(col)

One more note: We need the NDVs of the inner side of Semi and
Anti-joins for cardinality estimation, so only normal columns and
NDV-preserving functions are allowed in that case.

This is a port of these GPDB 5X and GPOrca PRs:
https://github.com/greenplum-db/gporca/pull/585
https://github.com/greenplum-db/gpdb/pull/10090

This is take 2, after reverting the first attempt due to a merge conflict that
caused a test to fail.
上级 560ffcb1
......@@ -639,6 +639,28 @@ gpdb::FuncStrict
return false;
}
bool
gpdb::IsFuncNDVPreserving
(
Oid funcid
)
{
// Given a function oid, return whether it's one of a list of NDV-preserving
// functions (estimated NDV of output is similar to that of the input)
switch (funcid)
{
// for now, these are the functions we consider for this optimization
case LOWER_OID:
case LTRIM_SPACE_OID:
case BTRIM_SPACE_OID:
case RTRIM_SPACE_OID:
case UPPER_OID:
return true;
default:
return false;
}
}
char
gpdb::FuncStability
(
......@@ -2128,6 +2150,24 @@ gpdb::IsOpStrict
return false;
}
bool
gpdb::IsOpNDVPreserving
(
Oid opno
)
{
switch (opno)
{
// for now, we consider only the concatenation op as NDV-preserving
// (note that we do additional checks later, e.g. col || 'const' is
// NDV-preserving, while col1 || col2 is not)
case OIDTextConcatenateOperator:
return true;
default:
return false;
}
}
void
gpdb::GetOpInputTypes
(
......
......@@ -1750,6 +1750,7 @@ CTranslatorRelcacheToDXL::RetrieveScOp
}
BOOL returns_null_on_null_input = gpdb::IsOpStrict(op_oid);
BOOL is_ndv_preserving = gpdb::IsOpNDVPreserving(op_oid);
CMDIdGPDB *mdid_hash_opfamily = NULL;
OID distr_opfamily = gpdb::GetCompatibleHashOpFamily(op_oid);
......@@ -1781,7 +1782,8 @@ CTranslatorRelcacheToDXL::RetrieveScOp
returns_null_on_null_input,
RetrieveScOpOpFamilies(mp, mdid),
mdid_hash_opfamily,
mdid_legacy_hash_opfamily
mdid_legacy_hash_opfamily,
is_ndv_preserving
);
return md_scalar_op;
}
......@@ -1802,12 +1804,14 @@ CTranslatorRelcacheToDXL::LookupFuncProps
IMDFunction::EFuncStbl *stability, // output: function stability
IMDFunction::EFuncDataAcc *access, // output: function datya access
BOOL *is_strict, // output: is function strict?
BOOL *is_ndv_preserving, // output: preserves NDVs of inputs
BOOL *returns_set // output: does function return set?
)
{
GPOS_ASSERT(NULL != stability);
GPOS_ASSERT(NULL != access);
GPOS_ASSERT(NULL != is_strict);
GPOS_ASSERT(NULL != is_ndv_preserving);
GPOS_ASSERT(NULL != returns_set);
*stability = GetFuncStability(gpdb::FuncStability(func_oid));
......@@ -1818,6 +1822,7 @@ CTranslatorRelcacheToDXL::LookupFuncProps
*returns_set = gpdb::GetFuncRetset(func_oid);
*is_strict = gpdb::FuncStrict(func_oid);
*is_ndv_preserving = gpdb::IsFuncNDVPreserving(func_oid);
}
......@@ -1886,7 +1891,8 @@ CTranslatorRelcacheToDXL::RetrieveFunc
IMDFunction::EFuncDataAcc access = IMDFunction::EfdaNoSQL;
BOOL is_strict = true;
BOOL returns_set = true;
LookupFuncProps(func_oid, &stability, &access, &is_strict, &returns_set);
BOOL is_ndv_preserving = true;
LookupFuncProps(func_oid, &stability, &access, &is_strict, &is_ndv_preserving, &returns_set);
mdid->AddRef();
CMDFunctionGPDB *md_func = GPOS_NEW(mp) CMDFunctionGPDB
......@@ -1899,7 +1905,8 @@ CTranslatorRelcacheToDXL::RetrieveFunc
returns_set,
stability,
access,
is_strict
is_strict,
is_ndv_preserving
);
return md_func;
......
......@@ -530,7 +530,7 @@
<dxl:Plan Id="0" SpaceSize="21">
<dxl:GatherMotion InputSegments="0,1" OutputSegments="-1">
<dxl:Properties>
<dxl:Cost StartupCost="0" TotalCost="641930.734375" Rows="1000.000000" Width="4"/>
<dxl:Cost StartupCost="0" TotalCost="3219.015625" Rows="1000.000000" Width="4"/>
</dxl:Properties>
<dxl:ProjList>
<dxl:ProjElem ColId="27" Alias="?column?">
......@@ -541,7 +541,7 @@
<dxl:SortingColumnList/>
<dxl:Result>
<dxl:Properties>
<dxl:Cost StartupCost="0" TotalCost="641927.781250" Rows="1000.000000" Width="4"/>
<dxl:Cost StartupCost="0" TotalCost="3216.062500" Rows="1000.000000" Width="4"/>
</dxl:Properties>
<dxl:ProjList>
<dxl:ProjElem ColId="27" Alias="?column?">
......@@ -552,7 +552,7 @@
</dxl:ParamList>
<dxl:Result>
<dxl:Properties>
<dxl:Cost StartupCost="0" TotalCost="1919.921875" Rows="80.000000" Width="4"/>
<dxl:Cost StartupCost="0" TotalCost="1608.203125" Rows="0.200000" Width="4"/>
</dxl:Properties>
<dxl:ProjList>
<dxl:ProjElem ColId="9" Alias="i">
......@@ -681,7 +681,7 @@
<dxl:OneTimeFilter/>
<dxl:TableScan>
<dxl:Properties>
<dxl:Cost StartupCost="0" TotalCost="641922.875000" Rows="1000.000000" Width="4"/>
<dxl:Cost StartupCost="0" TotalCost="3211.156250" Rows="1000.000000" Width="4"/>
</dxl:Properties>
<dxl:ProjList>
<dxl:ProjElem ColId="0" Alias="i">
......
<?xml version="1.0" encoding="UTF-8"?>
<dxl:DXLMessage xmlns:dxl="http://greenplum.com/dxl/2010/12/">
<dxl:Comment><![CDATA[
Test case: Left outer join with outer refs in join predicate
drop table if exists x,y,z;
create table x(i int, j int);
create table y(i int, j int);
create table z(i int, j int);
insert into x select i, i%2 from generate_series(1, 10) i;
insert into y select i, i%2 from generate_series(1, 10) i;
insert into z select i, i%2 from generate_series(1, 1000) i;
analyze x;
analyze y;
analyze z;
set optimizer_enumerate_plans = on;
set optimizer_segments = 2;
explain select (select x.i from x left outer join y on x.i+y.i = z.i) from z;
]]>
</dxl:Comment>
<dxl:Thread Id="0">
<dxl:OptimizerConfig>
<dxl:EnumeratorConfig Id="0" PlanSamples="0" CostThreshold="0"/>
......
......@@ -173,7 +173,7 @@
<dxl:SumAgg Mdid="0.0.0.0"/>
<dxl:CountAgg Mdid="0.2147.1.0"/>
</dxl:Type>
<dxl:GPDBScalarOp Mdid="0.97.1.0" Name="&lt;" ComparisonType="LT" ReturnsNullOnNullInput="true">
<dxl:GPDBScalarOp Mdid="0.97.1.0" Name="&lt;" ComparisonType="LT" ReturnsNullOnNullInput="true" IsNDVPreserving="false">
<dxl:LeftType Mdid="0.23.1.0"/>
<dxl:RightType Mdid="0.23.1.0"/>
<dxl:ResultType Mdid="0.16.1.0"/>
......@@ -185,14 +185,14 @@
<dxl:Opfamily Mdid="0.3027.1.0"/>
</dxl:Opfamilies>
</dxl:GPDBScalarOp>
<dxl:GPDBFunc Mdid="0.274.1.0" Name="timeofday" ReturnsSet="false" Stability="Volatile" DataAccess="NoSQL" IsStrict="true">
<dxl:GPDBFunc Mdid="0.274.1.0" Name="timeofday" ReturnsSet="false" Stability="Volatile" DataAccess="NoSQL" IsStrict="true" IsNDVPreserving="false">
<dxl:ResultType Mdid="0.25.1.0"/>
</dxl:GPDBFunc>
<dxl:GPDBAgg Mdid="0.2101.1.0" Name="avg" IsSplittable="true" HashAggCapable="true">
<dxl:ResultType Mdid="0.1700.1.0"/>
<dxl:IntermediateResultType Mdid="0.17.1.0"/>
</dxl:GPDBAgg>
<dxl:GPDBFunc Mdid="0.17135.1.0" Name="fooro" ReturnsSet="true" Stability="Volatile" DataAccess="ReadsSQLData" IsStrict="false">
<dxl:GPDBFunc Mdid="0.17135.1.0" Name="fooro" ReturnsSet="true" Stability="Volatile" DataAccess="ReadsSQLData" IsStrict="false" IsNDVPreserving="false">
<dxl:ResultType Mdid="0.2249.1.0"/>
<dxl:OutputColumns TypeMdids="0.23.1.0,0.23.1.0"/>
</dxl:GPDBFunc>
......
......@@ -1077,11 +1077,9 @@ namespace gpopt
static
BOOL FCrossJoin(CExpression *pexpr);
// extract scalar ident column reference from scalar expression containing
// only one scalar ident in the tree
const static
CColRef *PcrExtractFromScExpression(CExpression *pexpr);
// is this scalar expression an NDV-preserving function (used for join stats derivation)
static
BOOL IsExprNDVPreserving(CExpression *pexpr, const CColRef **underlying_colref);
// search the given array of predicates for predicates with equality or IS NOT
// DISTINCT FROM operators that has one side equal to the given expression
......
......@@ -5116,18 +5116,112 @@ CUtils::FCrossJoin
return fCrossJoin;
}
// extract scalar ident column reference from scalar expression containing
// only one scalar ident in the tree
const CColRef *
CUtils::PcrExtractFromScExpression
// Determine whether a scalar expression consists only of a scalar id and NDV-preserving
// functions plus casts. If so, return the corresponding CColRef.
BOOL
CUtils::IsExprNDVPreserving
(
CExpression *pexpr
CExpression *pexpr,
const CColRef **underlying_colref
)
{
if (pexpr->DeriveUsedColumns()->Size() == 1)
return pexpr->DeriveUsedColumns()->PcrFirst();
CExpression *curr_expr = pexpr;
*underlying_colref = NULL;
// go down the expression tree, visiting the child containing a scalar ident until
// we found the ident or until we found a non-NDV-preserving function (at which point there
// is no more need to check)
while (1)
{
COperator *pop = curr_expr->Pop();
ULONG child_with_scalar_ident = 0;
switch (pop->Eopid())
{
case COperator::EopScalarIdent:
{
// we reached the bottom of the expression, return the ColRef
CScalarIdent *cr = CScalarIdent::PopConvert(pop);
*underlying_colref = cr->Pcr();
GPOS_ASSERT(1 == pexpr->DeriveUsedColumns()->Size());
return true;
}
case COperator::EopScalarCast:
// skip over casts
// Note: We might in the future investigate whether there are some casts
// that reduce NDVs by too much. Most, if not all, casts that have that potential are
// converted to functions, though. Examples: timestamp -> date, double precision -> int.
break;
case COperator::EopScalarCoalesce:
{
// coalesce(col, const1, ... constn) is treated as an NDV-preserving function
for (ULONG c=1; c<curr_expr->Arity(); c++)
{
if (0 < (*curr_expr)[c]->DeriveUsedColumns()->Size())
{
// this coalesce has a ColRef in the second or later arguments, assume for
// now that this doesn't preserve NDVs (we could add logic to support this case later)
return false;
}
}
break;
}
case COperator::EopScalarFunc:
{
// check whether the function is NDV-preserving
CMDAccessor *md_accessor = COptCtxt::PoctxtFromTLS()->Pmda();
CScalarFunc *sf = CScalarFunc::PopConvert(pop);
const IMDFunction *pmdfunc = md_accessor->RetrieveFunc(sf->FuncMdId());
if (!pmdfunc->IsNDVPreserving() || 1 != curr_expr->Arity())
{
return false;
}
break;
}
case COperator::EopScalarOp:
{
CMDAccessor *md_accessor = COptCtxt::PoctxtFromTLS()->Pmda();
CScalarOp *so = CScalarOp::PopConvert(pop);
const IMDScalarOp *pmdscop = md_accessor->RetrieveScOp(so->MdIdOp());
if (!pmdscop->IsNDVPreserving() || 2 != curr_expr->Arity())
{
return false;
}
return NULL;
// col <op> const is NDV-preserving, and so is const <op> col
if (0 ==(*curr_expr)[1]->DeriveUsedColumns()->Size())
{
// col <op> const
child_with_scalar_ident = 0;
}
else if (0 ==(*curr_expr)[0]->DeriveUsedColumns()->Size())
{
// const <op> col
child_with_scalar_ident = 1;
}
else
{
// give up for now, both children reference a column,
// e.g. col1 <op> col2
return false;
}
break;
}
default:
// anything else we see is considered non-NDV-preserving
return false;
}
curr_expr = (*curr_expr)[child_with_scalar_ident];
}
}
......
......@@ -182,7 +182,8 @@ CLogicalDifference::PstatsDerive
exprhdl,
pexprScCond,
output_colrefsets,
outer_refs
outer_refs,
true // is an LASJ
);
IStatistics *LASJ_stats = outer_stats->CalcLASJoinStats
(
......
......@@ -179,7 +179,8 @@ CLogicalDifferenceAll::PstatsDerive
exprhdl,
pexprScCond,
output_colrefsets,
outer_refs
outer_refs,
true // is an LASJ
);
IStatistics *LASJ_stats = outer_stats->CalcLASJoinStats
(
......
......@@ -200,7 +200,8 @@ CLogicalIntersectAll::PstatsDerive
exprhdl,
pexprScCond,
output_colrefsets,
outer_refs
outer_refs,
true // is a semi-join
);
IStatistics *pstatsSemiJoin = CLogicalLeftSemiJoin::PstatsDerive(mp, join_preds_stats, outer_stats, inner_side_stats);
......
......@@ -149,7 +149,7 @@ CLogicalLeftAntiSemiJoin::PstatsDerive
GPOS_ASSERT(Esp(exprhdl) > EspNone);
IStatistics *outer_stats = exprhdl.Pstats(0);
IStatistics *inner_side_stats = exprhdl.Pstats(1);
CStatsPredJoinArray *join_preds_stats = CStatsPredUtils::ExtractJoinStatsFromExprHandle(mp, exprhdl);
CStatsPredJoinArray *join_preds_stats = CStatsPredUtils::ExtractJoinStatsFromExprHandle(mp, exprhdl, true /*LASJ*/);
IStatistics *pstatsLASJoin = outer_stats->CalcLASJoinStats
(
mp,
......
......@@ -171,7 +171,7 @@ CLogicalLeftSemiJoin::PstatsDerive
GPOS_ASSERT(Esp(exprhdl) > EspNone);
IStatistics *outer_stats = exprhdl.Pstats(0);
IStatistics *inner_side_stats = exprhdl.Pstats(1);
CStatsPredJoinArray *join_preds_stats = CStatsPredUtils::ExtractJoinStatsFromExprHandle(mp, exprhdl);
CStatsPredJoinArray *join_preds_stats = CStatsPredUtils::ExtractJoinStatsFromExprHandle(mp, exprhdl, true/*semi-join*/);
IStatistics *pstatsSemiJoin = PstatsDerive(mp, join_preds_stats, outer_stats, inner_side_stats);
join_preds_stats->Release();
......
......@@ -60,6 +60,8 @@ namespace gpdxl
// function strictness (i.e. whether func returns NULL on NULL input)
BOOL m_is_strict;
BOOL m_is_ndv_preserving;
// private copy ctor
CParseHandlerMDGPDBFunc(const CParseHandlerMDGPDBFunc &);
......
......@@ -65,6 +65,9 @@ namespace gpdxl
IMDId *m_mdid_hash_opfamily;
IMDId *m_mdid_legacy_hash_opfamily;
// preserves NDVs of inputs
BOOL m_is_ndv_preserving;
// private copy ctor
CParseHandlerMDGPDBScalarOp(const CParseHandlerMDGPDBScalarOp &);
......
......@@ -573,6 +573,7 @@ namespace gpdxl
EdxltokenCmpOther,
EdxltokenReturnsNullOnNullInput,
EdxltokenIsNDVPreserving,
EdxltokenTriggers,
EdxltokenTrigger,
......@@ -598,6 +599,7 @@ namespace gpdxl
EdxltokenGPDBFuncResultTypeId,
EdxltokenGPDBFuncReturnsSet,
EdxltokenGPDBFuncStrict,
EdxltokenGPDBFuncNDVPreserving,
EdxltokenGPDBCast,
EdxltokenGPDBCastBinaryCoercible,
......
......@@ -50,7 +50,7 @@ namespace gpmd
IMDId *m_mdid_type_result;
// output argument types
IMdIdArray *m_mdid_types_array;
IMdIdArray *m_mdid_types_array;
// whether function returns a set of values
BOOL m_returns_set;
......@@ -64,6 +64,10 @@ namespace gpmd
// function strictness (i.e. whether func returns NULL on NULL input)
BOOL m_is_strict;
// function result has very similar number of distinct values as the
// single function argument (used for cardinality estimation)
BOOL m_is_ndv_preserving;
// dxl token array for stability
Edxltoken m_dxl_func_stability_array[EfsSentinel];
......@@ -97,7 +101,8 @@ namespace gpmd
BOOL ReturnsSet,
EFuncStbl func_stability,
EFuncDataAcc func_data_access,
BOOL is_strict
BOOL is_strict,
BOOL is_ndv_preserving
);
virtual
......@@ -133,6 +138,12 @@ namespace gpmd
return m_is_strict;
}
virtual
BOOL IsNDVPreserving() const
{
return m_is_ndv_preserving;
}
// function stability
virtual
EFuncStbl GetFuncStability() const
......
......@@ -71,7 +71,7 @@ namespace gpmd
// does operator return NULL when all inputs are NULL?
BOOL m_returns_null_on_null_input;
// operator classes this operator belongs to
IMdIdArray *m_mdid_opfamilies_array;
......@@ -81,6 +81,10 @@ namespace gpmd
// compatible legacy hash op family using legacy (cdbhash) opclass
IMDId *m_mdid_legacy_hash_opfamily;
// does operator preserve the NDV of its input(s)
// (used for cardinality estimation)
BOOL m_is_ndv_preserving;
CMDScalarOpGPDB(const CMDScalarOpGPDB &);
public:
......@@ -101,7 +105,8 @@ namespace gpmd
BOOL returns_null_on_null_input,
IMdIdArray *mdid_opfamilies_array,
IMDId *m_mdid_hash_opfamily,
IMDId *mdid_legacy_hash_opfamily
IMDId *mdid_legacy_hash_opfamily,
BOOL is_ndv_preserving
);
~CMDScalarOpGPDB();
......@@ -155,6 +160,10 @@ namespace gpmd
virtual
BOOL ReturnsNullOnNullInput() const;
// preserves NDVs of its inputs?
virtual
BOOL IsNDVPreserving() const;
// comparison type
virtual
IMDType::ECmpType ParseCmpType() const;
......
......@@ -65,6 +65,10 @@ namespace gpmd
virtual
BOOL IsStrict() const = 0;
// does function preserve NDVs of input (for cardinality estimation)
virtual
BOOL IsNDVPreserving() const = 0;
// does function return a set of values
virtual
BOOL ReturnsSet() const = 0;
......
......@@ -75,6 +75,10 @@ namespace gpmd
virtual
BOOL ReturnsNullOnNullInput() const = 0;
// preserves NDVs of its inputs?
virtual
BOOL IsNDVPreserving() const = 0;
virtual
IMDType::ECmpType ParseCmpType() const = 0;
......
......@@ -55,9 +55,8 @@ namespace gpnaucrates
EstatscmptINDF, // is not distinct from
EstatscmptLike, // LIKE predicate comparison
EstatscmptNotLike, // NOT LIKE predicate comparison
// NDV comparision for equality predicate on columns with functions, ex f(a) = b or a = f(b)
EstatscmptEqNDVOuter, // use Outer NDV on inner side also
EstatscmptEqNDVInner, // use Inner NDV on outer side also
// NDV comparison for equality predicate on columns with functions, ex f(a) = b or a = f(b)
EstatscmptEqNDV,
EstatscmptOther
};
......
......@@ -64,6 +64,11 @@ namespace gpnaucrates
{}
// accessors
BOOL HasValidColIdOuter() const
{
return gpos::ulong_max != m_colidOuter;
}
ULONG ColIdOuter() const
{
return m_colidOuter;
......@@ -75,6 +80,11 @@ namespace gpnaucrates
return m_stats_cmp_type;
}
BOOL HasValidColIdInner() const
{
return gpos::ulong_max != m_colidInner;
}
ULONG ColIdInner() const
{
return m_colidInner;
......
......@@ -140,32 +140,40 @@ namespace gpopt
static
CStatsPred::EStatsCmpType GetStatsCmpType(IMDId *mdid);
// derive whether it is EstatscmptEqNDVInner or EstatscmptEqNDVOuter
static
CStatsPred::EStatsCmpType DeriveStatCmpEqNDVType ( ULONG left_index, ULONG right_index, BOOL left_is_null, BOOL right_is_null);
// helper function to extract statistics join filter from a given join predicate
static
CStatsPredJoin *ExtractJoinStatsFromJoinPred
(
CMemoryPool *mp,
CExpression *join_predicate_expr,
CColRefSetArray *join_output_col_refset, // array of output columns of join's relational inputs
CColRefSetArray *join_output_col_refset, // array of output columns of join's relational inputs
CColRefSet *outer_refs,
BOOL is_semi_or_anti_join,
CExpressionArray *unsupported_predicates_expr
);
// is the expression a comparison of scalar idents (or casted scalar idents).
// If so, extract relevant info.
// Is the expression a comparison of scalar idents (or casted scalar idents),
// or of other supported expressions? If so, extract relevant info.
static
BOOL IsPredCmpColsOrIgnoreCast
BOOL IsJoinPredSupportedForStatsEstimation
(
CExpression *expr,
const CColRef **col_ref1,
CColRefSetArray *output_col_refsets, // array of output columns of join's relational inputs
BOOL is_semi_or_anti_join,
CStatsPred::EStatsCmpType *stats_pred_cmp_type,
const CColRef **col_ref2,
BOOL &left_is_null,
BOOL &right_is_null
const CColRef **col_ref_outer,
const CColRef **col_ref_inner
);
// find out which input expression refers only to the inner table and which
// refers only to the outer table, and return accordingly
static BOOL AssignExprsToOuterAndInner
(
CColRefSetArray *output_col_refsets, // array of output columns of join's relational inputs
CExpression *expr_1,
CExpression *expr_2,
CExpression **outer_expr,
CExpression **inner_expr
);
public:
......@@ -180,14 +188,20 @@ namespace gpopt
(
CMemoryPool *mp,
CExpression *scalar_expr,
CColRefSetArray *output_col_refset, // array of output columns of join's relational inputs
CColRefSetArray *output_col_refset, // array of output columns of join's relational inputs
CColRefSet *outer_refs,
BOOL is_semi_or_anti_join,
CStatsPred **unsupported_pred_stats
);
// helper function to extract array of statistics join filter from an expression handle
static
CStatsPredJoinArray *ExtractJoinStatsFromExprHandle(CMemoryPool *mp, CExpressionHandle &expr_handle);
CStatsPredJoinArray *ExtractJoinStatsFromExprHandle
(
CMemoryPool *mp,
CExpressionHandle &expr_handle,
BOOL is_semi_or_anti_join
);
// helper function to extract array of statistics join filter from an expression
static
......@@ -197,7 +211,8 @@ namespace gpopt
CExpressionHandle &expr_handle,
CExpression *scalar_expression,
CColRefSetArray *output_col_refset,
CColRefSet *outer_refs
CColRefSet *outer_refs,
BOOL is_semi_or_anti_join
);
// is the predicate a conjunctive or disjunctive predicate
......
......@@ -38,7 +38,8 @@ CMDFunctionGPDB::CMDFunctionGPDB
BOOL ReturnsSet,
EFuncStbl func_stability,
EFuncDataAcc func_data_access,
BOOL is_strict
BOOL is_strict,
BOOL is_ndv_preserving
)
:
m_mp(mp),
......@@ -49,7 +50,8 @@ CMDFunctionGPDB::CMDFunctionGPDB
m_returns_set(ReturnsSet),
m_func_stability(func_stability),
m_func_data_access(func_data_access),
m_is_strict(is_strict)
m_is_strict(is_strict),
m_is_ndv_preserving(is_ndv_preserving)
{
GPOS_ASSERT(m_mdid->IsValid());
GPOS_ASSERT(EfsSentinel > func_stability);
......@@ -228,6 +230,7 @@ CMDFunctionGPDB::Serialize
xml_serializer->AddAttribute(CDXLTokens::GetDXLTokenStr(EdxltokenGPDBFuncStability), GetFuncStabilityStr());
xml_serializer->AddAttribute(CDXLTokens::GetDXLTokenStr(EdxltokenGPDBFuncDataAccess), GetFuncDataAccessStr());
xml_serializer->AddAttribute(CDXLTokens::GetDXLTokenStr(EdxltokenGPDBFuncStrict), m_is_strict);
xml_serializer->AddAttribute(CDXLTokens::GetDXLTokenStr(EdxltokenGPDBFuncNDVPreserving), m_is_ndv_preserving);
SerializeMDIdAsElem(xml_serializer, CDXLTokens::GetDXLTokenStr(EdxltokenGPDBFuncResultTypeId), m_mdid_type_result);
......
......@@ -43,7 +43,8 @@ CMDScalarOpGPDB::CMDScalarOpGPDB
BOOL returns_null_on_null_input,
IMdIdArray *mdid_opfamilies_array,
IMDId *mdid_hash_opfamily,
IMDId *mdid_legacy_hash_opfamily
IMDId *mdid_legacy_hash_opfamily,
BOOL is_ndv_preserving
)
:
m_mp(mp),
......@@ -59,7 +60,8 @@ CMDScalarOpGPDB::CMDScalarOpGPDB
m_returns_null_on_null_input(returns_null_on_null_input),
m_mdid_opfamilies_array(mdid_opfamilies_array),
m_mdid_hash_opfamily(mdid_hash_opfamily),
m_mdid_legacy_hash_opfamily(mdid_legacy_hash_opfamily)
m_mdid_legacy_hash_opfamily(mdid_legacy_hash_opfamily),
m_is_ndv_preserving(is_ndv_preserving)
{
GPOS_ASSERT(NULL != mdid_opfamilies_array);
m_dxl_str = CDXLUtils::SerializeMDObj(m_mp, this, false /*fSerializeHeader*/, false /*indentation*/);
......@@ -236,6 +238,12 @@ CMDScalarOpGPDB::ReturnsNullOnNullInput() const
}
BOOL
CMDScalarOpGPDB::IsNDVPreserving() const
{
return m_is_ndv_preserving;
}
//---------------------------------------------------------------------------
// @function:
// CMDScalarOpGPDB::ParseCmpType
......@@ -272,6 +280,7 @@ CMDScalarOpGPDB::Serialize
xml_serializer->AddAttribute(CDXLTokens::GetDXLTokenStr(EdxltokenName), m_mdname->GetMDName());
xml_serializer->AddAttribute(CDXLTokens::GetDXLTokenStr(EdxltokenGPDBScalarOpCmpType), IMDType::GetCmpTypeStr(m_comparision_type));
xml_serializer->AddAttribute(CDXLTokens::GetDXLTokenStr(EdxltokenReturnsNullOnNullInput), m_returns_null_on_null_input);
xml_serializer->AddAttribute(CDXLTokens::GetDXLTokenStr(EdxltokenIsNDVPreserving), m_is_ndv_preserving);
Edxltoken dxl_token_array[8] = {
EdxltokenGPDBScalarOpLeftTypeId, EdxltokenGPDBScalarOpRightTypeId,
......
......@@ -105,6 +105,17 @@ CParseHandlerMDGPDBFunc::StartElement
EdxltokenGPDBFunc
);
// parse whether func is NDV-preserving
m_is_ndv_preserving = CDXLOperatorFactory::ExtractConvertAttrValueToBool
(
m_parse_handler_mgr->GetDXLMemoryManager(),
attrs,
EdxltokenGPDBFuncNDVPreserving,
EdxltokenGPDBFunc,
true, // optional
false // default is false
);
// parse func stability property
const XMLCh *xmlszStbl = CDXLOperatorFactory::ExtractAttrValue
(
......@@ -190,7 +201,8 @@ CParseHandlerMDGPDBFunc::EndElement
m_returns_set,
m_func_stability,
m_func_data_access,
m_is_strict);
m_is_strict,
m_is_ndv_preserving);
// deactivate handler
m_parse_handler_mgr->DeactivateHandler();
......
......@@ -53,7 +53,8 @@ CParseHandlerMDGPDBScalarOp::CParseHandlerMDGPDBScalarOp
m_comparision_type(IMDType::EcmptOther),
m_returns_null_on_null_input(false),
m_mdid_hash_opfamily(NULL),
m_mdid_legacy_hash_opfamily(NULL)
m_mdid_legacy_hash_opfamily(NULL),
m_is_ndv_preserving(false)
{
}
......@@ -122,6 +123,17 @@ CParseHandlerMDGPDBScalarOp::StartElement
);
}
// ndv-preserving property is optional
m_is_ndv_preserving = CDXLOperatorFactory::ExtractConvertAttrValueToBool
(
m_parse_handler_mgr->GetDXLMemoryManager(),
attrs,
EdxltokenIsNDVPreserving,
EdxltokenGPDBScalarOp,
true, // is optional
false // default value
);
}
else if (0 == XMLString::compareString(CDXLTokens::XmlstrToken(EdxltokenGPDBScalarOpLeftTypeId), element_local_name))
{
......@@ -292,7 +304,8 @@ CParseHandlerMDGPDBScalarOp::EndElement
m_returns_null_on_null_input,
mdid_opfamilies_array,
m_mdid_hash_opfamily,
m_mdid_legacy_hash_opfamily
m_mdid_legacy_hash_opfamily,
m_is_ndv_preserving
)
;
......
......@@ -216,6 +216,7 @@ CJoinStatsProcessor::CalcAllJoinStats
join_preds_available,
output_colrefsets,
outer_refs,
is_a_left_join, // left joins use an anti-semijoin internally
&unsupported_pred_stats
);
......@@ -307,8 +308,11 @@ CJoinStatsProcessor::SetResultingJoinStats
{
CStatsPredJoin *join_stats = (*join_pred_stats_info)[i];
(void) join_colids->ExchangeSet(join_stats->ColIdOuter());
if (!semi_join)
if (join_stats->HasValidColIdOuter())
{
(void) join_colids->ExchangeSet(join_stats->ColIdOuter());
}
if (!semi_join && join_stats->HasValidColIdInner())
{
(void) join_colids->ExchangeSet(join_stats->ColIdInner());
}
......@@ -331,30 +335,43 @@ CJoinStatsProcessor::SetResultingJoinStats
for (ULONG i = 0; i < num_join_conds; i++)
{
CStatsPredJoin *pred_info = (*join_pred_stats_info)[i];
CStatsPred::EStatsCmpType stats_cmp_type = pred_info->GetCmpType();
ULONG colid1 = pred_info->ColIdOuter();
ULONG colid2 = pred_info->ColIdInner();
GPOS_ASSERT(colid1 != colid2);
// find the histograms corresponding to the two columns
const CHistogram *outer_histogram = outer_stats->GetHistogram(colid1);
// are column id1 and 2 always in the order of outer inner?
const CHistogram *inner_histogram = inner_side_stats->GetHistogram(colid2);
GPOS_ASSERT(NULL != outer_histogram);
GPOS_ASSERT(NULL != inner_histogram);
const CHistogram *outer_histogram = NULL;
const CHistogram *inner_histogram = NULL;
BOOL is_input_empty = CStatistics::IsEmptyJoin(outer_stats, inner_side_stats, IsLASJ);
CDouble local_scale_factor(1.0);
CHistogram *outer_histogram_after = NULL;
CHistogram *inner_histogram_after = NULL;
// find the histograms corresponding to the two columns
// are column id1 and 2 always in the order of outer inner?
if (pred_info->HasValidColIdOuter())
{
outer_histogram = outer_stats->GetHistogram(colid1);
GPOS_ASSERT(NULL != outer_histogram);
}
if (pred_info->HasValidColIdInner())
{
inner_histogram = inner_side_stats->GetHistogram(colid2);
GPOS_ASSERT(NULL != inner_histogram);
}
// When we have any form of equi join with join condition of type f(a)=b,
// we calculate the NDV of such a join as NDV(b) ( from Selinger et al.)
if (CStatsPred::EstatscmptEqNDVOuter == stats_cmp_type)
if (NULL == outer_histogram)
{
inner_histogram = outer_histogram;
GPOS_ASSERT(CStatsPred::EstatscmptEqNDV == pred_info->GetCmpType());
outer_histogram = inner_histogram;
colid1 = colid2;
}
else if (CStatsPred::EstatscmptEqNDVInner == stats_cmp_type)
else if (NULL == inner_histogram)
{
outer_histogram = inner_histogram;
GPOS_ASSERT(CStatsPred::EstatscmptEqNDV == pred_info->GetCmpType());
inner_histogram = outer_histogram;
colid2 = colid1;
}
JoinHistograms
......@@ -377,7 +394,7 @@ CJoinStatsProcessor::SetResultingJoinStats
output_is_empty = JoinStatsAreEmpty(outer_stats->IsEmpty(), output_is_empty, outer_histogram, inner_histogram, outer_histogram_after, join_type);
CStatisticsUtils::AddHistogram(mp, colid1, outer_histogram_after, result_col_hist_mapping);
if (!semi_join)
if (!semi_join && colid1 != colid2)
{
CStatisticsUtils::AddHistogram(mp, colid2, inner_histogram_after, result_col_hist_mapping);
}
......@@ -385,6 +402,7 @@ CJoinStatsProcessor::SetResultingJoinStats
GPOS_DELETE(outer_histogram_after);
GPOS_DELETE(inner_histogram_after);
// remember which tables the columns came from, this info is used to combine scale factors
CColumnFactory *col_factory = COptCtxt::PoctxtFromTLS()->Pcf();
CColRef *colref_outer = col_factory->LookupColRef(colid1);
......@@ -401,6 +419,9 @@ CJoinStatsProcessor::SetResultingJoinStats
// there should only be two tables involved in a join condition
// if the predicate is more complex (i.e. more than 2 tables involved in the predicate such as t1.a=t2.a+t3.a),
// the mdid of the base table will be NULL:
// Note that we hash on the pointer to the Mdid, not the value of the Mdid,
// but we know that CColRef::GetMdidTable() will always return the same
// pointer for a given table.
mdid_pair = GPOS_NEW(mp) IMdIdArray(mp, 2);
mdid_outer->AddRef();
mdid_inner->AddRef();
......
......@@ -97,11 +97,14 @@ CLeftOuterJoinStatsProcessor::MakeLOJHistogram
GPOS_ASSERT(NULL != inner_join_stats);
// build a bitset with all outer child columns contributing to the join
CBitSet *outer_side_cols = GPOS_NEW(mp) CBitSet(mp);
CBitSet *outer_side_join_cols = GPOS_NEW(mp) CBitSet(mp);
for (ULONG j = 0; j < join_preds_stats->Size(); j++)
{
CStatsPredJoin *join_stats = (*join_preds_stats)[j];
(void) outer_side_cols->ExchangeSet(join_stats->ColIdOuter());
if (join_stats->HasValidColIdOuter())
{
(void) outer_side_join_cols->ExchangeSet(join_stats->ColIdOuter());
}
}
// for the columns in the outer child, compute the buckets that do not contribute to the inner join
......@@ -129,7 +132,7 @@ CLeftOuterJoinStatsProcessor::MakeLOJHistogram
const CHistogram *inner_join_histogram = inner_join_stats->GetHistogram(colid);
GPOS_ASSERT(NULL != inner_join_histogram);
if (outer_side_cols->Get(colid))
if (outer_side_join_cols->Get(colid))
{
// add buckets from the outer histogram that do not contribute to the inner join
const CHistogram *LASJ_histogram = LASJ_stats->GetHistogram(colid);
......@@ -167,7 +170,7 @@ CLeftOuterJoinStatsProcessor::MakeLOJHistogram
// clean up
inner_colids_with_stats->Release();
outer_colids_with_stats->Release();
outer_side_cols->Release();
outer_side_join_cols->Release();
return LOJ_histograms;
}
......
......@@ -34,8 +34,11 @@ CLeftSemiJoinStatsProcessor::CalcLSJoinStatsStatic
ULongPtrArray *inner_colids = GPOS_NEW(mp) ULongPtrArray(mp);
for (ULONG ul = 0; ul < length; ul++)
{
ULONG colid = ((*join_preds_stats)[ul])->ColIdInner();
inner_colids->Append(GPOS_NEW(mp) ULONG(colid));
if ((*join_preds_stats)[ul]->HasValidColIdInner())
{
ULONG colid = ((*join_preds_stats)[ul])->ColIdInner();
inner_colids->Append(GPOS_NEW(mp) ULONG(colid));
}
}
// dummy agg columns required for group by derivation
......
......@@ -1180,6 +1180,7 @@ CStatisticsUtils::DeriveStatsForDynamicScan
scalar_expr,
output_colrefs,
outer_refs,
true, // semi-join
&unsupported_pred_stats
);
......@@ -1863,9 +1864,7 @@ CStatisticsUtils::IsStatsCmpTypeNdvEq
CStatsPred::EStatsCmpType stats_cmp_type
)
{
return (CStatsPred::EstatscmptEqNDVOuter == stats_cmp_type ||
CStatsPred::EstatscmptEqNDVInner == stats_cmp_type
);
return (CStatsPred::EstatscmptEqNDV == stats_cmp_type);
}
//---------------------------------------------------------------------------
// @function:
......
......@@ -613,6 +613,7 @@ CDXLTokens::Init
{EdxltokenCmpOther, GPOS_WSZ_LIT("Other")},
{EdxltokenReturnsNullOnNullInput, GPOS_WSZ_LIT("ReturnsNullOnNullInput")},
{EdxltokenIsNDVPreserving, GPOS_WSZ_LIT("IsNDVPreserving")},
{EdxltokenTriggers, GPOS_WSZ_LIT("Triggers")},
{EdxltokenTrigger, GPOS_WSZ_LIT("Trigger")},
......@@ -638,7 +639,8 @@ CDXLTokens::Init
{EdxltokenGPDBFuncResultTypeId, GPOS_WSZ_LIT("ResultType")},
{EdxltokenGPDBFuncReturnsSet, GPOS_WSZ_LIT("ReturnsSet")},
{EdxltokenGPDBFuncStrict, GPOS_WSZ_LIT("IsStrict")},
{EdxltokenGPDBFuncNDVPreserving, GPOS_WSZ_LIT("IsNDVPreserving")},
{EdxltokenGPDBAgg, GPOS_WSZ_LIT("GPDBAgg")},
{EdxltokenGPDBIsAggOrdered, GPOS_WSZ_LIT("IsOrdered")},
{EdxltokenGPDBAggResultTypeId, GPOS_WSZ_LIT("ResultType")},
......
......@@ -141,7 +141,7 @@ SingleColumnHomogenousIndexOnRoot-AO SingleColumnHomogenousIndexOnRoot-HEAP;
CStatsTest:
Stat-Derivation-Leaf-Pattern MissingBoolColStats JoinColWithOnlyNDV UnsupportedStatsPredicate
StatsFilter-AnyWithNewColStats;
StatsFilter-AnyWithNewColStats EquiJoinOnExpr-Supported EquiJoinOnExpr-Unsupported;
CICGMiscTest:
BroadcastSkewedHashjoin OrderByNullsFirst ConvertHashToRandomSelect ConvertHashToRandomInsert HJN-DeeperOuter CTAS CTAS-Random CheckAsUser
......
......@@ -536,6 +536,7 @@ DATA(insert OID = 643 ( "<>" PGNSP PGUID b f f 19 19 16 643 93 namene neqsel
DESCR("not equal");
DATA(insert OID = 654 ( "||" PGNSP PGUID b f f 25 25 25 0 0 textcat - - ));
DESCR("concatenate");
#define OIDTextConcatenateOperator 654
DATA(insert OID = 660 ( "<" PGNSP PGUID b f f 19 19 16 662 663 namelt scalarltsel scalarltjoinsel ));
DESCR("less than");
......
......@@ -1912,8 +1912,10 @@ DATA(insert OID = 868 ( strpos PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2
DESCR("position of substring");
DATA(insert OID = 870 ( lower PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "25" _null_ _null_ _null_ _null_ _null_ lower _null_ _null_ _null_ ));
DESCR("lowercase");
#define LOWER_OID 870
DATA(insert OID = 871 ( upper PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "25" _null_ _null_ _null_ _null_ _null_ upper _null_ _null_ _null_ ));
DESCR("uppercase");
#define UPPER_OID 871
DATA(insert OID = 872 ( initcap PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "25" _null_ _null_ _null_ _null_ _null_ initcap _null_ _null_ _null_ ));
DESCR("capitalize each word");
DATA(insert OID = 873 ( lpad PGNSP PGUID 12 1 0 0 0 f f f f t f i s 3 0 25 "25 23 25" _null_ _null_ _null_ _null_ _null_ lpad _null_ _null_ _null_ ));
......@@ -1936,14 +1938,17 @@ DATA(insert OID = 880 ( rpad PGNSP PGUID 14 1 0 0 0 f f f f t f i s 2 0 25
DESCR("right-pad string to length");
DATA(insert OID = 881 ( ltrim PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "25" _null_ _null_ _null_ _null_ _null_ ltrim1 _null_ _null_ _null_ ));
DESCR("trim spaces from left end of string");
#define LTRIM_SPACE_OID 881
DATA(insert OID = 882 ( rtrim PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "25" _null_ _null_ _null_ _null_ _null_ rtrim1 _null_ _null_ _null_ ));
DESCR("trim spaces from right end of string");
#define RTRIM_SPACE_OID 882
DATA(insert OID = 883 ( substr PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 25 "25 23" _null_ _null_ _null_ _null_ _null_ text_substr_no_len _null_ _null_ _null_ ));
DESCR("extract portion of string");
DATA(insert OID = 884 ( btrim PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 25 "25 25" _null_ _null_ _null_ _null_ _null_ btrim _null_ _null_ _null_ ));
DESCR("trim selected characters from both ends of string");
DATA(insert OID = 885 ( btrim PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "25" _null_ _null_ _null_ _null_ _null_ btrim1 _null_ _null_ _null_ ));
DESCR("trim spaces from both ends of string");
#define BTRIM_SPACE_OID 885
DATA(insert OID = 936 ( substring PGNSP PGUID 12 1 0 0 0 f f f f t f i s 3 0 25 "25 23 23" _null_ _null_ _null_ _null_ _null_ text_substr _null_ _null_ _null_ ));
DESCR("extract portion of string");
......
......@@ -204,6 +204,9 @@ namespace gpdb {
// is the given function strict
bool FuncStrict(Oid funcid);
// does this preserve the NDVs of its inputs?
bool IsFuncNDVPreserving(Oid funcid);
// stability property of given function
char FuncStability(Oid funcid);
......@@ -480,6 +483,9 @@ namespace gpdb {
// is the given operator strict
bool IsOpStrict(Oid opno);
// does it preserve the NDVs of its inputs
bool IsOpNDVPreserving(Oid opno);
// get input types for a given operator
void GetOpInputTypes(Oid opno, Oid *lefttype, Oid *righttype);
......
......@@ -165,6 +165,7 @@ namespace gpdxl
IMDFunction::EFuncStbl *stability, // output: function stability
IMDFunction::EFuncDataAcc *access, // output: function data access
BOOL *is_strict, // output: is function strict?
BOOL *is_ndv_preserving, // output: preserves NDVs of inputs
BOOL *ReturnsSet // output: does function return set?
);
......
......@@ -12264,32 +12264,35 @@ WHERE L1.lid = int4in(unknownout(meta.load_id));
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause. Creating a NULL policy entry.
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..437.37 rows=134 width=8)
Result (cost=0.00..431.10 rows=1 width=8)
Output: c, lid
-> Redistribute Motion 3:3 (slice1; segments: 3) (cost=0.00..431.12 rows=134 width=8)
-> Redistribute Motion 3:3 (slice1; segments: 3) (cost=0.00..431.08 rows=1 width=8)
Output: c, lid
-> HashAggregate (cost=0.00..431.12 rows=134 width=8)
-> GroupAggregate (cost=0.00..431.08 rows=1 width=8)
Output: c, lid
Group Key: t55.c, t55.lid
-> Hash Join (cost=0.00..431.08 rows=134 width=8)
-> Sort (cost=0.00..431.08 rows=1 width=8)
Output: c, lid
Hash Cond: (t55.lid = int4in(unknownout(('99'))))
-> Redistribute Motion 3:3 (slice2; segments: 3) (cost=0.00..431.02 rows=334 width=8)
Sort Key: t55.c, t55.lid
-> Hash Join (cost=0.00..431.08 rows=1 width=8)
Output: c, lid
Hash Key: lid
-> Seq Scan on orca.t55 (cost=0.00..431.01 rows=334 width=8)
Hash Cond: (t55.lid = int4in(unknownout(('99'))))
-> Redistribute Motion 3:3 (slice2; segments: 3) (cost=0.00..431.02 rows=334 width=8)
Output: c, lid
-> Hash (cost=0.00..0.00 rows=1 width=8)
Output: ('99')
-> Result (cost=0.00..0.00 rows=1 width=8)
Hash Key: lid
-> Seq Scan on orca.t55 (cost=0.00..431.01 rows=334 width=8)
Output: c, lid
-> Hash (cost=0.00..0.00 rows=1 width=8)
Output: ('99')
-> Result (cost=0.00..0.00 rows=1 width=8)
Output: ('99'), int4in(unknownout(('99')))
-> Result (cost=0.00..0.00 rows=1 width=1)
Output: '99'
Output: ('99')
-> Result (cost=0.00..0.00 rows=1 width=8)
Output: ('99'), int4in(unknownout(('99')))
-> Result (cost=0.00..0.00 rows=1 width=1)
Output: '99'
Optimizer: Pivotal Optimizer (GPORCA)
Settings: optimizer=on, optimizer_cte_inlining_bound=1000, optimizer_join_order=query, optimizer_metadata_caching=on
(25 rows)
(28 rows)
CREATE TABLE TP AS
WITH META AS (SELECT '2020-01-01' AS VALID_DT, '99' AS LOAD_ID)
......
......@@ -4246,23 +4246,26 @@ select * from
(tenk1 as a1 full join (select 1 as id) as yy on (a1.unique1 = yy.id))
on (xx.id = coalesce(yy.id));
QUERY PLAN
------------------------------------------------------
Hash Left Join
Hash Cond: ((1) = COALESCE((1)))
-> Result
-> Hash
-> Gather Motion 3:1 (slice1; segments: 3)
-> Merge Full Join
Merge Cond: (unique1 = (1))
-> Sort
Sort Key: unique1
-> Seq Scan on tenk1
-> Sort
Sort Key: (1)
-> Result
------------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3)
-> Hash Left Join
Hash Cond: ((1) = COALESCE((1)))
-> Result
-> Result
-> Hash
-> Redistribute Motion 3:3 (slice2; segments: 3)
Hash Key: COALESCE((1))
-> Merge Full Join
Merge Cond: (unique1 = (1))
-> Sort
Sort Key: unique1
-> Seq Scan on tenk1
-> Sort
Sort Key: (1)
-> Result
Optimizer: Pivotal Optimizer (GPORCA) version 3.83.0
(15 rows)
-> Result
Optimizer: Pivotal Optimizer (GPORCA)
(18 rows)
select * from
(select 1 as id) as xx
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册