Experimental cost model update (port from 6X) (#11115)

This is a cherry-pick of the change from PR https://github.com/greenplum-db/gporca/pull/607 Avoid costing change for IN predicates on btree indexes Commit e5f1716 changed the way we handle IN predicates on indexes, it now uses a more efficient array comparison instead of treating it like an OR predicate. A side effect is that the cost function, CCostModelGPDB::CostBitmapTableScan, now goes through a different code path, using the "small NDV" or "large NDV" costing method. This produces very high cost estimates when the NDV increases beyond 2, so we basically never choose an index for these cases, although a btree index used in a bitmap scan isn't very sensitive to the NDV. To avoid this, we go back to the old formula we used before commit e5f1716. The fix is restricted to IN predicates on btree indexes, used in a bitmap scan. Add an MDP for a larger IN list, using a btree index on an AO table Misc. changes to the calibration test program - Added tests for btree indexes (btree_scan_tests). - Changed data distribution so that all column values range from 1...n. - Parameter values for test queries are now proportional to selectivity, a parameter value of 0 produces a selectivity of 0%. - Changed the logic to fake statistics somewhat, hopefully this will lead to more precise estimates. Incorporated the changes to the data distribution with no more 0 values. Added fake stats for unique columns. - Headers of tests now use semicolons to separate parts, to give a nicer output when pasting into Google Docs. - Some formatting changes. - Log fallbacks. - When using existing tables, the program now determines the table structure (heap or append-only) and the row count. - Split off two very slow tests into separate test units. These are not included when running "all" tests, they have to be run explicitly. - Add btree join tests, rename "bitmap_join_tests" to "index_join_tests" and run both bitmap and btree joins - Update min and max parameter values to cover a range that includes or at least is closer to the cross-over between index and table scan - Remove the "high NDV" tests, since the ranges in the general test now include both low and high NDV cases (<= and > 200) - Print out selectivity of each query, if available - Suppress standard deviation output when we execute queries only once - Set search path when connecting - Decrease the parameter range when running bitmap scan tests on heap tables - Run btree scan tests only on AO tables, they are not designed for testing index scans Updates to the experimental cost model, new calibration 1. Simplify some of the formulas, the calibration process seemed to justify that. We might have to revisit if problems come up. Changes: - Rewrite some of the formulas so the costs per row and costs per byte are more easy to see - Make the cost for the width directly proportional - Unify the formula for scans and joins, use the same per-byte costs and make NDV-dependent costs proportional to num_rebinds * dNDV, except for the logic in item 3. That makes the cost for the new experimental cost model a simple linear formula: num_rebinds * ( rows * c1 + rows * width * c2 + ndv * c3 + bitmap_union_cost + c4 ) + c5 We have 5 constants, c1 ... c5: c1: cost per row (rows on one segment) c2: cost per byte c3: cost per distinct value (total NDV on all segments) c4: cost per rebind c5: initialization cost bitmap_union_cost: see item 3 below 2. Recalibrate some of the cost parameters, using the updated calibration program src/backend/gporca/scripts/cal_bitmap_test.py 3. Add a cost penalty for bitmap index scans on heap tables. The added cost takes the form bitmap_union_cost = <base table rows> * (NDV-1) * c6. The reason for this is, as others have pointed out, that heap tables lead to much larger bit vectors, since their CTIDs are more spaced out than those of AO tables. The main factor seems to be the cost of unioning these bit vectors, and that cost is proportional to the number of bitmaps minus one and the size of the bitmaps, which is approximated here by the number of rows in the table. Note that because we use (NDV-1) in the formula, this penalty does not apply to usual index joins, which have an NDV of 1 per rebind. This is consistent with what we see in measurements and it also seems reasonable, since we don't have to union bitmaps in this case. 4. Fix to select CostModelGPDB for the 'experimental' model, as we do in 5X. 5. Calibrate the constants involved (c1 ... c6), using the calibration program and running experiments with heap and append-only tables on a laptop and also on a Linux cluster with 24 segments. Also run some other workloads for validation. 6. Give a small initial advantage to bitmap scans, so they will be chosen over table scans for small tables. Otherwise, small queries will have more or less random plans, all of which cost around 431, the value of the initial cost. Added a 10% advantage of the bitmap scan. * Port calibration program to Python 3 - Used 2to3 program to do the basics. - Version parameter in argparse no longer supported - Needs additional option in connection string to keep the search path - The dbconn.execSQL call can no longer be used to get a cursor, this was probably a non-observable defect in the Python 2 version - Needed to use // (floor division) in some cases Co-authored-by: N David Kimura <dkimura@vmware.com>

Experimental cost model update (port from 6X) (#11115)
This is a cherry-pick of the change from PR https://github.com/greenplum-db/gporca/pull/607 Avoid costing change for IN predicates on btree indexes Commit e5f1716 changed the way we handle IN predicates on indexes, it now uses a more efficient array comparison instead of treating it like an OR predicate. A side effect is that the cost function, CCostModelGPDB::CostBitmapTableScan, now goes through a different code path, using the "small NDV" or "large NDV" costing method. This produces very high cost estimates when the NDV increases beyond 2, so we basically never choose an index for these cases, although a btree index used in a bitmap scan isn't very sensitive to the NDV. To avoid this, we go back to the old formula we used before commit e5f1716. The fix is restricted to IN predicates on btree indexes, used in a bitmap scan. Add an MDP for a larger IN list, using a btree index on an AO table Misc. changes to the calibration test program - Added tests for btree indexes (btree_scan_tests). - Changed data distribution so that all column values range from 1...n. - Parameter values for test queries are now proportional to selectivity, a parameter value of 0 produces a selectivity of 0%. - Changed the logic to fake statistics somewhat, hopefully this will lead to more precise estimates. Incorporated the changes to the data distribution with no more 0 values. Added fake stats for unique columns. - Headers of tests now use semicolons to separate parts, to give a nicer output when pasting into Google Docs. - Some formatting changes. - Log fallbacks. - When using existing tables, the program now determines the table structure (heap or append-only) and the row count. - Split off two very slow tests into separate test units. These are not included when running "all" tests, they have to be run explicitly. - Add btree join tests, rename "bitmap_join_tests" to "index_join_tests" and run both bitmap and btree joins - Update min and max parameter values to cover a range that includes or at least is closer to the cross-over between index and table scan - Remove the "high NDV" tests, since the ranges in the general test now include both low and high NDV cases (<= and > 200) - Print out selectivity of each query, if available - Suppress standard deviation output when we execute queries only once - Set search path when connecting - Decrease the parameter range when running bitmap scan tests on heap tables - Run btree scan tests only on AO tables, they are not designed for testing index scans Updates to the experimental cost model, new calibration 1. Simplify some of the formulas, the calibration process seemed to justify that. We might have to revisit if problems come up. Changes: - Rewrite some of the formulas so the costs per row and costs per byte are more easy to see - Make the cost for the width directly proportional - Unify the formula for scans and joins, use the same per-byte costs and make NDV-dependent costs proportional to num_rebinds * dNDV, except for the logic in item 3. That makes the cost for the new experimental cost model a simple linear formula: num_rebinds * ( rows * c1 + rows * width * c2 + ndv * c3 + bitmap_union_cost + c4 ) + c5 We have 5 constants, c1 ... c5: c1: cost per row (rows on one segment) c2: cost per byte c3: cost per distinct value (total NDV on all segments) c4: cost per rebind c5: initialization cost bitmap_union_cost: see item 3 below 2. Recalibrate some of the cost parameters, using the updated calibration program src/backend/gporca/scripts/cal_bitmap_test.py 3. Add a cost penalty for bitmap index scans on heap tables. The added cost takes the form bitmap_union_cost = <base table rows> * (NDV-1) * c6. The reason for this is, as others have pointed out, that heap tables lead to much larger bit vectors, since their CTIDs are more spaced out than those of AO tables. The main factor seems to be the cost of unioning these bit vectors, and that cost is proportional to the number of bitmaps minus one and the size of the bitmaps, which is approximated here by the number of rows in the table. Note that because we use (NDV-1) in the formula, this penalty does not apply to usual index joins, which have an NDV of 1 per rebind. This is consistent with what we see in measurements and it also seems reasonable, since we don't have to union bitmaps in this case. 4. Fix to select CostModelGPDB for the 'experimental' model, as we do in 5X. 5. Calibrate the constants involved (c1 ... c6), using the calibration program and running experiments with heap and append-only tables on a laptop and also on a Linux cluster with 24 segments. Also run some other workloads for validation. 6. Give a small initial advantage to bitmap scans, so they will be chosen over table scans for small tables. Otherwise, small queries will have more or less random plans, all of which cost around 431, the value of the initial cost. Added a 10% advantage of the bitmap scan. * Port calibration program to Python 3 - Used 2to3 program to do the basics. - Version parameter in argparse no longer supported - Needs additional option in connection string to keep the search path - The dbconn.execSQL call can no longer be used to get a cursor, this was probably a non-observable defect in the Python 2 version - Needed to use // (floor division) in some cases Co-authored-by: N David Kimura <dkimura@vmware.com>
9363718d · Hans Zeller · GitHub · bfcc63e1 · 9363718d · 9363718d
8 changed file
--- a/src/backend/gpopt/utils/COptTasks.cpp
+++ b/src/backend/gpopt/utils/COptTasks.cpp
@@ -477,7 +477,7 @@ ICostModel *
 COptTasks::GetCostModel(CMemoryPool *mp, ULONG num_segments)
 {
 	ICostModel *cost_model = NULL;
-	if (OPTIMIZER_GPDB_CALIBRATED >= optimizer_cost_model)
+	if (optimizer_cost_model >= OPTIMIZER_GPDB_CALIBRATED)
 	{
 		cost_model = GPOS_NEW(mp) CCostModelGPDB(mp, num_segments);
 	}

--- a/src/backend/gporca/data/dxl/minidump/BTreeIndex-Against-InListLarge.mdp
+++ b/src/backend/gporca/data/dxl/minidump/BTreeIndex-Against-InListLarge.mdp
--- a/src/backend/gporca/data/dxl/minidump/BitmapIndexScanChooseIndex.mdp
+++ b/src/backend/gporca/data/dxl/minidump/BitmapIndexScanChooseIndex.mdp
@@ -608,7 +608,7 @@
    <dxl:Plan Id="0" SpaceSize="2">
      <dxl:GatherMotion InputSegments="0,1,2" OutputSegments="-1">
        <dxl:Properties>
-          <dxl:Cost StartupCost="0" TotalCost="431.757244" Rows="983.264660" Width="4"/>
+          <dxl:Cost StartupCost="0" TotalCost="400.107345" Rows="983.264660" Width="4"/>
        </dxl:Properties>
        <dxl:ProjList>
          <dxl:ProjElem ColId="0" Alias="a">
@@ -619,7 +619,7 @@
        <dxl:SortingColumnList/>
        <dxl:BitmapTableScan>
          <dxl:Properties>
-            <dxl:Cost StartupCost="0" TotalCost="431.740148" Rows="983.264660" Width="4"/>
+            <dxl:Cost StartupCost="0" TotalCost="400.090249" Rows="983.264660" Width="4"/>
          </dxl:Properties>
          <dxl:ProjList>
            <dxl:ProjElem ColId="0" Alias="a">

--- a/src/backend/gporca/libgpdbcost/src/CCostModelGPDB.cpp
+++ b/src/backend/gporca/libgpdbcost/src/CCostModelGPDB.cpp
@@ -26,6 +26,7 @@
 #include "gpopt/operators/CPhysicalMotion.h"
 #include "gpopt/operators/CPhysicalPartitionSelector.h"
 #include "gpopt/operators/CPredicateUtils.h"
+#include "gpopt/operators/CScalarBitmapIndexProbe.h"
 #include "naucrates/statistics/CStatisticsUtils.h"
 #include "gpopt/operators/CExpression.h"
 #include "gpdbcost/CCostModelGPDB.h"
@@ -1618,6 +1619,18 @@ CCostModelGPDB::CostBitmapTableScan(CMemoryPool *mp, CExpressionHandle &exprhdl,
 	CColRefSet *pcrsUsed = pexprIndexCond->DeriveUsedColumns();
 	CColRefSet *outerRefs = exprhdl.DeriveOuterReferences();
 	CColRefSet *pcrsLocalUsed = GPOS_NEW(mp) CColRefSet(mp, *pcrsUsed);
+	IMDIndex::EmdindexType indexType = IMDIndex::EmdindSentinel;
+
+	if (COperator::EopScalarBitmapIndexProbe == pexprIndexCond->Pop()->Eopid())
+	{
+		indexType = CScalarBitmapIndexProbe::PopConvert(pexprIndexCond->Pop())
+						->Pindexdesc()
+						->IndexType();
+	}
+
+	BOOL isInPredOnBtreeIndex =
+		(IMDIndex::EmdindBtree == indexType &&
+		 COperator::EopScalarArrayCmp == (*pexprIndexCond)[0]->Pop()->Eopid());

 	// subtract outer references from the used colrefs, so we can see
 	// how many colrefs are used for this table
@@ -1632,9 +1645,17 @@ CCostModelGPDB::CostBitmapTableScan(CMemoryPool *mp, CExpressionHandle &exprhdl,

 	if (COperator::EopScalarBitmapIndexProbe !=
 			pexprIndexCond->Pop()->Eopid() ||
-		1 < pcrsLocalUsed->Size())
+		1 < pcrsLocalUsed->Size() ||
+		(isInPredOnBtreeIndex && rows > 2.0 &&
+		 !GPOS_FTRACE(EopttraceCalibratedBitmapIndexCostModel)))
 	{
-		// child is Bitmap AND/OR, or we use Multi column index
+		// Child is Bitmap AND/OR, or we use Multi column index or this is an IN predicate
+		// that's used with the "calibrated" cost model.
+		// Handling the IN predicate in this code path is to avoid plan regressions from
+		// earlier versions of the code that treated IN predicates like ORs and therefore
+		// also handled them in this code path. This is especially noticeable for btree
+		// indexes that often have a high NDV, because the small/large NDV cost model
+		// produces very high cost for cases with a higher NDV.
 		const CDouble dIndexFilterCostUnit =
 			pcmgpdb->GetCostModelParams()
 				->PcpLookup(CCostModelParamsGPDB::EcpIndexFilterCostUnit)
@@ -1671,6 +1692,11 @@ CCostModelGPDB::CostBitmapTableScan(CMemoryPool *mp, CExpressionHandle &exprhdl,
 		// if the expression is const table get, the pcrsUsed is empty
 		// so we use minimum value MinDistinct for dNDV in that case.
 		CDouble dNDV = CHistogram::MinDistinct;
+		CDouble dNDVThreshold =
+			pcmgpdb->GetCostModelParams()
+				->PcpLookup(CCostModelParamsGPDB::EcpBitmapNDVThreshold)
+				->Get();
+
 		if (rows < 1.0)
 		{
 			// if we aren't accessing a row every rebind, then don't charge a cost for those cases where we don't have a row
@@ -1698,10 +1724,7 @@ CCostModelGPDB::CostBitmapTableScan(CMemoryPool *mp, CExpressionHandle &exprhdl,

 		if (!GPOS_FTRACE(EopttraceCalibratedBitmapIndexCostModel))
 		{
-			CDouble dNDVThreshold =
-				pcmgpdb->GetCostModelParams()
-					->PcpLookup(CCostModelParamsGPDB::EcpBitmapNDVThreshold)
-					->Get();
+			// optimizer_cost_model = 'calibrated'
 			if (dNDVThreshold <= dNDV)
 			{
 				result = CostBitmapLargeNDV(pcmgpdb, pci, dNDV);
@@ -1713,44 +1736,66 @@ CCostModelGPDB::CostBitmapTableScan(CMemoryPool *mp, CExpressionHandle &exprhdl,
 		}
 		else
 		{
+			// optimizer_cost_model = 'experimental'
 			CDouble dBitmapIO =
 				pcmgpdb->GetCostModelParams()
 					->PcpLookup(CCostModelParamsGPDB::EcpBitmapIOCostSmallNDV)
 					->Get();
-			CDouble dInitScan =
+			CDouble c5_dInitScan =
 				pcmgpdb->GetCostModelParams()
 					->PcpLookup(CCostModelParamsGPDB::EcpInitScanFactor)
 					->Get();
+			CDouble c3_dBitmapPageCost =
+				pcmgpdb->GetCostModelParams()
+					->PcpLookup(CCostModelParamsGPDB::EcpBitmapPageCost)
+					->Get();
+			BOOL isAOTable = CPhysicalScan::PopConvert(exprhdl.Pop())
+								 ->Ptabdesc()
+								 ->IsAORowOrColTable();
+
+			// some cost constants determined with the cal_bitmap_test.py script
+			CDouble c1_cost_per_row(0.03);
+			CDouble c2_cost_per_byte(0.0001);
+			CDouble bitmap_union_cost_per_distinct_value(0.000027);
+			CDouble init_cost_advantage_for_bitmap_scan(0.9);

-			if (1 < pcrsUsed->Size())  // it is a join
+			if (IMDIndex::EmdindBtree == indexType)
 			{
-				// The numbers below were experimentally determined using regression analysis in the cal_bitmap_test.py script
-				// The following dSizeCost is in the form C1 * rows + C2 * rows * width. This is because the width should have
-				// significantly less weight than rows as the execution time does not grow as fast in regards to width
-				CDouble dSizeCost =
-					rows * (1 + std::max(width * 0.005, 1.0)) * 0.05;
-				result = CCost(	 // cost for each byte returned by the index scan plus cost for incremental rebinds
-					pci->NumRebinds() * (dBitmapIO * dSizeCost + dInitRebind) +
-					// the BitmapPageCost * dNDV takes into account the idea of multiple tuples being on the same page.
-					// If you have a small NDV, the likelihood of multiple tuples matching on one page is high and so the
-					// page cost is reduced. Even though the page cost will decrease, the cost of accessing each tuple will
-					// dominate. Likewise, if the NDV is large, the num of tuples matching per page is lower so the page
-					// cost should be higher
-					dInitScan * dNDV);
+				// btree indexes are not sensitive to the NDV, since they don't have any bitmaps
+				c3_dBitmapPageCost = 0.0;
 			}
-			else
+
+			// Give the index scan a small initial advantage over the table scan, so we use indexes
+			// for small tables - this should avoid having table scan and index scan costs being
+			// very close together for many small queries.
+			c5_dInitScan = c5_dInitScan * init_cost_advantage_for_bitmap_scan;
+
+			// The numbers below were experimentally determined using regression analysis in the cal_bitmap_test.py script
+			// The following dSizeCost is in the form C1 * rows + C2 * rows * width. This is because the width should have
+			// significantly less weight than rows as the execution time does not grow as fast in regards to width
+			CDouble dSizeCost = dBitmapIO * (rows * c1_cost_per_row +
+											 rows * width * c2_cost_per_byte);
+
+			CDouble bitmapUnionCost = 0;
+
+			if (!isAOTable && indexType == IMDIndex::EmdindBitmap && dNDV > 1.0)
 			{
-				// The numbers below were experimentally determined using regression analysis in the cal_bitmap_test.py script
-				CDouble dSizeCost =
-					rows * (1 + std::max(width * 0.005, 1.0)) * 0.001;
-
-				result =
-					CCost(	// cost for each byte returned by the index scan plus cost for incremental rebinds
-						pci->NumRebinds() *
-							(dBitmapIO * dSizeCost + 10 * dInitRebind) * dNDV +
-						// similar to above, the dInitScan * dNDV takes into account the likelihood of multiple tuples per page
-						dInitScan * dNDV);
+				CDouble baseTableRows = CPhysicalScan::PopConvert(exprhdl.Pop())
+											->PstatsBaseTable()
+											->Rows();
+
+				// for bitmap index scans on heap tables, we found that there is an additional cost
+				// associated with unioning them that is proportional to the number of bitmaps involved
+				// (dNDV-1) times the width of the bitmap (proportional to the number of rows in the table)
+				bitmapUnionCost = std::max(0.0, dNDV.Get() - 1.0) *
+								  baseTableRows *
+								  bitmap_union_cost_per_distinct_value;
 			}
+
+			result = CCost(pci->NumRebinds() *
+							   (dSizeCost + dNDV * c3_dBitmapPageCost +
+								dInitRebind + bitmapUnionCost) +
+						   c5_dInitScan);
 		}
 	}


--- a/src/backend/gporca/libgpdbcost/src/CCostModelParamsGPDB.cpp
+++ b/src/backend/gporca/libgpdbcost/src/CCostModelParamsGPDB.cpp
@@ -169,7 +169,7 @@ const CDouble CCostModelParamsGPDB::DBitmapPageCostLargeNDV(83.1651);
 const CDouble CCostModelParamsGPDB::DBitmapPageCostSmallNDV(204.3810);

 // default bitmap page cost with no assumption about NDV
-const CDouble CCostModelParamsGPDB::DBitmapPageCost(50.4381);
+const CDouble CCostModelParamsGPDB::DBitmapPageCost(10);

 // default threshold of NDV for bitmap costing
 const CDouble CCostModelParamsGPDB::DBitmapNDVThreshold(200);

--- a/src/backend/gporca/libgpopt/src/xforms/CXformUtils.cpp
+++ b/src/backend/gporca/libgpopt/src/xforms/CXformUtils.cpp
@@ -76,7 +76,7 @@ using namespace gpopt;
 // predicates less selective than this threshold
 // (selectivity is greater than this number) lead to
 // disqualification of a btree index on an AO table
-#define AO_TABLE_BTREE_INDEX_SELECTIVITY_THRESHOLD 0.05
+#define AO_TABLE_BTREE_INDEX_SELECTIVITY_THRESHOLD 0.10

 //---------------------------------------------------------------------------
 //	@function:

--- a/src/backend/gporca/scripts/cal_bitmap_test.py
+++ b/src/backend/gporca/scripts/cal_bitmap_test.py
--- a/src/backend/gporca/server/CMakeLists.txt
+++ b/src/backend/gporca/server/CMakeLists.txt
@@ -87,7 +87,7 @@ CTypeModifierTest:
 TypeModifierColumn TypeModifierCast TypeModifierConst TypeModifierDoubleMappableConst TypeModifierArrayRef;

 CIndexScanTest:
-BTreeIndex-Against-InList BTreeIndex-Against-ScalarSubquery
+BTreeIndex-Against-InList BTreeIndex-Against-InListLarge BTreeIndex-Against-ScalarSubquery
 IndexScan-AOTable IndexScan-DroppedColumns IndexScan-BoolTrue IndexScan-BoolFalse
 IndexScan-Relabel IndexGet-OuterRefs LogicalIndexGetDroppedCols NewBtreeIndexScanCost
 IndexScan-ORPredsNonPart IndexScan-ORPredsAOPart IndexScan-AndedIn;