提交 eed6c9ed 编写于 作者: T Tom Lane

Add a GUC parameter seq_page_cost, and use that everywhere we formerly

assumed that a sequential page fetch has cost 1.0.  This patch doesn't
in itself change the system's behavior at all, but it opens the door to
people adopting other units of measurement for EXPLAIN costs.  Also, if
we ever decide it's worth inventing per-tablespace access cost settings,
this change provides a workable intellectual framework for that.
上级 a837851d
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.59 2006/05/21 20:10:42 tgl Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.60 2006/06/05 02:49:58 tgl Exp $ -->
<chapter Id="runtime-config"> <chapter Id="runtime-config">
<title>Server Configuration</title> <title>Server Configuration</title>
...@@ -1739,40 +1739,39 @@ archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' # Windows ...@@ -1739,40 +1739,39 @@ archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' # Windows
Planner Cost Constants Planner Cost Constants
</title> </title>
<para>
The <firstterm>cost</> variables described in this section are measured
on an arbitrary scale. Only their relative values matter, hence
scaling them all up or down by the same factor will result in no change
in the planner's choices. Traditionally, these variables have been
referenced to sequential page fetches as the unit of cost; that is,
<varname>seq_page_cost</> is conventionally set to <literal>1.0</>
and the other cost variables are set with reference to that. But
you can use a different scale if you prefer, such as actual execution
times in milliseconds on a particular machine.
</para>
<note> <note>
<para> <para>
Unfortunately, there is no well-defined method for determining Unfortunately, there is no well-defined method for determining ideal
ideal values for the family of <quote>cost</quote> variables that values for the cost variables. They are best treated as averages over
appear below. You are encouraged to experiment and share the entire mix of queries that a particular installation will get. This
your findings. means that changing them on the basis of just a few experiments is very
risky.
</para> </para>
</note> </note>
<variablelist> <variablelist>
<varlistentry id="guc-effective-cache-size" xreflabel="effective_cache_size"> <varlistentry id="guc-seq-page-cost" xreflabel="seq_page_cost">
<term><varname>effective_cache_size</varname> (<type>floating point</type>)</term> <term><varname>seq_page_cost</varname> (<type>floating point</type>)</term>
<indexterm> <indexterm>
<primary><varname>effective_cache_size</> configuration parameter</primary> <primary><varname>seq_page_cost</> configuration parameter</primary>
</indexterm> </indexterm>
<listitem> <listitem>
<para> <para>
Sets the planner's assumption about the effective size of the Sets the planner's estimate of the cost of a disk page fetch
disk cache that is available to a single index scan. This is that is part of a series of sequential fetches. The default is 1.0.
factored into estimates of the cost of using an index; a
higher value makes it more likely index scans will be used, a
lower value makes it more likely sequential scans will be
used. When setting this parameter you should consider both
<productname>PostgreSQL</productname>'s shared buffers and the
portion of the kernel's disk cache that will be used for
<productname>PostgreSQL</productname> data files. Also, take
into account the expected number of concurrent queries using
different indexes, since they will have to share the available
space. This parameter has no effect on the size of shared
memory allocated by <productname>PostgreSQL</productname>, nor
does it reserve kernel disk cache; it is used only for
estimation purposes. The value is measured in disk pages,
which are normally 8192 bytes each. The default is 1000.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -1785,12 +1784,27 @@ archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' # Windows ...@@ -1785,12 +1784,27 @@ archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' # Windows
<listitem> <listitem>
<para> <para>
Sets the planner's estimate of the cost of a Sets the planner's estimate of the cost of a
nonsequentially fetched disk page. This is measured as a non-sequentially-fetched disk page. The default is 4.0.
multiple of the cost of a sequential page fetch. A higher Reducing this value relative to <varname>seq_page_cost</>
value makes it more likely a sequential scan will be used, a will cause the system to prefer index scans; raising it will
lower value makes it more likely an index scan will be make index scans look relatively more expensive. You can raise
used. The default is four. or lower both values together to change the importance of disk I/O
costs relative to CPU costs, which are described by the following
parameters.
</para> </para>
<tip>
<para>
Although the system will let you set <varname>random_page_cost</> to
less than <varname>seq_page_cost</>, it is not physically sensible
to do so. However, setting them equal makes sense if the database
is entirely cached in RAM, since in that case there is no penalty
for touching pages out of sequence. Also, in a heavily-cached
database you should lower both values relative to the CPU parameters,
since the cost of fetching a page already in RAM is much smaller
than it would normally be.
</para>
</tip>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -1802,8 +1816,8 @@ archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' # Windows ...@@ -1802,8 +1816,8 @@ archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' # Windows
<listitem> <listitem>
<para> <para>
Sets the planner's estimate of the cost of processing Sets the planner's estimate of the cost of processing
each row during a query. This is measured as a fraction of each row during a query.
the cost of a sequential page fetch. The default is 0.01. The default is 0.01.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -1816,9 +1830,8 @@ archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' # Windows ...@@ -1816,9 +1830,8 @@ archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' # Windows
<listitem> <listitem>
<para> <para>
Sets the planner's estimate of the cost of processing Sets the planner's estimate of the cost of processing
each index row during an index scan. This is measured as a each index entry during an index scan.
fraction of the cost of a sequential page fetch. The default The default is 0.001.
is 0.001.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -1831,8 +1844,35 @@ archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' # Windows ...@@ -1831,8 +1844,35 @@ archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' # Windows
<listitem> <listitem>
<para> <para>
Sets the planner's estimate of the cost of processing each Sets the planner's estimate of the cost of processing each
operator in a <literal>WHERE</> clause. This is measured as a fraction of operator or function executed during a query.
the cost of a sequential page fetch. The default is 0.0025. The default is 0.0025.
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-effective-cache-size" xreflabel="effective_cache_size">
<term><varname>effective_cache_size</varname> (<type>floating point</type>)</term>
<indexterm>
<primary><varname>effective_cache_size</> configuration parameter</primary>
</indexterm>
<listitem>
<para>
Sets the planner's assumption about the effective size of the
disk cache that is available to a single index scan. This is
factored into estimates of the cost of using an index; a
higher value makes it more likely index scans will be used, a
lower value makes it more likely sequential scans will be
used. When setting this parameter you should consider both
<productname>PostgreSQL</productname>'s shared buffers and the
portion of the kernel's disk cache that will be used for
<productname>PostgreSQL</productname> data files. Also, take
into account the expected number of concurrent queries using
different indexes, since they will have to share the available
space. This parameter has no effect on the size of shared
memory allocated by <productname>PostgreSQL</productname>, nor
does it reserve kernel disk cache; it is used only for
estimation purposes. The value is measured in disk pages,
which are normally 8192 bytes each. The default is 1000.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/indexam.sgml,v 2.12 2006/05/24 11:01:39 teodor Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/indexam.sgml,v 2.13 2006/06/05 02:49:58 tgl Exp $ -->
<chapter id="indexam"> <chapter id="indexam">
<title>Index Access Method Interface Definition</title> <title>Index Access Method Interface Definition</title>
...@@ -771,14 +771,14 @@ amcostestimate (PlannerInfo *root, ...@@ -771,14 +771,14 @@ amcostestimate (PlannerInfo *root,
</para> </para>
<para> <para>
The index access costs should be computed in the units used by The index access costs should be computed using the parameters used by
<filename>src/backend/optimizer/path/costsize.c</filename>: a sequential <filename>src/backend/optimizer/path/costsize.c</filename>: a sequential
disk block fetch has cost 1.0, a nonsequential fetch has cost disk block fetch has cost <varname>seq_page_cost</>, a nonsequential fetch
<varname>random_page_cost</>, and the cost of processing one index row has cost <varname>random_page_cost</>, and the cost of processing one index
should usually be taken as <varname>cpu_index_tuple_cost</>. In addition, row should usually be taken as <varname>cpu_index_tuple_cost</>. In
an appropriate multiple of <varname>cpu_operator_cost</> should be charged addition, an appropriate multiple of <varname>cpu_operator_cost</> should
for any comparison operators invoked during index processing (especially be charged for any comparison operators invoked during index processing
evaluation of the indexQuals themselves). (especially evaluation of the indexQuals themselves).
</para> </para>
<para> <para>
...@@ -788,10 +788,10 @@ amcostestimate (PlannerInfo *root, ...@@ -788,10 +788,10 @@ amcostestimate (PlannerInfo *root,
</para> </para>
<para> <para>
The <quote>start-up cost</quote> is the part of the total scan cost that must be expended The <quote>start-up cost</quote> is the part of the total scan cost that
before we can begin to fetch the first row. For most indexes this can must be expended before we can begin to fetch the first row. For most
be taken as zero, but an index type with a high start-up cost might want indexes this can be taken as zero, but an index type with a high start-up
to set it nonzero. cost might want to set it nonzero.
</para> </para>
<para> <para>
...@@ -850,13 +850,13 @@ amcostestimate (PlannerInfo *root, ...@@ -850,13 +850,13 @@ amcostestimate (PlannerInfo *root,
<programlisting> <programlisting>
/* /*
* Our generic assumption is that the index pages will be read * Our generic assumption is that the index pages will be read
* sequentially, so they have cost 1.0 each, not random_page_cost. * sequentially, so they cost seq_page_cost each, not random_page_cost.
* Also, we charge for evaluation of the indexquals at each index row. * Also, we charge for evaluation of the indexquals at each index row.
* All the costs are assumed to be paid incrementally during the scan. * All the costs are assumed to be paid incrementally during the scan.
*/ */
cost_qual_eval(&amp;index_qual_cost, indexQuals); cost_qual_eval(&amp;index_qual_cost, indexQuals);
*indexStartupCost = index_qual_cost.startup; *indexStartupCost = index_qual_cost.startup;
*indexTotalCost = numIndexPages + *indexTotalCost = seq_page_cost * numIndexPages +
(cpu_index_tuple_cost + index_qual_cost.per_tuple) * numIndexTuples; (cpu_index_tuple_cost + index_qual_cost.per_tuple) * numIndexTuples;
</programlisting> </programlisting>
</para> </para>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.56 2006/03/10 19:10:48 momjian Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.57 2006/06/05 02:49:58 tgl Exp $ -->
<chapter id="performance-tips"> <chapter id="performance-tips">
<title>Performance Tips</title> <title>Performance Tips</title>
...@@ -60,7 +60,7 @@ ...@@ -60,7 +60,7 @@
<footnote> <footnote>
<para> <para>
Examples in this section are drawn from the regression test database Examples in this section are drawn from the regression test database
after doing a <command>VACUUM ANALYZE</>, using 8.1 development sources. after doing a <command>VACUUM ANALYZE</>, using 8.2 development sources.
You should be able to get similar results if you try the examples yourself, You should be able to get similar results if you try the examples yourself,
but your estimated costs and row counts will probably vary slightly but your estimated costs and row counts will probably vary slightly
because <command>ANALYZE</>'s statistics are random samples rather because <command>ANALYZE</>'s statistics are random samples rather
...@@ -114,12 +114,13 @@ EXPLAIN SELECT * FROM tenk1; ...@@ -114,12 +114,13 @@ EXPLAIN SELECT * FROM tenk1;
</para> </para>
<para> <para>
The costs are measured in units of disk page fetches; that is, 1.0 The costs are measured in arbitrary units determined by the planner's
equals one sequential disk page read, by definition. (CPU effort cost parameters (see <xref linkend="runtime-config-query-constants">).
estimates are made too; they are converted into disk-page units using some Traditional practice is to measure the costs in units of disk page
fairly arbitrary fudge factors. If you want to experiment with these fetches; that is, <xref linkend="guc-seq-page-cost"> is conventionally
factors, see the list of run-time configuration parameters in set to <literal>1.0</> and the other cost parameters are set relative
<xref linkend="runtime-config-query-constants">.) to that. The examples in this section are run with the default cost
parameters.
</para> </para>
<para> <para>
...@@ -164,9 +165,9 @@ SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1'; ...@@ -164,9 +165,9 @@ SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1';
you will find out that <classname>tenk1</classname> has 358 disk you will find out that <classname>tenk1</classname> has 358 disk
pages and 10000 rows. So the cost is estimated at 358 page pages and 10000 rows. So the cost is estimated at 358 page
reads, defined as costing 1.0 apiece, plus 10000 * <xref reads, costing <xref linkend="guc-seq-page-cost"> apiece (1.0 by
linkend="guc-cpu-tuple-cost"> which is default), plus 10000 * <xref linkend="guc-cpu-tuple-cost"> which is
typically 0.01 (try <command>SHOW cpu_tuple_cost</command>). 0.01 by default.
</para> </para>
<para> <para>
...@@ -400,8 +401,9 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t ...@@ -400,8 +401,9 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t
Note that the <quote>actual time</quote> values are in milliseconds of Note that the <quote>actual time</quote> values are in milliseconds of
real time, whereas the <quote>cost</quote> estimates are expressed in real time, whereas the <quote>cost</quote> estimates are expressed in
arbitrary units of disk fetches; so they are unlikely to match up. arbitrary units; so they are unlikely to match up.
The thing to pay attention to is the ratios. The thing to pay attention to is whether the ratios of actual time and
estimated costs are consistent.
</para> </para>
<para> <para>
...@@ -427,7 +429,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t ...@@ -427,7 +429,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t
may be considerably larger, because it includes the time spent processing may be considerably larger, because it includes the time spent processing
the result rows. In these commands, the time for the top plan node the result rows. In these commands, the time for the top plan node
essentially is the time spent computing the new rows and/or locating the essentially is the time spent computing the new rows and/or locating the
old ones, but it doesn't include the time spent making the changes. old ones, but it doesn't include the time spent applying the changes.
Time spent firing triggers, if any, is also outside the top plan node, Time spent firing triggers, if any, is also outside the top plan node,
and is shown separately for each trigger. and is shown separately for each trigger.
</para> </para>
......
...@@ -3,14 +3,19 @@ ...@@ -3,14 +3,19 @@
* costsize.c * costsize.c
* Routines to compute (and set) relation sizes and path costs * Routines to compute (and set) relation sizes and path costs
* *
* Path costs are measured in units of disk accesses: one sequential page * Path costs are measured in arbitrary units established by these basic
* fetch has cost 1. All else is scaled relative to a page fetch, using * parameters:
* the scaling parameters
* *
* seq_page_cost Cost of a sequential page fetch
* random_page_cost Cost of a non-sequential page fetch * random_page_cost Cost of a non-sequential page fetch
* cpu_tuple_cost Cost of typical CPU time to process a tuple * cpu_tuple_cost Cost of typical CPU time to process a tuple
* cpu_index_tuple_cost Cost of typical CPU time to process an index tuple * cpu_index_tuple_cost Cost of typical CPU time to process an index tuple
* cpu_operator_cost Cost of CPU time to process a typical WHERE operator * cpu_operator_cost Cost of CPU time to execute an operator or function
*
* We expect that the kernel will typically do some amount of read-ahead
* optimization; this in conjunction with seek costs means that seq_page_cost
* is normally considerably less than random_page_cost. (However, if the
* database is fully cached in RAM, it is reasonable to set them equal.)
* *
* We also use a rough estimate "effective_cache_size" of the number of * We also use a rough estimate "effective_cache_size" of the number of
* disk pages in Postgres + OS-level disk cache. (We can't simply use * disk pages in Postgres + OS-level disk cache. (We can't simply use
...@@ -49,7 +54,7 @@ ...@@ -49,7 +54,7 @@
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/path/costsize.c,v 1.155 2006/03/05 15:58:28 momjian Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/path/costsize.c,v 1.156 2006/06/05 02:49:58 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
...@@ -85,12 +90,14 @@ ...@@ -85,12 +90,14 @@
(path)->parent->rows) (path)->parent->rows)
double effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE; double seq_page_cost = DEFAULT_SEQ_PAGE_COST;
double random_page_cost = DEFAULT_RANDOM_PAGE_COST; double random_page_cost = DEFAULT_RANDOM_PAGE_COST;
double cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST; double cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST;
double cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST; double cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST;
double cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST; double cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST;
double effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE;
Cost disable_cost = 100000000.0; Cost disable_cost = 100000000.0;
bool enable_seqscan = true; bool enable_seqscan = true;
...@@ -156,14 +163,8 @@ cost_seqscan(Path *path, PlannerInfo *root, ...@@ -156,14 +163,8 @@ cost_seqscan(Path *path, PlannerInfo *root,
/* /*
* disk costs * disk costs
*
* The cost of reading a page sequentially is 1.0, by definition. Note
* that the Unix kernel will typically do some amount of read-ahead
* optimization, so that this cost is less than the true cost of reading a
* page from disk. We ignore that issue here, but must take it into
* account when estimating the cost of non-sequential accesses!
*/ */
run_cost += baserel->pages; /* sequential fetches with cost 1.0 */ run_cost += seq_page_cost * baserel->pages;
/* CPU costs */ /* CPU costs */
startup_cost += baserel->baserestrictcost.startup; startup_cost += baserel->baserestrictcost.startup;
...@@ -194,20 +195,21 @@ cost_seqscan(Path *path, PlannerInfo *root, ...@@ -194,20 +195,21 @@ cost_seqscan(Path *path, PlannerInfo *root,
* with the entirely ad-hoc equations (writing relsize for * with the entirely ad-hoc equations (writing relsize for
* relpages/effective_cache_size): * relpages/effective_cache_size):
* if relsize >= 1: * if relsize >= 1:
* random_page_cost - (random_page_cost-1)/2 * (1/relsize) * random_page_cost - (random_page_cost-seq_page_cost)/2 * (1/relsize)
* if relsize < 1: * if relsize < 1:
* 1 + ((random_page_cost-1)/2) * relsize ** 2 * seq_page_cost + ((random_page_cost-seq_page_cost)/2) * relsize ** 2
* These give the right asymptotic behavior (=> 1.0 as relpages becomes * These give the right asymptotic behavior (=> seq_page_cost as relpages
* small, => random_page_cost as it becomes large) and meet in the middle * becomes small, => random_page_cost as it becomes large) and meet in the
* with the estimate that the cache is about 50% effective for a relation * middle with the estimate that the cache is about 50% effective for a
* of the same size as effective_cache_size. (XXX this is probably all * relation of the same size as effective_cache_size. (XXX this is probably
* wrong, but I haven't been able to find any theory about how effective * all wrong, but I haven't been able to find any theory about how effective
* a disk cache should be presumed to be.) * a disk cache should be presumed to be.)
*/ */
static Cost static Cost
cost_nonsequential_access(double relpages) cost_nonsequential_access(double relpages)
{ {
double relsize; double relsize;
double random_delta;
/* don't crash on bad input data */ /* don't crash on bad input data */
if (relpages <= 0.0 || effective_cache_size <= 0.0) if (relpages <= 0.0 || effective_cache_size <= 0.0)
...@@ -215,19 +217,17 @@ cost_nonsequential_access(double relpages) ...@@ -215,19 +217,17 @@ cost_nonsequential_access(double relpages)
relsize = relpages / effective_cache_size; relsize = relpages / effective_cache_size;
random_delta = (random_page_cost - seq_page_cost) * 0.5;
if (relsize >= 1.0) if (relsize >= 1.0)
return random_page_cost - (random_page_cost - 1.0) * 0.5 / relsize; return random_page_cost - random_delta / relsize;
else else
return 1.0 + (random_page_cost - 1.0) * 0.5 * relsize * relsize; return seq_page_cost + random_delta * relsize * relsize;
} }
/* /*
* cost_index * cost_index
* Determines and returns the cost of scanning a relation using an index. * Determines and returns the cost of scanning a relation using an index.
* *
* NOTE: an indexscan plan node can actually represent several passes,
* but here we consider the cost of just one pass.
*
* 'index' is the index to be used * 'index' is the index to be used
* 'indexQuals' is the list of applicable qual clauses (implicit AND semantics) * 'indexQuals' is the list of applicable qual clauses (implicit AND semantics)
* 'is_injoin' is T if we are considering using the index scan as the inside * 'is_injoin' is T if we are considering using the index scan as the inside
...@@ -327,9 +327,9 @@ cost_index(IndexPath *path, PlannerInfo *root, ...@@ -327,9 +327,9 @@ cost_index(IndexPath *path, PlannerInfo *root,
* be just sT. What's more, these will be sequential fetches, not the * be just sT. What's more, these will be sequential fetches, not the
* random fetches that occur in the uncorrelated case. So, depending on * random fetches that occur in the uncorrelated case. So, depending on
* the extent of correlation, we should estimate the actual I/O cost * the extent of correlation, we should estimate the actual I/O cost
* somewhere between s * T * 1.0 and PF * random_cost. We currently * somewhere between s * T * seq_page_cost and PF * random_page_cost.
* interpolate linearly between these two endpoints based on the * We currently interpolate linearly between these two endpoints based on
* correlation squared (XXX is that appropriate?). * the correlation squared (XXX is that appropriate?).
* *
* In any case the number of tuples fetched is Ns. * In any case the number of tuples fetched is Ns.
*---------- *----------
...@@ -346,8 +346,10 @@ cost_index(IndexPath *path, PlannerInfo *root, ...@@ -346,8 +346,10 @@ cost_index(IndexPath *path, PlannerInfo *root,
{ {
pages_fetched = pages_fetched =
(2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
if (pages_fetched > T) if (pages_fetched >= T)
pages_fetched = T; pages_fetched = T;
else
pages_fetched = ceil(pages_fetched);
} }
else else
{ {
...@@ -364,6 +366,7 @@ cost_index(IndexPath *path, PlannerInfo *root, ...@@ -364,6 +366,7 @@ cost_index(IndexPath *path, PlannerInfo *root,
pages_fetched = pages_fetched =
b + (tuples_fetched - lim) * (T - b) / T; b + (tuples_fetched - lim) * (T - b) / T;
} }
pages_fetched = ceil(pages_fetched);
} }
/* /*
...@@ -373,7 +376,7 @@ cost_index(IndexPath *path, PlannerInfo *root, ...@@ -373,7 +376,7 @@ cost_index(IndexPath *path, PlannerInfo *root,
* rather than using cost_nonsequential_access, since we've already * rather than using cost_nonsequential_access, since we've already
* accounted for caching effects by using the Mackert model. * accounted for caching effects by using the Mackert model.
*/ */
min_IO_cost = ceil(indexSelectivity * T); min_IO_cost = ceil(indexSelectivity * T) * seq_page_cost;
max_IO_cost = pages_fetched * random_page_cost; max_IO_cost = pages_fetched * random_page_cost;
/* /*
...@@ -461,19 +464,21 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel, ...@@ -461,19 +464,21 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
T = (baserel->pages > 1) ? (double) baserel->pages : 1.0; T = (baserel->pages > 1) ? (double) baserel->pages : 1.0;
pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
if (pages_fetched > T) if (pages_fetched >= T)
pages_fetched = T; pages_fetched = T;
else
pages_fetched = ceil(pages_fetched);
/* /*
* For small numbers of pages we should charge random_page_cost apiece, * For small numbers of pages we should charge random_page_cost apiece,
* while if nearly all the table's pages are being read, it's more * while if nearly all the table's pages are being read, it's more
* appropriate to charge 1.0 apiece. The effect is nonlinear, too. For * appropriate to charge seq_page_cost apiece. The effect is nonlinear,
* lack of a better idea, interpolate like this to determine the cost per * too. For lack of a better idea, interpolate like this to determine the
* page. * cost per page.
*/ */
if (pages_fetched >= 2.0) if (pages_fetched >= 2.0)
cost_per_page = random_page_cost - cost_per_page = random_page_cost -
(random_page_cost - 1.0) * sqrt(pages_fetched / T); (random_page_cost - seq_page_cost) * sqrt(pages_fetched / T);
else else
cost_per_page = random_page_cost; cost_per_page = random_page_cost;
...@@ -833,9 +838,9 @@ cost_sort(Path *path, PlannerInfo *root, ...@@ -833,9 +838,9 @@ cost_sort(Path *path, PlannerInfo *root,
else else
log_runs = 1.0; log_runs = 1.0;
npageaccesses = 2.0 * npages * log_runs; npageaccesses = 2.0 * npages * log_runs;
/* Assume half are sequential (cost 1), half are not */ /* Assume half are sequential, half are not */
startup_cost += npageaccesses * startup_cost += npageaccesses *
(1.0 + cost_nonsequential_access(npages)) * 0.5; (seq_page_cost + cost_nonsequential_access(npages)) * 0.5;
} }
/* /*
...@@ -871,8 +876,8 @@ cost_material(Path *path, ...@@ -871,8 +876,8 @@ cost_material(Path *path,
double npages = ceil(nbytes / BLCKSZ); double npages = ceil(nbytes / BLCKSZ);
/* We'll write during startup and read during retrieval */ /* We'll write during startup and read during retrieval */
startup_cost += npages; startup_cost += seq_page_cost * npages;
run_cost += npages; run_cost += seq_page_cost * npages;
} }
/* /*
......
...@@ -15,7 +15,7 @@ ...@@ -15,7 +15,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.205 2006/05/02 11:28:55 teodor Exp $ * $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.206 2006/06/05 02:49:58 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
...@@ -4555,9 +4555,9 @@ genericcostestimate(PlannerInfo *root, ...@@ -4555,9 +4555,9 @@ genericcostestimate(PlannerInfo *root,
* Compute the index access cost. * Compute the index access cost.
* *
* Disk cost: our generic assumption is that the index pages will be read * Disk cost: our generic assumption is that the index pages will be read
* sequentially, so they have cost 1.0 each, not random_page_cost. * sequentially, so they cost seq_page_cost each, not random_page_cost.
*/ */
*indexTotalCost = numIndexPages; *indexTotalCost = seq_page_cost * numIndexPages;
/* /*
* CPU cost: any complex expressions in the indexquals will need to be * CPU cost: any complex expressions in the indexquals will need to be
......
...@@ -10,7 +10,7 @@ ...@@ -10,7 +10,7 @@
* Written by Peter Eisentraut <peter_e@gmx.net>. * Written by Peter Eisentraut <peter_e@gmx.net>.
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.320 2006/05/21 20:10:42 tgl Exp $ * $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.321 2006/06/05 02:49:58 tgl Exp $
* *
*-------------------------------------------------------------------- *--------------------------------------------------------------------
*/ */
...@@ -1595,56 +1595,62 @@ static struct config_int ConfigureNamesInt[] = ...@@ -1595,56 +1595,62 @@ static struct config_int ConfigureNamesInt[] =
static struct config_real ConfigureNamesReal[] = static struct config_real ConfigureNamesReal[] =
{ {
{ {
{"effective_cache_size", PGC_USERSET, QUERY_TUNING_COST, {"seq_page_cost", PGC_USERSET, QUERY_TUNING_COST,
gettext_noop("Sets the planner's assumption about size of the disk cache."), gettext_noop("Sets the planner's estimate of the cost of a "
gettext_noop("That is, the portion of the kernel's disk cache that " "sequentially fetched disk page."),
"will be used for PostgreSQL data files. This is measured in disk " NULL
"pages, which are normally 8 kB each.")
}, },
&effective_cache_size, &seq_page_cost,
DEFAULT_EFFECTIVE_CACHE_SIZE, 1, DBL_MAX, NULL, NULL DEFAULT_SEQ_PAGE_COST, 0, DBL_MAX, NULL, NULL
}, },
{ {
{"random_page_cost", PGC_USERSET, QUERY_TUNING_COST, {"random_page_cost", PGC_USERSET, QUERY_TUNING_COST,
gettext_noop("Sets the planner's estimate of the cost of a nonsequentially " gettext_noop("Sets the planner's estimate of the cost of a "
"fetched disk page."), "nonsequentially fetched disk page."),
gettext_noop("This is measured as a multiple of the cost of a " NULL
"sequential page fetch. A higher value makes it more likely a "
"sequential scan will be used, a lower value makes it more likely an "
"index scan will be used.")
}, },
&random_page_cost, &random_page_cost,
DEFAULT_RANDOM_PAGE_COST, 0, DBL_MAX, NULL, NULL DEFAULT_RANDOM_PAGE_COST, 0, DBL_MAX, NULL, NULL
}, },
{ {
{"cpu_tuple_cost", PGC_USERSET, QUERY_TUNING_COST, {"cpu_tuple_cost", PGC_USERSET, QUERY_TUNING_COST,
gettext_noop("Sets the planner's estimate of the cost of processing each tuple (row)."), gettext_noop("Sets the planner's estimate of the cost of "
gettext_noop("This is measured as a fraction of the cost of a " "processing each tuple (row)."),
"sequential page fetch.") NULL
}, },
&cpu_tuple_cost, &cpu_tuple_cost,
DEFAULT_CPU_TUPLE_COST, 0, DBL_MAX, NULL, NULL DEFAULT_CPU_TUPLE_COST, 0, DBL_MAX, NULL, NULL
}, },
{ {
{"cpu_index_tuple_cost", PGC_USERSET, QUERY_TUNING_COST, {"cpu_index_tuple_cost", PGC_USERSET, QUERY_TUNING_COST,
gettext_noop("Sets the planner's estimate of processing cost for each " gettext_noop("Sets the planner's estimate of the cost of "
"index tuple (row) during index scan."), "processing each index entry during an index scan."),
gettext_noop("This is measured as a fraction of the cost of a " NULL
"sequential page fetch.")
}, },
&cpu_index_tuple_cost, &cpu_index_tuple_cost,
DEFAULT_CPU_INDEX_TUPLE_COST, 0, DBL_MAX, NULL, NULL DEFAULT_CPU_INDEX_TUPLE_COST, 0, DBL_MAX, NULL, NULL
}, },
{ {
{"cpu_operator_cost", PGC_USERSET, QUERY_TUNING_COST, {"cpu_operator_cost", PGC_USERSET, QUERY_TUNING_COST,
gettext_noop("Sets the planner's estimate of processing cost of each operator in WHERE."), gettext_noop("Sets the planner's estimate of the cost of "
gettext_noop("This is measured as a fraction of the cost of a sequential " "processing each operator or function call."),
"page fetch.") NULL
}, },
&cpu_operator_cost, &cpu_operator_cost,
DEFAULT_CPU_OPERATOR_COST, 0, DBL_MAX, NULL, NULL DEFAULT_CPU_OPERATOR_COST, 0, DBL_MAX, NULL, NULL
}, },
{
{"effective_cache_size", PGC_USERSET, QUERY_TUNING_COST,
gettext_noop("Sets the planner's assumption about size of the disk cache."),
gettext_noop("That is, the portion of the kernel's disk cache that "
"will be used for PostgreSQL data files. This is measured in disk "
"pages, which are normally 8 kB each.")
},
&effective_cache_size,
DEFAULT_EFFECTIVE_CACHE_SIZE, 1, DBL_MAX, NULL, NULL
},
{ {
{"geqo_selection_bias", PGC_USERSET, QUERY_TUNING_GEQO, {"geqo_selection_bias", PGC_USERSET, QUERY_TUNING_GEQO,
gettext_noop("GEQO: selective pressure within the population."), gettext_noop("GEQO: selective pressure within the population."),
......
...@@ -175,12 +175,12 @@ ...@@ -175,12 +175,12 @@
# - Planner Cost Constants - # - Planner Cost Constants -
#seq_page_cost = 1.0 # measured on an arbitrary scale
#random_page_cost = 4.0 # same scale as above
#cpu_tuple_cost = 0.01 # same scale as above
#cpu_index_tuple_cost = 0.001 # same scale as above
#cpu_operator_cost = 0.0025 # same scale as above
#effective_cache_size = 1000 # typically 8KB each #effective_cache_size = 1000 # typically 8KB each
#random_page_cost = 4 # units are one sequential page fetch
# cost
#cpu_tuple_cost = 0.01 # (same)
#cpu_index_tuple_cost = 0.001 # (same)
#cpu_operator_cost = 0.0025 # (same)
# - Genetic Query Optimizer - # - Genetic Query Optimizer -
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* $PostgreSQL: pgsql/src/include/optimizer/cost.h,v 1.73 2006/03/05 15:58:57 momjian Exp $ * $PostgreSQL: pgsql/src/include/optimizer/cost.h,v 1.74 2006/06/05 02:49:58 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
...@@ -21,12 +21,14 @@ ...@@ -21,12 +21,14 @@
/* defaults for costsize.c's Cost parameters */ /* defaults for costsize.c's Cost parameters */
/* NB: cost-estimation code should use the variables, not these constants! */ /* NB: cost-estimation code should use the variables, not these constants! */
/* If you change these, update backend/utils/misc/postgresql.sample.conf */ /* If you change these, update backend/utils/misc/postgresql.sample.conf */
#define DEFAULT_EFFECTIVE_CACHE_SIZE 1000.0 /* measured in pages */ #define DEFAULT_SEQ_PAGE_COST 1.0
#define DEFAULT_RANDOM_PAGE_COST 4.0 #define DEFAULT_RANDOM_PAGE_COST 4.0
#define DEFAULT_CPU_TUPLE_COST 0.01 #define DEFAULT_CPU_TUPLE_COST 0.01
#define DEFAULT_CPU_INDEX_TUPLE_COST 0.001 #define DEFAULT_CPU_INDEX_TUPLE_COST 0.001
#define DEFAULT_CPU_OPERATOR_COST 0.0025 #define DEFAULT_CPU_OPERATOR_COST 0.0025
#define DEFAULT_EFFECTIVE_CACHE_SIZE 1000.0 /* measured in pages */
/* /*
* prototypes for costsize.c * prototypes for costsize.c
...@@ -34,11 +36,12 @@ ...@@ -34,11 +36,12 @@
*/ */
/* parameter variables and flags */ /* parameter variables and flags */
extern double effective_cache_size; extern DLLIMPORT double seq_page_cost;
extern double random_page_cost; extern DLLIMPORT double random_page_cost;
extern double cpu_tuple_cost; extern DLLIMPORT double cpu_tuple_cost;
extern DLLIMPORT double cpu_index_tuple_cost; extern DLLIMPORT double cpu_index_tuple_cost;
extern double cpu_operator_cost; extern DLLIMPORT double cpu_operator_cost;
extern double effective_cache_size;
extern Cost disable_cost; extern Cost disable_cost;
extern bool enable_seqscan; extern bool enable_seqscan;
extern bool enable_indexscan; extern bool enable_indexscan;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册