query-profiling.xml 19.2 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Composite//EN" "ditabase.dtd">

<topic id="topic39" xml:lang="en">
  <title id="in198649">Query Profiling</title>
  <abstract>
    <shortdesc>Examine the query plans of poorly performing queries to identify possible performance
      tuning opportunities.</shortdesc>
  </abstract>
  <body>
    <p>Greenplum Database devises a <i>query plan</i> for each query. Choosing the
      right query plan to match the query and data structure is necessary for good performance. A
      query plan defines how Greenplum Database will run the query in the parallel
      execution environment. </p>
    <p>The query optimizer uses data statistics maintained by the database to choose a query plan
      with the lowest possible cost. Cost is measured in disk I/O, shown as units of disk page
      fetches. The goal is to minimize the total execution cost for the plan.</p>
    <p>View the plan for a given query with the <codeph>EXPLAIN</codeph> command.
        <codeph>EXPLAIN</codeph> shows the query optimizer's estimated cost for the query plan. For
      example:</p>
    <p>
      <codeblock>EXPLAIN SELECT * FROM names WHERE id=22;
</codeblock>
    </p>
    <p><codeph>EXPLAIN ANALYZE</codeph> runs the statement in addition to displaying its plan. This
      is useful for determining how close the optimizer's estimates are to reality. For example:</p>
    <p>
      <codeblock>EXPLAIN ANALYZE SELECT * FROM names WHERE id=22;
</codeblock>
    </p>
31
    <note>In Greenplum Database, the default GPORCA optimizer co-exists with the Postgres query
D
David Yozie 已提交
32
      optimizer. The <cmdname>EXPLAIN</cmdname> output generated by GPORCA is different than the
33
      output generated by the Postgres query optimizer. <p>By default, Greenplum Database uses GPORCA
D
David Yozie 已提交
34 35 36 37 38 39
        to generate an execution plan for a query when possible. </p><p>When the <codeph>EXPLAIN
          ANALYZE</codeph> command uses GPORCA, the <codeph>EXPLAIN</codeph> plan shows only the
        number of partitions that are being eliminated. The scanned partitions are not shown. To
        show name of the scanned partitions in the segment logs set the server configuration
        parameter <codeph>gp_log_dynamic_partition_pruning</codeph> to <codeph>on</codeph>. This
        example <codeph>SET</codeph> command enables the parameter.</p><p>
40
        <codeblock>SET gp_log_dynamic_partition_pruning = on;</codeblock>
41
      </p><p>For information about GPORCA, see <xref href="query.xml#topic1"/>. </p></note>
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61
  </body>
  <topic id="topic40" xml:lang="en">
    <title>Reading EXPLAIN Output</title>
    <body>
      <p>A query plan is a tree of nodes. Each node in the plan represents a single operation, such
        as a table scan, join, aggregation, or sort.</p>
      <p>Read plans from the bottom to the top: each node feeds rows into the node directly above
        it. The bottom nodes of a plan are usually table scan operations: sequential, index, or
        bitmap index scans. If the query requires joins, aggregations, sorts, or other operations on
        the rows, there are additional nodes above the scan nodes to perform these operations. The
        topmost plan nodes are usually Greenplum Database motion nodes: redistribute,
        explicit redistribute, broadcast, or gather motions. These operations move rows between
        segment instances during query processing.</p>
      <p>The output of <codeph>EXPLAIN</codeph> has one line for each node in the plan tree and
        shows the basic node type and the following execution cost estimates for that plan node:</p>
      <ul>
        <li id="in182482"><b>cost</b> —Measured in units of disk page fetches. 1.0 equals one
          sequential disk page read. The first estimate is the start-up cost of getting the first
          row and the second is the total cost of cost of getting all rows. The total cost assumes
          all rows will be retrieved, which is not always true; for example, if the query uses
62 63 64 65 66 67 68 69 70 71
            <codeph>LIMIT</codeph>, not all rows are retrieved.<note>The cost values generated by
            the Pivotal Query Optimizer and the Postgres Planner are not directly comparable. The
            two optimizers use different cost models, as well as different algorithms, to determine
            the cost of an execution plan. Nothing can or should be inferred by comparing cost
            values between the two optimizers.<p>In addition, the cost generated for any given
              optimizer is valid only for comparing plan alternatives for a given single query and
              set of statistics. Different queries can generate plans with different costs, even
              when keeping the optimizer a constant.</p><p>To summarize, the cost is essentially an
              internal number used by a given optimizer, and nothing should be inferred by examining
              only the cost value displayed in the <codeph>EXPLAIN</codeph> plans.</p></note></li>
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191
        <li id="in182483"><b>rows</b> —The total number of rows output by this plan node. This
          number is usually less than the number of rows processed or scanned by the plan node,
          reflecting the estimated selectivity of any <codeph>WHERE</codeph> clause conditions.
          Ideally, the estimate for the topmost node approximates the number of rows that the query
          actually returns<ph>, updates, or deletes</ph>.</li>
        <li id="in182484"><b>width</b> —The total bytes of all the rows that this plan node
          outputs.</li>
      </ul>
      <p>Note the following:</p>
      <ul>
        <li id="in203106">The cost of a node includes the cost of its child nodes. The topmost plan
          node has the estimated total execution cost for the plan. This is the number the optimizer
          intends to minimize.</li>
        <li id="in203119">The cost reflects only the aspects of plan execution that the query
          optimizer takes into consideration. For example, the cost does not reflect time spent
          transmitting result rows to the client.</li>
      </ul>
    </body>
    <topic id="topic41" xml:lang="en">
      <title id="in182487">EXPLAIN Example</title>
      <body>
        <p>The following example describes how to read an <codeph>EXPLAIN</codeph> query plan for a
          query:</p>
        <p>
          <codeblock><b>EXPLAIN</b> SELECT * FROM names WHERE name = 'Joelle';
                     QUERY PLAN
------------------------------------------------------------
Gather Motion 2:1 (slice1) (cost=0.00..20.88 rows=1 width=13)

   -&gt; Seq Scan on 'names' (cost=0.00..20.88 rows=1 width=13)
         Filter: name::text ~~ 'Joelle'::text
</codeblock>
        </p>
        <p>Read the plan from the bottom to the top. To start, the query optimizer sequentially
          scans the <i>names</i> table. Notice the <codeph>WHERE</codeph> clause is applied as a
            <i>filter</i> condition. This means the scan operation checks the condition for each row
          it scans and outputs only the rows that satisfy the condition.</p>
        <p>The results of the scan operation are passed to a <i>gather motion</i> operation. In
            Greenplum Database, a gather motion is when segments send rows to the
          master. In this example, we have two segment instances that send to one master instance.
          This operation is working on <codeph>slice1</codeph> of the parallel query execution plan.
          A query plan is divided into <i>slices</i> so the segments can work on portions of the
          query plan in parallel.</p>
        <p>The estimated startup cost for this plan is <codeph>00.00</codeph> (no cost) and a total
          cost of <codeph>20.88</codeph> disk page fetches. The optimizer estimates this query will
          return one row.</p>
      </body>
    </topic>
  </topic>
  <topic id="topic42" xml:lang="en">
    <title>Reading EXPLAIN ANALYZE Output</title>
    <body>
      <p><codeph>EXPLAIN ANALYZE</codeph> plans and runs the statement. The <codeph>EXPLAIN
          ANALYZE</codeph> plan shows the actual execution cost along with the optimizer's
        estimates. This allows you to see if the optimizer's estimates are close to reality.
          <codeph>EXPLAIN ANALYZE</codeph> also shows the following:</p>
      <ul>
        <li id="in209045">The total runtime (in milliseconds) in which the query executed.</li>
        <li id="in209046">The memory used by each slice of the query plan, as well as the memory
          reserved for the whole query statement.</li>
        <li id="in209047">The number of <i>workers</i> (segments) involved in a plan node operation.
          Only segments that return rows are counted.</li>
        <li id="in209048">The maximum number of rows returned by the segment that produced the most
          rows for the operation. If multiple segments produce an equal number of rows,
            <codeph>EXPLAIN ANALYZE</codeph> shows the segment with the longest <i>&lt;time&gt; to
            end</i>.</li>
        <li id="in209049">The segment id of the segment that produced the most rows for an
          operation.</li>
        <li id="in209050">For relevant operations, the amount of memory (<codeph>work_mem</codeph>)
          used by the operation. If the <codeph>work_mem</codeph> was insufficient to perform the
          operation in memory, the plan shows the amount of data spilled to disk for the
          lowest-performing segment. For example:<p>
            <codeblock>Work_mem used: 64K bytes avg, 64K bytes max (seg0).
Work_mem wanted: 90K bytes avg, 90K byes max (seg0) to lessen 
workfile I/O affecting 2 workers.
</codeblock>
          </p></li>
        <li id="in209053">The time (in milliseconds) in which the segment that produced the most
          rows retrieved the first row, and the time taken for that segment to retrieve all rows.
          The result may omit <i>&lt;time&gt; to first row</i> if it is the same as the
            <i>&lt;time&gt; to end</i>.</li>
      </ul>
    </body>
    <topic id="topic43" xml:lang="en">
      <title>EXPLAIN ANALYZE Examples</title>
      <body>
        <p>This example describes how to read an <codeph>EXPLAIN ANALYZE</codeph> query plan using
          the same query. The <codeph>bold</codeph> parts of the plan show actual timing and rows
          returned for each plan node, as well as memory and time statistics for the whole
          query.</p>
        <codeblock><b>EXPLAIN ANALYZE</b> SELECT * FROM names WHERE name = 'Joelle';
                     QUERY PLAN
------------------------------------------------------------
Gather Motion 2:1 (slice1; segments: 2) (cost=0.00..20.88 rows=1 width=13)
    <b>Rows out: 1 rows at destination with 0.305 ms to first row, 0.537 ms to end, start offset by 0.289 ms.</b>
        -&gt; Seq Scan on names (cost=0.00..20.88 rows=1 width=13)
             <b>Rows out: Avg 1 rows x 2 workers. Max 1 rows (seg0) with 0.255 ms to first row, 0.486 ms to end, start offset by 0.968 ms.</b>
                 Filter: name = 'Joelle'::text
 Slice statistics:

      (slice0) Executor memory: 135K bytes.

    (slice1) Executor memory: 151K bytes avg x 2 workers, 151K bytes max (seg0).

<b>Statement statistics:</b>
 <b>Memory used: 128000K bytes</b>
 <b>Total runtime: 22.548 ms</b>
</codeblock>
        <p>Read the plan from the bottom to the top. The total elapsed time to run this query was
            <i>22.548</i> milliseconds.</p>
        <p>The <i>sequential scan</i> operation had only one segment (<i>seg0</i>) that returned
          rows, and it returned just <i>1 row</i>. It took <i>0.255</i> milliseconds to find the
          first row and <i>0.486</i> to scan all rows. This result is close to the optimizer's
          estimate: the query optimizer estimated it would return one row for this query. The
            <i>gather motion</i> (segments sending data to the master) received 1 row . The total
          elapsed time for this operation was <i>0.537</i> milliseconds.</p>
      </body>
      <topic id="topic_idt_2ll_gr">
        <title>Determining the Query Optimizer</title>
        <body>
192
          <p>You can view <cmdname>EXPLAIN</cmdname> output to determine if GPORCA is enabled for
193
            the query plan and whether GPORCA or the Postgres query optimizer generated the explain
194 195 196
            plan. The information appears at the end of the <cmdname>EXPLAIN</cmdname> output. The
              <codeph>Settings</codeph> line displays the setting of the server configuration
            parameter <codeph>OPTIMIZER</codeph>. The <codeph>Optimizer status</codeph> line
197
            displays whether GPORCA or the Postgres query optimizer generated the explain plan.</p>
198 199 200
          <p>For these two example query plans, GPORCA is enabled, the server
            configuration parameter <codeph>OPTIMIZER</codeph> is <codeph>on</codeph>. For the first
            plan, GPORCA generated the <cmdname>EXPLAIN</cmdname> plan. For the
201
            second plan, Greenplum Database fell back to the Postgres query optimizer to
202 203 204 205 206 207 208
            generate the query plan.</p>
          <p>
            <codeblock>                       QUERY PLAN
------------------------------------------------------------------------------------
 Aggregate  (cost=0.00..296.14 rows=1 width=8)
   ->  Gather Motion 2:1  (slice1; segments: 2)  (cost=0.00..295.10 rows=1 width=8)
         ->  Aggregate  (cost=0.00..294.10 rows=1 width=8)
209
               ->  Seq Scan on part  (cost=0.00..97.69 rows=100040 width=1)
210
 Settings:  <b>optimizer=on</b>
211
 Optimizer status: <b>Pivotal Optimizer (GPORCA) version 1.584</b>
212 213 214 215 216 217 218 219 220 221
(5 rows)</codeblock>
            <codeblock>explain select count(*) from part;

                       QUERY PLAN
----------------------------------------------------------------------------------------
 Aggregate  (cost=3519.05..3519.06 rows=1 width=8)
   ->  Gather Motion 2:1  (slice1; segments: 2)  (cost=3518.99..3519.03 rows=1 width=8)
         ->  Aggregate  (cost=3518.99..3519.00 rows=1 width=8)
               ->  Seq Scan on part  (cost=0.00..3018.79 rows=100040 width=1)
 Settings:  <b>optimizer=on</b>
222
 Optimizer status: <b>Postgres query optimizer</b>
223 224 225 226 227 228 229 230 231 232 233 234
(5 rows)</codeblock>
          </p>
          <p>For this query, the server configuration parameter <codeph>OPTIMIZER</codeph> is
              <codeph>off</codeph>.<codeblock>explain select count(*) from part;

                       QUERY PLAN
----------------------------------------------------------------------------------------
 Aggregate  (cost=3519.05..3519.06 rows=1 width=8)
   ->  Gather Motion 2:1  (slice1; segments: 2)  (cost=3518.99..3519.03 rows=1 width=8)
         ->  Aggregate  (cost=3518.99..3519.00 rows=1 width=8)
               ->  Seq Scan on part  (cost=0.00..3018.79 rows=100040 width=1)
 Settings: <b>optimizer=off</b>
235
 Optimizer status: <b>Postgres query optimizer</b>
236 237 238 239 240 241 242 243 244 245 246 247 248 249
(5 rows)</codeblock></p>
        </body>
      </topic>
    </topic>
  </topic>
  <topic id="topic44" xml:lang="en">
    <title>Examining Query Plans to Solve Problems</title>
    <body>
      <p>If a query performs poorly, examine its query plan and ask the following questions:</p>
      <ul>
        <li id="in182530"><b>Do operations in the plan take an exceptionally long time?</b> Look for
          an operation consumes the majority of query processing time. For example, if an index scan
          takes longer than expected, the index could be out-of-date and need to be reindexed. Or,
          adjust <codeph>enable_&lt;operator&gt; </codeph>parameters to see if you can force the
250
          Postgres query optimizer (planner) to choose a different plan by disabling a particular
251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269
          query plan operator for that query.</li>
        <li id="in182538"><b>Are the optimizer's estimates close to reality?</b> Run <codeph>EXPLAIN
            ANALYZE</codeph> and see if the number of rows the optimizer estimates is close to the
          number of rows the query operation actually returns. If there is a large discrepancy,
          collect more statistics on the relevant columns. <p>See the <i>Greenplum
              Database Reference Guide</i> for more information on the <codeph>EXPLAIN
              ANALYZE</codeph> and <codeph>ANALYZE</codeph> commands.</p></li>
        <li id="in182542"><b>Are selective predicates applied early in the plan?</b> Apply the most
          selective filters early in the plan so fewer rows move up the plan tree. If the query plan
          does not correctly estimate query predicate selectivity, collect more statistics on the
          relevant columns. <ph>See the <codeph>ANALYZE</codeph> command in the
              <i>Greenplum Database Reference Guide</i> for more information collecting statistics.
          </ph>You can also try reordering the <codeph>WHERE</codeph> clause of your SQL
          statement.</li>
        <li id="in182546"><b>Does the optimizer choose the best join order?</b> When you have a
          query that joins multiple tables, make sure that the optimizer chooses the most selective
          join order. Joins that eliminate the largest number of rows should be done earlier in the
          plan so fewer rows move up the plan tree. <p>If the plan is not choosing the optimal join
            order, set <codeph>join_collapse_limit=1</codeph> and use explicit <codeph>JOIN</codeph>
270
            syntax in your SQL statement to force the Postgres query optimizer (planner) to the
271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297
            specified join order. You can also collect more statistics on the relevant join columns.
            </p><p>See the <codeph>ANALYZE</codeph> command in the <i>Greenplum
              Database Reference Guide</i> for more information collecting statistics.</p></li>
        <li id="in182550"><b>Does the optimizer selectively scan partitioned tables?</b> If you use
          table partitioning, is the optimizer selectively scanning only the child tables required
          to satisfy the query predicates? Scans of the parent tables should return 0 rows since the
          parent tables do not contain any data. See <xref
            href="../../ddl/ddl-partition.xml#topic74"/> for an example of a query plan that shows a
          selective partition scan.</li>
        <li id="in182554"><b>Does the optimizer choose hash aggregate and hash join operations where
            applicable?</b> Hash operations are typically much faster than other types of joins or
          aggregations. Row comparison and sorting is done in memory rather than reading/writing
          from disk. To enable the query optimizer to choose hash operations, there must be
          sufficient memory available to hold the estimated number of rows. Try increasing work
          memory to improve performance for a query. If possible, run an <codeph>EXPLAIN
            ANALYZE</codeph> for the query to show which plan operations spilled to disk, how much
          work memory they used, and how much memory was required to avoid spilling to disk. For
              example:<p><codeph>Work_mem used: 23430K bytes avg, 23430K bytes max (seg0). Work_mem
              wanted: 33649K bytes avg, 33649K bytes max (seg0) to lessen workfile I/O affecting 2
              workers.</codeph></p><p>The "bytes wanted" message from <codeph>EXPLAIN
              ANALYZE</codeph> is based on the amount of data written to work files and is not
            exact. The minimum <codeph>work_mem</codeph> needed can differ from the suggested
            value.</p></li>
      </ul>
    </body>
  </topic>
</topic>