• A
    Adding GiST support for GPORCA · ec3693e6
    Ashuka Xue 提交于
    Prior to this commit, there was no support for GiST indexes in GPORCA.
    For queries involving GiST indexes, ORCA was selecting Table Scan paths
    as the optimal plan. These plans could take up to 300+ times longer than
    Planner, which generated a index scan plan using the GiST index.
    
    Example:
    ```
    CREATE TABLE gist_tbl (a int, p polygon);
    CREATE TABLE gist_tbl2 (b int, p polygon);
    CREATE INDEX poly_index ON gist_tbl USING gist(p);
    
    INSERT INTO gist_tbl SELECT i, polygon(box(point(i, i+2),point(i+4,
    i+6))) FROM generate_series(1,50000)i;
    INSERT INTO gist_tbl2 SELECT i, polygon(box(point(i+1, i+3),point(i+5,
    i+7))) FROM generate_series(1,50000)i;
    
    ANALYZE;
    ```
    With the query `SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE
    gist_tbl.p <@ gist_tbl2.p;`, we see a performance increase with the
    support of GiST.
    
    Before:
    ```
    EXPLAIN SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
                                                         QUERY PLAN
    ---------------------------------------------------------------------------------------------------------------------
     Aggregate  (cost=0.00..171401912.12 rows=1 width=8)
       ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..171401912.12 rows=1 width=8)
             ->  Aggregate  (cost=0.00..171401912.12 rows=1 width=8)
                   ->  Nested Loop  (cost=0.00..171401912.12 rows=335499869 width=1)
                         Join Filter: gist_tbl.p <@ gist_tbl2.p
                         ->  Table Scan on gist_tbl2  (cost=0.00..432.25 rows=16776 width=101)
                         ->  Materialize  (cost=0.00..530.81 rows=49997 width=101)
                               ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..525.76 rows=49997 width=101)
                                     ->  Table Scan on gist_tbl  (cost=0.00..432.24 rows=16666 width=101)
     Optimizer status: PQO version 2.65.1
    (10 rows)
    
    Time: 170.172 ms
    SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
     count
    -------
     49999
    (1 row)
    
    Time: 546028.227 ms
    ```
    
    After:
    ```
    EXPLAIN SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
                                                      QUERY PLAN
    ---------------------------------------------------------------------------------------------------------------
     Aggregate  (cost=0.00..21749053.24 rows=1 width=8)
       ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..21749053.24 rows=1 width=8)
             ->  Aggregate  (cost=0.00..21749053.24 rows=1 width=8)
                   ->  Nested Loop  (cost=0.00..21749053.24 rows=335499869 width=1)
                         Join Filter: true
                         ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..526.39 rows=50328 width=101)
                               ->  Table Scan on gist_tbl2  (cost=0.00..432.25 rows=16776 width=101)
                         ->  Bitmap Table Scan on gist_tbl  (cost=0.00..21746725.48 rows=6667 width=1)
                               Recheck Cond: gist_tbl.p <@ gist_tbl2.p
                               ->  Bitmap Index Scan on poly_index  (cost=0.00..0.00 rows=0 width=0)
                                     Index Cond: gist_tbl.p <@ gist_tbl2.p
     Optimizer status: PQO version 2.65.1
    (12 rows)
    
    Time: 617.489 ms
    
    SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
     count
    -------
     49999
    (1 row)
    
    Time: 7779.198 ms
    ```
    
    GiST support was implemented by sending over GiST index information to
    GPORCA in the metadata using a new index enum specifically for GiST.
    Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
    Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
    ec3693e6
cdbpartindex.c 43.7 KB