• B
    Bump Orca to v3.5.0 and Remove fix-me · 19f36828
    Bhuvnesh Chaudhary 提交于
    Update the tests and bump orca version corresponding to the below fix in
    ORCA.
    
    If the Scalar Agg node has child which has volatile function and has
    universal distribution for relational child, instead of deriving the
    spec as that of the child i.e Universal Spec, we should derive the
    distribution spec as singleton.
    
    Deriving singleton distribution spec will ensure that the tree is
    executed on one node and the results will be distributed to the
    remaining segments.
    
    Consider the below setup:
    
    create table f (a text);
    explain insert into f select sum(random()) from (select 1);
    
    Physical plan:
    +--CPhysicalDML (Insert, "foo"), Source Columns: ["f1" (2)], Action: ("ColRef_0004" (4))   rows:1   width:1  rebinds:1   cost:0.016658   origin: [Grp:12, GrpExpr:2]
       +--CPhysicalComputeScalar   rows:1   width:1  rebinds:1   cost:0.001033   origin: [Grp:23, GrpExpr:3]
          |--CPhysicalMotionHashDistribute HASHED: [ +--CScalarIdent "f1" (2)
     , nulls colocated, duplicate sensitive ]   rows:1   width:1  rebinds:1   cost:0.001029   origin: [Grp:11, GrpExpr:2]
          |  +--CPhysicalComputeScalar   rows:1   width:1  rebinds:1   cost:0.001008   origin: [Grp:11, GrpExpr:1]
          |     |--CPhysicalScalarAgg( Global ) Grp Cols: [], Minimal Grp Cols:[], Generates Duplicates :[ 0 ]    rows:1   width:8  rebinds:1   cost:0.001000   origin: [Grp:6, GrpExpr:3]
          |     |  |--CPhysicalTVF   rows:1000   width:1  rebinds:1   cost:0.001000   origin: [Grp:1, GrpExpr:1]
          |     |  |  |--CScalarConst (1)   origin: [Grp:0, GrpExpr:0]
          |     |  |  +--CScalarConst (1)   origin: [Grp:0, GrpExpr:0]
          |     |  +--CScalarProjectList   origin: [Grp:5, GrpExpr:0]
          |     |     +--CScalarProjectElement "sum" (1)   origin: [Grp:4, GrpExpr:0]
          |     |        +--CScalarAggFunc (sum , Distinct: false , Aggregate Stage: Global)   origin: [Grp:3, GrpExpr:0]
          |     |           +--CScalarFunc (random)   origin: [Grp:2, GrpExpr:0]
          |     +--CScalarProjectList   origin: [Grp:10, GrpExpr:0]
          |        +--CScalarProjectElement "f1" (2)   origin: [Grp:9, GrpExpr:0]
          |           +--CScalarCoerceViaIO   origin: [Grp:8, GrpExpr:0]
          |              +--CScalarIdent "sum" (1)   origin: [Grp:7, GrpExpr:0]
          +--CScalarProjectList   origin: [Grp:22, GrpExpr:0]
             +--CScalarProjectElement "ColRef_0004" (4)   origin: [Grp:21, GrpExpr:0]
                +--CScalarConst (1)   origin: [Grp:0, GrpExpr:0]
    ",
                                                 QUERY PLAN
    ----------------------------------------------------------------------------------------------------
     Insert  (cost=0.00..0.02 rows=1 width=8)
       ->  Result  (cost=0.00..0.00 rows=1 width=12)
             ->  Result  (cost=0.00..0.00 rows=1 width=8)
                   ->  Result  (cost=0.00..0.00 rows=1 width=8)
                         ->  Aggregate  (cost=0.00..0.00 rows=1 width=8)
                               ->  Function Scan on generate_series  (cost=0.00..0.00 rows=334 width=1)
    
    Above CPhysicalMotionHashDistribute has a universal child (i.e
    CPhysicalTVF), and translator will convert it to a Result Node with Hash
    Filter. The hash filter is build on the projected column in this query
    i.e sum(random()).. Random being a volatile function, will cause the
    resulting sum to be non-deterministic i.e volatile in this case. So the
    hash filter may not appropriately filter out the segment and the insert
    may result in inserting non-deterministic number of rows in the table
    (with the maximum being equal to the number of segments).  So, we should
    derive singleton distribution if the relation child of the scalar agg
    node delivers universal spec and the project list has volatile
    functions.
    
    Resulting plan will be:
    
    pivotal=# explain insert into foo select sum(random()) from (select 1) f;
                                       QUERY PLAN
    ---------------------------------------------------------------------------------
     Insert  (cost=0.00..0.02 rows=1 width=8)
       ->  Result  (cost=0.00..0.00 rows=1 width=12)
             ->  Redistribute Motion 1:3  (slice1)  (cost=0.00..0.00 rows=1 width=8)
                   Hash Key: (((sum(random())))::text)
                   ->  Result  (cost=0.00..0.00 rows=1 width=8)
                         ->  Aggregate  (cost=0.00..0.00 rows=1 width=8)
                               ->  Result  (cost=0.00..0.00 rows=1 width=1)
    19f36828
configure 558.2 KB