1. 09 1月, 2019 15 次提交
    • G
    • H
      Fix assertion failure in join planning. · 22288e8d
      Heikki Linnakangas 提交于
      cdbpath_motion_for_join() was sometimes returning an incorrect locus for a
      join between SingleQE and Hashed loci. This happened, when even the "last
      resort" strategy to move hashed side to the single QE failed. This can
      happen at least in the query that's added to the regression tests. The
      query involves a nested loop join path, when one side is a SingleQE locus
      and the other side is a Hashed locus, and there are no join predicates
      that can be used to determine the resulting locus.
      
      While we're at it, turn the Assertion that this tripped, and some related
      ones at the same place, into elog()s. No need to crash the whole server if
      the planner screws up, and it'd be good to perform these sanity checks in
      production, too.
      
      The failure of the "last resort" codepath was left unhandled by commit
      0522e960. Fixes https://github.com/greenplum-db/gpdb/issues/6643.
      Reviewed-by: NPaul Guo <pguo@pivotal.io>
      22288e8d
    • Y
      fix typo and indent (#6653) · 1ffc362e
      Yandong Yao 提交于
      1ffc362e
    • R
      Do not enforce join ordering for ANTI and LASJ. (#6625) · 29daab51
      Richard Guo 提交于
      The following identity holds true:
      
      	(A antijoin B on (Pab)) innerjoin C on (Pac)
          	= (A innerjoin C on (Pac)) antijoin B on (Pab)
      
      So we should not enforce join ordering for ANTI. Instead we need to
      collapse ANTI join nodes so that they participate fully in the join
      order search.
      
      For example:
      
      	select * from a join b on a.i = b.i where
      		not exists (select i from c where a.i = c.i);
      
      For this query, the origin join order is "(a innerjoin b) antijoin c". If
      we enforce ANTI join ordering, this will be the final join order. But
      another join order "(a antijoin c) innerjoin b" is also legal. We should
      take this order into consideration and pick a cheaper one.
      
      For LASJ, it is the same as ANTI joins.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
      29daab51
    • A
      pg_rewind: parse bitmap wal records. · 161920e8
      Ashwin Agrawal 提交于
      Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>
      161920e8
    • A
      pg_rewind: add test for bitmap wal records. · a6913d2f
      Ashwin Agrawal 提交于
      Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>
      a6913d2f
    • A
      In maintenance_mode ignore distributed log. · 4eb48055
      Ashwin Agrawal 提交于
      With this commit the QE in maintenance mode will ignore the
      distributed log and just pretend like single instance postgres.
      
      Without this if starting QE as single instance only, no distributed
      snapshot is executed. Due to this distributed oldest xmin points to
      oldest datfrozen_xid in system. As a result, vacuum any table results
      in HEAP_TUPLE_RECENTLY_DEAD and avoids cleaning up dead rows.
      Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>
      4eb48055
    • P
      Fix calculation of WorkfileMgrLock and WorkfileQuerySpaceLock · 1540eb1c
      Pengzhou Tang 提交于
      All lwlocks are stored in MainLWLockArray which is an array of
      LWLockPadded structures:
      
      typedef union LWLockPadded
      {
        LWLock lock;
        char pad[LWLOCK_PADDED_SIZE];
      } LWLockPadded;
      
      The calculation in SyncHTPartLockId to fetch a lwlock is
      incorrect because it offsets the array as an LWLock array.
      In current code base, it works fine because the size of
      LWLock happens to be 32, if structure LWLock get enlarged,
      the calculation will mess up.
      1540eb1c
    • P
      fix according to comments · a8c2f7c4
      Pengzhou Tang 提交于
      a8c2f7c4
    • P
      Dispatcher should use DISPATCH_WAIT_FINISH mode to wait QEs for init plans · b7bb5438
      Pengzhou Tang 提交于
      GPDB always set the REWIND flag for subplans include init plans, in 6195b967,
      we enhanced the restriction that if a node is not eager free, we cannot squelch
      a node earlier include init plans, this exposes a few hidden bugs: if init plan
      contains a motion node that needs to be squelched earlier, the whole query will
      get stuck in cdbdisp_checkDispatchResult() because some QEs are still keep
      sending tuples.
      
      To resolve this, we use DISPATCH_WAIT_FINISH mode for dispatcher to wait the
      dispatch results of init plan, init plan with motion is always executed on
      QD and should always be a SELECT-like plan, init plan must already fetched
      all the tuples it needed before dispatcher waiting for the QEs,
      DISPATCH_WAIT_FINISH is the right mode for init plan.
      b7bb5438
    • E
      bd7c4b1a
    • E
      Fix distributed snapshot xmax check. · 2b4674a4
      Ekta Khanna 提交于
      As part of commit dc78e56c, logic for distributed snapshot was modified
      to use latestCompletedDxid. This changed the logic from xmax being
      inclusive range to not inclusive for visible transactions in snapshot.
      Hence, updating the check to return
      DISTRIBUTEDSNAPSHOT_COMMITTED_INPROGRESS even for transaction id equal
      to global xmax now. Other way to fix is using latestCompletedDxid
      without +1 for xmax, but better is to keep logic similar to local
      snapshot check and not have xmax in inclusive range of visible
      transactions.
      
      This was exposed in CI by test
      isolation/results/heap-repeatable-read-vacuum-freeze failing
      intermittently.  This was due to isolation framework itself triggering
      query on pg_locks to check for deadlocks. This commit adds explicitely
      test to cover the scenario.
      Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
      2b4674a4
    • A
      Avoid calling CreateRestartPoint() from startup process. · 558c460e
      Ashwin Agrawal 提交于
      With commit 8a11bfff, aggressive restart point creation is not performed in gpdb
      as well. Since CreateRestartPoint() is not coded to be called from startup
      process, GPDB specific code exception was added in past to work correctly for
      previous aggressive restart point creations, calls to which could happen via
      startup process.
      
      Now given only when gp_replica_check is running restartpoint is created on
      checkpoint record, which should be done via checkpointer process. Eliminate any
      case of calling CreateRestartPoint() from startup process and thereby remove
      GPDB added exception to CreateRestartPoint() and align to upstream code.
      558c460e
    • H
      Make gptransfer test's output understandable to gpdiff. · c3b9d927
      Heikki Linnakangas 提交于
      The gptransfer behave test was using gpdiff to compare data between the
      source and target systems, and was relying on gpdiff to mask row order
      differences. However, after 1f44603a, gpdiff no longer recognized the
      results as psql result sets, because it did not echo the SELECT statements
      to the output. gpdiff expects to see those. Fix, by echoing the
      statements, like in pg_regress. That makes the output, if there are any
      differences, more readable anyway.
      
      While we're at it, change the gpdiff invocation to produce a unified diff.
      If the test fails, because there is a difference, that makes the output
      a lot more readable.
      c3b9d927
    • H
      Fix regression failure on "gpmapreduce --help". · f8035a33
      Heikki Linnakangas 提交于
      The test was using "-- ignore", to cause gpdiff to ignore any differences
      in the test output. But after commit 1f44603a, gpdiff doesn't consider
      the test's output as a psql result set anymore, so the "-- ignore" directive
      doesn't work anymore. Use the more common "-- start_ignore"/"-- end_ignore"
      block instead.
      
      (I'm not sure how useful the test is, if we don't check
      the output, but thats a different story.)
      f8035a33
  2. 08 1月, 2019 16 次提交
  3. 07 1月, 2019 7 次提交
    • H
      Refactor executor Squelch and EagerFree functions. · 6195b967
      Heikki Linnakangas 提交于
      Merge the two concepts, squelching and eager-freeing. There is now only
      one function, ExecSquelchNode(), that you can call. It recurses to
      children, as before, but it now also performs eager-freeing of resources.
      Previously that was done as a separate call, but that was an unnecessary
      distinction, because all callers of ExecSquelchNode() also called
      ExecEagerFree()
      
      The concept of eager-freeing still lives on, as ExecEagerFree*() functions
      specific to many node types. But it no longer recurses! The pattern is
      that ExecSquelchNode() always performs eager freeing of the node, and also
      recurses. In addition to that, some node types also call the node-specific
      EagerFree function of the same node, after reaching the end of tuples.
      
      This makes it more clear which function should be called when.
      
      ExecEagerWalker() used to have special handling for the pattern of:
        Result
        -> Material
          -> Broadcast Motion
      
      I tried removing that, but then I started to get "illegal rescan of motion
      node" errors in the regression tests, from queries with a LIMIT node in a
      subquery. Upon closer look, I believe that was because the Limit node was
      calling ExecSquelchNode() on the input, even though the Limit node was
      marked as rescannable. To fix that, I added delayEagerFree logic to Limit
      node, to not call ExecSquelchNode() when the node might get rescanned
      later.
      
      The planstate_walk_kids() code did not know how to recurse into the
      children of a MergeAppend node. We missed adding that logic, when we
      merged the MergeAppend node type from upstream, in 9.1. We don't use that
      mechanism for recursing in ExecSquelchNode() anymore, but that probably
      should be fixed, anyway, as a separate commit later.
      
      Fixes https://github.com/greenplum-db/gpdb/issues/6602 and
      https://github.com/greenplum-db/gpdb/issues/6074.
      Reviewed-by: NTang Pengzhou <ptang@pivotal.io>
      6195b967
    • H
      Remove pointless mock test. · 45160699
      Heikki Linnakangas 提交于
      The test was testing, that when ExecEagerFree() is called on a
      ShareInputScanState node, it calls ExecEagerFreeShareInputScan(). But that
      is trivially true, there is a direct call to ExecEagerFreeShareInputScan()
      in ExecEagerFree(). Seems pointless to have a test for it.
      45160699
    • H
      Fix struct name in comment. · 3ef7748e
      Heikki Linnakangas 提交于
      3ef7748e
    • H
      Fix typos in comments. · 76797520
      Heikki Linnakangas 提交于
      Mostly misspellings of "function".
      76797520
    • P
      Do not use hardcoded TemplateDbOid for database access during standby promotion. (#6601) · f13bc209
      Paul Guo 提交于
      TemplateDbOid is for database template1, but template1 could be recreated and
      thus its oid is not longer TemplateDbOid.
      
      Also change to Use database postgres to find database oid and tablespace like
      what fts and gdd do. We should better use the same database for such
      purpose. I thought about which database gpdb should use: template1 or
      postgres. Both of them are template database, but it seems that some users
      or customers sometimes customize their own template1 database and
      template1 is the default template for database creating so if some auxiliary
      processes use that, "create database" command will fail. Using database
      postgres seems to be more reasonable. In addition, of course, users possibly
      drop the database postgres, but that is a rare case and they could easily
      recreate one for our purpose. We really do not need to over-design for such rare case.
      f13bc209
    • P
      remove unused field within CdbComponentDatabases · 9b6c40de
      Pengzhou Tang 提交于
      9b6c40de
    • J
      Fallback 'default_transaction_isolation' guc from serializable to repeatable read (#6572) · 00527685
      Jinbao Chen 提交于
      
      Serializable is not yet supported, so we need to fallback gucs
      'transaction_isolation' and 'default_transaction_isolation' from
      serializable to repeatable read. Before, we just right fallback the
      transaction_isolation. For default_transaction_isolation, we only use
      the correct fallback when using the ‘SET SESSION CHARACTERISTICS AS
      TRANSACTION ISOLATION LEVEL serializable’ statement. But we do not
      fallback when using 'SET default_transaction_isolation = 'serializable''.
      Add a check hook 'check_DefaultXactIsoLevel' to fix it.
      00527685
  4. 05 1月, 2019 2 次提交