1. 13 8月, 2020 6 次提交
    • H
      Make Fault Injection sites cheaper, when no faults have been activated. · 5e2c9e8b
      Heikki Linnakangas 提交于
      Fault injection is expected to be *very* cheap, we even enable it on
      production builds. That's why I was very surprised when I saw 'perf' report
      that FaultInjector_InjectFaultIfSet() was consuming about 10% of CPU time
      in a performance test I was running on my laptop. I tracked it to the
      FaultInjector_InjectFaultIfSet() call in standard_ExecutorRun(). It gets
      called for every tuple between 10000 and 1000000, on every segment.
      
      Why is FaultInjector_InjectFaultIfSet() so expensive? It has a quick exit
      in it, when no faults have been activated, but before reaching the quick
      exit it calls strlen() on the arguments. That's not cheap. And the function
      call isn't completely negligible on hot code paths, either.
      
      To fix, turn FaultInjector_InjectFaultIfSet() into a macro that's only
      few instructions long in the fast path. That should be cheap enough.
      Reviewed-by: NAshwin Agrawal <aashwin@vmware.com>
      Reviewed-by: NJesse Zhang <jzhang@pivotal.io>
      Reviewed-by: NAsim R P <pasim@vmware.com>
      5e2c9e8b
    • H
      Remove comment about an argument that was removed earlier. · 5edd8625
      Heikki Linnakangas 提交于
      'GpPolicy' argument was removed in commit c892d95f.
      5edd8625
    • H
      42ba6b0a
    • P
      Fix query string truncation while dispatching to QE · 889ba39e
      Polina Bungina 提交于
      Execution of a long enough query containing multi-byte characters can cause incorrect truncation of the query string. Incorrect truncation implies an occasional cut of a multi-byte character and (with log_min_duration_statement set to 0 ) subsequent write of an invalid symbol to segment logs. Such broken character present in logs produces problems when trying to fetch logs info from gp_toolkit.__gp_log_segment_ext  table - queries fail with the following error: «ERROR: invalid byte sequence for encoding…». 
      This is caused by buildGpQueryString function in `cdbdisp_query.c`, which prepares query text for dispatch to QE. It does not take into account character length when truncation is necessary (text is longer than QUERY_STRING_TRUNCATE_SIZE). 
      889ba39e
    • P
      Coverity: Cleanup minor memory leak issue · c0f0ba08
      Pengzhou Tang 提交于
      c0f0ba08
    • D
      Allow static partition selection for lossy casts in ORCA · 4bddeeff
      Divyesh Vanjare 提交于
      For a table partitioned by timestamp column, a query such as
        SELECT * FROM my_table WHERE ts::date == '2020-05-10'
      should only scan on a few partitions.
      
      ORCA previously supported only implicit casts for partition selection.
      This commit, extends ORCA to also support a subset of lossy (assignment)
      casts that are order-preserving (increasing) functions. This will
      improve ORCA's ability to partition elimination to produce faster plans.
      
      To ensure correctness, the additional supported functions are captured
      in an allow-list in gpdb::IsFuncAllowedForPartitionSelection(), which
      includes some in-built lossy casts such as ts::date, float::int etc.
      
      Details:
       - For list partitions, we compare our predicate with each distinct
         value in the list to determine if the partition has to be
         selected/eliminated. Hence, none of the operators need to be changed
         for list partition selection
      
       - For range partition selection, we check bounds of each partition and
         compare it with the predicates to determine if partition has to be
         selected/eliminated.
      
         A partition such as [1, 2) shouldn't be selected for float = 2.0, but
         should be selected for float::int = 2.  We change the logic for handling
         equality predicates differently when lossy casts are present (ub: upper
         bound, lb: lower bound)
      
         if (lossy cast on partition col):
           (lb::int <= 2) and (ub::int >= 2)
         else:
           ((lb <= 2 and inclusive lb) or (lb < 2))
           and
           ((ub >= 2 and inclusive ub ) or (ub > 2))
      
        - CMDFunctionGPDB now captures whether or not a cast is a lossy cast
          supported by ORCA for partition selection. This is then used in
          Expr2DXL translation to identify how partitions should be selected.
      4bddeeff
  2. 12 8月, 2020 2 次提交
    • H
      Fix compilation without libuv's uv.h header. · 7858128f
      Heikki Linnakangas 提交于
      ic_proxy_backend.h includes libuv's uv.h header, and ic_proxy_backend.h
      was being included in ic_tcp.c, even when compiling with
      --disable-ic-proxy.
      7858128f
    • H
      ic-proxy: support parallel backend registeration to proxy · 608514c5
      Hubert Zhang 提交于
      Previously, when backends connect to a proxy, we need to setup
      domain socket pipe and send HELLO message(recv ack message) in
      a blocking and non-parallel way. This makes ICPROXY hard to introduce
      check_for_interrupt during backend registeration.
      
      By utilizing libuv loop, we could register backend in paralle. Note
      that this is one of the step to replace all the ic_tcp backend logic
      reused by ic_proxy currently. In future, we should use libuv to replace
      all the backend logic, from registeration to send/recv data.
      Co-authored-by: NNing Yu <nyu@pivotal.io>
      608514c5
  3. 10 8月, 2020 2 次提交
    • P
      Cleanup idle gangs after releasing lock table's partition lock · 22308d6a
      ppggff 提交于
      When the GUC parameter resource_cleanup_gangs_on_wait is on, the backend will clean up idle gangs before waiting for the resource queue lock. The cleanup operation involves network IO, so it takes a while.
      
      In the current code, the cleanup operation still holds a partition lock that would normally only be held for a short period of time, which will prevent normal lock table operations by other backends. The cleanup operation should be moved to after releasing the partition lock and before the backend starts waiting.
      22308d6a
    • N
      ic-proxy: type checking in ic_proxy_new() · a3ef623d
      Ning Yu 提交于
      A typical mistake on allocating typed memory is as below:
      
          int64 *ptr = malloc(sizeof(int32));
      
      To prevent this, now we make ic_proxy_new() a typed allocator, it always
      return a pointer of the specified type, for example:
      
          int64 *p1 = ic_proxy_new(int64); /* good */
          int64 *p2 = ic_proxy_new(int32); /* bad, gcc will raise a warning */
      Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
      a3ef623d
  4. 08 8月, 2020 4 次提交
  5. 07 8月, 2020 5 次提交
    • H
      Simplify responsibility for 'rel' in SET DISTRIBUTED BY and EXTEND TABLE. · 8518ed4f
      Heikki Linnakangas 提交于
      SET DISTRIBUTED BY and EXTEND TABLE subcommands worked differently from
      all other ALTER TABLE subcommands in who's responsible for closing the
      relcache reference. In all other subcommands, ATRewriteCatalogs opens and
      closes the 'rel', but for these two commands, the ATExec*() function
      closed it. I don't see any good reason for that. There were very old
      comments about forcing the relcache entry to be forgotten, but that
      explanation doesn't make sense to me, and everything seems to work without
      the early closing. Maybe it was needed a long time ago, but the code has
      changed a lot since it was written. Simplify, by closing the relation in
      ATRewriteCatalogs(), like with all other ALTER TABLE subcommands.
      Reviewed-by: NAsim R P <pasim@vmware.com>
      8518ed4f
    • N
      ic-proxy: pass the copydml test · 0b46f726
      Ning Yu 提交于
      The copydml test creates both BEFORE and AFTER triggers on a table and
      checks the execution order of the client output.  The original output
      order is "BEFORE -> RESULT -> AFTER", this is the order produced in
      ic-tcp mode; in ic-udpifc mode the order has a chance to become "RESULT
      -> BEFORE -> AFTER", it is not determined.  There are 2 variants of the
      answer files for these 2 orders, so the test can pass on any order.
      
      In ic-proxy mode, the order can also be "BEFORE -> AFTER -> RESULT", so
      we also need a 3rd variant of the answer file for this order.  By
      providing this we could pass the test in ic-proxy mode, and we could
      re-enable the copydml test, it was previously disabled on ic-proxy
      pipeline jobs.
      Reviewed-by: NAsim R P <pasim@vmware.com>
      Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
      0b46f726
    • H
      Fix the flaky test of walreceiver (#10588) · ef949df2
      Hao Wu 提交于
      walreceiver is quite sensitive to any WAL write. After create a table
      and insert some tuples, it doesn't run a vacuum. There may be other
      cases that cause some WAL traffic. One of the WAL records is
      relevant to the hot-standby to record RUNNING_XACTS.
      
      The last test_receive() only tests there should be no WAL traffic
      after receiving the WAL records to the current xlog location. It still
      has a gap that a new WAL record transmitted to the walreceiver.
      It's not meaningful to run test_receive(), since the function
      test_receive_and_verify() has verified that all the WAL traffic
      to the latest xlog location is received. So, removed test_receive().
      Reviewed-by: NPaul Guo <paulguo@gmail.com>
      Reviewed-by: N(Jerome)Junfeng Yang <jeyang@pivotal.io>
      ef949df2
    • A
      Fix connection string for gpsd · e4bde277
      Abhijit Subramanya 提交于
      gpsd was failing because the connectionString that we passed to pgdb.connect
      had the parameters in the wrong order. It started failing after upgrading to a
      higher version of PyGreSQL. So use a dictionary instead in order to avoid
      sending in the parameters incorrectly.
      Co-authored-by: NAshwin Agrawal <aashwin@vmware.com>
      e4bde277
    • L
      docs - update pxf xrefs to v5.14 (#10591) · 7f79304d
      Lisa Owen 提交于
      7f79304d
  6. 06 8月, 2020 4 次提交
    • P
    • P
      Fix potential panic in visibility check code. (#10589) · 85811692
      Paul Guo 提交于
      We've seen a panic case on gpdb 6 with stack as below,
      
      3  markDirty (isXmin=0 '\000', tuple=0x7effe221b3c0, relation=0x0, buffer=16058) at tqual.c:105
      4  SetHintBits (xid=<optimized out>, infomask=1024, rel=0x0, buffer=16058, tuple=0x7effe221b3c0) at tqual.c:199
      5  HeapTupleSatisfiesMVCC (relation=0x0, htup=<optimized out>, snapshot=0x15f0dc0 <CatalogSnapshotData>, buffer=16058) at tqual.c:1200
      6  0x00000000007080a8 in systable_recheck_tuple (sysscan=sysscan@entry=0x2e85940, tup=tup@entry=0x2e859e0) at genam.c:462
      7  0x000000000078753b in findDependentObjects (object=0x2e856e0, flags=<optimized out>, stack=0x0, targetObjects=0x2e85b40, pendingObjects=0x2e856b0,
         depRel=0x7fff2608adc8) at dependency.c:793
      8  0x00000000007883c7 in performMultipleDeletions (objects=objects@entry=0x2e856b0, behavior=DROP_RESTRICT, flags=flags@entry=0) at dependency.c:363
      9  0x0000000000870b61 in RemoveRelations (drop=drop@entry=0x2e85000) at tablecmds.c:1313
      10 0x0000000000a85e48 in ExecDropStmt (stmt=stmt@entry=0x2e85000, isTopLevel=isTopLevel@entry=0 '\000') at utility.c:1765
      11 0x0000000000a87d03 in ProcessUtilitySlow (parsetree=parsetree@entry=0x2e85000,
      
      The reason is that we pass a NULL relation to the visibility check code, which
      might use the relation variable to determine if hint bit should be set or not.
      Let's pass the correct relation variable even it might not be used finally.
      
      I'm not able to reproduce the issue locally so I can not provide a test case
      but that is surely a potential issue.
      Reviewed-by: NAshwin Agrawal <aashwin@vmware.com>
      85811692
    • Z
      Print CTID when we detect data distribution wrong for UPDATE|DELETE. · 3faf0b51
      Zhenghua Lyu 提交于
      When update or delete statement errors out because of the CTID is
      not belong to the local segment, we should also print out the CTID
      of the tuple so that it will be much easier to locate the wrong-
      distributed data via:
        `select * from t where gp_segment_id = xxx and ctid='(aaa,bbb)'`.
      3faf0b51
    • D
      Docs - update component versions for 6.10 · 0a475980
      David Yozie 提交于
      0a475980
  7. 05 8月, 2020 2 次提交
  8. 04 8月, 2020 2 次提交
  9. 03 8月, 2020 7 次提交
  10. 01 8月, 2020 5 次提交
  11. 31 7月, 2020 1 次提交
    • T
      Remove fault injection from gpMgmt · 2f65547b
      Tyler Ramer 提交于
      The command execution framework shipped with a fault injection in
      delivered code. See https://github.com/greenplum-db/gpdb/issues/10546
      for execution details and implications.
      
      It seems the fault injection framework was added in 2009, used
      sparingly, and should be removed until it can be safely replaced.
      
      Additionally, the "gppylib/test/regress" folder used fault injector, but
      the "check-regress" target seems not to have been called - obvious
      because pygresql regression checks are present, but pygresql has not
      been in master for some time without causing any errors to these tests
      Authored-by: NTyler Ramer <tramer@vmware.com>
      2f65547b