- 15 8月, 2020 1 次提交
-
-
由 Bhuvnesh Chaudhary 提交于
-
- 13 8月, 2020 3 次提交
-
-
由 Heikki Linnakangas 提交于
-
由 Heikki Linnakangas 提交于
Fault injection is expected to be *very* cheap, we even enable it on production builds. That's why I was very surprised when I saw 'perf' report that FaultInjector_InjectFaultIfSet() was consuming about 10% of CPU time in a performance test I was running on my laptop. I tracked it to the FaultInjector_InjectFaultIfSet() call in standard_ExecutorRun(). It gets called for every tuple between 10000 and 1000000, on every segment. Why is FaultInjector_InjectFaultIfSet() so expensive? It has a quick exit in it, when no faults have been activated, but before reaching the quick exit it calls strlen() on the arguments. That's not cheap. And the function call isn't completely negligible on hot code paths, either. To fix, turn FaultInjector_InjectFaultIfSet() into a macro that's only few instructions long in the fast path. That should be cheap enough. Reviewed-by: NAshwin Agrawal <aashwin@vmware.com> Reviewed-by: NJesse Zhang <jzhang@pivotal.io> Reviewed-by: NAsim R P <pasim@vmware.com>
-
由 Divyesh Vanjare 提交于
For a table partitioned by timestamp column, a query such as SELECT * FROM my_table WHERE ts::date == '2020-05-10' should only scan on a few partitions. ORCA previously supported only implicit casts for partition selection. This commit, extends ORCA to also support a subset of lossy (assignment) casts that are order-preserving (increasing) functions. This will improve ORCA's ability to partition elimination to produce faster plans. To ensure correctness, the additional supported functions are captured in an allow-list in gpdb::IsFuncAllowedForPartitionSelection(), which includes some in-built lossy casts such as ts::date, float::int etc. Details: - For list partitions, we compare our predicate with each distinct value in the list to determine if the partition has to be selected/eliminated. Hence, none of the operators need to be changed for list partition selection - For range partition selection, we check bounds of each partition and compare it with the predicates to determine if partition has to be selected/eliminated. A partition such as [1, 2) shouldn't be selected for float = 2.0, but should be selected for float::int = 2. We change the logic for handling equality predicates differently when lossy casts are present (ub: upper bound, lb: lower bound) if (lossy cast on partition col): (lb::int <= 2) and (ub::int >= 2) else: ((lb <= 2 and inclusive lb) or (lb < 2)) and ((ub >= 2 and inclusive ub ) or (ub > 2)) - CMDFunctionGPDB now captures whether or not a cast is a lossy cast supported by ORCA for partition selection. This is then used in Expr2DXL translation to identify how partitions should be selected.
-
- 12 8月, 2020 3 次提交
-
-
由 Heikki Linnakangas 提交于
ic_proxy_backend.h includes libuv's uv.h header, and ic_proxy_backend.h was being included in ic_tcp.c, even when compiling with --disable-ic-proxy.
-
由 Paul Guo 提交于
We've seen a panic case on gpdb 6 with stack as below, 3 markDirty (isXmin=0 '\000', tuple=0x7effe221b3c0, relation=0x0, buffer=16058) at tqual.c:105 4 SetHintBits (xid=<optimized out>, infomask=1024, rel=0x0, buffer=16058, tuple=0x7effe221b3c0) at tqual.c:199 5 HeapTupleSatisfiesMVCC (relation=0x0, htup=<optimized out>, snapshot=0x15f0dc0 <CatalogSnapshotData>, buffer=16058) at tqual.c:1200 6 0x00000000007080a8 in systable_recheck_tuple (sysscan=sysscan@entry=0x2e85940, tup=tup@entry=0x2e859e0) at genam.c:462 7 0x000000000078753b in findDependentObjects (object=0x2e856e0, flags=<optimized out>, stack=0x0, targetObjects=0x2e85b40, pendingObjects=0x2e856b0, depRel=0x7fff2608adc8) at dependency.c:793 8 0x00000000007883c7 in performMultipleDeletions (objects=objects@entry=0x2e856b0, behavior=DROP_RESTRICT, flags=flags@entry=0) at dependency.c:363 9 0x0000000000870b61 in RemoveRelations (drop=drop@entry=0x2e85000) at tablecmds.c:1313 10 0x0000000000a85e48 in ExecDropStmt (stmt=stmt@entry=0x2e85000, isTopLevel=isTopLevel@entry=0 '\000') at utility.c:1765 11 0x0000000000a87d03 in ProcessUtilitySlow (parsetree=parsetree@entry=0x2e85000, The reason is that we pass a NULL relation to the visibility check code, which might use the relation variable to determine if hint bit should be set or not. Let's pass the correct relation variable even it might not be used finally. I'm not able to reproduce the issue locally so I can not provide a test case but that is surely a potential issue. Reviewed-by: NAshwin Agrawal <aashwin@vmware.com> (cherry picked from commit 85811692)
-
由 Zhenghua Lyu 提交于
When update or delete statement errors out because of the CTID is not belong to the local segment, we should also print out the CTID of the tuple so that it will be much easier to locate the wrong- distributed data via: `select * from t where gp_segment_id = xxx and ctid='(aaa,bbb)'`.
-
- 10 8月, 2020 2 次提交
-
-
由 Wen Lin 提交于
Add a bool flag 'delim_off' for CopyStateData to indicate if delimiter is set to OFF or not.
-
- 08 8月, 2020 5 次提交
-
-
由 Mel Kiyama 提交于
* docs - support proxies for GPDB interconnect -New topic in Admin. Guide. -New GUC gp_interconnect_proxy_addresses -Updated GUC gp_interconnect_type - new value PROXY Also added note to gpexpand documents - do no use proxy during expand. * docs - review comment updates * docs - review comment updates. -update IC proxy example function -other minor edits. * docs - add not about running ic-proxy configuration function.
-
由 Mel Kiyama 提交于
* docs - new GUC max_slot_wal_keep_size GUC to control WAL log disk size. Per segment instance. * docs - minor edit
-
由 Lisa Owen 提交于
-
由 Lisa Owen 提交于
-
由 Lisa Owen 提交于
-
- 07 8月, 2020 4 次提交
-
-
由 David Yozie 提交于
* Update libevent requirement for RHEL/CentOS 6 * Add new features introduced in 1.9.x * Add 1.11.x features * Add 1.12.x feature * Add 1.13.x features * Update version in doc notice
-
由 Paul Guo 提交于
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NHao Wu <gfphoenix78@gmail.com> (cherry picked from commit af0fac18)
-
由 Hubert Zhang 提交于
* Coverity: Resource leak Fix some resource leak. (cherry picked from commit e27ec070) * Coverity: Identical code for different branches Clean identical code in heap.c and analyze.c (cherry picked from commit f076205c) * Coverity: Variable unused Remove unused varaible. For tuplesort.h, we doesn't support mksort based cluster, so we should just set is_mk_tuplesortstate to false (cherry picked from commit e5c86775) * Coverity: Logically dead code Remove dead code. insertDesc is alwasy NULL in ao_vacuum_rel_compact() (cherry picked from commit 38552bf7) * Coverity: Sizeof not portable sizeof(HeapTuple *) should be sizeof(HeapTuple) (cherry picked from commit cf23db49) * Coverity: Behave test misformat conn = dbconn.connect() should be aligned with if statement, or it will never be executed. (cherry picked from commit 126ba1c6) * Coverity: Check return value of strcmp return value of strcmp is not checked in some branches. (cherry picked from commit fd498cf9)
-
由 Lisa Owen 提交于
-
- 06 8月, 2020 3 次提交
-
-
由 (Jerome)Junfeng Yang 提交于
(cherry picked from commit 9d801056)
-
由 Richard Guo 提交于
When transforming DISTRIBUTED BY clause in utility mode, NULL would be returned as only QD can have policies. As a result, the child table in a partitioned table would have NULL distributedBy, which would trigger assertion failure or cause crash with current codes. This patch fixes that by emitting an error message in that case to avoid the crash. Reviewed-by: N(Jerome)Junfeng Yang <jeyang@pivotal.io>
-
由 David Yozie 提交于
-
- 05 8月, 2020 4 次提交
-
-
由 xiong-gang 提交于
In commit 3ef5e267, we changed the value of some macro in guc.h, that would introduce ABI change if 3rd-party libaraies are using the macros.
-
由 xiong-gang 提交于
When the mirror is down and the master is reset, the 'dtx recovery' process will hang because the primary can't sync the WAL to the mirror. The FTS should be able to probe and 'sync off' the mirror.
-
由 David Yozie 提交于
-
由 Lisa Owen 提交于
* docs - note how oss users get gpbackup * small edit
-
- 04 8月, 2020 3 次提交
-
-
由 xiong-gang 提交于
In the test, check 'sync_state' in pg_stat_replication is unnecessary and it makes the test flaky, so remove it.
-
由 Ning Yu 提交于
Fixed the bug that the SIGHUP handler was installed for SIGINT by mistake, so the ic-proxy bgworkers would die on SIGHUP. By correcting the signal name, now we could let the ic-proxy bgworkers reload the postgresql.conf by executing "gpstop -u". Reviewed-by: NHubert Zhang <hzhang@pivotal.io> (cherry picked from commit a181655b)
-
由 David Yozie 提交于
-
- 03 8月, 2020 7 次提交
-
-
由 David Yozie 提交于
This reverts commit bd5b35b1.
-
由 Gang Xiong 提交于
-
由 Gang Xiong 提交于
1. change the GUC unit from MB to KB as 6X doesn't have GUC_UNIT_MB. 2. the upstream commit added 3 fields in the system view 'pg_replication_slots', this commit remove that change since we cannot make catalog change on 6X. 3. upstream uses 'slot->active_pid' to identify the process that acquired the replication slot, this commit added 'walsnd' in 'ReplicationSlot' to do the same. 4. upstream uses condition variable to wait the walsender exit, this commit uses WalSndWaitStoppingOneWalSender as we don't have condition variable on 6X. 5. add test cases.
-
由 Alvaro Herrera 提交于
Replication slots are useful to retain data that may be needed by a replication system. But experience has shown that allowing them to retain excessive data can lead to the primary failing because of running out of space. This new feature allows the user to configure a maximum amount of space to be reserved using the new option max_slot_wal_keep_size. Slots that overrun that space are invalidated at checkpoint time, enabling the storage to be released. Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> Reviewed-by: NMasahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: NJehan-Guillaume de Rorthais <jgdr@dalibo.com> Reviewed-by: NÁlvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20170228.122736.123383594.horiguchi.kyotaro@lab.ntt.co.jp
-
由 (Jerome)Junfeng Yang 提交于
In some cases, merge stats logic for root partition table may consume very high memory usage in CacheMemoryContext. This may lead to `Canceling query because of high VMEM usage` when concurrently ANALYZE partition tables. For example, there are several root partition tables and they both have thousands of leaf tables. And these tables are all wide tables that may contain hundreds of columns. So when analyze()/auto_stats() leaf tables concurrently, `leaf_parts_analyzed` will consume lots of memory(catalog catch for pg_statistic and pg_attribute) under CacheMemoryContext for each backend, which may hit the protect VMEM limit. In `leaf_parts_analyzed`, a single backend's leaf table analysis for a root partition table, it may add cache entries up to number_of_leaf_tables * number_of_columns tuples from pg_statistic and number_of_leaf_tables * number_of_columns tuples from pg_arrtibute. Set guc `optimizer_analyze_root_partition` or `optimizer_analyze_enable_merge_of_leaf_stats` to false could skip merge stats for root table and `leaf_parts_analyzed` will not execute. To resolve this issue: 1. When checking whether merge stats are available for a root table in `leaf_parts_analyzed`, check whether all leaf tables are ANALYZEd first, if they're still un-ANALYZE leaf table exists, return quickly to avoid touch columns' pg_attribute and pg_statistic per leaf table(this will save lots of time). And also don't rely on system catalog cache and use the index to fetch the stats tuple to avoid one-time cache usage(in common cases). 2. When merging a stats in `merge_leaf_stats`, don't rely on system catalog cache and use the index to fetch the stats tuple. There are side-effects for not rely on system catalog cache(which are all **rare** situations). 1. If insert/update/copy several leaf tables which under **same root partition** table in **same session** and all leaf tables are **analyzed** will be much slower since auto_stats will call `leaf_parts_analyzed` once the leaf table gets updated, and we don't rely on system catalog cache now. (`set optimizer_analyze_enable_merge_of_leaf_stats=false` could avoid this) 2. ANALYZE the same root table several times in the same session is much slower than before since we don't rely on system catalog cache. Seems this solution improves the performance for ANALYZE, and it also makes ANALYZE won't hit the memory issue anymore. (cherry picked from commit 533a47dd)
-
由 Ning Yu 提交于
In a query that contains multiple init/sub plans, the packets of the second subplan might be received while the first is still being processed in the ic-proxy mode, this is because in ic-proxy mode a local host handshake is used instead of the global one. To distinguish the packets of different subplans, especially for the early coming ones, we must stop handling on the BYE immediately, and pass any unhandled early coming pkts to the successor or the placeholder. This fixes the random hanging during the ICW parallel group of qp_functions_in_from. No new test is added. Co-authored-by: NHubert Zhang <hzhang@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io> (cherry picked from commit 79ff4e62)
-
- 01 8月, 2020 1 次提交
-
-
由 bhuvnesh chaudhary 提交于
The initialization file (passed as gpinitsystem -I <file>) can have two formats: legacy (5-field) and new (6-field, that has the HOST_ADDRESS). This commit fixes a bug in which an internal sorting routine that matched a primary with its corresponding mirror assumed that <file> was always in the new format. The fix is to convert any input <file> to the new format via re-writing the QD_ARRAY, PRIMARY_ARRAY and MIRROR_ARRAY to have 6 fields. We also always use '~' as the separator instead of ':' for consistency. The bug fixed is that a 5-field <file> was being sorted numerically, causing either the hostname (on a multi-host cluster) or the port (on a single-host cluster) to be used to sort instead or the content. This could result in the primary and its corresponding mirror being created on different contents, which fortunately hit an internal error check. Unit tests and a behave test have been added as well. The behave test uses a demo cluster to validate a legacy gpinitsystem initialization file format (e.g. one that has 5 fields) successfully creates a Greenplum database. Co-authored-by: NDavid Krieger <dkrieger@vmware.com>
-
- 31 7月, 2020 4 次提交
-
-
由 Ashwin Agrawal 提交于
Adding pg_stat_clear_snapshot() in functions looping over gp_stat_replication / pg_stat_replication to refresh result everytime the query is run as part of same transaction. Without pg_stat_clear_snapshot() query result is not refreshed for pg_stat_activity neither for xx_stat_replication functions on multiple invocations inside a transaction. So, in absence of it the tests become flaky. Also, tests commit_blocking_on_standby and dtx_recovery_wait_lsn were initially committed with wrong expectations, hence were missing to test the intended behavior. Now reflect the correct expectation. (cherry picked from commit c565e988)
-
由 Ashwin Agrawal 提交于
This was missed in commit 96b332c0. (cherry picked from commit 8ef5d722)
-
由 Chris Hajas 提交于
Orca's DP algorithms currently generate logical alternatives based only on cardinality; they do not take into account motions/partition selectors as these are physical properties handled later in the optimization process. Since DPv2 doesn't generate all possible alternatives for the optimization stage, we end up generating alternatives that do not support partition selection or can only place poor partition selectors. This PR introduces partition knowledge into the DPv2 algorithm. If there is a possible partition selector, it will generate an alternative that considers it, in addition to the previous alternatives. We introduce new properties, m_contain_PS to indicate whether a SExpressionInfo contains a PS for a particular expression. We consider an expression to have a possible partition selector if the join expression columns and the partition table's partition key overlap. If they do, we mark this expression as containing a PS for a particular PT. We consider a good PS one which is selective. Eg: ``` - DTS - PS -TS - Pred ``` would be selective. However, if there is no selective predicate, we do not consider this as a promising PS. For now, we add just a single alternative that satisfies this property and only consider linear trees. This is a backport of 9c445321
-
由 Ashuka Xue 提交于
This commit only affects cardinality estimation in ORCA when the user sets `optimizer_damping_factor_join = 0`. It improves the square root algorithm first introduced by commit ce453cf2. In the original square root algorithm, we assumed that distribution column predicates would have some correlation with other predicates in the join and therefore would be accordingly damped when calculating join cardinality. However, distribution columns are ideally unique in order to gain the best performance for GPDB. Under this assumption, distribution columns should not be correlated and thus needed to be treated as independent when calculating join cardinality. This is a best guess since we do not have a way to support correlated columns at this time. Co-authored-by: NAshuka Xue <axue@vmware.com> Co-authored-by: NChris Hajas <chajas@vmware.com>
-