- 31 7月, 2017 2 次提交
-
-
由 Zhenghua Lyu 提交于
1. Detect cgroup mount point in test code. 2. Fix bug when buflen is 0. 3. Check cgroup status on master in gpconfig. 4. Fix coverity warnings.
-
由 Zhenghua Lyu 提交于
When GPDB is running in a container, the swap and ram read via sysinfo is the value of host machine. To find the correct value of swap and ram in the container context, we take both the value from sysinfo and the value from cgroup into account.
-
- 25 7月, 2017 1 次提交
-
-
由 Ning Yu 提交于
* Fix the resgroup assert failure on CREATE INDEX CONCURRENTLY syntax. When resgroup is enabled an assertion failure will be encountered with below case: SET gp_create_index_concurrently TO true; DROP TABLE IF EXISTS concur_heap; CREATE TABLE concur_heap (f1 text, f2 text, dk text) distributed by (dk); CREATE INDEX CONCURRENTLY concur_index1 ON concur_heap(f2,f1); The root cause is that we had the assumption on QD that a command is dispatched to QEs when assigned to a resgroup, but this is false with CREATE INDEX CONCURRENTLY syntax. To fix it we have to make necessary check and cleanup on QEs. * Do not assign a resource group in SIGUSR1 handler. When assigning a resource group on master it might call WaitLatch() to wait for a free slot. However as WaitLatch() expects to be waken by the SIGUSR1 signal, it will run into endless waiting when SIGUSR1 is blocked. One scenario is the catch up handler. Catch up handler is triggered and executed directly in the SIGUSR1 handler, so during its execution SIGUSR1 is blocked. And as catch up handler will begin a transaction so it will try to assign a resource group and trigger the endless waiting. To fix this we add the check to not assign a resource group when running inside the SIGUSR1 handler. As signal handlers are supposed to be light and short and safe, so skip resource group in such a case shall be reasonable.
-
- 24 7月, 2017 2 次提交
-
-
由 xiong-gang 提交于
The issue of hanging on recv() in internal_cancel() are reported serveral times, the socket status is shown 'ESTABLISHED' on master, while the peer process on the segment has already exit. We are not sure how exactly dose this happen, but we are able to simulate this hang issue by dropping packet or reboot the system on the segment. This patch use poll() to do non-blocking recv() in internal_cancel(); the timeout of poll() is set to the max value of authentication_timeout to make sure the process on segment has already exit before attempting another retry; and we expect retry on connect() can detect network issue. Signed-off-by: NNing Yu <nyu@pivotal.io>
-
由 Zhenghua Lyu 提交于
In the past, we use hard coded path "/sys/fs/cgroup" as cgroup mount point, this can be wrong when 1) running on old kernels or 2) the customer has special cgroup mount points. Now we detect the mount point at runtime by checking /proc/self/mounts. Signed-off-by: NNing Yu <nyu@pivotal.io>
-
- 22 7月, 2017 4 次提交
-
-
由 Asim R P 提交于
regress.c cannot include fmgroids.h because the header file is generated during build process. The ICW jobs in CI checkout gpdb source code and run make from within src/test/regress. That fails to find fmgroids.h. It seems we need a dedicated contrib module for gp_inject_fault. This reverts commit bd26a268.
-
由 Asim R P 提交于
This fixes ICW breakage due to "postgres" binary cannot be loaded as a shared library. To run gp_fault_inject() function manually, generate regress.so by running make in src/test/regress. Thereafter, create function command can be used to create the function, as in create_fault_function.source.
-
由 Asim R P 提交于
The function gp_inject_fault() was defined in a test specific contrib module (src/test/dtm). All tests can now make use of it. Two pg_regress tests (dispatch and cursor) are modified to demonstrate the usage. The function is also made capable to inject fault in any segment, specified by dbid. No more invoking gpfaultinjector python script from SQL files.
- 18 7月, 2017 1 次提交
-
-
由 Ming LI 提交于
If there are two external tables refer to the same PIPE file using gpfdist or file protocol directly, concurrent read will result in wrong data format or hang for gpfdist. Now before read the pipe, it will firstly flock the pipe file (Windows not supported yet), other requests from gpdb will report error. Signed-off-by: NMing LI <mli@apache.org> Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>
-
- 13 7月, 2017 3 次提交
-
-
由 Daniel Gustafsson 提交于
This removes code which is either unreachable due to prior identical tests which break the codepath, or which is dead due to always being true. Asserting that an unsigned integer is >= 0 will always be true, so it's pointless. Per "logically dead code" gripes by Coverity
-
由 Abhijit Subramanya 提交于
If we try to inject certain faults when the system is initialized with filerep disabled, we get the following error: ``` gpfaultinjector error: Injection Failed: Failure: could not insert fault injection, segment not in primary or mirror role Failure: could not insert fault injection, segment not in primary or mirror role ``` This patch removes the check for the role for non-filerep faults so that they don't fail on a cluster initialized without filerep.
-
由 Asim R P 提交于
The GUC gp_changetracking_max_rows replaces a compile time constant. Resync worker obtains at the most gp_changetracking_max_rows number of changed blocks from changetracking log at one time. Controling this with a GUC allows exploiting bugs in resync logic around this area.
-
- 10 7月, 2017 2 次提交
-
-
由 xiong-gang 提交于
CREATE RESOURCE GROUP rg1 WITH (concurrency=1, cpu_rate_limit=10, memory_limit=10); CREATE ROLE r1 RESOURCE GROUP rg1; session 1: set role r1; BEGIN; session 2: BEGIN; <--- hang, and then cancel BEGIN; <--- assertion failure Signed-off-by: NNing Yu <nyu@pivotal.io>
-
由 Richard Guo 提交于
Memory usage statistic in resource group is defined as unsigned integer. For subtraction 'a - b' on memory usage, the atomic subtraction function 'pg_atomic_sub_fetch_*' will return the value of 'a' before the subtraction. Then this value is asserted to be no less than 'b'.
-
- 07 7月, 2017 1 次提交
-
-
由 Ning Yu 提交于
Change initial contents in pg_resgroupcapability: * Remove memory_redzone_limit; * Add memory_shared_quota, memory_spill_ratio; Change resgroup concurrency range to [1, 'max_connections']: * Original range is [0, 'max_connections'], and -1 means unlimited. * Now the range is [1, 'max_connections'], and -1 is not supported. Change resgroup limit type from float to int. Changed below resgroup resource limit types from float to int percentage value: * cpu_rate_limit; * memory_limit; * memory_shared_quota; * memory_spill_ratio;
-
- 06 7月, 2017 2 次提交
-
-
由 Daniel Gustafsson 提交于
Commit a8f956c6 removed the old SAN failover code but left the catalogs in place due to catalog change freeze. This removes the no longer used catalogs and the relevant doc entries.
-
由 Daniel Gustafsson 提交于
This adds the ability for the caller of pg_terminate_backend() or pg_cancel_backend() to include an optional message to the process which is being signalled. The message will be appended to the error message returned to the killed process. The new syntax is overloaded as: SELECT pg_terminate_backend(<pid> [, msg]); SELECT pg_cancel_backend(<pid> [, msg]);
-
- 29 6月, 2017 3 次提交
-
-
由 Heikki Linnakangas 提交于
Instead of meticulously recording the OIDs of each object in the pg_dump output, dump and load all OIDs as a separate steps in pg_upgrade. We now only preserve OIDs of types, relations and schemas from the old cluster. Other objects are assigned new OIDs as part of the restore. To ensure the OIDs are consistent between the QD and QEs, we dump the (new) OIDs of all objects to a file, after upgrading the QD node, and use those OIDs when restoring the QE nodes. We were already using a similar mechanism for new array types, but we now do that for all objects.
-
由 Daniel Gustafsson 提交于
The backport of the data checksum catalog changes backported the relevant GUC from a version which has struct config_bool defined differently than GPDB. The reason an extra NULL in the config_bool array initialization wasn't causing a compilation failure is that there is an extra bool member at the end which is only set during runtime, reset_val. The extra NULL was "overflowing" into this member and thus only raised a warning under -Wint-conversion: guc.c:1180:15: warning: incompatible pointer to integer conversion initializing 'bool' (aka 'char') with an expression of type 'void *’ Fix by removing the superflous NULL. Since it was setting reset_val to NULL (and for a GUC which is yet to "do something") there should be no effects by this.
-
由 Ning Yu 提交于
Implement resgroup memory limit. In a resgroup we divide the memory into several slots, the number depends on the concurrency setting in the resgroup. Each slot has a reserved quota of memory, all the slots also share some shared memory which can be acquired preemptively. Some GUCs and resgroup options are defined to adjust the exact allocation policy: resgroup options: - memory_shared_quota - memory_spill_ratio GUCs: - gp_resource_group_memory_limit Signed-off-by: NNing Yu <nyu@pivotal.io>
-
- 28 6月, 2017 2 次提交
-
-
由 Kenan Yao 提交于
If QD receives a SIGINT and calls CHECK_FOR_INTERRUPTS after finishing Gang creation, but before recording this Gang in global variables like primaryWriterGang, this Gang would not be destroyed, hence next time QD wants to create a new writer Gang, it would find existing writer Gang on segments, and report snapshot collision error.
-
由 Asim R P 提交于
This patch pulls in the addition of checksum version information to pg_control and a GUC to report the checksum version. Heap data checksum feature will be pulled in its entirety as subsequent patches. Upstream commit that this patch pulls from: commit 96ef3b8f Author: Simon Riggs <simon@2ndQuadrant.com> Date: Fri Mar 22 13:54:07 2013 +0000 Allow I/O reliability checks using 16-bit checksums commit 44395174 Author: Simon Riggs <simon@2ndQuadrant.com> Date: Tue Apr 30 12:27:12 2013 +0100 Record data_checksum_version in control file. commit 5a7e75849cb595943fc605c4532716e9dd69f8a0 Author: Heikki Linnakangas <heikki.linnakangas@iki.fi> Date: Mon Sep 16 14:36:01 2013 +0300 Add a GUC to report whether data page checksums are enabled.
-
- 27 6月, 2017 1 次提交
-
-
由 Ning Yu 提交于
Support ALTER RESOURCE GROUP SET CPU_RATE_LIMIT syntax. The new cpu rate limit take effect immediately at end of transaction. Example 1: CREATE RESOURCE GROUP g1 WITH (cpu_rate_limit=0.1,memory_limit=0.1); ALTER RESOURCE GROUP g1 SET CPU_RATE_LIMIT 0.2; The new cpu rate limit take effect immediately. Example 2: BEGIN; ALTER RESOURCE GROUP g1 SET CPU_RATE_LIMIT 0.2; The new cpu rate limit doesn't take effect unless the transaction is committed. Signed-off-by: NRichard Guo <riguo@pivotal.io> Signed-off-by: NGang Xiong <gxiong@pivotal.io>
-
- 24 6月, 2017 1 次提交
-
-
由 Ashwin Agrawal 提交于
Incase of --enable-segwalrep, write-ahead logging should not be skipped for anything, as it relies on that mechanism to construct the things on mirror. Write-ahead logging for these pieces were only enabled performed for master, with this commit gets enabled for segments as well.
-
- 22 6月, 2017 3 次提交
-
-
由 Richard Guo 提交于
A dedicated list is maintained for resource group related callbacks. At transaction end, the callback functions are processed in the order of FIFO on COMMIT, and in the order of LIFO on ABORT. Signed-off-by: NPengzhou Tang <ptang@pivotal.io>
-
由 foyzur 提交于
In GPDB the dispatcher dispatches the entire plan tree to each query executor (QX). Each QX deserializes the entire plan tree and starts execution from the root of the plan tree. This begins by calling InitPlan on the QueryDesc, which blindly calls ExecInitNode on the root of the plan. Unfortunately, this is wasteful, in terms of memory and CPU. Each QX is in charge of a single slice. There can be many slices. Looking into plan nodes that belong to other slices, and initializing (e.g., creating PlanState for such nodes) is clearly wasteful. For large plans, particularly planner plans, in the presence of partitions, this can add up to a significant waste. This PR proposes a fix to solve this problem. The idea is to find the local root for each slice and start ExecInitNode there. There are few special cases: SubPlans are special, as they appear as expression but the expression holds the root of the sub plan tree. All the subplans are bundled in the plannedstmt->subplans, but confusingly as Plan pointers (i.e., we save the root of the SubPlan expression's Plan tree). Therefore, to find the relevant sub plans, we need to first find the relevant expressions and extract their roots and then iterate the plannedstmt->subplans, but only ExecInitNode on the ones that we can reach from some expressions in current slice. InitPlan are no better as they can appear anywhere in the Plan tree. Walking from a local motion is not sufficient to find these InitPlan. Therefore, we need to walk from the root of the plan tree and identify all the SubPlan. Note: unlike regular subplan, the initplan may not appear in the expression as subplan; rather it will appear as a parameter generator in some other parts of the tree. We need to find these InitPlan and obtain the SubPlan for each InitPlan. We can then use the SubPlan's setParam to copy precomputed parameter values from estate->es_param_list_info to estate->es_param_exec_vals We also found that the origSliceIdInPlan is highly unreliable and cannot be used as an indicator of a plan node's slice information. Therefore, we precompute each plan node's slice information to correctly determine if a Plan node is alien or not. This makes alien node identification more accurate. In successive PRs, we plan to use the alien memory account balance as a test to see if we successfully eliminated all aliens. We will also use the alien account balance to determine memory savings.
-
由 foyzur 提交于
Detecting dead parent account and replacing with Rollover during memory accounting array to tree conversion. * Unit test to check if children of dead parents are serialized as children of Rollover account.
-
- 21 6月, 2017 1 次提交
-
-
由 Asim R P 提交于
ExclusiveLock should be acquired in place of RowExclusiveLock for DMLs on user tables. If RowExclusiveLock is acquired, we may have a local deadlock on QD when concurrent UPDATE statements are executed from within UDFs. The problem is described in more detail here: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/PC0Z_zw-YaY/tsCmPdADBQAJ
-
- 20 6月, 2017 2 次提交
-
-
由 Abhijit Subramanya 提交于
The test used to validate that the tmlock is not held after completing the DTM recovery. The root cause for not releasing the lock was that in case of an error during recovery `elog_demote(WARNING)` was called which would demote the error to a warning. This would cause the abort processing code to not get executed and hence the lock would not be released. Adding a simple assert in the code once DTM recovery is complete is sufficient to make sure that the lock is released.
-
由 Asim R P 提交于
Otherwise there is a possibility of distributed deadlock. One such deadlock is caused by ENTRY_DB_SINGLETON reader entering LockAcquire when QD writer of the same MPP session already holds the lock. A backend from another MPP session is already waiting on the lock with a lockmode that conflicts with the reader's requested lockmode. This results in waitMask conflict and the reader is enqueued in the wait queue. But the QD writer is never going to release the lock because it's waiting for tuples from segments (QE writers/readers). And the QE writers/readers are also waiting for the ENTRY_DB_SINGLETON reader, completing the cycle necessary for deadlock. The fix is to avoid checking waitMask conflicts for a reader if writer of the same MPP session already holds the lock. In such a case the reader is granted the lock as long as it does not conflict with existing holders of the lock. Two isloation2 tests are added. One simulates the above mentioned deadlock and fails if it occurs. Another ensures that granting locks to readers without checking waitMask conflict does not starve existing waiters. cf. https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/OS1-ODIK0P4/ZIzayBbMBwAJSigned-off-by: NXin Zhang <xzhang@pivotal.io>
-
- 19 6月, 2017 2 次提交
-
-
由 Pengzhou Tang 提交于
"%m%s" format style is differ from upstream and broken the coverity scan, and it did not give too much value to keep it, so remove it.
-
由 Kenan Yao 提交于
-
- 17 6月, 2017 1 次提交
-
-
This brought in postgres/postgres@44d5be0 pretty much wholesale, except: 1. We leave `WITH RECURSIVE` for a later commit. The code is brought in, but kept dormant by us bailing early at the parser whenever there is a recursive CTE. 2. We use `ShareInputScan` in the stead of `CteScan`. ShareInputScan is basically the parallel-capable `CteScan`. (See `set_cte_pathlist` and `create_ctescan_plan`) 3. Consequently we do not put the sub-plan for the CTE in a pseudo-initplan: it is directly present in the main plan tree instead, hence we disable `SS_process_ctes` inside `subquery_planner` 4. Another corollary is that all new operators (`CteScan`, `RecursiveUnion`, and `WorkTableScan`) are dead code right now. But they will come to live once we bring in parallel implementation of `WITH RECURSIVE` In general this commit reduces the divergence between Greenplum and upstream. User visible changes: The merge in parser enables a corner case previously treated as error: you can now specify fewer columns in your `WITH` clause than the actual projected columns in the body subquery of the `WITH`. Original commit message: > Implement SQL-standard WITH clauses, including WITH RECURSIVE. > > There are some unimplemented aspects: recursive queries must use UNION ALL > (should allow UNION too), and we don't have SEARCH or CYCLE clauses. > These might or might not get done for 8.4, but even without them it's a > pretty useful feature. > > There are also a couple of small loose ends and definitional quibbles, > which I'll send a memo about to pgsql-hackers shortly. But let's land > the patch now so we can get on with other development. > > Yoshiyuki Asaba, with lots of help from Tatsuo Ishii and Tom Lane > (cherry picked from commit 44d5be0e)
-
- 09 6月, 2017 1 次提交
-
- 08 6月, 2017 1 次提交
-
-
由 Asim R P 提交于
Original deadlock is caused by a reader waiting on a lock which is already held by the writer of the same MPP session, and another session is waiting for a conflicting mode on the same lock. The fix is to avoid checking waitMask conflicts for reader (i.e. MyProc is different from lockHolderProcPtr). Detailed discussion of the deadlock issue is at: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/OS1-ODIK0P4/ZIzayBbMBwAJ Two isloation2 tests are added. One to validate the deadlock does not occur and another to ensure that granting locks to readers does not starve existing waiters. Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
- 07 6月, 2017 3 次提交
-
-
由 Pengzhou Tang 提交于
This commit restore TCP interconnect and fix some hang issues. * restore TCP interconnect code * Add GUC called gp_interconnect_tcp_listener_backlog for tcp to control the backlog param of listen call * use memmove instead of memcpy because the memory areas do overlap. * call checkForCancelFromQD() for TCP interconnect if there are no data for a while, this can avoid QD from getting stuck. * revert cancelUnfinished related modification in 8d251945, otherwise some queries will get stuck * move and rename faultinjector "cursor_qe_reader_after_snapshot" to make test cases pass under TCP interconnect.
-
由 Pengzhou Tang 提交于
* Change the default level of gp_log_gang to off. * Log the query plan size in level TERSE, it's useful for debugging.
-
由 Melanie Plageman 提交于
- Remove iteration specific members of qexec packet - Remove iterators_history table - Remove measures used to populate iterators_history - Remove iterator_aggregate flag Signed-off-by: NNadeem Ghani <nghani@pivotal.io> Signed-off-by: NMelanie Plageman <mplageman@pivotal.io>
-
- 05 6月, 2017 1 次提交
-
-
由 Richard Guo 提交于
Record memory usage for resource group. 1. Update total memory usage for a resource group when a session belonging to this group allocates/frees memory. 2. Update total memory usage for related resource groups when a session enters into or leaves from a resource group. 3. Dispatch current resource group ID from QD to QEs to keep track of current resource group. 4. Show total memory usage of a resource group. 5. Add test case for memory usage recording of resource group. Signed-off-by: Nxiong-gang <gxiong@pivotal.io> Signed-off-by: NKenan Yao <kyao@pivotal.io> Signed-off-by: NNing Yu <nyu@pivotal.io>
-