- 19 5月, 2017 1 次提交
-
-
由 Pengzhou Tang 提交于
Resource group cpu rate limitation is implemented with cgroup on linux system. When resource group is enabled via GUC we check whether cgroup is available and properly configured on the system. A sub cgroup is created for each resource group, cpu quota and share weight will be set depends on the resource group configuration. The queries will run under these cgroups, and the cpu usage will be restricted by cgroup. The cgroups directory structures: * /sys/fs/cgroup/{cpu,cpuacct}/gpdb: the toplevel gpdb cgroup * /sys/fs/cgroup/{cpu,cpuacct}/gpdb/*/: cgroup for each resource group The logic for cpu rate limitation: * in toplevel gpdb cgroup we set the cpu quota and share weight as: cpu.cfs_quota_us := cpu.cfs_period_us * 256 * gp_resource_group_cpu_limit cpu.shares := 1024 * ncores * for each sub group we set the cpu quota and share weight as: sub.cpu.cfs_quota_us := -1 sub.cpu.shares := top.cpu.shares * sub.cpu_rate_limit The minimum and maximum cpu percentage for a sub cgroup: sub.cpu.min_percentage := gp_resource_group_cpu_limit * sub.cpu_rate_limit sub.cpu.max_percentage := gp_resource_group_cpu_limit The acutal percentage depends on how busy the system is. gp_resource_group_cpu_limit is a GUC introduced to control the cpu resgroups assigned on each host. gpconfig -c gp_resource_group_cpu_limit -v '0.9' A new pipeline is created to perform the tests as we need privileged permission to enable and setup cgroups on the system. Signed-off-by: NNing Yu <nyu@pivotal.io>
-
- 11 5月, 2017 4 次提交
-
-
Vacuum of AO tables happens in four phases: prepare, compact, drop and cleanup. When there are concurrent transactions executing vacuum on AO tables, if one of the vacuum transaction is executing a drop phase, the other transaction will not be able to open the relation if it is also executing the drop phase. When the relation open fails, vacuum simply exits without performing the cleanup phase. The relfrozenxid in pg_class is updated during the cleanup phase and hence the age of the table does not get reduced. This becomes a problem because the age of the database does not reduce and eventually the xid wraparound issue will be encountered. This problem is fixed by performing the cleanup phase even if the open relation fails during drop phase.
-
由 Jimmy Yih 提交于
Due to the different phases of VACUUM, compaction phase could have committed but the drop was cancelled. This could leave some aoseg/aocsseg entries in AOSEG_STATE_AWAITING_DROP state which could then be cleaned up in a subsequent VACUUM as expected... but it currently just skips. The master keeps a hash table (AppendOnlyHash) and uses it during VACUUM to check tuple count, state, and aborted/committed. Our issue here is that state checking was only being done against state AWAITING_DROP_READY. It was not including COMPACTED_AWAITING_DROP in the state check. This commit adds the check and updates the detailed comment block on the state transitions in appendonlywriter.h.
-
由 Jimmy Yih 提交于
When doing an ADD COLUMN operation on an AOCO table, the vpinfo byte array stored in the aocsseg auxiliary table for each column is updated to add an extra vpinfo index to match the relnatts value in pg_class. However, entries in the aocsseg auxiliary table with state 2 (AOSEG_STATE_AWAITING_DROP) will not have their vpinfo byte array updated. Subsequent calls to getAOCSVPEntry function will then fail due to the vpinfo size does not match the relnatts value in pg_class (e.g. persistent table rebuild). Since we have the AccessExclusiveLock on the table during ADD COLUMN, we might as well schedule the drop before the vpinfos are updated to prevent any inconsistency issues.
-
由 Jimmy Yih 提交于
We expect lazy VACUUM to skip drop phase if it cannot acquire the relation's AccessExclusiveLock. This commit properly sets the dontWait flag to True when calling try_relation_open during drop phase. We also change the dontWait part of try_relation_open to not error out because callers should be the ones to handle an invalid relation.
-
- 10 5月, 2017 1 次提交
-
-
由 xiong-gang 提交于
Previously, we remove the shared memory object when drop resource group, and restore it if the transaction aborts. The concurrently access to shared memory object would fail in this way. If the resource group is dropped, new transactions to this resource group will be queued up until the drop transaction is finished. Signed-off-by: NRichard Guo <riguo@pivotal.io>
-
- 09 5月, 2017 1 次提交
-
-
由 Kenan Yao 提交于
We were not storing sourceTag in CachedPlanSource, so if a cached plan is used later, that field is broken, and would break resource queue concurrency limit functionality for example. Reported by issue #2308 when using JDBC/ODBC
-
- 03 5月, 2017 1 次提交
-
-
由 xiong-gang 提交于
Increase the 'concurrency' limit can take effect immediately, and the queueing transactions can be woken up. Decrease the 'concurrency' is different, if the new limit is smaller than the number of current running transactions, ALTER statement won't cancel the running transactions to the limit. Therefore, we use column 'proposed' in pg_resgroupcapability to represent the effective limit, and use column 'value' to record the historical limit. For example, we have a resource group with concurrency=3, and there are 3 running transactions and 3 queueing transactions. If we alter the concurrency to 2, 'proposed' will be updated to 2 and 'value' will stay as 3. When one running transaction is finished, it won't wake up the transactions in the queue as the current concurrency is 2. If we execute the statement again to alter the concurrency to 2, it will update the 'value' column to 2, and the 'value' is consistent with 'proposed' again. Signed-off-by: NRichard Guo <riguo@pivotal.io>
-
- 21 4月, 2017 1 次提交
-
-
由 xiong-gang 提交于
1.Change the format of 'total_queue_duration' in 'gp_toolkit.gp_resgroup_status' to keep it consistent with 'rsgqueueduration' in 'pg_stat_activity'. 2.Show resource group information for running queries in 'pg_stat_activity'. Signed-off-by: Richard Guo<riguo@pivotal.io>
-
- 18 4月, 2017 1 次提交
-
-
由 xiong-gang 提交于
1. Support num_running, num_queueing, num_queued, num_executed and total_queue_duration in gp_toolkit.gp_resgroup_status. 2. Reflect status in pg_stat_activity when transaction is queued in resource group. Signed-off-by: NRichard Guo <riguo@pivotal.io> Signed-off-by: NKenan Yao <kyao@pivotal.io>
-
- 05 4月, 2017 1 次提交
-
-
由 Jimmy Yih 提交于
These tests assumed OID == relfilenode. We updated the tests to not assume it anymore.
-
- 13 3月, 2017 1 次提交
-
-
由 Heikki Linnakangas 提交于
I modified the test slightly, to eliminate the sleep, making the test run faster.
-
- 07 3月, 2017 2 次提交
-
-
-
First, it support shell script and command: The new syntax is: ``` ! <some script> ``` This is required to run something like gpfaultinjector, gpstart, gpstop, etc. Second, it support utility mode with blocking command and join: The new syntax is: ``` 2U&: <sql> 2U<: ``` The above example means: - blocking in utility mode on dbid 2 - join back in previous session in utility mode on dbid 2 Fix the exception handling to allow the test to complete, and log the failed commands instead of abort the test. This will make sure all the cleaning steps are executed and not blocking following tests. This also include init_file for diff comparison to ignore timestamp output from gpfaultinjector.
-
- 03 3月, 2017 1 次提交
-
-
由 Ashwin Agrawal 提交于
-
- 23 2月, 2017 1 次提交
-
-
由 Heikki Linnakangas 提交于
These are the same tests queries for column-oriented append-only tables, as those moved by commit 11a5a807, for row-oriented append-only tables. There were two additional tests that were never executed for row-oriented tables though: phantom_reads_update_serializable and phantom_reads_delete_serializable. I believe that was an oversight in the original test suite; they are now also executed for row-oriented tables. We use the UAO templating mechanism, to run the same test files against row- and column-oriented tables. To make that work, fix a bug in the templating mechanism pg_regress.c: if the --ao-dir argument was shorter than 7 characters, the uao directory was not detected correctly.
-
- 19 2月, 2017 3 次提交
-
-
由 Heikki Linnakangas 提交于
ORCA acquires different locks. Same problem that was fixed in the main test suite by commit b3e13f6b.
-
由 Heikki Linnakangas 提交于
Change the queries that check tuple counts on a particular segment, in utility mode, to not print out the exact tuple counts, but a crude classification of 0, 1, <5 or more tuples. That's less sensitive to how the tuples are distributed across segments. The locks_reindex test is moved to the regular regression suite. I rewrote it to use a more advanced "locktest" view, copied from the partition_locking test, which doesn't rely on utility mode. These changes should make the tests work with even larger clusters, but I've only tested with 1, 2, and 3 nodes.
-
由 Heikki Linnakangas 提交于
This new "isolation2" suite uses the same Python helper that TINC used, to run these special isolation test cases.
-