提交 · 2650f728ee9338158098180cb26b7e31406101a2 · Greenplum / Gpdb

19 5月, 2017 1 次提交

Implement resource group cpu rate limitation. · 2650f728

由 Pengzhou Tang 提交于 5月 01, 2017

Resource group cpu rate limitation is implemented with cgroup on linux
system. When resource group is enabled via GUC we check whether cgroup
is available and properly configured on the system. A sub cgroup is
created for each resource group, cpu quota and share weight will be set
depends on the resource group configuration. The queries will run under
these cgroups, and the cpu usage will be restricted by cgroup.

The cgroups directory structures:
* /sys/fs/cgroup/{cpu,cpuacct}/gpdb: the toplevel gpdb cgroup
* /sys/fs/cgroup/{cpu,cpuacct}/gpdb/*/: cgroup for each resource group

The logic for cpu rate limitation:

* in toplevel gpdb cgroup we set the cpu quota and share weight as:

    cpu.cfs_quota_us := cpu.cfs_period_us * 256 * gp_resource_group_cpu_limit
    cpu.shares := 1024 * ncores

* for each sub group we set the cpu quota and share weight as:

    sub.cpu.cfs_quota_us := -1
    sub.cpu.shares := top.cpu.shares * sub.cpu_rate_limit

The minimum and maximum cpu percentage for a sub cgroup:

    sub.cpu.min_percentage := gp_resource_group_cpu_limit * sub.cpu_rate_limit
    sub.cpu.max_percentage := gp_resource_group_cpu_limit

The acutal percentage depends on how busy the system is.

gp_resource_group_cpu_limit is a GUC introduced to control the cpu
resgroups assigned on each host.

    gpconfig -c gp_resource_group_cpu_limit -v '0.9'

A new pipeline is created to perform the tests as we need privileged
permission to enable and setup cgroups on the system.
Signed-off-by: NNing Yu <nyu@pivotal.io>

2650f728

11 5月, 2017 4 次提交

Perform cleanup phase even if open relation failed during drop phase of vacuum. · b3abb1b7

由 Ashwin Agrawal, Abhijit Subramanya and Xin Zhang 提交于 5月 08, 2017

Vacuum of AO tables happens in four phases: prepare, compact, drop and cleanup.
When there are concurrent transactions executing vacuum on AO tables, if one of
the vacuum transaction is executing a drop phase, the other transaction will
not be able to open the relation if it is also executing the drop phase. When
the relation open fails, vacuum simply exits without performing the cleanup
phase. The relfrozenxid in pg_class is updated during the cleanup phase and
hence the age of the table does not get reduced. This becomes a problem because
the age of the database does not reduce and eventually the xid wraparound issue
will be encountered.

This problem is fixed by performing the cleanup phase even if the open relation
fails during drop phase.

b3abb1b7

VACUUM should cleanup aosegs/aocssegs in AOSEG_STATE_AWAITING_DROP state · cdc8172a

由 Jimmy Yih 提交于 5月 08, 2017

Due to the different phases of VACUUM, compaction phase could have
committed but the drop was cancelled. This could leave some
aoseg/aocsseg entries in AOSEG_STATE_AWAITING_DROP state which could
then be cleaned up in a subsequent VACUUM as expected... but it
currently just skips. The master keeps a hash table (AppendOnlyHash)
and uses it during VACUUM to check tuple count, state, and
aborted/committed. Our issue here is that state checking was only
being done against state AWAITING_DROP_READY. It was not including
COMPACTED_AWAITING_DROP in the state check. This commit adds the check
and updates the detailed comment block on the state transitions in
appendonlywriter.h.

cdc8172a

ADD COLUMN on an AOCO table should check and drop aocssegs in await drop · 930b2c38

由 Jimmy Yih 提交于 5月 02, 2017

When doing an ADD COLUMN operation on an AOCO table, the vpinfo byte
array stored in the aocsseg auxiliary table for each column is updated
to add an extra vpinfo index to match the relnatts value in
pg_class. However, entries in the aocsseg auxiliary table with state 2
(AOSEG_STATE_AWAITING_DROP) will not have their vpinfo byte array
updated. Subsequent calls to getAOCSVPEntry function will then fail
due to the vpinfo size does not match the relnatts value in pg_class
(e.g. persistent table rebuild). Since we have the AccessExclusiveLock
on the table during ADD COLUMN, we might as well schedule the drop
before the vpinfos are updated to prevent any inconsistency issues.

930b2c38

Fix lazy VACUUM to not wait for AccessExclusiveLock during drop phase · 05f55e86

由 Jimmy Yih 提交于 5月 04, 2017

We expect lazy VACUUM to skip drop phase if it cannot acquire the
relation's AccessExclusiveLock. This commit properly sets the dontWait
flag to True when calling try_relation_open during drop phase. We also
change the dontWait part of try_relation_open to not error out because
callers should be the ones to handle an invalid relation.

05f55e86

10 5月, 2017 1 次提交

Remove the shared memory object only when transaction of drop resource group is committed · 04ffeec0

由 xiong-gang 提交于 5月 10, 2017

Previously, we remove the shared memory object when drop resource group,
and restore it if the transaction aborts. The concurrently access to
shared memory object would fail in this way.
If the resource group is dropped, new transactions to this resource
group will be queued up until the drop transaction is finished.
Signed-off-by: NRichard Guo <riguo@pivotal.io>

04ffeec0

09 5月, 2017 1 次提交

sourceTag missed in plan cache · 66abeaa4

由 Kenan Yao 提交于 5月 09, 2017

We were not storing sourceTag in CachedPlanSource, so if a cached plan
is used later, that field is broken, and would break resource queue
concurrency limit functionality for example.

Reported by issue #2308 when using JDBC/ODBC

66abeaa4

03 5月, 2017 1 次提交

ALTER RESOURCE GROUP SET CONCURRENCY N · 043511e9

由 xiong-gang 提交于 5月 03, 2017

Increase the 'concurrency' limit can take effect immediately, and the
queueing transactions can be woken up. Decrease the 'concurrency' is
different, if the new limit is smaller than the number of current running
transactions, ALTER statement won't cancel the running transactions to
the limit. Therefore, we use column 'proposed' in pg_resgroupcapability
to represent the effective limit, and use column 'value' to record the
historical limit.
For example, we have a resource group with concurrency=3, and there
are 3 running transactions and 3 queueing transactions. If we alter the
concurrency to 2, 'proposed' will be updated to 2 and 'value' will stay
as 3. When one running transaction is finished, it won't wake up the
transactions in the queue as the current concurrency is 2. If we execute
the statement again to alter the concurrency to 2, it will update the
'value' column to 2, and the 'value' is consistent with 'proposed'
again.
Signed-off-by: NRichard Guo <riguo@pivotal.io>

043511e9

21 4月, 2017 1 次提交

Minor fix on resource group statistic information · d16226bd

由 xiong-gang 提交于 4月 21, 2017

1.Change the format of 'total_queue_duration' in 'gp_toolkit.gp_resgroup_status' to keep
it consistent with 'rsgqueueduration' in 'pg_stat_activity'.
2.Show resource group information for running queries in 'pg_stat_activity'.

Signed-off-by: Richard Guo<riguo@pivotal.io>

d16226bd

18 4月, 2017 1 次提交

Concurrency statistic information of resource group · b6e2888c

由 xiong-gang 提交于 4月 18, 2017

1. Support num_running, num_queueing, num_queued, num_executed and
total_queue_duration in gp_toolkit.gp_resgroup_status.
2. Reflect status in pg_stat_activity when transaction is queued in resource group.
Signed-off-by: NRichard Guo <riguo@pivotal.io>
Signed-off-by: NKenan Yao <kyao@pivotal.io>

b6e2888c

05 4月, 2017 1 次提交
- J
  Fix ICW tests after introducing relfilenode counter change. · 974e78dc
  由 Jimmy Yih 提交于 3月 16, 2017
```
These tests assumed OID == relfilenode. We updated the tests to not assume it
anymore.
```
  974e78dc
13 3月, 2017 1 次提交
- H
  Move test case for querying pg_views, while a view is dropped. · 59eb4ac1
  由 Heikki Linnakangas 提交于 3月 13, 2017
```
I modified the test slightly, to eliminate the sleep, making the test run
faster.
```
  59eb4ac1
07 3月, 2017 2 次提交

A

Add test to validate checkpoint blocked by commit and commit prepared. · 961084ef
由 Ashwin Agrawal and Xin Zhang 提交于 2月 27, 2017

961084ef

Extend isolation2 test framework. · a5a473c0

由 Ashwin Agrawal and Xin Zhang 提交于 2月 27, 2017

First, it support shell script and command:

The new syntax is:
```
! <some script>
```
This is required to run something like gpfaultinjector, gpstart, gpstop, etc.

Second, it support utility mode with blocking command and join:

The new syntax is:
```
2U&: <sql>
2U<:
```
The above example means:
- blocking in utility mode on dbid 2
- join back in previous session in utility mode on dbid 2

Fix the exception handling to allow the test to complete, and log the failed
commands instead of abort the test. This will make sure all the cleaning steps
are executed and not blocking following tests.

This also include init_file for diff comparison to ignore timestamp output from
gpfaultinjector.

a5a473c0

03 3月, 2017 1 次提交
- A
  
  Add isolation2 .gitignore · de646936
  由 Ashwin Agrawal 提交于 3月 02, 2017
  
  de646936
23 2月, 2017 1 次提交

Move UAOCS isolation tests to the new isolation2 test suite. · f6215119

由 Heikki Linnakangas 提交于 2月 23, 2017

These are the same tests queries for column-oriented append-only tables,
as those moved by commit 11a5a807, for row-oriented append-only tables.
There were two additional tests that were never executed for row-oriented
tables though: phantom_reads_update_serializable and
phantom_reads_delete_serializable. I believe that was an oversight in the
original test suite; they are now also executed for row-oriented tables.

We use the UAO templating mechanism, to run the same test files against
row- and column-oriented tables. To make that work, fix a bug in the
templating mechanism pg_regress.c: if the --ao-dir argument was shorter
than 7 characters, the uao directory was not detected correctly.

f6215119

19 2月, 2017 3 次提交

H
Add ORCA-specific expected output files for new uao-isolation tests. · 302f5839
由 Heikki Linnakangas 提交于 2月 19, 2017
```
ORCA acquires different locks. Same problem that was fixed in the main
test suite by commit b3e13f6b.
```
302f5839

Fix test cases to work with a three-segment cluster. · aada2649

由 Heikki Linnakangas 提交于 2月 18, 2017

Change the queries that check tuple counts on a particular segment,
in utility mode, to not print out the exact tuple counts, but a crude
classification of 0, 1, <5 or more tuples. That's less sensitive to how
the tuples are distributed across segments.

The locks_reindex test is moved to the regular regression suite. I rewrote
it to use a more advanced "locktest" view, copied from the
partition_locking test, which doesn't rely on utility mode.

These changes should make the tests work with even larger clusters,
but I've only tested with 1, 2, and 3 nodes.

aada2649

H
Move UAO isolation test cases from TINC to a new pg_regress based suite. · 11a5a807
由 Heikki Linnakangas 提交于 2月 18, 2017
```
This new "isolation2" suite uses the same Python helper that TINC used,
to run these special isolation test cases.
```
11a5a807