Using Resource Groups Resource groups are an experimental feature and are not intended for use in a production environment. Experimental features are subject to change without notice in future releases.

You can use resource groups to manage the number of active queries that may execute concurrently in your Greenplum Database cluster. With resource groups, you can also manage the amount of CPU and memory resources Greenplum allocates to each query.

When the user executes a query, Greenplum Database evaluates the query against a set of limits defined for the resource group. Greenplum Database executes the query immediately if the group's resource limits have not yet been reached and the query does not cause the group to exceed the concurrent transaction limit. If these conditions are not met, Greenplum Database queues the query. For example, if the maximum number of concurrent transactions for the resource group has already been reached, a subsequent query is queued and must wait until other queries complete before it runs. Greenplum Database may also execute a pending query when the resource group's concurrency and memory limits are altered to large enough values.

Within a resource group, transactions are evaluated on a first in, first out basis. Greenplum Database periodically assesses the active workload of the system, reallocating resources and starting/queuing jobs as necessary.

When you create a resource group, you provide a set of limits that determine the amount of CPU and memory resources available to transactions executed within the group. These limits are:

Limit Type Description CONCURRENCY The maximum number of concurrent transactions, including active and idle transactions, that are permitted for this resource group. CPU_RATE_LIMIT The percentage of CPU resources available to this resource group. MEMORY_LIMIT The percentage of memory resources available to this resource group. MEMORY_SHARED_QUOTA The percentage of memory to share across transactions submitted in this resource group. MEMORY_SPILL_RATIO The memory usage threshold for memory-intensive transactions. When a transaction reaches this threshold, it spills to disk.
Transaction Concurrency Limit

The CONCURRENCY limit controls the maximum number of concurrent transactions permitted for the resource group. Each resource group is logically divided into a fixed number of slots equal to the CONCURRENCY limit. Greenplum Database allocates these slots an equal, fixed percentage of memory resources.

The default CONCURRENCY limit value for a resource group is 20.

Greenplum Database queues any transactions submitted after the resource group reaches its CONCURRENCY limit. When a running transaction completes, Greenplum Database un-queues and executes the earliest queued transaction if sufficient memory resources exist.

CPU Limit

The server configuration parameter identifies the maximum percentage of system CPU resources to allocate to resource groups on each Greenplum Database segment node. The remaining CPU resources are used for the OS kernel and the Greenplum Database daemon processes. The default gp_resource_group_cpu_limit value is .9 (90%).

The default gp_resource_group_cpu_limit value may not leave sufficient CPU resources if you are running other workloads on your Greenplum Database cluster nodes, so be sure to adjust this server configuration parameter accordingly.

The Greenplum Database node CPU percentage is further divided equally among each segment on the Greenplum node. Each resource group reserves a percentage of the segment CPU for resource management. You identify this percentage via the CPU_RATE_LIMIT value you provide when you create the resource group.

The minimum CPU_RATE_LIMIT percentage you can specify for a resource group is 1, the maximum is 100.

The sum of CPU_RATE_LIMITs specified for all resource groups you define in your Greenplum Database cluster must not exceed 100.

CPU resource assignment is elastic in that Greenplum Database may allocate the CPU resources of an idle resource group to a busier one(s). In such situations, CPU resources are re-allocated to the previously idle resource group when that resource group next becomes active. If multiple resource groups are busy, they are allocated the CPU resources of any idle resource groups based on the ratio of their CPU_RATE_LIMITs. For example, a resource group created with a CPU_RATE_LIMIT of 40 will be allocated twice as much extra CPU resource as a resource group you create with a CPU_RATE_LIMIT of 20.

Memory Limits

When resource groups are enabled, memory usage is managed at the Greenplum Database node, segment, resource group, and transaction levels.

The server configuration parameter identifies the maximum percentage of system memory resources to allocate to resource groups on each Greenplum Database segment node. The default gp_resource_group_memory_limit value is .9 (90%).

The memory resource available on a Greenplum Database node is further divided equally among each segment on the node. Each resource group reserves a percentage of the segment memory for resource management. You identify this percentage via the MEMORY_LIMIT value you specify when you create the resource group. The minimum MEMORY_LIMIT percentage you can specify for a resource group is 1, the maximum is 100.

The sum of MEMORY_LIMITs specified for all resource groups you define in your Greenplum Database cluster must not exceed 100.

The memory reserved by the resource group is divided into fixed and shared components. The MEMORY_SHARED_QUOTA value you specify when you create the resource group identifies the percentage of reserved resource group memory that may be shared among the currently running transactions. This memory is allotted on a first-come, first-served basis. A running transaction may use none, some, or all of the MEMORY_SHARED_QUOTA.

The minimum MEMORY_SHARED_QUOTA you can specify is 0, the maximum is 100. The default MEMORY_SHARED_QUOTA is 20.

As mentioned previously, CONCURRENCY identifies the maximum number of concurrently running transactions permitted in the resource group. The fixed memory reserved by a resource group is divided into CONCURRENCY number of transaction slots. Each slot is allocated a fixed, equal amount of resource group memory. Greenplum Database guarantees this fixed memory to each transaction. Resource Group Memory Allotments

When a query's memory usage exceeds the fixed per-transaction memory usage amount, Greenplum Database allocates available resource group shared memory to the query. The maximum amount of resource group memory available to a specific transaction slot is the sum of the transaction's fixed memory and the full resource group shared memory allotment.

Spill to Disk

MEMORY_SPILL_RATIO identifies the memory usage threshold for memory-intensive operators in a transaction. When the transaction reaches this memory threshold, it spills to disk. Greenplum Database uses the MEMORY_SPILL_RATIO to determine the initial memory to allocate to a transaction.

The minimum MEMORY_SPILL_RATIO percentage you can specify for a resource group is 0. The maximum is 100. The default MEMORY_SPILL_RATIO is 20.

You define the MEMORY_SPILL_RATIO when you create a resource group. You can selectively set this limit on a per-query basis at the session level with the server configuration parameter.

Using Resource Groups Significant Pivotal Greenplum Database performance degradation has been observed when enabling resource group-based workload management on supported versions of RedHat 6.x, CentOS 6.x, and SuSE 11 systems. The issue is due to a Linux cgroup kernel bug. The kernel bug has been fixed on CentOS 7.x and Red Hat 7.x systems.
Prerequisite

Greenplum Database resource groups use Linux Control Groups (cgroups) to manage CPU resources. With cgroups, Greenplum isolates the CPU usage of your Greenplum processes from other processes on the node. This allows Greenplum to support CPU usage restrictions on a per-resource-group basis.

For detailed information about cgroups, refer to the Control Groups documentation for your Linux distribution.

Complete the following tasks on each node in your Greenplum Database cluster to set up cgroups for use with resource groups:

  1. If you are running the Suse 11+ operating system on your Greenplum Database cluster nodes, you must enable swap accounting on each node and restart your Greenplum Database cluster. The swapaccount kernel boot parameter governs the swap accounting setting on Suse 11+ systems. After setting this boot parameter, you must reboot your systems. For details, refer to the Cgroup Swap Control discussion in the Suse 11 release notes. You must be the superuser or have sudo access to configure kernel boot parameters and reboot systems.
  2. Create the Greenplum Database cgroups configuration file /etc/cgconfig.d/gpdb.conf. You must be the superuser or have sudo access to create this file: sudo vi /etc/cgconfig.d/gpdb.conf
  3. Add the following configuration information to /etc/cgconfig.d/gpdb.conf: group gpdb { perm { task { uid = gpadmin; gid = gpadmin; } admin { uid = gpadmin; gid = gpadmin; } } cpu { } cpuacct { } }

    This content configures CPU and CPU accounting control groups managed by the gpadmin user.

  4. If not already installed and running, install the Control Groups operating system package and start the cgroups service on each Greenplum Database node. The commands you run to perform these tasks will differ based on the operating system installed on the node. You must be the superuser or have sudo access to run these commands:
    • Redhat/CentOS 7.x systems: sudo yum install libcgroup-tools sudo cgconfigparser -l /etc/cgconfig.d/gpdb.conf
    • Redhat/CentOS 6.x systems: sudo yum install libcgroup sudo service cgconfig start
    • Suse 11+ systems: sudo zypper install libcgroup-tools sudo cgconfigparser -l /etc/cgconfig.d/gpdb.conf
  5. Identify the cgroup directory mount point for the node: grep cgroup /proc/mounts

    The first line of output identifies the cgroup mount point.

  6. Verify that you set up the Greenplum Database cgroups configuration correctly by running the following commands. Replace cgroup_mount_point with the mount point you identified in the previous step: ls -l cgroup_mount_point/cpu/gpdb ls -l cgroup_mount_point/cpuacct/gpdb

    If these directories exist and are owned by gpadmin:gpadmin, you have successfully configured cgroups for Greenplum Database CPU resource management.

Procedure

To use resource groups in your Greenplum Database cluster, you:

  1. Enable resource groups for your Greenplum Database cluster.
  2. Create resource groups.
  3. Assign the resource groups to one or more roles.
  4. Use resource management system views to monitor and manage the resource groups.
Enabling Resource Groups

When you install Greenplum Database, resource queues are enabled by default. To use resource groups instead of resource queues, you must set the server configuration parameter.

  1. Set the gp_resource_manager server configuration parameter to the value "group": gpconfig -s gp_resource_manager gpconfig -c gp_resource_manager -v "group"
  2. Restart Greenplum Database: gpstop gpstart

Once enabled, any transaction submitted by a role is directed to the resource group assigned to the role, and is governed by that resource group's concurrency, memory, and CPU limits.

Greenplum Database creates two default resource groups named admin_group and default_group. When you enable resources groups, any role that was not explicitly assigned a resource group is assigned the default group for the role's capability. SUPERUSER roles are assigned the admin_group, non-admin roles are assigned the group named default_group.

The default resource groups admin_group and default_group are created with the following resource limits:

Limit Type admin_group default_group CONCURRENCY 10 20 CPU_RATE_LIMIT 10 30 MEMORY_LIMIT 10 30 MEMORY_SHARED_QUOTA 50 50 MEMORY_SPILL_RATIO 20 20
Creating Resource Groups

When you create a resource group, you provide a name, CPU limit, and memory limit. You can optionally provide a concurrent transaction limit and memory shared quota and spill ratio. Use the command to create a new resource group.

When you create a resource group, you must provide CPU_RATE_LIMIT and MEMORY_LIMIT limit values. These limits identify the percentage of Greenplum Database resources to allocate to this resource group. For example, to create a resource group named rgroup1 with a CPU limit of 20 and a memory limit of 25:

=# CREATE RESOURCE GROUP rgroup1 WITH (CPU_RATE_LIMIT=20, MEMORY_LIMIT=25);

The CPU limit of 20 is shared by every role to which rgroup1 is assigned. Similarly, the memory limit of 25 is shared by every role to which rgroup1 is assigned. rgroup1 utilizes the default CONCURRENCY setting of 20.

The command updates the limits of a resource group. To change the limits of a resource group, specify the new values you want for the group. For example:

=# ALTER RESOURCE GROUP rg_light SET CONCURRENCY 7; =# ALTER RESOURCE GROUP exec SET MEMORY_LIMIT 25;

You cannot set or alter the CONCURRENCY value for the admin_group to zero (0).

The command drops a resource group. To drop a resource group, the group cannot be assigned to any role, nor can there be any transactions active or waiting in the resource group. To drop a resource group:

=# DROP RESOURCE GROUP exec;

Assigning a Resource Group to a Role

When you create a resource group, the group is available for assignment to one or more roles (users). You assign a resource group to a database role using the RESOURCE GROUP clause of the or commands. If you do not specify a resource group for a role, the role is assigned the default group for the role's capability. SUPERUSER roles are assigned the admin_group, non-admin roles are assigned the group named default_group.

Use the ALTER ROLE or CREATE ROLE commands to assign a resource group to a role. For example:

=# ALTER ROLE bill RESOURCE GROUP rg_light; =# CREATE ROLE mary RESOURCE GROUP exec;

You can assign a resource group to one or more roles. If you have defined a role hierarchy, assigning a resource group to a parent role does not propagate down to the members of that role group.

If you wish to remove a resource group assignment from a role and assign the role the default group, change the role's group name assignment to NONE. For example:

=# ALTER ROLE mary RESOURCE GROUP NONE;

Monitoring Resource Group Status

Monitoring the status of your resource groups and queries may involve the following tasks:

Viewing Resource Group Limits

The gp_toolkit system view displays the current and proposed limits for a resource group. The proposed limit differs from the current limit when you alter the limit but the new value can not be immediately applied. To view the limits of all resource groups:

=# SELECT * FROM gp_toolkit.gp_resgroup_config;

Viewing Resource Group Query Status and CPU/Memory Usage

The gp_toolkit system view enables you to view the status and activity of a resource group. The view displays the number of running and queued transactions. It also displays the real-time CPU and memory usage of the resource group. To view this information:

=# SELECT * FROM gp_toolkit.gp_resgroup_status;

Viewing the Resource Group Assigned to a Role

To view the resource group-to-role assignments, perform the following query on the and system catalog tables:

=# SELECT rolname, rsgname FROM pg_roles, pg_resgroup     WHERE pg_roles.rolresgroup=pg_resgroup.oid;

Viewing a Resource Group's Running and Pending Queries

To view a resource group's running queries, pending queries, and how long the pending queries have been queued, examine the system catalog table:

=# SELECT current_query, waiting, rsgname, rsgqueueduration FROM pg_stat_activity;

Canceling a Queued Transaction in a Resource Group

There may be cases when you want to cancel a queued transaction in a resource group. For example, you may want to remove a query that is waiting in the resource group queue but has not yet been executed. Or, you may want to stop a running query that is taking too long to execute, or one that is sitting idle in a transaction and taking up resource group transaction slots that are needed by other users.

To cancel a queued transaction, you must first determine the process id (pid) associated with the transaction. Once you have obtained the process id, you can invoke pg_cancel_backend() to end that process, as shown below.

For example, to view the process information associated with all statements currently active or waiting in all resource groups, run the following query. If the query returns no results, then there are no running or queued transactions in any resource group.

=# SELECT rolname, g.rsgname, procpid, waiting, current_query, datname FROM pg_roles, gp_toolkit.gp_resgroup_status g, pg_stat_activity WHERE pg_roles.rolresgroup=g.groupid AND pg_stat_activity.usename=pg_roles.rolname;

Sample partial query output:

rolname | rsgname | procpid | waiting | current_query | datname ---------+----------+---------+---------+-----------------------+---------   sammy | rg_light | 31861 | f | <IDLE> in transaction | testdb  billy | rg_light | 31905 | t | SELECT * FROM topten; | testdb

Use this output to identify the process id (procpid) of the transaction you want to cancel, and then cancel the process. For example, to cancel the pending query identified in the sample output above:

=# SELECT pg_cancel_backend(31905);

You can provide an optional message in a second argument to pg_cancel_backend() to indicate to the user why the process was cancelled.

Do not use an operating system KILL command to cancel any Greenplum Database process.