1. 19 5月, 2017 1 次提交
    • P
      Implement resource group cpu rate limitation. · 2650f728
      Pengzhou Tang 提交于
      Resource group cpu rate limitation is implemented with cgroup on linux
      system. When resource group is enabled via GUC we check whether cgroup
      is available and properly configured on the system. A sub cgroup is
      created for each resource group, cpu quota and share weight will be set
      depends on the resource group configuration. The queries will run under
      these cgroups, and the cpu usage will be restricted by cgroup.
      
      The cgroups directory structures:
      * /sys/fs/cgroup/{cpu,cpuacct}/gpdb: the toplevel gpdb cgroup
      * /sys/fs/cgroup/{cpu,cpuacct}/gpdb/*/: cgroup for each resource group
      
      The logic for cpu rate limitation:
      
      * in toplevel gpdb cgroup we set the cpu quota and share weight as:
      
          cpu.cfs_quota_us := cpu.cfs_period_us * 256 * gp_resource_group_cpu_limit
          cpu.shares := 1024 * ncores
      
      * for each sub group we set the cpu quota and share weight as:
      
          sub.cpu.cfs_quota_us := -1
          sub.cpu.shares := top.cpu.shares * sub.cpu_rate_limit
      
      The minimum and maximum cpu percentage for a sub cgroup:
      
          sub.cpu.min_percentage := gp_resource_group_cpu_limit * sub.cpu_rate_limit
          sub.cpu.max_percentage := gp_resource_group_cpu_limit
      
      The acutal percentage depends on how busy the system is.
      
      gp_resource_group_cpu_limit is a GUC introduced to control the cpu
      resgroups assigned on each host.
      
          gpconfig -c gp_resource_group_cpu_limit -v '0.9'
      
      A new pipeline is created to perform the tests as we need privileged
      permission to enable and setup cgroups on the system.
      Signed-off-by: NNing Yu <nyu@pivotal.io>
      2650f728
  2. 11 5月, 2017 4 次提交
    • A
      Perform cleanup phase even if open relation failed during drop phase of vacuum. · b3abb1b7
      Vacuum of AO tables happens in four phases: prepare, compact, drop and cleanup.
      When there are concurrent transactions executing vacuum on AO tables, if one of
      the vacuum transaction is executing a drop phase, the other transaction will
      not be able to open the relation if it is also executing the drop phase. When
      the relation open fails, vacuum simply exits without performing the cleanup
      phase. The relfrozenxid in pg_class is updated during the cleanup phase and
      hence the age of the table does not get reduced. This becomes a problem because
      the age of the database does not reduce and eventually the xid wraparound issue
      will be encountered.
      
      This problem is fixed by performing the cleanup phase even if the open relation
      fails during drop phase.
      b3abb1b7
    • J
      VACUUM should cleanup aosegs/aocssegs in AOSEG_STATE_AWAITING_DROP state · cdc8172a
      Jimmy Yih 提交于
      Due to the different phases of VACUUM, compaction phase could have
      committed but the drop was cancelled. This could leave some
      aoseg/aocsseg entries in AOSEG_STATE_AWAITING_DROP state which could
      then be cleaned up in a subsequent VACUUM as expected... but it
      currently just skips. The master keeps a hash table (AppendOnlyHash)
      and uses it during VACUUM to check tuple count, state, and
      aborted/committed. Our issue here is that state checking was only
      being done against state AWAITING_DROP_READY. It was not including
      COMPACTED_AWAITING_DROP in the state check. This commit adds the check
      and updates the detailed comment block on the state transitions in
      appendonlywriter.h.
      cdc8172a
    • J
      ADD COLUMN on an AOCO table should check and drop aocssegs in await drop · 930b2c38
      Jimmy Yih 提交于
      When doing an ADD COLUMN operation on an AOCO table, the vpinfo byte
      array stored in the aocsseg auxiliary table for each column is updated
      to add an extra vpinfo index to match the relnatts value in
      pg_class. However, entries in the aocsseg auxiliary table with state 2
      (AOSEG_STATE_AWAITING_DROP) will not have their vpinfo byte array
      updated. Subsequent calls to getAOCSVPEntry function will then fail
      due to the vpinfo size does not match the relnatts value in pg_class
      (e.g. persistent table rebuild). Since we have the AccessExclusiveLock
      on the table during ADD COLUMN, we might as well schedule the drop
      before the vpinfos are updated to prevent any inconsistency issues.
      930b2c38
    • J
      Fix lazy VACUUM to not wait for AccessExclusiveLock during drop phase · 05f55e86
      Jimmy Yih 提交于
      We expect lazy VACUUM to skip drop phase if it cannot acquire the
      relation's AccessExclusiveLock. This commit properly sets the dontWait
      flag to True when calling try_relation_open during drop phase. We also
      change the dontWait part of try_relation_open to not error out because
      callers should be the ones to handle an invalid relation.
      05f55e86
  3. 10 5月, 2017 1 次提交
  4. 09 5月, 2017 1 次提交
    • K
      sourceTag missed in plan cache · 66abeaa4
      Kenan Yao 提交于
      We were not storing sourceTag in CachedPlanSource, so if a cached plan
      is used later, that field is broken, and would break resource queue
      concurrency limit functionality for example.
      
      Reported by issue #2308 when using JDBC/ODBC
      66abeaa4
  5. 03 5月, 2017 1 次提交
    • X
      ALTER RESOURCE GROUP SET CONCURRENCY N · 043511e9
      xiong-gang 提交于
      Increase the 'concurrency' limit can take effect immediately, and the
      queueing transactions can be woken up. Decrease the 'concurrency' is
      different, if the new limit is smaller than the number of current running
      transactions, ALTER statement won't cancel the running transactions to
      the limit. Therefore, we use column 'proposed' in pg_resgroupcapability
      to represent the effective limit, and use column 'value' to record the
      historical limit.
      For example, we have a resource group with concurrency=3, and there
      are 3 running transactions and 3 queueing transactions. If we alter the
      concurrency to 2, 'proposed' will be updated to 2 and 'value' will stay
      as 3. When one running transaction is finished, it won't wake up the
      transactions in the queue as the current concurrency is 2. If we execute
      the statement again to alter the concurrency to 2, it will update the
      'value' column to 2, and the 'value' is consistent with 'proposed'
      again.
      Signed-off-by: NRichard Guo <riguo@pivotal.io>
      043511e9
  6. 21 4月, 2017 1 次提交
    • X
      Minor fix on resource group statistic information · d16226bd
      xiong-gang 提交于
      1.Change the format of 'total_queue_duration' in 'gp_toolkit.gp_resgroup_status' to keep
      it consistent with 'rsgqueueduration' in 'pg_stat_activity'.
      2.Show resource group information for running queries in 'pg_stat_activity'.
      
      Signed-off-by: Richard Guo<riguo@pivotal.io>
      d16226bd
  7. 18 4月, 2017 1 次提交
  8. 05 4月, 2017 1 次提交
  9. 13 3月, 2017 1 次提交
  10. 07 3月, 2017 2 次提交
    • A
    • A
      Extend isolation2 test framework. · a5a473c0
      Ashwin Agrawal and Xin Zhang 提交于
      First, it support shell script and command:
      
      The new syntax is:
      ```
      ! <some script>
      ```
      This is required to run something like gpfaultinjector, gpstart, gpstop, etc.
      
      Second, it support utility mode with blocking command and join:
      
      The new syntax is:
      ```
      2U&: <sql>
      2U<:
      ```
      The above example means:
      - blocking in utility mode on dbid 2
      - join back in previous session in utility mode on dbid 2
      
      Fix the exception handling to allow the test to complete, and log the failed
      commands instead of abort the test. This will make sure all the cleaning steps
      are executed and not blocking following tests.
      
      This also include init_file for diff comparison to ignore timestamp output from
      gpfaultinjector.
      a5a473c0
  11. 03 3月, 2017 1 次提交
  12. 23 2月, 2017 1 次提交
    • H
      Move UAOCS isolation tests to the new isolation2 test suite. · f6215119
      Heikki Linnakangas 提交于
      These are the same tests queries for column-oriented append-only tables,
      as those moved by commit 11a5a807, for row-oriented append-only tables.
      There were two additional tests that were never executed for row-oriented
      tables though: phantom_reads_update_serializable and
      phantom_reads_delete_serializable. I believe that was an oversight in the
      original test suite; they are now also executed for row-oriented tables.
      
      We use the UAO templating mechanism, to run the same test files against
      row- and column-oriented tables. To make that work, fix a bug in the
      templating mechanism pg_regress.c: if the --ao-dir argument was shorter
      than 7 characters, the uao directory was not detected correctly.
      f6215119
  13. 19 2月, 2017 3 次提交