1. 24 4月, 2020 1 次提交
  2. 05 6月, 2019 1 次提交
  3. 11 1月, 2019 1 次提交
  4. 22 8月, 2018 1 次提交
    • N
      resgroup: treat cpuset as -1 when it is missing in catalog. · dfcc5656
      Ning Yu 提交于
      When a resgroup rg1 is created on version 5.8 it has no cpuset setting
      in pg_resgroupcapability.  If the cluster is then upgraded to a version
      that supports the resgroup cpuset feature (5.9 for example), we will
      notice that rg1's cpuset is "" instead of "-1", which stands for a empty
      cpuset, thus we accidentally treat rg1's cpuset as non-empty, this will
      cause rg1 can not be altered if the cpuset cgroup dir does not exist.
      
      To fix the issue we should set rg1's cpuset to "-1" when it's missing in
      pg_resgroupcapability.
      
      Also changed the binary swap tests to remove the cpuset cgroup dir
      before running the tests, so this issue could be triggered.
      
      Co-Author: Jialun Du <jdu@pivotal.io>
      Co-Author: Ning Yu <nyu@pivotal.io>
      dfcc5656
  5. 05 6月, 2018 1 次提交
    • J
      Implement CPUSET (#5023) · 0c0782fe
      Jialun 提交于
      * Implement CPUSET, a new management of cpu resource in resource
      group which can reserve the specified cores for specified
      resource group exclusively. This can ensure that there are always
      available cpu resources for the group which has set CPUSET.
      The most common scenario is allocating fixed cores for short
      queries.
      
      - One can use it by executing CREATE RESOURCE GROUP xxx WITH (
        cpuset='0-1', xxxx). 0-1 are the reserved cpu cores for
        this group. Or ALTER RESOURCE GROUP SET CPUSET '0,1' to modify
        the value.
      - The syntax of CPUSET is a combination of the tuples, each
        tuple represents one core number or the core numbers interval,
        separated by comma. E.g. 0,1,2-3. All the core in CPUSET must be
        available in system and the core numbers in each group cannot
        have overlap.
      - CPUSET and CPU_RATE_LIMIT are mutually exclusive. One cannot
        create a resource group with both CPUSET and CPU_RATE_LIMIT.
        But the CPUSET and CPU_RATE_LIMIT can be freely switched in
        one group by executing ALTER operation, that means if one
        feature has been set, the other is disabled.
      - The cpu cores will be returned to GPDB, when the group has been
        dropped, or the CPUSET value has been changed, or the CPU_RATE_LIMIT
        has been set.
      - If some of the cores have been allocated to the resource group,
        then the CPU_RATE_LIMIT in other groups only indicating the
        percentage of cpu resources of the left cpu cores.
      - If the GPDB is busy, all the other cores which have not be
        allocated to any resource groups exclusively through CPUSET
        have already been run out, the cpu cores in CPUSET will still
        not be allocated.
      - The cpu cores in CPUSET will be used exclusively only in GPDB
        level, the other non-GPDB processes in system may use them.
      - Add test cases for this new feature, and the test environment
        must contain at least two cpu cores, so we upgrade the configuration
        of instance_type in resource_group jobs.
      
      * - Compatible with the case that cgroup directory cpuset/gpdb
        does not exist
      - Implement pg_dump for cpuset & memory_auditor
      - Fix a typo
      - Change default cpuset value from empty string to -1, for
        the code in 5X assume that all the default value in
        resource group is integer, a non-integer value will make the
        system fail to start
      0c0782fe
  6. 19 4月, 2018 1 次提交
    • N
      resgroup: backward compatibility for memory auditor · f2f86174
      Ning Yu 提交于
      Memory auditor was a new feature introduced to allow external components
      (e.g. pl/container) managed by resource group.  This feature requires a
      new gpdb dir to be created in cgroup memory controller, however on 5X
      branch unless the users created this new dir manually then the upgrade
      from a previous version would fail.
      
      In this commit we provide backward compatibility by checking the release
      version:
      
      - on 6X and master branches the memory auditor feature is always enabled
        so the new gpdb dir is mandatory;
      - on 5X branch only if the new gpdb dir is created with proper
        permissions the memory auditor feature could be enabled, when it's
        disabled `CREATE RESOURCE GROUP WITH (memory_auditor='cgroup') will fail
        with guide information on how to enable it;
      
      Binary swap tests are also provided to verify backward compatibility in
      future releases.  As cgroup need to be configured to enable resgroup we
      split the resgroup binary swap tests into two parts:
      
      - resqueue mode only tests which can be triggered in the
        icw_gporca_centos6 pipeline job after the ICW tests, these parts have
        no requirements on cgroup;
      - complete resqueue & resgroup modes tests which can be triggered in the
        mpp_resource_group_centos{6,7} pipeline jobs after the resgroup tests,
        these parts need cgroup to be properly configured;
      f2f86174
  7. 03 4月, 2018 1 次提交
    • A
      CI: Upgrade to CCP 2.0 · b869221c
      Alexandra Wang 提交于
      CCP 2.0 includes the following changes:
      
      1) CCP migration from AWS to GOOGLE.
      
      CCP jobs (except for jobs need connection to ddboost and netbackup) now
      no longer need external workers, therefore ccp tags for external workers
      are removed.
      
      The tfstate backend for AWS and GOOGLE are stored seperatedly on s3
      bucket, `clusters-aws/` for aws and `clusters-google/` for google,
      set_failed are also different between the two cloud providers.
      
      2) Separate gpinitsystem from the gen_cluster task
      
      When failures occur in production for gpinitsystem itself, it is
      important for a developer to be able to quickly distinguish whether it
      is a CCP failure, or a problem with the binaries used to init the GPDB
      cluster. By separating the tasks, it is easier to see when gpinit itself
      has failed.
      
      3) The path to scripts used in CCP has changed
      
      Instead of all of the generic scripts being in `ccp_src/aws/` they are
      now in a better location of `ccp_src/scripts/`.
      
      4) Paramater names have changed
      
      platform is now PLATFORM for all references in CCP jobs
      
      5) NVME jobs
      
      Jobs that used NVME in AWS have been migrated to an identical feature
      for NVME in GCP but this does include a change to the terraform path
      specified in the job.
      
      6) Instance types mapping from Ec2 to GCE
      
      The new paramater name for specifying instance type in GCP jobs is
      `instance_type`. There is not always a 1:1 match for instance types so
      there are slight differences in available resources for some jobs.
      Signed-off-by: NKris Macoskey <kmacoskey@pivotal.io>
      b869221c
  8. 21 3月, 2018 1 次提交
  9. 20 3月, 2018 1 次提交
  10. 24 1月, 2018 1 次提交
    • A
      CI: Print only first 1000 lines of diff output. · 40b3b16c
      Ashwin Agrawal 提交于
      This is basically just work-around for broken functionality of current
      CI. Ideally it should work with arbitary number of lines dumped. But since it
      currently cannot and mostly fails to load if diff output is very long, adding
      limit to number of lines printed. Don't know what the upper limit is but trying
      with 1000. As if its broked beyond 1000 lines then its not helpful anyways
      to look from CI and instead logging in container and checking is better.
      40b3b16c
  11. 13 12月, 2017 1 次提交
  12. 08 11月, 2017 2 次提交
    • N
      resgroup: concourse: do not let inner ssh hijack the input. · bf7d346e
      Ning Yu 提交于
      In the resgroup concourse we use bash here doc to execute commands on
      remote server, but once a ssh command is executed the preceding commands
      are all ignored. This is because the here doc is hijacked or consumed by
      the inner ssh.
      
      Fixed by redirect inner ssh's stdin from /dev/null.
      bf7d346e
    • N
      resgroup: concourse: pass error code on ICR errors. · 09c4744f
      Ning Yu 提交于
      The installcheck-resgroup (ICR) error code was replaced with the diff
      watcher's error code. Now pass it to the shell correctly.
      09c4744f
  13. 07 11月, 2017 2 次提交
  14. 01 11月, 2017 1 次提交
  15. 28 9月, 2017 1 次提交
    • G
      Fix red pipeline job · d39f0a46
      Gang Xiong 提交于
      configure requires libcurl when pxf is enabled, it's not needed
      for resource group regression tests, so disable it.
      d39f0a46
  16. 15 9月, 2017 1 次提交
  17. 07 8月, 2017 2 次提交
  18. 04 8月, 2017 1 次提交
  19. 19 5月, 2017 1 次提交
    • P
      Implement resource group cpu rate limitation. · 2650f728
      Pengzhou Tang 提交于
      Resource group cpu rate limitation is implemented with cgroup on linux
      system. When resource group is enabled via GUC we check whether cgroup
      is available and properly configured on the system. A sub cgroup is
      created for each resource group, cpu quota and share weight will be set
      depends on the resource group configuration. The queries will run under
      these cgroups, and the cpu usage will be restricted by cgroup.
      
      The cgroups directory structures:
      * /sys/fs/cgroup/{cpu,cpuacct}/gpdb: the toplevel gpdb cgroup
      * /sys/fs/cgroup/{cpu,cpuacct}/gpdb/*/: cgroup for each resource group
      
      The logic for cpu rate limitation:
      
      * in toplevel gpdb cgroup we set the cpu quota and share weight as:
      
          cpu.cfs_quota_us := cpu.cfs_period_us * 256 * gp_resource_group_cpu_limit
          cpu.shares := 1024 * ncores
      
      * for each sub group we set the cpu quota and share weight as:
      
          sub.cpu.cfs_quota_us := -1
          sub.cpu.shares := top.cpu.shares * sub.cpu_rate_limit
      
      The minimum and maximum cpu percentage for a sub cgroup:
      
          sub.cpu.min_percentage := gp_resource_group_cpu_limit * sub.cpu_rate_limit
          sub.cpu.max_percentage := gp_resource_group_cpu_limit
      
      The acutal percentage depends on how busy the system is.
      
      gp_resource_group_cpu_limit is a GUC introduced to control the cpu
      resgroups assigned on each host.
      
          gpconfig -c gp_resource_group_cpu_limit -v '0.9'
      
      A new pipeline is created to perform the tests as we need privileged
      permission to enable and setup cgroups on the system.
      Signed-off-by: NNing Yu <nyu@pivotal.io>
      2650f728