188.md 4.7 KB
Newer Older
Lab机器人's avatar
readme  
Lab机器人 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69
# Monitoring Kubernetes

> 原文:[https://docs.gitlab.com/ee/user/project/integrations/prometheus_library/kubernetes.html](https://docs.gitlab.com/ee/user/project/integrations/prometheus_library/kubernetes.html)

*   [Requirements](#requirements)
*   [Metrics supported](#metrics-supported)
*   [Configuring Prometheus to monitor for Kubernetes metrics](#configuring-prometheus-to-monitor-for-kubernetes-metrics)
*   [Specifying the Environment](#specifying-the-environment)
*   [Displaying Canary metrics](#displaying-canary-metrics-premium)
    *   [Canary metrics supported](#canary-metrics-supported)

# Monitoring Kubernetes[](#monitoring-kubernetes "Permalink")

在 GitLab 9.0 中[引入](https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/8935) .

GitLab 支持自动检测和监视 Kubernetes 指标.

## Requirements[](#requirements "Permalink")

必须启用[Prometheus](../prometheus.html)[Kubernetes](../kubernetes.html)集成服务.

## Metrics supported[](#metrics-supported "Permalink")

*   平均内存使用量(MB):

    ```
    avg(sum(container_memory_usage_bytes{container_name!="POD",pod_name=~"^%{ci_environment_slug}-([^c].*|c([^a]|a([^n]|n([^a]|a([^r]|r[^y])))).*|)-(.*)",namespace="%{kube_namespace}"}) by (job)) without (job) / count(avg(container_memory_usage_bytes{container_name!="POD",pod_name=~"^%{ci_environment_slug}-([^c].*|c([^a]|a([^n]|n([^a]|a([^r]|r[^y])))).*|)-(.*)",namespace="%{kube_namespace}"}) without (job)) /1024/1024 
    ```

*   平均 CPU 使用率(%):

    ```
    avg(sum(rate(container_cpu_usage_seconds_total{container_name!="POD",pod_name=~"^%{ci_environment_slug}-([^c].*|c([^a]|a([^n]|n([^a]|a([^r]|r[^y])))).*|)-(.*)",namespace="%{kube_namespace}"}[15m])) by (job)) without (job) / count(sum(rate(container_cpu_usage_seconds_total{container_name!="POD",pod_name=~"^%{ci_environment_slug}-([^c].*|c([^a]|a([^n]|n([^a]|a([^r]|r[^y])))).*|)-(.*)",namespace="%{kube_namespace}"}[15m])) by (pod_name)) 
    ```

## Configuring Prometheus to monitor for Kubernetes metrics[](#configuring-prometheus-to-monitor-for-kubernetes-metrics "Permalink")

为了收集 Kubernetes 指标,需要将 Prometheus 部署到集群中并进行正确配置. GitLab 支持两种方法:

*   GitLab [与 Kubernetes 集成](../../clusters/index.html) ,并且可以[将 Prometheus 部署到连接的集群中](../prometheus.html#managed-prometheus-on-kubernetes) . 它被自动配置为收集 Kubernetes 指标.
*   要配置自己的 Prometheus 服务器,可以遵循[Prometheus 文档](https://s0prometheus0io.icopy.site/docs/introduction/overview/) .

## Specifying the Environment[](#specifying-the-environment "Permalink")

为了隔离并仅显示给定环境的相关 CPU 和内存指标,GitLab 需要一种方法来检测其正在运行的容器. 由于这些指标是在容器级别跟踪的,因此传统的 Kubernetes 标签不可用.

相反, [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/)[DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/)名称应以[CI_ENVIRONMENT_SLUG 开头](../../../../ci/variables/README.html#predefined-environment-variables) . 如果需要,可以在其后跟一个`-`和其他内容. 例如, `review-homepage-5620p5`的部署名称将与`review/homepage`环境匹配.

## Displaying Canary metrics[](#displaying-canary-metrics-premium "Permalink")

[GitLab 10.2 中](https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/15201)引入.

GitLab 还收集了针对[Canary 部署的](../../canary_deployments.html) Kubernetes 指标,从而可以轻松比较当前部署的版本和 Canary.

这些度量标准期望[Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/)[DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/)名称以`$CI_ENVIRONMENT_SLUG-canary`开头,以隔离 canary 度量标准.

### Canary metrics supported[](#canary-metrics-supported "Permalink")

*   平均内存使用量(MB)

    ```
    avg(sum(container_memory_usage_bytes{container_name!="POD",pod_name=~"^%{ci_environment_slug}-canary-(.*)",namespace="%{kube_namespace}"}) by (job)) without (job) / count(avg(container_memory_usage_bytes{container_name!="POD",pod_name=~"^%{ci_environment_slug}-canary-(.*)",namespace="%{kube_namespace}"}) without (job)) /1024/1024 
    ```

*   平均 CPU 使用率(%)

    ```
    avg(sum(rate(container_cpu_usage_seconds_total{container_name!="POD",pod_name=~"^%{ci_environment_slug}-canary-(.*)",namespace="%{kube_namespace}"}[15m])) by (job)) without (job) / count(sum(rate(container_cpu_usage_seconds_total{container_name!="POD",pod_name=~"^%{ci_environment_slug}-canary-(.*)",namespace="%{kube_namespace}"}[15m])) by (pod_name)) 
    ```