Prometheus_CN.md 2.9 KB
Newer Older
S
ShiningZhang 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
## Paddle Serving使用普罗米修斯监控

Paddle Serving支持普罗米修斯进行性能数据的监控。默认的访问接口为`http://localhost:19393/metrics`。数据形式为文本格式,您可以使用如下命令直观的看到:
```
curl http://localhost:19393/metrics
```

## 配置使用

### C+ Server

对于 C++ Server 来说,启动服务时请添加如下参数

| 参数     | 参数说明                    | 备注                                                             |
| :------- | :-------------------------- | :--------------------------------------------------------------- |
| enable_prometheus | 开启Prometheus    | 开启Prometheus功能                                      |
| prometheus_port  | Prometheus数据端口    | 默认为19393                                     |

### Python Pipeline

对于 Python Pipeline 来说,启动服务时请在配置文件config.yml中添加如下参数
```
dag:
    #开启Prometheus
    enable_prometheus: True
    #配置Prometheus数据端口
    prometheus_port: 19393
```

### 监控数据类型

监控数据类型如下表

| Metric                                         | Frequency   | Description                                           |
| ---------------------------------------------- | ----------- | ----------------------------------------------------- |
| `pd_query_request_success_total`               | Per request | Number of successful query requests                         |
| `pd_query_request_failure_total`               | Per request | Number of failed query requests     |
| `pd_inference_count_total`                     | Per request | Number of inferences performed      |
| `pd_query_request_duration_us_total`           | Per request | Cumulative end-to-end query request handling time                            |
| `pd_inference_duration_us_total`               | Per request | Cumulative time requests spend executing the inference model               |

## 监控示例

此处给出一个使用普罗米修斯进行服务监控的简单示例

**1、获取镜像**

```
docker pull prom/node-exporter
docker pull prom/prometheus
```

**2、运行镜像**

```
docker run -d -p 9100:9100 \
  -v "/proc:/host/proc:ro" \
  -v "/sys:/host/sys:ro" \
  -v "/:/rootfs:ro" \
  --net="host" \
  prom/node-exporter
```

**3、配置**

修改监控服务的配置文件/opt/prometheus/prometheus.yml,添加监控节点信息

```
global:
  scrape_interval:     60s
  evaluation_interval: 60s
  
scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets: ['localhost:9090']
        labels:
          instance: prometheus
  
  - job_name: linux
    static_configs:
      - targets: ['$IP:9100']
        labels:
          instance: localhost
```

**4、启动监控服务**

```
docker run  -d \
  -p 9090:9090 \
  -v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml  \
  prom/prometheus
```
访问 `http://serverip:9090/graph` 即可