Prometheus_CN.md 2.9 KB
Newer Older
S
ShiningZhang 已提交
1 2 3 4 5 6 7 8 9
## Paddle Serving使用普罗米修斯监控

Paddle Serving支持普罗米修斯进行性能数据的监控。默认的访问接口为`http://localhost:19393/metrics`。数据形式为文本格式,您可以使用如下命令直观的看到:
```
curl http://localhost:19393/metrics
```

## 配置使用

10
### C++ Server
S
ShiningZhang 已提交
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95

对于 C++ Server 来说,启动服务时请添加如下参数

| 参数     | 参数说明                    | 备注                                                             |
| :------- | :-------------------------- | :--------------------------------------------------------------- |
| enable_prometheus | 开启Prometheus    | 开启Prometheus功能                                      |
| prometheus_port  | Prometheus数据端口    | 默认为19393                                     |

### Python Pipeline

对于 Python Pipeline 来说,启动服务时请在配置文件config.yml中添加如下参数
```
dag:
    #开启Prometheus
    enable_prometheus: True
    #配置Prometheus数据端口
    prometheus_port: 19393
```

### 监控数据类型

监控数据类型如下表

| Metric                                         | Frequency   | Description                                           |
| ---------------------------------------------- | ----------- | ----------------------------------------------------- |
| `pd_query_request_success_total`               | Per request | Number of successful query requests                         |
| `pd_query_request_failure_total`               | Per request | Number of failed query requests     |
| `pd_inference_count_total`                     | Per request | Number of inferences performed      |
| `pd_query_request_duration_us_total`           | Per request | Cumulative end-to-end query request handling time                            |
| `pd_inference_duration_us_total`               | Per request | Cumulative time requests spend executing the inference model               |

## 监控示例

此处给出一个使用普罗米修斯进行服务监控的简单示例

**1、获取镜像**

```
docker pull prom/node-exporter
docker pull prom/prometheus
```

**2、运行镜像**

```
docker run -d -p 9100:9100 \
  -v "/proc:/host/proc:ro" \
  -v "/sys:/host/sys:ro" \
  -v "/:/rootfs:ro" \
  --net="host" \
  prom/node-exporter
```

**3、配置**

修改监控服务的配置文件/opt/prometheus/prometheus.yml,添加监控节点信息

```
global:
  scrape_interval:     60s
  evaluation_interval: 60s
  
scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets: ['localhost:9090']
        labels:
          instance: prometheus
  
  - job_name: linux
    static_configs:
      - targets: ['$IP:9100']
        labels:
          instance: localhost
```

**4、启动监控服务**

```
docker run  -d \
  -p 9090:9090 \
  -v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml  \
  prom/prometheus
```
访问 `http://serverip:9090/graph` 即可