未验证 提交 f866338c 编写于 作者: W Wan Kai 提交者: GitHub

Add PromQL Service doc and how to use in Grafana. (#10459)

上级 7334e6f2
# PromQL Service
PromQL([Prometheus Query Language](https://prometheus.io/docs/prometheus/latest/querying/basics/)) Service
exposes Prometheus Querying HTTP APIs including the bundled PromQL expression system.
Third-party systems or visualization platforms that already support PromQL (such as Grafana),
could obtain metrics through PromeQL Service.
As SkyWalking and Prometheus have fundamental differences in metrics classification, format, storage, etc.
The PromQL Service supported will be a subset of the complete PromQL
## Details Of Supported Protocol
The following doc describes the details of the supported protocol and compared it to the PromQL official documentation.
If not mentioned, it will not be supported by default.
### Time series Selectors
#### [Instant Vector Selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#instant-vector-selectors)
For example: select metric `service_cpm` which the service is `$service` and the layer is `$layer`.
```text
service_cpm{service='$service', layer='$layer'}
```
**Note: The label matching operators only support `=` instead of regular expressions.**
#### [Range Vector Selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#range-vector-selectors)
For example: select metric `service_cpm` which the service is `$service` and the layer is `$layer` within the last 5 minutes.
```text
service_cpm{service='$service', layer='$layer'}[5m]
```
#### [Time Durations](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-durations)
| Unit | Definition | Support |
|------|--------------|---------|
| ms | milliseconds | yes |
| s | seconds | yes |
| m | minutes | yes |
| h | hours | yes |
| d | days | yes |
| w | weeks | yes |
| y | years | **no** |
### Binary operators
#### [Arithmetic binary operators](https://prometheus.io/docs/prometheus/latest/querying/operators/#arithmetic-binary-operators)
| Operator | Definition | Support |
|----------|----------------------|---------|
| + | addition | yes |
| - | subtraction | yes |
| * | multiplication | yes |
| / | division | yes |
| % | modulo | yes |
| ^ | power/exponentiation | **no** |
##### Between two scalars
For example:
```text
1 + 2
```
##### Between an instant vector and a scalar
For example:
```text
service_cpm{service='$service', layer='$layer'} / 100
```
##### Between two instant vectors
For example:
```text
service_cpm{service='$service', layer='$layer'} + service_cpm{service='$service', layer='$layer'}
```
**Note: The operations between vectors require the same metric and labels, and don't support [Vector matching](https://prometheus.io/docs/prometheus/latest/querying/operators/#vector-matching).**
#### [Comparison binary operators](https://prometheus.io/docs/prometheus/latest/querying/operators/#comparison-binary-operators)
| Operator | Definition | Support |
|----------|------------------|---------|
| == | equal | yes |
| != | not-equal | yes |
| \> | greater-than | yes |
| < | less-than | yes |
| \>= | greater-or-equal | yes |
| <= | less-or-equal) | yes |
##### Between two scalars
For example:
```text
1 > bool 2
```
##### Between an instant vector and a scalar
For example:
```text
service_cpm{service='$service', layer='$layer'} > 1
```
##### Between two instant vectors
For example:
```text
service_cpm{service='service_A', layer='$layer'} > service_cpm{service='service_B', layer='$layer'}
```
### HTTP API
#### Expression queries
##### [Instant queries](https://prometheus.io/docs/prometheus/latest/querying/api/#instant-queries)
```text
GET|POST /api/v1/query
```
| Parameter | Definition | Support | Optional |
|-----------|-------------------------------------------------------------------------------------------------------------------------------------|---------|------------|
| query | prometheus expression | yes | no |
| time | **The latest metrics value from current time to this time is returned. If time is empty, the default look-back time is 2 minutes.** | yes | yes |
| timeout | evaluation timeout | **no** | **ignore** |
For example:
```text
/api/v1/query?query=service_cpm{service='agent::songs', layer='GENERAL'}
```
Result:
```json
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"__name__": "service_cpm",
"layer": "GENERAL",
"scope": "Service",
"service": "agent::songs"
},
"value": [
1677548400,
"6"
]
}
]
}
}
```
##### [Range queries](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries)
```text
GET|POST /api/v1/query_range
```
| Parameter | Definition | Support | Optional |
|-----------|--------------------------------------------------------------------------------------|---------|------------|
| query | prometheus expression | yes | no |
| start | start timestamp, **seconds** | yes | no |
| end | end timestamp, **seconds** | yes | no |
| step | **SkyWalking will automatically fit Step(DAY, HOUR, MINUTE) through start and end.** | **no** | **ignore** |
| timeout | evaluation timeout | **no** | **ignore** |
For example:
```text
/api/v1/query_range?query=service_cpm{service='agent::songs', layer='GENERAL'}&start=1677479336&end=1677479636
```
Result:
```json
{
"status": "success",
"data": {
"resultType": "matrix",
"result": [
{
"metric": {
"__name__": "service_cpm",
"layer": "GENERAL",
"scope": "Service",
"service": "agent::songs"
},
"values": [
[
1677479280,
"18"
],
[
1677479340,
"18"
],
[
1677479400,
"18"
],
[
1677479460,
"18"
],
[
1677479520,
"18"
],
[
1677479580,
"18"
]
]
}
]
}
}
```
#### Querying metadata
##### [Finding series by label matchers](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers)
```text
GET|POST /api/v1/series
```
| Parameter | Definition | Support | Optional |
|-----------|------------------------------|---------|----------|
| match[] | series selector | yes | no |
| start | start timestamp, **seconds** | yes | no |
| end | end timestamp, **seconds** | yes | no |
For example:
```text
/api/v1/series?match[]=service_traffic{layer='GENERAL'}&start=1677479336&end=1677479636
```
Result:
```json
{
"status": "success",
"data": [
{
"__name__": "service_traffic",
"service": "agent::songs",
"scope": "Service",
"layer": "GENERAL"
},
{
"__name__": "service_traffic",
"service": "agent::recommendation",
"scope": "Service",
"layer": "GENERAL"
},
{
"__name__": "service_traffic",
"service": "agent::app",
"scope": "Service",
"layer": "GENERAL"
},
{
"__name__": "service_traffic",
"service": "agent::gateway",
"scope": "Service",
"layer": "GENERAL"
},
{
"__name__": "service_traffic",
"service": "agent::frontend",
"scope": "Service",
"layer": "GENERAL"
}
]
}
```
**Note: SkyWalking's metadata exists in the following metrics(traffics):**
- service_traffic
- instance_traffic
- endpoint_traffic
#### [Getting label names](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names)
```text
GET|POST /api/v1/labels
```
| Parameter | Definition | Support | Optional |
|-----------|-----------------|---------|----------|
| match[] | series selector | yes | yes |
| start | start timestamp | **no** | yes |
| end | end timestamp | **no** | yes |
For example:
```text
/api/v1/labels?match[]=instance_jvm_cpu'
```
Result:
```json
{
"status": "success",
"data": [
"layer",
"scope",
"top_n",
"order",
"service_instance",
"parent_service"
]
}
```
#### [Querying label values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values)
```text
GET /api/v1/label/<label_name>/values
```
| Parameter | Definition | Support | Optional |
|-----------|-----------------|---------|----------|
| match[] | series selector | yes | no |
| start | start timestamp | **no** | yes |
| end | end timestamp | **no** | yes |
For example:
```text
/api/v1/label/__name__/values
```
Result:
```json
{
"status": "success",
"data": [
"meter_mysql_instance_qps",
"service_cpm",
"envoy_cluster_up_rq_active",
"instance_jvm_class_loaded_class_count",
"k8s_cluster_memory_requests",
"meter_vm_memory_used",
"meter_apisix_sv_bandwidth_unmatched",
"meter_vm_memory_total",
"instance_jvm_thread_live_count",
"instance_jvm_thread_timed_waiting_state_thread_count",
"browser_app_page_first_pack_percentile",
"instance_clr_max_worker_threads",
...
]
}
```
#### [Querying metric metadata](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-metric-metadata)
```text
GET /api/v1/metadata
```
| Parameter | Definition | Support | Optional |
|-----------|---------------------------------------------|---------|----------|
| limit | maximum number of metrics to return | yes | **yes** |
| metric | **metric name, support regular expression** | yes | **yes** |
For example:
```text
/api/v1/metadata?limit=10
```
Result:
```json
{
"status": "success",
"data": {
"meter_mysql_instance_qps": [
{
"type": "gauge",
"help": "",
"unit": ""
}
],
"meter_apisix_sv_bandwidth_unmatched": [
{
"type": "gauge",
"help": "",
"unit": ""
}
],
"service_cpm": [
{
"type": "gauge",
"help": "",
"unit": ""
}
],
...
}
}
```
## Metrics Type For Query
### Supported Metrics [Scope](../../../oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/query/enumeration/Scope.java)(Catalog)
All scopes are not supported completely, please check the following table:
| Scope | Support |
|-------------------------|---------|
| Service | yes |
| ServiceInstance | yes |
| Endpoint | yes |
| ServiceRelation | no |
| ServiceInstanceRelation | no |
| Process | no |
| ProcessRelation | no |
### General labels
Each metric contains general labels: [layer](../../../oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java).
Different metrics will have different labels depending on their Scope and metric value type.
| Query Labels | Scope | Expression Example |
|----------------------------------|-----------------|------------------------------------------------------------------------------------------------|
| layer, service | Service | service_cpm{service='$service', layer='$layer'} |
| layer, service, service_instance | ServiceInstance | service_instance_cpm{service='$service', service_instance='$service_instance', layer='$layer'} |
| layer, service, endpoint | Endpoint | endpoint_cpm{service='$service', endpoint='$endpoint', layer='$layer'} |
### Common Value Metrics
- Query Labels:
```text
{General labels}
```
- Expression Example:
```text
service_cpm{service='agent::songs', layer='GENERAL'}
```
- Result (Instant Query):
```json
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"__name__": "service_cpm",
"layer": "GENERAL",
"scope": "Service",
"service": "agent::songs"
},
"value": [
1677490740,
"3"
]
}
]
}
}
```
### Labeled Value Metrics
- Query Labels:
```text
--{General labels}
--labels: Used to filter the value labels to be returned
--relabels: Used to rename the returned value labels
note: The number and order of labels must match the number and order of relabels.
```
- Expression Example:
```text
service_percentile{service='agent::songs', layer='GENERAL', labels='0,1,2', relabels='P50,P75,P90'}
```
- Result (Instant Query):
```json
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"__name__": "service_percentile",
"label": "P50",
"layer": "GENERAL",
"scope": "Service",
"service": "agent::songs"
},
"value": [
1677493380,
"0"
]
},
{
"metric": {
"__name__": "service_percentile",
"label": "P75",
"layer": "GENERAL",
"scope": "Service",
"service": "agent::songs"
},
"value": [
1677493380,
"0"
]
},
{
"metric": {
"__name__": "service_percentile",
"label": "P90",
"layer": "GENERAL",
"scope": "Service",
"service": "agent::songs"
},
"value": [
1677493380,
"0"
]
}
]
}
}
```
### Sort Metrics
- Query Labels:
```text
--parent_service: <optional> Name of the parent service.
--top_n: The max number of the selected metric value
--order: ASC/DES
```
- Expression Example:
```text
service_instance_cpm{parent_service='agent::songs', layer='GENERAL', top_n='10', order='DES'}
```
- Result (Instant Query):
```json
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"__name__": "service_instance_cpm",
"layer": "GENERAL",
"scope": "ServiceInstance",
"service_instance": "651db53c0e3843d8b9c4c53a90b4992a@10.4.0.28"
},
"value": [
1677494280,
"14"
]
},
{
"metric": {
"__name__": "service_instance_cpm",
"layer": "GENERAL",
"scope": "ServiceInstance",
"service_instance": "4c04cf44d6bd408880556aa3c2cfb620@10.4.0.232"
},
"value": [
1677494280,
"6"
]
},
{
"metric": {
"__name__": "service_instance_cpm",
"layer": "GENERAL",
"scope": "ServiceInstance",
"service_instance": "f5ac8ead31af4e6795cae761729a2742@10.4.0.236"
},
"value": [
1677494280,
"5"
]
}
]
}
}
```
### Sampled Records
- Query Labels:
```text
--parent_service: Name of the parent service
--top_n: The max number of the selected records value
--order: ASC/DES
```
- Expression Example:
```text
top_n_database_statement{parent_service='localhost:-1', layer='VIRTUAL_DATABASE', top_n='10', order='DES'}
```
- Result (Instant Query):
```json
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"__name__": "top_n_database_statement",
"layer": "VIRTUAL_DATABASE",
"scope": "Service",
"record": "select song0_.id as id1_0_, song0_.artist as artist2_0_, song0_.genre as genre3_0_, song0_.liked as liked4_0_, song0_.name as name5_0_ from song song0_ where song0_.liked>?"
},
"value": [
1677501360,
"1"
]
},
{
"metric": {
"__name__": "top_n_database_statement",
"layer": "VIRTUAL_DATABASE",
"scope": "Service",
"record": "select song0_.id as id1_0_, song0_.artist as artist2_0_, song0_.genre as genre3_0_, song0_.liked as liked4_0_, song0_.name as name5_0_ from song song0_ where song0_.liked>?"
},
"value": [
1677501360,
"1"
]
},
{
"metric": {
"__name__": "top_n_database_statement",
"layer": "VIRTUAL_DATABASE",
"scope": "Service",
"record": "select song0_.id as id1_0_, song0_.artist as artist2_0_, song0_.genre as genre3_0_, song0_.liked as liked4_0_, song0_.name as name5_0_ from song song0_ where song0_.liked>?"
},
"value": [
1677501360,
"1"
]
}
]
}
}
```
# Use Grafana As The UI
Since 9.4.0, SkyWalking provide [PromQL Service](../../api/promql-service.md). You can choose [Grafana](https://grafana.com/)
as the SkyWalking UI. About the installation and how to use please refer to the [official document](https://grafana.com/docs/grafana/v9.3/).
Notice <1>, Gafana is [AGPL-3.0 license](https://github.com/grafana/grafana/blob/main/LICENSE), which is very different from Apache 2.0.
Please follow AGPL 3.0 license requirements.
Notice <2>, SkyWalking always uses its native UI as first class. All visualization features are only available on native UI.
Grafana UI is an extension on our support of PromQL APIs. We don't maintain or promise the complete Grafana UI dashboard setup.
## Configure Data Source
In the data source config panel, chose the `Prometheus` and set the url to the OAP server address, the default port is `9090`.
<img src="https://skywalking.apache.org/doc-graph/promql/grafana-datasource.jpg"/>
## Configure Dashboards
### Dashboards Settings
The following steps are the example of config a `General Service` dashboard:
1. Create a dashboard named `General Service`. A [layer](../../../../oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java) is recommended as a dashboard.
2. Configure variables for the dashboard:
<img src="https://skywalking.apache.org/doc-graph/promql/grafana-variables.jpg"/>
After configure, you can select the service/instance/endpoint on the top of the dashboard:
<img src="https://skywalking.apache.org/doc-graph/promql/grafana-variables2.jpg"/>
### Add Panels
The following contents show how to add several typical metrics panels.
General settings:
1. Chose the metrics and chart.
2. Set `Query options --> Min interval = 1m`, because the metrics min time bucket in SkyWalking is 1m.
3. Add PromQL expressions, use the variables configured above for the labels then you can select the labels value from top.
**Note: Some metrics values may be required calculations to match units.**
4. Select the returned labels you want to show on panel.
5. Test query and save the panel.
#### Common Value Metrics
1. For example `service_apdex` and `Time series chart`.
2. Add PromQL expression, the metric scope is `Service`, so add labels `service` and `layer` for match.
3. Set `Connect null values --> Always` and `Show points --> Always` because when the query interval > 1hour or 1day SkyWalking return
the hour/day step metrics values.
<img src="https://skywalking.apache.org/doc-graph/promql/grafana-panels.jpg"/>
#### Labeled Value Metrics
1. For example `service_percentile` and `Time series chart`.
2. Add PromQL expressions, the metric scope is `Service`, add labels `service` and `layer` for match.
And it's a labeled value metric, add `labels='0,1,2,3,4'` filter the result label, and add`relabels='P50,P75,P90,P95,P99'` rename the result label.
3. Set `Connect null values --> Always` and `Show points --> Always` because when the query interval > 1hour or 1day SkyWalking return
the hour/day step metrics values.
<img src="https://skywalking.apache.org/doc-graph/promql/grafana-panels2.jpg"/>
#### Sort Metrics
1. For example `service_instance_cpm` and `Bar gauge chart`.
2. Add PromQL expressions, add labels `parent_service` and `layer` for match, add `top_n='10'` and `order='DES'` filter the result.
3. Set the `Calculation --> Latest*`.
<img src="https://skywalking.apache.org/doc-graph/promql/grafana-panels3.jpg"/>
#### Sampled Records
Same as the Sort Metrics.
......@@ -144,11 +144,17 @@ catalog:
- name: "Dynamic Configuration"
path: "/en/setup/backend/dynamic-config"
- name: "UI Setup"
path: "/en/setup/backend/ui-setup"
catalog:
- name: "Native UI"
catalog:
- name: "Setup"
path: "/en/setup/backend/ui-setup"
- name: "Customization"
path: "/en/ui/readme"
- name: "Grafana UI"
path: "/en/setup/backend/ui-grafana"
- name: "Official Dashboards"
catalog:
- name: "Overview"
path: "/en/ui/readme"
- name: "General Service"
catalog:
- name: "Server Agents"
......@@ -257,6 +263,8 @@ catalog:
path: "/en/api/event"
- name: "Profiling"
path: "/en/api/profiling-protocol"
- name: "PromQL Service"
path: "/en/api/promql-service"
- name: "Query APIs"
catalog:
- name: "GraphQL APIs"
......
......@@ -288,7 +288,7 @@ public class PromQLApiHandler {
if (time.isPresent()) {
endTS = formatTimestamp2Millis(time.get());
}
long startTS = endTS - 900000; //look back 15m by default
long startTS = endTS - 120000; //look back 2m by default
Duration duration = timestamp2Duration(startTS, endTS);
ExprQueryRsp response = new ExprQueryRsp();
......@@ -347,7 +347,7 @@ public class PromQLApiHandler {
@Param("query") String query,
@Param("start") String start,
@Param("end") String end,
@Param("step") String step,
@Param("step") Optional<String> step,
@Param("timeout") Optional<String> timeout) throws IOException {
long startTS = formatTimestamp2Millis(start);
long endTS = formatTimestamp2Millis(end);
......
......@@ -50,11 +50,11 @@ public class PromOpUtils {
long durationValue = endTS - startTS;
if (durationValue < 3600000) {
if (durationValue <= 3600000) {
duration.setStep(Step.MINUTE);
duration.setStart(startDT.toString(DurationUtils.YYYY_MM_DD_HHMM));
duration.setEnd(endDT.toString(DurationUtils.YYYY_MM_DD_HHMM));
} else if (durationValue < 86400000) {
} else if (durationValue <= 86400000) {
duration.setStep(Step.HOUR);
duration.setStart(startDT.toString(DurationUtils.YYYY_MM_DD_HH));
duration.setEnd(endDT.toString(DurationUtils.YYYY_MM_DD_HH));
......
......@@ -260,14 +260,11 @@ public class PromQLExprQueryVisitor extends PromQLParserBaseVisitor<ParseResult>
private void checkLabels(Map<LabelName, String> labelMap,
LabelName... labelNames) throws IllegalExpressionException {
StringBuilder missLabels = new StringBuilder();
int j = 0;
for (int i = 0; i < labelNames.length; i++) {
String labelName = labelNames[i].toString();
if (labelMap.get(labelNames[i]) == null) {
if (i == 0) {
missLabels.append(labelName);
} else {
missLabels.append(",").append(labelName);
}
missLabels.append(j++ > 0 ? "," : "").append(labelName);
}
}
String result = missLabels.toString();
......@@ -426,7 +423,8 @@ public class PromQLExprQueryVisitor extends PromQLParserBaseVisitor<ParseResult>
Layer layer,
Scope scope,
Map<LabelName, String> labelMap) throws IllegalExpressionException {
checkLabels(labelMap, LabelName.TOP_N, LabelName.PARENT_SERVICE, LabelName.ORDER);
//sortMetrics query ParentService could be null.
checkLabels(labelMap, LabelName.TOP_N, LabelName.ORDER);
TopNCondition topNCondition = new TopNCondition();
topNCondition.setName(metricName);
topNCondition.setParentService(labelMap.get(LabelName.PARENT_SERVICE));
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册