未验证 提交 f9db9ba2 编写于 作者: Y yswdqz 提交者: GitHub

Support monitoring PostgreSQL slow SQLs. (#9673)

上级 4e73a30d
......@@ -19,6 +19,7 @@
* Remove physical index existing check and keep template existing check only to avoid meaningless `retry wait`
in `no-init` mode.
* Make sure instance list ordered in TTL processor to avoid TTL timer never runs.
* Support monitoring PostgreSQL slow SQLs.
#### UI
......@@ -40,6 +41,7 @@
* Remove `responseCode` fields of all OAL sources, as well as examples to avoid user's confusion.
* Remove All from the endpoints selector.
* Enhance menu configurations to make it easier to change.
* Update PostgreSQL dashboard to visualize collected slow SQLs.
#### Documentation
......
# PostgreSQL monitoring
## PostgreSQL server performance from `postgres-exporter`
SkyWalking leverages postgres-exporter for collecting metrics data from PostgreSQL. It leverages OpenTelemetry Collector to transfer the metrics to
[OpenTelemetry receiver](opentelemetry-receiver.md) and into the [Meter System](./../../concepts-and-designs/meter.md).
## Data flow
### Data flow
1. postgres-exporter collect metrics data from PostgreSQL.
2. OpenTelemetry Collector fetches metrics from postgres-exporter via Prometheus Receiver and pushes metrics to SkyWalking OAP Server via the OpenCensus gRPC Exporter or OpenTelemetry gRPC exporter.
3. The SkyWalking OAP Server parses the expression with [MAL](../../concepts-and-designs/mal.md) to filter/calculate/aggregate and store the results.
## Set up
### Set up
1. Set up [postgres-exporter](https://github.com/prometheus-community/postgres_exporter#quick-start).
2. Set up [OpenTelemetry Collector ](https://opentelemetry.io/docs/collector/getting-started/#docker). For details on Prometheus Receiver in OpenTelemetry Collector, refer to [here](../../../../test/e2e-v2/cases/postgresql/postgres-exporter/otel-collector-config.yaml).
3. Config SkyWalking [OpenTelemetry receiver](opentelemetry-receiver.md).
## PostgreSQL Monitoring
### PostgreSQL Monitoring
PostgreSQL monitoring provides monitoring of the status and resources of the PostgreSQL server.PostgreSQL server as a `Service` in OAP, and land on the `Layer: POSTGRESQL`.
### PostgreSQL Supported Metrics
#### Supported Metrics
| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|-----|------|-----|-----|-----|
| Shared Buffers | MB | meter_pg_shared_buffers | The number of shared memory buffers used by the server | postgres-exporter|
......@@ -44,8 +45,33 @@ PostgreSQL monitoring provides monitoring of the status and resources of the Pos
| Idle Sessions | | meter_pg_idle_sessions | The number of connections which state is idle,idle in transaction or idle in transaction (aborted) | postgres-exporter|
| Locks Count | | meter_pg_locks_count | Number of locks | postgres-exporter|
## Customizations
### Customizations
You can customize your own metrics/expression/dashboard panel.
The metrics definition and expression rules are found in `/config/otel-rules/postgresql.yaml`.
The PostgreSQL dashboard panel configurations are found in `/config/ui-initialized-templates/postgresql`.
## Collect sampled slow SQLs
SkyWalking leverages [fluentbit](https://fluentbit.io/) or other log agents for collecting slow SQL statements from PostgreSQL.
### Data flow
1. fluentbit agent collects slow sql logs from PostgreSQL.
2. fluentbit agent sends data to SkyWalking OAP Server using native log APIs via HTTP.
3. The SkyWalking OAP Server parses the expression with [LAL](../../concepts-and-designs/lal.md) to parse/extract and store the results.
### Set up
1. Set up [fluentbit](https://docs.fluentbit.io/manual/installation/docker).
2. Config [fluentbit](../../../../test/e2e-v2/cases/postgresql/postgres-exporter/fluent-bit.conf)
3. Config PostgreSQL to enable slow log. [Example](../../../../test/e2e-v2/cases/postgresql/postgres-exporter/postgresql.conf).
### Slow SQL Monitoring
Slow SQL monitoring provides monitoring of the slow SQL statements of the PostgreSQL server. PostgreSQL server is cataloged as a `Layer: POSTGRESQL` `Service` in OAP.
#### Supported Metrics
| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|-----|------|-----|-----|-----|
|Slow Statements | ms | top_n_database_statement | The latency and statement of PostgreSQL slow SQLs | fluentbit|
### Customizations
You can customize your own metrics/expression/dashboard panel.
The slowsql expression rules are found in `/config/lal/pgsql-slowsql.yaml`
The PostgreSQL dashboard panel configurations are found in `/config/ui-initialized-templates/postgresql`.
\ No newline at end of file
......@@ -241,7 +241,7 @@ agent-analyzer:
log-analyzer:
selector: ${SW_LOG_ANALYZER:default}
default:
lalFiles: ${SW_LOG_LAL_FILES:envoy-als,mysql-slowsql,default}
lalFiles: ${SW_LOG_LAL_FILES:envoy-als,mysql-slowsql,pgsql-slowsql,default}
malFiles: ${SW_LOG_MAL_FILES:""}
event-analyzer:
......
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
rules:
- name: pgsql-slowsql
layer: POSTGRESQL
dsl: |
filter {
json{
}
extractor{
layer parsed.layer as String
service parsed.service as String
timestamp parsed.time as String
if (tag("LOG_KIND") == "SLOW_SQL") {
slowSql {
id parsed.id as String
statement parsed.statement as String
latency parsed.query_time as Long
}
}
}
}
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
{{- contains . }}
- key: 0
value:
{{- contains .value }}
- key: {{ notEmpty .key }}
value: {{ notEmpty .value }}
{{- end }}
{{- end }}
......@@ -21,22 +21,40 @@ services:
file: ../../../script/docker-compose/base-compose.yml
service: oap
environment:
SW_OTEL_RECEIVER: default
SW_OTEL_RECEIVER_ENABLED_OC_RULES: postgresql
- TZ=Asia/Shanghai
- SW_STORAGE=postgresql
- SW_JDBC_URL=jdbc:postgresql://postgres:5432/skywalking
- SW_DATA_SOURCE_USER=postgres
- SW_DATA_SOURCE_PASSWORD=123456
- SW_CORE_TOPN_REPORT_PERIOD=2
ports:
- 12800
depends_on:
postgres:
condition: service_healthy
postgres:
image: postgres:14.1
environment:
- POSTGRES_PASSWORD=password
- TZ=Asia/Shanghai
- POSTGRES_PASSWORD=123456
- POSTGRES_DB=skywalking
ports:
- 5432
command: postgres -c config_file=/etc/postgresql.conf
volumes:
- ./postgresql.conf:/etc/postgresql.conf
- ./data:/var/lib/postgresql/data
networks:
- e2e
healthcheck:
test: ["CMD", "bash", "-c", "cat < /dev/null > /dev/tcp/127.0.0.1/5432"]
interval: 5s
timeout: 60s
retries: 120
postgres-load:
image: postgres:14.1
environment:
- PGPASSWORD=password
- PGPASSWORD=123456
depends_on:
- otel-collector
networks:
......@@ -46,7 +64,7 @@ services:
- ../postgres-exporter:/docker
command:
- -c
- "psql -f /docker/mock.sql -h postgres -p 5432 -U postgres postgres"
- "psql -f /docker/mock.sql -h postgres -p 5432 -U postgres"
postgres-exporter:
image: quay.io/prometheuscommunity/postgres-exporter:v0.11.0
depends_on:
......@@ -54,9 +72,23 @@ services:
ports:
- 9187
environment:
- DATA_SOURCE_NAME=postgresql://postgres:password@postgres:5432/postgres?sslmode=disable
- DATA_SOURCE_NAME=postgresql://postgres:123456@postgres:5432/postgres?sslmode=disable
networks:
- e2e
fluentbit:
image: fluent/fluent-bit:1.9
environment:
- TZ=Asia/Shanghai
volumes:
- ./data/log:/tmp/skywalking-logs
- ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
- ./fluent-bit-parser.conf:/fluent-bit/etc/fluent-bit-parser.conf
- ./fluent-bit-script.lua:/fluent-bit/etc/fluent-bit-script.lua
networks:
- e2e
depends_on:
postgres:
condition: service_healthy
otel-collector:
image: otel/opentelemetry-collector:0.50.0
......
......@@ -30,7 +30,7 @@ setup:
verify:
retry:
count: 20
count: 60
interval: 3s
cases:
- includes:
......
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
[PARSER]
name my-log-format
format regex
regex \w*-\w*-\w* \w*:\w*:\w*.\w* UTCLOG: duration: \w*.\w* ms statement.*
--
-- Licensed to the Apache Software Foundation (ASF) under one or more
-- contributor license agreements. See the NOTICE file distributed with
-- this work for additional information regarding copyright ownership.
-- The ASF licenses this file to You under the Apache License, Version 2.0
-- (the "License"); you may not use this file except in compliance with
-- the License. You may obtain a copy of the License at
--
-- http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
--
function rewrite_body(tag, timestamp, record)
log = record["log"]
record["log"] = nil
record["date"] = nil
record["tags"] = {data={{key="LOG_KIND", value="SLOW_SQL"}}}
arr = split(log,"\n")
re1 = {}
re1["time"] = os.time()
re1["layer"] = "POSTGRESQL"
record["layer"] = "POSTGRESQL"
_,durationIndex = string.find(log,"duration: ")
msIndex,_ = string.find(log," ms")
duration = string.sub(log,durationIndex+1,msIndex)
_,statementAf = string.find(log,"statement: ")
re1["statement"] = string.sub(log,statementAf+1)
duration = string.sub(log,durationIndex+1,msIndex-1)
d1 = math.floor(tonumber(duration))
re1["query_time"] = d1
record["service"]="postgresql::postgres:5432"
re1["service"]= "postgresql::postgres:5432"
re1["id"] = uuid()
jsonstr = table2json(re1)
record["body"]={json={}}
record["body"]["json"]["json"] = jsonstr
return 1, timestamp, record
end
function split(input, delimiter)
input = tostring(input)
delimiter = tostring(delimiter)
if (delimiter == "") then return false end
local pos, arr = 0, {}
for st, sp in function() return string.find(input, delimiter, pos, true) end do
table.insert(arr, string.sub(input, pos, st - 1))
pos = sp + 1
end
table.insert(arr, string.sub(input, pos))
return arr
end
function uuid()
local seed={'e','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f'}
local tb={}
for i=1,32 do
table.insert(tb,seed[math.random(1,16)])
end
local sid=table.concat(tb)
return string.format('%s-%s-%s-%s-%s',
string.sub(sid,1,8),
string.sub(sid,9,12),
string.sub(sid,13,16),
string.sub(sid,17,20),
string.sub(sid,21,32)
)
end
function table2json(t)
local function serialize(tbl)
local tmp = {}
for k, v in pairs(tbl) do
local k_type = type(k)
local v_type = type(v)
local key = (k_type == "string" and '"' .. k .. '":') or (k_type == "number" and "")
local value =
(v_type == "table" and serialize(v)) or (v_type == "boolean" and tostring(v)) or
(v_type == "string" and '"' .. v .. '"') or
(v_type == "number" and v)
tmp[#tmp + 1] = key and value and tostring(key) .. tostring(value) or nil
end
if table.maxn(tbl) == 0 then
return "{" .. table.concat(tmp, ",") .. "}"
else
return "[" .. table.concat(tmp, ",") .. "]"
end
end
assert(type(t) == "table")
return serialize(t)
end
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
[SERVICE]
flush 1
log_level info
parsers_File fluent-bit-parser.conf
[INPUT]
name tail
path /tmp/skywalking-logs/slow.log
parser my-log-format
[FILTER]
name grep
match *
regex log \w*-\w*-\w* \w*:\w*:\w*.\w* UTCLOG: duration: \w*.\w* ms statement.*
[FILTER]
name lua
match *
script fluent-bit-script.lua
call rewrite_body
[OUTPUT]
name stdout
match *
format json
[OUTPUT]
name http
match *
host oap
port 12800
uri /v3/logs
format json
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
listen_addresses = '*'
max_connections = 100
shared_buffers = 128MB
max_wal_size = 1GB
min_wal_size = 80MB
log_timezone = 'Etc/UTC'
datestyle = 'iso, mdy'
timezone = 'Etc/UTC'
lc_messages = 'en_US.utf8'
lc_monetary = 'en_US.utf8'
lc_numeric = 'en_US.utf8'
lc_time = 'en_US.utf8'
default_text_search_config = 'pg_catalog.english'
logging_collector = on
log_directory = 'log'
log_filename = 'slow.log'
log_rotation_age = 1d
log_rotation_size = 10MB
log_min_messages = info
log_min_duration_statement = 1000
log_checkpoints = on
log_connections = on
log_disconnections = on
log_duration = on
log_line_prefix = '%m'
\ No newline at end of file
......@@ -56,3 +56,6 @@
- query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics multiple-linear --name=meter_pg_active_sessions --labels=postgres --service-name=postgresql::postgres:5432 |yq e 'to_entries | with(.[] ; .value=(.value | to_entries))' -
expected: expected/metrics-has-value-label.yml
# slow sql
- query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics sampled-record --name=top_n_database_statement |yq e 'to_entries | with(.[] ; .value=(.value | to_entries))' -
expected: expected/db-has-value.yml
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册