Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
DiDi
nightingale
提交
544c93c7
N
nightingale
项目概览
DiDi
/
nightingale
11 个月 前同步成功
通知
46
Star
7053
Fork
1161
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
N
nightingale
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
544c93c7
编写于
4月 02, 2022
作者:
U
Ulric Qin
浏览文件
操作
浏览文件
下载
差异文件
Merge branch 'main' of github.com:didi/nightingale
上级
66bc023e
c5ea2d0d
变更
5
展开全部
隐藏空白更改
内联
并排
Showing
5 changed file
with
746 addition
and
507 deletion
+746
-507
etc/alerts/linux_by_exporter.json
etc/alerts/linux_by_exporter.json
+344
-0
etc/alerts/node_by_exporter.json
etc/alerts/node_by_exporter.json
+0
-310
etc/dashboards/linux_by_exporter.json
etc/dashboards/linux_by_exporter.json
+223
-0
etc/dashboards/linux_by_telegraf.json
etc/dashboards/linux_by_telegraf.json
+179
-0
etc/dashboards/node_by_exporter.json
etc/dashboards/node_by_exporter.json
+0
-197
未找到文件。
etc/alerts/linux_by_exporter.json
0 → 100644
浏览文件 @
544c93c7
[
{
"name"
:
"inode资源不足-使用率超过90"
,
"note"
:
""
,
"severity"
:
2
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"(100 - ((node_filesystem_files_free * 100) / node_filesystem_files))>90"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"内存资源不足-利用率大于75%"
,
"note"
:
"需要扩容或者升级配置了"
,
"severity"
:
2
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"(node_memory_MemTotal_bytes - node_memory_MemFree_bytes - (node_memory_Cached_bytes + node_memory_Buffers_bytes))/node_memory_MemTotal_bytes*100 > 75"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[
"dingtalk"
],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"内存资源不足-利用率大于95%"
,
"note"
:
"需要扩容或者升级配置了"
,
"severity"
:
1
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"(node_memory_MemTotal_bytes - node_memory_MemFree_bytes - (node_memory_Cached_bytes + node_memory_Buffers_bytes))/node_memory_MemTotal_bytes*100 > 95"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[
"dingtalk"
],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"文件句柄不足-使用率超过90%"
,
"note"
:
"可以将文件句柄limit调大,或者扩容"
,
"severity"
:
2
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"(node_filefd_allocated{instance=
\"
$node
\"
}/node_filefd_maximum{instance=
\"
$node
\"
}*100) > 90"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"某磁盘无法正常读写"
,
"note"
:
""
,
"severity"
:
1
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"(node_filesystem_device_error{instance=
\"
$node
\"
,mountpoint!~
\"
/var/lib/.*
\"
,mountpoint!~
\"
/run.*
\"
}) > 0"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"磁盘需要清理了-利用率达到92%"
,
"note"
:
""
,
"severity"
:
1
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"(100 - ((node_filesystem_avail_bytes * 100) / node_filesystem_size_bytes) ) > 92 "
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[
"dingtalk"
],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"系统conntrack需要调整-使用率超过80%"
,
"note"
:
""
,
"severity"
:
2
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"node_nf_conntrack_entries / node_nf_conntrack_entries_limit*100 > 80"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"系统出现oom"
,
"note"
:
""
,
"severity"
:
2
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"increase(node_vmstat_oom_kill[1m]) > 0"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"网卡入方向丢包"
,
"note"
:
""
,
"severity"
:
2
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"rate(node_network_receive_drop_total{device=~
\"
e.*
\"
}[1m]) > 3"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"网卡出方向丢包"
,
"note"
:
""
,
"severity"
:
2
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"rate(node_network_transmit_drop_total{device=~
\"
e.*
\"
}[1m]) > 3"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"计算资源不足-机器每个核平均负载大于10"
,
"note"
:
"需要扩容或者升级配置了"
,
"severity"
:
2
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"avg (node_load1) by (instance)/count(count(node_cpu_seconds_total) by (cpu,instance)) by (instance) >10"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"运行进程数过多-超过3000"
,
"note"
:
"建议扩容"
,
"severity"
:
2
,
"disabled"
:
1
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"node_procs_running > 3000"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
}
]
\ No newline at end of file
etc/alerts/node_by_exporter.json
已删除
100644 → 0
浏览文件 @
66bc023e
[
{
"name"
:
"inode资源不足-使用率超过90"
,
"note"
:
""
,
"severity"
:
2
,
"disabled"
:
0
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"(100 - ((node_filesystem_files_free * 100) / node_filesystem_files))>90"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"内存资源不足-利用率大于75%"
,
"note"
:
"需要扩容或者升级配置了"
,
"severity"
:
2
,
"disabled"
:
0
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"(node_memory_MemTotal_bytes - node_memory_MemFree_bytes - (node_memory_Cached_bytes + node_memory_Buffers_bytes))/node_memory_MemTotal_bytes*100 > 75"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"文件句柄不足-使用率超过90%"
,
"note"
:
"可以将文件句柄limit调大,或者扩容"
,
"severity"
:
2
,
"disabled"
:
0
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"(node_filefd_allocated{instance=
\"
$node
\"
}/node_filefd_maximum{instance=
\"
$node
\"
}*100) > 90"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"某磁盘无法正常读写"
,
"note"
:
""
,
"severity"
:
1
,
"disabled"
:
0
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"(node_filesystem_device_error{instance=
\"
$node
\"
,mountpoint!~
\"
/var/lib/.*
\"
,mountpoint!~
\"
/run.*
\"
}) > 0"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"磁盘需要清理了-利用率达到92%"
,
"note"
:
""
,
"severity"
:
1
,
"disabled"
:
0
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"(100 - ((node_filesystem_avail_bytes * 100) / node_filesystem_size_bytes) ) > 92 "
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"系统conntrack需要调整-使用率超过80%"
,
"note"
:
""
,
"severity"
:
2
,
"disabled"
:
0
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"node_nf_conntrack_entries / node_nf_conntrack_entries_limit*100 > 80"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"系统出现oom"
,
"note"
:
""
,
"severity"
:
2
,
"disabled"
:
0
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"increase(node_vmstat_oom_kill[1m]) > 0"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"网卡入方向丢包"
,
"note"
:
""
,
"severity"
:
2
,
"disabled"
:
0
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"rate(node_network_receive_drop_total{device=~
\"
e.*
\"
}[1m]) > 3"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"网卡出方向丢包"
,
"note"
:
""
,
"severity"
:
2
,
"disabled"
:
0
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"rate(node_network_transmit_drop_total{device=~
\"
e.*
\"
}[1m]) > 3"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"计算资源不足-机器loadavg1大于15"
,
"note"
:
"需要扩容或者升级配置了"
,
"severity"
:
2
,
"disabled"
:
0
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"node_load1>15"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
},
{
"name"
:
"运行进程数过多-超过3000"
,
"note"
:
"建议扩容"
,
"severity"
:
2
,
"disabled"
:
0
,
"prom_for_duration"
:
60
,
"prom_ql"
:
"node_procs_running > 3000"
,
"prom_eval_interval"
:
15
,
"enable_stime"
:
"00:00"
,
"enable_etime"
:
"23:59"
,
"enable_days_of_week"
:
[
"1"
,
"2"
,
"3"
,
"4"
,
"5"
,
"6"
,
"0"
],
"enable_in_bg"
:
0
,
"notify_recovered"
:
1
,
"notify_channels"
:
[],
"notify_repeat_step"
:
60
,
"recover_duration"
:
0
,
"callbacks"
:
[],
"runbook_url"
:
""
,
"append_tags"
:
[]
}
]
\ No newline at end of file
etc/dashboards/linux_by_exporter.json
0 → 100644
浏览文件 @
544c93c7
此差异已折叠。
点击以展开。
etc/dashboards/linux_by_telegraf.json
0 → 100644
浏览文件 @
544c93c7
此差异已折叠。
点击以展开。
etc/dashboards/node_by_exporter.json
已删除
100644 → 0
浏览文件 @
66bc023e
此差异已折叠。
点击以展开。
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录