1. windows_exporter
1.1 介绍
收集win机器指标,该服务默认收集cpu、cpu_info、memory、process、tcp、cs、logical_disk、net、os、system、textfile、time
- 默认安装到C:\Program Files \windows_exporter目录。
- 默认监听端口是9182
1.2 部署
1.下载
wget https://github.com/prometheus-community/windows_exporter/releases/download/v0.29.2/windows_exporter-0.29.2-amd64.msi
wget https://github.com/prometheus-community/windows_exporter/releases/download/v0.29.2/windows_exporter-0.29.2-amd64.msi
2.安装
直接双击运行安装,这个默认安装会自动注册到系统服务中。
- 安装路径
- 端口和配置文件
- 安装
3.启动
win+R,输入services.msc
如果没有上述图片的服务,需要注入到系统服务中去,实现开机自起。
语法:
powershell
C:\Users\Administrator>sc create
描述:
在注册表和服务数据库中创建服务项。
用法:
sc <server> create [service name] [binPath= ] <option1> <option2>...
选项:
注意: 选项名称包括等号。
等号和值之间需要一个空格。
type= <own|share|interact|kernel|filesys|rec>
(默认 = own)
start= <boot|system|auto|demand|disabled|delayed-auto>
(默认 = demand)
error= <normal|severe|critical|ignore>
(默认 = normal)
binPath= <BinaryPathName>
group= <LoadOrderGroup>
tag= <yes|no>
depend= <依存关系(以 / (斜杠) 分隔)>
obj= <AccountName|ObjectName>
(默认 = LocalSystem)
DisplayName= <显示名称>
password= <密码>
C:\Users\Administrator>sc create
描述:
在注册表和服务数据库中创建服务项。
用法:
sc <server> create [service name] [binPath= ] <option1> <option2>...
选项:
注意: 选项名称包括等号。
等号和值之间需要一个空格。
type= <own|share|interact|kernel|filesys|rec>
(默认 = own)
start= <boot|system|auto|demand|disabled|delayed-auto>
(默认 = demand)
error= <normal|severe|critical|ignore>
(默认 = normal)
binPath= <BinaryPathName>
group= <LoadOrderGroup>
tag= <yes|no>
depend= <依存关系(以 / (斜杠) 分隔)>
obj= <AccountName|ObjectName>
(默认 = LocalSystem)
DisplayName= <显示名称>
password= <密码>
- 创建服务
C:\Users\Administrator>sc create windows_exporter binpath= C:\windows_exporter.exe type= own start= auto displayname= windows_exporter
[SC] CreateService 成功
C:\Users\Administrator>sc create windows_exporter binpath= C:\windows_exporter.exe type= own start= auto displayname= windows_exporter
[SC] CreateService 成功
❌ 注意
等号“=”与值之间有一个空格,必需要保留,如果不保留,运行会出错。
- 删除服务
powershell
C:\Users\Administrator>sc delete windows_exporter
[SC] DeleteService 成功
C:\Users\Administrator>sc delete windows_exporter
[SC] DeleteService 成功
4.访问
ip:9182/metrics
1.3 prometheus采集
yaml
- job_name: Windows
static_configs:
- targets:
- 10.103.236.129:9182
scrape_interval: 1m
scrape_timeout: 30s
scheme: http
metrics_path: "/metrics"
honor_labels: true
- job_name: Windows
static_configs:
- targets:
- 10.103.236.129:9182
scrape_interval: 1m
scrape_timeout: 30s
scheme: http
metrics_path: "/metrics"
honor_labels: true
- 重新加载服务
bash
curl -XPOST http://prometheus.ikubernetes.net/-/reload
curl -XPOST http://prometheus.ikubernetes.net/-/reload
1.告警规则
yaml
windos.rules: |
groups:
- name: windos.rules
rules:
- alert: Windows_采集器状态
expr: windows_exporter_collector_success == 0
for: 0m
labels:
severity: critical
annotations:
summary: Windows Server collector Error (instance {{ $labels.instance }})
description: "Collector {{ $labels.collector }} was not successful\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
- alert: Windows_ServerServiceStatus
expr: windows_service_state{ name="MSSQLSERVER", state="running"} != 1
for: 1m
labels:
severity: critical
annotations:
summary: Windows Server service Status (instance {{ $labels.instance }})
description: "Windows Service state is not OK\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
- alert: Windows_cpu
expr: 100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[2m])) * 100) > 80
for: 0m
labels:
severity: critical
annotations:
summary: Windows Server CPU Usage (instance {{ $labels.instance }})
description: "CPU Usage is more than 80%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
- alert: Windows_mem
expr: 100 - ((windows_os_physical_memory_free_bytes / windows_cs_physical_memory_bytes) * 100) > 80
for: 2m
labels:
severity: critical
annotations:
summary: Windows Server memory Usage (instance {{ $labels.instance }})
description: "Memory usage is more than 80%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
- alert: Windows_disk
expr: 100.0 - 100 * ((windows_logical_disk_free_bytes / 1024 / 1024 ) / (windows_logical_disk_size_bytes / 1024 / 1024)) > 80
for: 2m
labels:
severity: critical
annotations:
summary: Windows Server disk Space Usage (instance {{ $labels.instance }})
description: "Disk usage is more than 80%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
windos.rules: |
groups:
- name: windos.rules
rules:
- alert: Windows_采集器状态
expr: windows_exporter_collector_success == 0
for: 0m
labels:
severity: critical
annotations:
summary: Windows Server collector Error (instance {{ $labels.instance }})
description: "Collector {{ $labels.collector }} was not successful\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
- alert: Windows_ServerServiceStatus
expr: windows_service_state{ name="MSSQLSERVER", state="running"} != 1
for: 1m
labels:
severity: critical
annotations:
summary: Windows Server service Status (instance {{ $labels.instance }})
description: "Windows Service state is not OK\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
- alert: Windows_cpu
expr: 100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[2m])) * 100) > 80
for: 0m
labels:
severity: critical
annotations:
summary: Windows Server CPU Usage (instance {{ $labels.instance }})
description: "CPU Usage is more than 80%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
- alert: Windows_mem
expr: 100 - ((windows_os_physical_memory_free_bytes / windows_cs_physical_memory_bytes) * 100) > 80
for: 2m
labels:
severity: critical
annotations:
summary: Windows Server memory Usage (instance {{ $labels.instance }})
description: "Memory usage is more than 80%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
- alert: Windows_disk
expr: 100.0 - 100 * ((windows_logical_disk_free_bytes / 1024 / 1024 ) / (windows_logical_disk_size_bytes / 1024 / 1024)) > 80
for: 2m
labels:
severity: critical
annotations:
summary: Windows Server disk Space Usage (instance {{ $labels.instance }})
description: "Disk usage is more than 80%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
1.4 grafana
默认搜索,https://grafana.com/grafana/dashboards/
模板使用10467