Skip to content

1. windows_exporter

1.1 介绍

收集win机器指标,该服务默认收集cpu、cpu_info、memory、process、tcp、cs、logical_disk、net、os、system、textfile、time

  • 默认安装到C:\Program Files \windows_exporter目录。
  • 默认监听端口是9182

1.2 部署

1.下载

官当

wget https://github.com/prometheus-community/windows_exporter/releases/download/v0.29.2/windows_exporter-0.29.2-amd64.msi
wget https://github.com/prometheus-community/windows_exporter/releases/download/v0.29.2/windows_exporter-0.29.2-amd64.msi

2.安装

直接双击运行安装,这个默认安装会自动注册到系统服务中。

image-20241113102506127

  • 安装路径

image-20241113102835206

  • 端口和配置文件

image-20241113102906685

  • 安装

image-20241113102944740

3.启动

win+R,输入services.msc

image-20241113104230171

如果没有上述图片的服务,需要注入到系统服务中去,实现开机自起。

语法:

powershell
C:\Users\Administrator>sc create
描述:
        在注册表和服务数据库中创建服务项。
用法:
        sc <server> create [service name] [binPath= ] <option1> <option2>...
 
选项:
注意: 选项名称包括等号。
      等号和值之间需要一个空格。
 type= <own|share|interact|kernel|filesys|rec>
       (默认 = own)
 start= <boot|system|auto|demand|disabled|delayed-auto>
       (默认 = demand)
 error= <normal|severe|critical|ignore>
       (默认 = normal)
 binPath= <BinaryPathName>
 group= <LoadOrderGroup>
 tag= <yes|no>
 depend= <依存关系(以 / (斜杠) 分隔)>
 obj= <AccountName|ObjectName>
       (默认 = LocalSystem)
 DisplayName= <显示名称>
 password= <密码>
C:\Users\Administrator>sc create
描述:
        在注册表和服务数据库中创建服务项。
用法:
        sc <server> create [service name] [binPath= ] <option1> <option2>...
 
选项:
注意: 选项名称包括等号。
      等号和值之间需要一个空格。
 type= <own|share|interact|kernel|filesys|rec>
       (默认 = own)
 start= <boot|system|auto|demand|disabled|delayed-auto>
       (默认 = demand)
 error= <normal|severe|critical|ignore>
       (默认 = normal)
 binPath= <BinaryPathName>
 group= <LoadOrderGroup>
 tag= <yes|no>
 depend= <依存关系(以 / (斜杠) 分隔)>
 obj= <AccountName|ObjectName>
       (默认 = LocalSystem)
 DisplayName= <显示名称>
 password= <密码>
  • 创建服务
C:\Users\Administrator>sc create windows_exporter binpath= C:\windows_exporter.exe type= own start= auto displayname= windows_exporter
[SC] CreateService 成功
C:\Users\Administrator>sc create windows_exporter binpath= C:\windows_exporter.exe type= own start= auto displayname= windows_exporter
[SC] CreateService 成功

❌ 注意

等号“=”与值之间有一个空格,必需要保留,如果不保留,运行会出错。

  • 删除服务
powershell
C:\Users\Administrator>sc delete windows_exporter
[SC] DeleteService 成功
C:\Users\Administrator>sc delete windows_exporter
[SC] DeleteService 成功

4.访问

ip:9182/metrics

image-20241113104646833

1.3 prometheus采集

yaml
    - job_name: Windows
      static_configs:
      - targets:
        - 10.103.236.129:9182
      scrape_interval: 1m
      scrape_timeout: 30s
      scheme: http
      metrics_path: "/metrics"
      honor_labels: true
    - job_name: Windows
      static_configs:
      - targets:
        - 10.103.236.129:9182
      scrape_interval: 1m
      scrape_timeout: 30s
      scheme: http
      metrics_path: "/metrics"
      honor_labels: true
  • 重新加载服务
bash
curl -XPOST  http://prometheus.ikubernetes.net/-/reload
curl -XPOST  http://prometheus.ikubernetes.net/-/reload

image-20241113115828526

1.告警规则

yaml

  windos.rules: |
    groups:
    - name: windos.rules
      rules:
      - alert: Windows_采集器状态
        expr: windows_exporter_collector_success == 0
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: Windows Server collector Error (instance {{ $labels.instance }})
          description: "Collector {{ $labels.collector }} was not successful\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
      - alert: Windows_ServerServiceStatus
        expr: windows_service_state{ name="MSSQLSERVER", state="running"} != 1
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: Windows Server service Status (instance {{ $labels.instance }})
          description: "Windows Service state is not OK\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
      - alert: Windows_cpu
        expr: 100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[2m])) * 100) > 80
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: Windows Server CPU Usage (instance {{ $labels.instance }})
          description: "CPU Usage is more than 80%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
      - alert: Windows_mem
        expr: 100 - ((windows_os_physical_memory_free_bytes / windows_cs_physical_memory_bytes) * 100) > 80
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: Windows Server memory Usage (instance {{ $labels.instance }})
          description: "Memory usage is more than 80%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
      - alert: Windows_disk
        expr: 100.0 - 100 * ((windows_logical_disk_free_bytes / 1024 / 1024 ) / (windows_logical_disk_size_bytes / 1024 / 1024)) > 80
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: Windows Server disk Space Usage (instance {{ $labels.instance }})
          description: "Disk usage is more than 80%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

  windos.rules: |
    groups:
    - name: windos.rules
      rules:
      - alert: Windows_采集器状态
        expr: windows_exporter_collector_success == 0
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: Windows Server collector Error (instance {{ $labels.instance }})
          description: "Collector {{ $labels.collector }} was not successful\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
      - alert: Windows_ServerServiceStatus
        expr: windows_service_state{ name="MSSQLSERVER", state="running"} != 1
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: Windows Server service Status (instance {{ $labels.instance }})
          description: "Windows Service state is not OK\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
      - alert: Windows_cpu
        expr: 100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[2m])) * 100) > 80
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: Windows Server CPU Usage (instance {{ $labels.instance }})
          description: "CPU Usage is more than 80%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
      - alert: Windows_mem
        expr: 100 - ((windows_os_physical_memory_free_bytes / windows_cs_physical_memory_bytes) * 100) > 80
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: Windows Server memory Usage (instance {{ $labels.instance }})
          description: "Memory usage is more than 80%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
      - alert: Windows_disk
        expr: 100.0 - 100 * ((windows_logical_disk_free_bytes / 1024 / 1024 ) / (windows_logical_disk_size_bytes / 1024 / 1024)) > 80
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: Windows Server disk Space Usage (instance {{ $labels.instance }})
          description: "Disk usage is more than 80%\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

1.4 grafana

默认搜索,https://grafana.com/grafana/dashboards/

模板使用10467

image-20241113120529306