1. 自定义资源接入
Prometheus使用各种Exporter来监控资源。Exporter可以看成是监控的agent端,它负责收集对应资源的指标,并提供接口给到Prometheus读取。
2. ECS数据抓取
2.1 配置安装node-exporter
- 启动容器
docker run -d -p 9100:9100 \
-v "/proc:/host/proc" \
-v "/sys:/host/sys" \
-v "/:/rootfs" \
-v "/etc/localtime:/etc/localtime" \
prom/node-exporter \
--path.procfs /host/proc \
--path.sysfs /host/sys \
--collector.filesystem.ignored-mount-points
"^/(sys|proc|dev|host|etc)($|/)"
docker run -d -p 9100:9100 \
-v "/proc:/host/proc" \
-v "/sys:/host/sys" \
-v "/:/rootfs" \
-v "/etc/localtime:/etc/localtime" \
prom/node-exporter \
--path.procfs /host/proc \
--path.sysfs /host/sys \
--collector.filesystem.ignored-mount-points
"^/(sys|proc|dev|host|etc)($|/)"
验证,curl localhost:9100/metrics
- 创建采集器
- job_name: 'other-ECS'
static_configs:
- targets: ['10.103.236.199:9100']
labels:
hostname: 'test-node-exporter'
- job_name: 'other-ECS'
static_configs:
- targets: ['10.103.236.199:9100']
labels:
hostname: 'test-node-exporter'
- 热加载
curl -XPOST http://prometheus.ikubernetes.net/-/reload
curl -XPOST http://prometheus.ikubernetes.net/-/reload
3. process-exporter进程监控
process-export主要用来做进程监控,比如某个服务的进程数、消耗了多少CPU、内存等资源
3.0 语法
vim /opt/process-exporter/config/process-exporter.yml
process_names:
# - name: "{{.Comm}}"
# cmdline:
# - '.+'
- name: "{{.Matches}}"
cmdline:
- 'nginx'
- name: "{{.Matches}}"
cmdline:
- '/opt/atlassian/confluence/bin/tomcat-juli.jar'
- name: "{{.Matches}}"
cmdline:
- 'vsftpd'
- name: "{{.Matches}}"
cmdline:
- 'redis-server'
vim /opt/process-exporter/config/process-exporter.yml
process_names:
# - name: "{{.Comm}}"
# cmdline:
# - '.+'
- name: "{{.Matches}}"
cmdline:
- 'nginx'
- name: "{{.Matches}}"
cmdline:
- '/opt/atlassian/confluence/bin/tomcat-juli.jar'
- name: "{{.Matches}}"
cmdline:
- 'vsftpd'
- name: "{{.Matches}}"
cmdline:
- 'redis-server'
cmdline: 所选进程的唯一标识,ps -ef 可以查询到。如果改进程不存在,则不会有该进程的数据采集到。
{{.Comm}} 记得带上{{}}
{{.Comm}} 记得带上{{}}
.Comm | groupname="redis-server" | exe或者sh文件名称 |
---|---|---|
.ExeBase | groupname="redis-server *:6379" | / |
.ExeFull | groupname="/usr/bin/redis-server *:6379" | ps中的进程完成信息 |
.Username | groupname="redis" | 使用进程所属的用户进行分组 |
.Matches | groupname="map[:redis]" | 表示配置到关键字"redis" |
3.1 创建挂载目录
mkdir -p /opt/process-exporter/config
cat /opt/process-exporter/config/process-exporter.yml
process_names:
- name: "{{.Matches}}" # 匹配模板
cmdline:
- 'api' #根据自己的修改
mkdir -p /opt/process-exporter/config
cat /opt/process-exporter/config/process-exporter.yml
process_names:
- name: "{{.Matches}}" # 匹配模板
cmdline:
- 'api' #根据自己的修改
3.2 配置安装process-exporter
docker run -itd --rm -p 9256:9256 --privileged -v /proc:/host/proc -v /opt/process-exporter/config:/config ncabatoff/process-exporter --procfs /host/proc -config.path config/process-exporter.yml
docker run -itd --rm -p 9256:9256 --privileged -v /proc:/host/proc -v /opt/process-exporter/config:/config ncabatoff/process-exporter --procfs /host/proc -config.path config/process-exporter.yml
- 验证
curl localhost:9256/metrics
ps aux | grep -v grep | grep api
curl localhost:9256/metrics
ps aux | grep -v grep | grep api
4. k8s中process-exporter进程监控
需要监控k8s集群每台linux服务器的 docker kubelet进程运行状态,当有进程异常时,触发告警
4.1 配置config
- 创建config
1.exporter-cofig.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: process-exporter-config
namespace: monitor
data:
process-exporter-config.yaml: |-
process_names:
- name: "{{.Matches}}"
cmdline:
- 'docker'
- name: "{{.Matches}}"
cmdline:
- 'kubelet'
apiVersion: v1
kind: ConfigMap
metadata:
name: process-exporter-config
namespace: monitor
data:
process-exporter-config.yaml: |-
process_names:
- name: "{{.Matches}}"
cmdline:
- 'docker'
- name: "{{.Matches}}"
cmdline:
- 'kubelet'
- 执行
kubectl apply -f 1.exporter-cofig.yaml
kubectl apply -f 1.exporter-cofig.yaml
4.2 安装
- 创建daeset
2.exporter-dp.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: process-exporter
namespace: monitor
labels:
app: process-exporter
spec:
selector:
matchLabels:
app: process-exporter
template:
metadata:
labels:
app: process-exporter
spec:
hostPID: true
hostIPC: true
hostNetwork: true
nodeSelector:
kubernetes.io/os: linux
containers:
- name: process-exporter
image: registry.cn-zhangjiakou.aliyuncs.com/hsuing/process-exporter:latest
args:
- -config.path=/config/process-exporter-config.yaml
ports:
- containerPort: 9256
resources:
requests:
cpu: 10m
memory: 10Mi
limits:
cpu: 150m
memory: 180Mi
securityContext:
runAsNonRoot: true
runAsUser: 65534
volumeMounts:
- name: proc
mountPath: /proc
- name: config
mountPath: /config
volumes:
- name: proc
hostPath:
path: /proc
- name: config
configMap:
name: process-exporter-config
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: process-exporter
namespace: monitor
labels:
app: process-exporter
spec:
selector:
matchLabels:
app: process-exporter
template:
metadata:
labels:
app: process-exporter
spec:
hostPID: true
hostIPC: true
hostNetwork: true
nodeSelector:
kubernetes.io/os: linux
containers:
- name: process-exporter
image: registry.cn-zhangjiakou.aliyuncs.com/hsuing/process-exporter:latest
args:
- -config.path=/config/process-exporter-config.yaml
ports:
- containerPort: 9256
resources:
requests:
cpu: 10m
memory: 10Mi
limits:
cpu: 150m
memory: 180Mi
securityContext:
runAsNonRoot: true
runAsUser: 65534
volumeMounts:
- name: proc
mountPath: /proc
- name: config
mountPath: /config
volumes:
- name: proc
hostPath:
path: /proc
- name: config
configMap:
name: process-exporter-config
- apply
kubectl apply -f 2.exporter-dp.yaml
kubectl apply -f 2.exporter-dp.yaml
- 验证
curl pod-ip:9256/metrics
curl pod-ip:9256/metrics
4.3 prometheus配置采集器
- job_name: 'process-exporter'
scrape_interval: 1m
scrape_timeout: 1m
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9256'
target_label: __address__
action: replace
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- source_labels: [__meta_kubernetes_node_address_InternalIP]
action: replace
target_label: ip
- job_name: 'process-exporter'
scrape_interval: 1m
scrape_timeout: 1m
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9256'
target_label: __address__
action: replace
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- source_labels: [__meta_kubernetes_node_address_InternalIP]
action: replace
target_label: ip
- 热更新
curl -XPOST http://prometheus.ikubernetes.net/-/reload
curl -XPOST http://prometheus.ikubernetes.net/-/reload
- 效果
4.4 rule规则
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-rule
labels:
name: prometheus-rule
namespace: monitoring
data:
alert-rules.yaml: |-
groups:
- name: node-alert
rules:
- alert: service not running
expr: namedprocess_namegroup_num_procs == 0
for: 1m
labels:
severity: warning
team: server
annotations:
summary: "{{$labels.ip}} service status not running"
description: "{{$labels.ip}} {{$labels.groupname}} service status not running"
value: "{{$labels.groupname}}"
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-rule
labels:
name: prometheus-rule
namespace: monitoring
data:
alert-rules.yaml: |-
groups:
- name: node-alert
rules:
- alert: service not running
expr: namedprocess_namegroup_num_procs == 0
for: 1m
labels:
severity: warning
team: server
annotations:
summary: "{{$labels.ip}} service status not running"
description: "{{$labels.ip}} {{$labels.groupname}} service status not running"
value: "{{$labels.groupname}}"
模板ID为249
5. domain-exporter
文档,https://github.com/caarlos0/domain_exporter/releases
5.1 创建svc
cat 1.domain-svc.yaml
apiVersion: v1
kind: Service
metadata:
labels:
name: domain-exporter
name: domain-exporter
namespace: monitor
spec:
ports:
- name: domain-exporter
protocol: TCP
port: 9222
targetPort: 9222
selector:
app: domain-exporter
apiVersion: v1
kind: Service
metadata:
labels:
name: domain-exporter
name: domain-exporter
namespace: monitor
spec:
ports:
- name: domain-exporter
protocol: TCP
port: 9222
targetPort: 9222
selector:
app: domain-exporter
- 执行apply
5.2 创建dp
cat 2.domain-exporter-dp.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: domain-exporter
namespace: monitor
spec:
replicas: 1
selector:
matchLabels:
app: domain-exporter
template:
metadata:
name: domain-exporter
labels:
app: domain-exporter
spec:
containers:
- name: domain-exporter
image: registry.cn-zhangjiakou.aliyuncs.com/hsuing/domain_exporter:v1.23.0
ports:
- name: tcp
containerPort: 9222
protocol: TCP
resources:
requests:
cpu: 100m
memory: 50Mi
limits:
cpu: 200m
memory: 256Mi
securityContext:
runAsUser: 1000
readOnlyRootFilesystem: true
runAsNonRoot: true
readinessProbe:
tcpSocket:
port: 9222
initialDelaySeconds: 5
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
apiVersion: apps/v1
kind: Deployment
metadata:
name: domain-exporter
namespace: monitor
spec:
replicas: 1
selector:
matchLabels:
app: domain-exporter
template:
metadata:
name: domain-exporter
labels:
app: domain-exporter
spec:
containers:
- name: domain-exporter
image: registry.cn-zhangjiakou.aliyuncs.com/hsuing/domain_exporter:v1.23.0
ports:
- name: tcp
containerPort: 9222
protocol: TCP
resources:
requests:
cpu: 100m
memory: 50Mi
limits:
cpu: 200m
memory: 256Mi
securityContext:
runAsUser: 1000
readOnlyRootFilesystem: true
runAsNonRoot: true
readinessProbe:
tcpSocket:
port: 9222
initialDelaySeconds: 5
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
- 执行apply
5.3 接入prometheus
- job_name: domain-exporter
metrics_path: /probe
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: domain-exporter:9222 # domain_exporter address
static_configs:
- targets:
- baidu.com #根据环境修改
- job_name: domain-exporter
metrics_path: /probe
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: domain-exporter:9222 # domain_exporter address
static_configs:
- targets:
- baidu.com #根据环境修改
执行apply
热更新
curl -XPOST http://prometheus.ikubernetes.net/-/reload
curl -XPOST http://prometheus.ikubernetes.net/-/reload
5.4 报警规则
domain.rules: |
groups:
- name: domain
rules:
- alert: 域名检测失败
expr: domain_probe_success == 0
for: 2h
labels:
severity: warning
annotations:
summary: '{{ $labels.instance }} ,域名检测'
description: '{{ $labels.domain }}, 域名检测失败,请及时查看!!!'
- alert: 域名过期
expr: domain_expiry_days < 15
for: 2h
labels:
severity: warning
annotations:
summary: '{{ $labels.instance }},域名过期'
description: '{{ $labels.domain }},将在15天后过期,请及时查看!!!'
- alert: 域名过期
expr: domain_expiry_days < 5
for: 2h
labels:
severity: warning
annotations:
summary: '{{ $labels.instance }},域名过期'
description: '{{ $labels.domain }},将在5天后过期,请及时查看!!!'
domain.rules: |
groups:
- name: domain
rules:
- alert: 域名检测失败
expr: domain_probe_success == 0
for: 2h
labels:
severity: warning
annotations:
summary: '{{ $labels.instance }} ,域名检测'
description: '{{ $labels.domain }}, 域名检测失败,请及时查看!!!'
- alert: 域名过期
expr: domain_expiry_days < 15
for: 2h
labels:
severity: warning
annotations:
summary: '{{ $labels.instance }},域名过期'
description: '{{ $labels.domain }},将在15天后过期,请及时查看!!!'
- alert: 域名过期
expr: domain_expiry_days < 5
for: 2h
labels:
severity: warning
annotations:
summary: '{{ $labels.instance }},域名过期'
description: '{{ $labels.domain }},将在5天后过期,请及时查看!!!'
- 执行apply
- 热更新
6. redis-export
version: "3.2"
services:
redis-exporter:
image: oliver006/redis_exporter
container_name: redis-exporter
restart: unless-stopped
command:
- "-redis.password-file=/redis_passwd.json"
volumes:
- /usr/share/zoneinfo/PRC:/etc/localtime
- /data/redis-exporter/redis_passwd.json:/redis_passwd.json
expose:
- 9121
network_mode: "host"
version: "3.2"
services:
redis-exporter:
image: oliver006/redis_exporter
container_name: redis-exporter
restart: unless-stopped
command:
- "-redis.password-file=/redis_passwd.json"
volumes:
- /usr/share/zoneinfo/PRC:/etc/localtime
- /data/redis-exporter/redis_passwd.json:/redis_passwd.json
expose:
- 9121
network_mode: "host"
7. mysql-export
https://github.com/prometheus/mysqld_exporter
https://github.com/starsliao/TenSunS/blob/main/docs/如何优雅的使用一个mysqld_exporter监控所有的MySQL实例.md
8. PG-export
https://github.com/prometheus-community/postgres_exporter
https://cloud.tencent.com/developer/article/1868937
https://pigsty.cc/zh/docs/pgsql/dashboard/
https://demo.pigsty.cc/dashboards/f/pgsql/pgsql
参考,