Skip to content

1. 注册企业微信

1.1 打开注册地址

https://work.weixin.qq.com/wework_admin/register_wx?from=myhome_baidu

img

1.2 企业ID

找到"我的企业" 下拉找到 "企业ID/corpid" 这个记录一下后面会用到

image-20240703115340322

1.3 生成应用

1.找到"应用管理" ==>> "自建" ==>> "创建应用"

img

2.选择上一步创建的图片,填写应用信息

img

3.记录 "Agentid" "Secret" 后面会用

img

img

4.配置企业可用IP

里面需要企业的域名,在域名下放一个文件

img

上面这个配置不通过一直会报错:

{"errcode":60020,"errmsg":"not allow to access from your ip, hint: [***], from ip: ***, more info at https://open.work.weixin.qq.com/devtool/query?e=60020"}

❌ 注意

企业可信IP在很早之前不用开启,虽然也报60020,但是可以调用,后来创建应用程序自动开启,导致不配置ip,无法调用

要想使用新的应用,则需要monye

2. 基于企业微信的报警媒介

实时告警通知:企业微信/钉钉等即时通信工具能够实现实时的告警通知,使得团队成员能够及时响应和解决问题。

通知范围更广:基于企业微信/钉钉的告警通知可以通过群组和@某人的方式,将告警通知发送给更广泛的接收者,避免出现漏报的情况。

告警信息更直观:企业微信/钉钉等通信工具提供了更丰富的告警信息呈现方式,例如文本消息、链接、图片、语音等,使得告警信息更加直观和易于理解。

3. Alertmanger配置config

cat 2.alertmanager-configmap-wechat.yaml

yaml
data:
  alertmanager.yml: |-
    global:
      resolve_timeout: 1m
      smtp_smarthost: 'smtp.qq.com:465'     # 邮箱服务器的SMTP主机配置
      smtp_from: 'xxx@qq.com'    # 发送邮件主题
      smtp_auth_username: 'xxx@qq.com'      # 登录用户名
      smtp_auth_password: 'xxxx'    # 此处的auth password是邮箱的第三方登录授权密码,而非用户密码
      smtp_require_tls: false           # 有些邮箱需要开启此配置,这里使用的是企微邮箱,仅做测试,不需要开启此功能。
    templates:
      - '/etc/alertmanager/*.tmpl'
    route:
      group_by: ['env','instance','type','group','job','alertname','cluster']
      group_wait: 10s
      group_interval: 2m
      repeat_interval: 10m
      receiver: 'wechat'
      routes:
      - receiver: 'email'
        match:
          severity: critical

      - receiver: 'wechat'
        match:
          severity: critical222

      - receiver: 'webhook'
        match:
          severity: critical

    receivers:
    - name: 'email'
      email_configs:
      - to: 'hxopensource@163.com'
        send_resolved: true
        html: '{{ template "email.to.html" . }}'
        headers: { Subject: "系统监控告警{{- if gt (len .Alerts.Resolved) 0 -}}恢复{{ end }}" }

    #- name: 'devops'
    #  email_configs:
    #  - to: 'hxopensource@163.com,xxx@qq.com'
    #    send_resolved: true
    #    html: '{{ template "email.to.html" . }}'

    - name: 'wechat'
      wechat_configs:
      - corp_id: 'wwe158xxx4275006'
        to_party: '1'
        to_user: '@all'
        agent_id: 1000004
        api_secret: 'eGORelIo1EqzLfxxxxsdtAOm5nkGELI-Ag3TTwo'
        send_resolved: true

    - name: 'webhook'
      webhook_configs:
      - url: 'http://webhook-dingtalk.monitor.svc.cluster.local:8060/dingtalk/webhook1/send'
        send_resolved: true

    inhibit_rules:
      - source_match:
          severity: 'critical'
        target_match:
          severity: 'warning'
        equal: ['alertname', 'dev', 'instance']

  wechat.tmpl: |-
    {{ define "wechat.default.message" }}
    {{- if gt (len .Alerts.Firing) 0 -}}
    {{- range $index, $alert := .Alerts -}}
    {{- if eq $index 0 }}
    ========= 监控报警 =========
    告警状态:{{   .Status }}
    告警级别:{{ .Labels.severity }}
    告警类型:{{ $alert.Labels.alertname }}
    故障主机: {{ $alert.Labels.instance }}
    告警主题: {{ $alert.Annotations.summary }}
    告警详情: {{ $alert.Annotations.message }}{{ $alert.Annotations.description}};
    触发阀值:{{ .Annotations.value }}
    故障时间: {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
    ========= = end =  =========
    {{- end }}
    {{- end }}
    {{- end }}
    {{- if gt (len .Alerts.Resolved) 0 -}}
    {{- range $index, $alert := .Alerts -}}
    {{- if eq $index 0 }}
    ========= 告警恢复 =========
    告警类型:{{ .Labels.alertname }}
    告警状态:{{   .Status }}
    告警主题: {{ $alert.Annotations.summary }}
    告警详情: {{ $alert.Annotations.message }}{{ $alert.Annotations.description}};
    故障时间: {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
    恢复时间: {{ ($alert.EndsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
    {{- if gt (len $alert.Labels.instance) 0 }}
    实例信息: {{ $alert.Labels.instance }}
    {{- end }}
    ========= = end =  =========
    {{- end }}
    {{- end }}
    {{- end }}
    {{- end }}

  email.tmpl: |-
    {{ define "email.from" }}xxx.com{{ end }}
    {{ define "email.to" }}xxx.com{{ end }}
    {{ define "email.to.html" }}
    {{- if gt (len .Alerts.Firing) 0 -}}
    {{ range .Alerts }}
    ========= 监控报警 =========<br>
    告警程序: prometheus_alert <br>
    告警级别: {{ .Labels.severity }} <br>
    告警类型: {{ .Labels.alertname }} <br>
    告警主机: {{ .Labels.instance }} <br>
    告警主题: {{ .Annotations.summary }}  <br>
    告警详情: {{ .Annotations.description }} <br>
    触发时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }} <br>
    ========= = end =  =========<br>
    {{ end }}{{ end -}}

    {{- if gt (len .Alerts.Resolved) 0 -}}
    {{ range .Alerts }}
    ========= 告警恢复 =========<br>
    告警程序: prometheus_alert <br>
    告警级别: {{ .Labels.severity }} <br>
    告警类型: {{ .Labels.alertname }} <br>
    告警主机: {{ .Labels.instance }} <br>
    告警主题: {{ .Annotations.summary }} <br>
    告警详情: {{ .Annotations.description }} <br>
    触发时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }} <br>
    恢复时间: {{ .EndsAt.Format "2006-01-02 15:04:05" }} <br>
    ========= = end =  =========<br>
    {{ end }}{{ end -}}

    {{- end }}
data:
  alertmanager.yml: |-
    global:
      resolve_timeout: 1m
      smtp_smarthost: 'smtp.qq.com:465'     # 邮箱服务器的SMTP主机配置
      smtp_from: 'xxx@qq.com'    # 发送邮件主题
      smtp_auth_username: 'xxx@qq.com'      # 登录用户名
      smtp_auth_password: 'xxxx'    # 此处的auth password是邮箱的第三方登录授权密码,而非用户密码
      smtp_require_tls: false           # 有些邮箱需要开启此配置,这里使用的是企微邮箱,仅做测试,不需要开启此功能。
    templates:
      - '/etc/alertmanager/*.tmpl'
    route:
      group_by: ['env','instance','type','group','job','alertname','cluster']
      group_wait: 10s
      group_interval: 2m
      repeat_interval: 10m
      receiver: 'wechat'
      routes:
      - receiver: 'email'
        match:
          severity: critical

      - receiver: 'wechat'
        match:
          severity: critical222

      - receiver: 'webhook'
        match:
          severity: critical

    receivers:
    - name: 'email'
      email_configs:
      - to: 'hxopensource@163.com'
        send_resolved: true
        html: '{{ template "email.to.html" . }}'
        headers: { Subject: "系统监控告警{{- if gt (len .Alerts.Resolved) 0 -}}恢复{{ end }}" }

    #- name: 'devops'
    #  email_configs:
    #  - to: 'hxopensource@163.com,xxx@qq.com'
    #    send_resolved: true
    #    html: '{{ template "email.to.html" . }}'

    - name: 'wechat'
      wechat_configs:
      - corp_id: 'wwe158xxx4275006'
        to_party: '1'
        to_user: '@all'
        agent_id: 1000004
        api_secret: 'eGORelIo1EqzLfxxxxsdtAOm5nkGELI-Ag3TTwo'
        send_resolved: true

    - name: 'webhook'
      webhook_configs:
      - url: 'http://webhook-dingtalk.monitor.svc.cluster.local:8060/dingtalk/webhook1/send'
        send_resolved: true

    inhibit_rules:
      - source_match:
          severity: 'critical'
        target_match:
          severity: 'warning'
        equal: ['alertname', 'dev', 'instance']

  wechat.tmpl: |-
    {{ define "wechat.default.message" }}
    {{- if gt (len .Alerts.Firing) 0 -}}
    {{- range $index, $alert := .Alerts -}}
    {{- if eq $index 0 }}
    ========= 监控报警 =========
    告警状态:{{   .Status }}
    告警级别:{{ .Labels.severity }}
    告警类型:{{ $alert.Labels.alertname }}
    故障主机: {{ $alert.Labels.instance }}
    告警主题: {{ $alert.Annotations.summary }}
    告警详情: {{ $alert.Annotations.message }}{{ $alert.Annotations.description}};
    触发阀值:{{ .Annotations.value }}
    故障时间: {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
    ========= = end =  =========
    {{- end }}
    {{- end }}
    {{- end }}
    {{- if gt (len .Alerts.Resolved) 0 -}}
    {{- range $index, $alert := .Alerts -}}
    {{- if eq $index 0 }}
    ========= 告警恢复 =========
    告警类型:{{ .Labels.alertname }}
    告警状态:{{   .Status }}
    告警主题: {{ $alert.Annotations.summary }}
    告警详情: {{ $alert.Annotations.message }}{{ $alert.Annotations.description}};
    故障时间: {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
    恢复时间: {{ ($alert.EndsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
    {{- if gt (len $alert.Labels.instance) 0 }}
    实例信息: {{ $alert.Labels.instance }}
    {{- end }}
    ========= = end =  =========
    {{- end }}
    {{- end }}
    {{- end }}
    {{- end }}

  email.tmpl: |-
    {{ define "email.from" }}xxx.com{{ end }}
    {{ define "email.to" }}xxx.com{{ end }}
    {{ define "email.to.html" }}
    {{- if gt (len .Alerts.Firing) 0 -}}
    {{ range .Alerts }}
    ========= 监控报警 =========<br>
    告警程序: prometheus_alert <br>
    告警级别: {{ .Labels.severity }} <br>
    告警类型: {{ .Labels.alertname }} <br>
    告警主机: {{ .Labels.instance }} <br>
    告警主题: {{ .Annotations.summary }}  <br>
    告警详情: {{ .Annotations.description }} <br>
    触发时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }} <br>
    ========= = end =  =========<br>
    {{ end }}{{ end -}}

    {{- if gt (len .Alerts.Resolved) 0 -}}
    {{ range .Alerts }}
    ========= 告警恢复 =========<br>
    告警程序: prometheus_alert <br>
    告警级别: {{ .Labels.severity }} <br>
    告警类型: {{ .Labels.alertname }} <br>
    告警主机: {{ .Labels.instance }} <br>
    告警主题: {{ .Annotations.summary }} <br>
    告警详情: {{ .Annotations.description }} <br>
    触发时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }} <br>
    恢复时间: {{ .EndsAt.Format "2006-01-02 15:04:05" }} <br>
    ========= = end =  =========<br>
    {{ end }}{{ end -}}

    {{- end }}
  • apply
yaml
kubectl apply -f  2.alertmanager-configmap-wechat.yaml
kubectl apply -f  2.alertmanager-configmap-wechat.yaml
  • 热更新
bash
curl -XPOST http://alertmanager.ikubernetes.net/-/reload
curl -XPOST http://alertmanager.ikubernetes.net/-/reload
  • 发送测试
bash
curl -XPOST -H 'Content-Type: application/json' http://alertmanager.ikubernetes.net/api/v1/alerts -d '[{"labels":{"severity":"critical222"},"annotations":{"summary":"This is a testalert"}}]'
curl -XPOST -H 'Content-Type: application/json' http://alertmanager.ikubernetes.net/api/v1/alerts -d '[{"labels":{"severity":"critical222"},"annotations":{"summary":"This is a testalert"}}]'
  • 效果

image-20240703120504375