简述

监控Nginx主要用到以下三个模块:

  • nginx-module-vts:Nginx virtual host traffic statusmodule,Nginx的监控模块,能够提供JSON格式的数据产出。
  • nginx-vts-exporter:Simple serverthat scrapes Nginx vts stats and exports them via HTTP for Prometheus consumption。主要用于收集Nginx的监控数据,并给Prometheus提供监控接口,默认端口号9913。
  • Prometheus:监控Nginx-vts-exporter提供的Nginx数据,并存储在时序数据库中,可以使用PromQL对时序数据进行查询和聚合。

添加Nginx模块

[root@localhost vhost]# cd /data/software
[root@localhost vhost]# git clone https://github.com/vozlt/nginx-module-vts.git
[root@localhost vhost]# /usr/local/nginx/sbin/nginx -V
...
--add-module=/data/software/nginx-module-vts 

配置Nginx

新起一个vhost暴露给server端访问数据。

[root@localhost vhost]# vim monitor.conf 

server {
     listen 8888;
     #allow 127.0.0.1;
     #allow 47.56.165.134;
     #vhost_traffic_status_filter_by_set_key $uri uri::$server_name;                                               #每个uri访问量       
     #vhost_traffic_status_filter_by_set_key $geoip_country_code country::$server_name;          #不同国家/区域请求量       
     #vhost_traffic_status_filter_by_set_key $filter_user_agent agent::$server_name;                   #获取用户所使用的agent       
     # vhost_traffic_status_filter_by_set_key $status $server_name;                                               #http code统计       
     # vhost_traffic_status_filter_by_set_key $upstream_addr upstream::backend;                         #后端转发统计       
     #vhost_traffic_status_filter_by_set_key $remote_port client::ports::$server_name;                 #请求端口统计       
     # vhost_traffic_status_filter_by_set_key $remote_addr client::addr::$server_name;                 #请求IP统计       
     #location ~ ^/storage/(.+)/.*$ {
         # set $volume $1;
         # vhost_traffic_status_filter_by_set_key $volume storage::$server_name;                           #请求路径统计
    # }
     location /status {
            vhost_traffic_status_display;
            vhost_traffic_status_display_format html;
        }
    }

重载nginx,使其配置文件生效。

[root@localhost vhost]# /usr/local/nginx/sbin/nginx -t
nginx: the configuration file /usr/local/nginx/conf/nginx.conf syntax is ok
nginx: configuration file /usr/local/nginx/conf/nginx.conf test is successful
[root@localhost vhost]# /usr/local/nginx/sbin/nginx -s reload

此时通过http://IP地址:port/status就可以看到nginx的状态信息了。

配置nginx-exporter

[root@localhost vhost]# https://github.com/hnlq715/nginx-vts-exporter/releases/download/v0.9.1/nginx-vts-exporter-0.9.1.linux-amd64.tar.gz
[root@localhost vhost]# tar -zxvf nginx-vts-exporter-0.9.1.linux-amd64.tar.gz
[root@localhost vhost]# mv nginx-vts-exporter-0.9.1.linux-amd64 /usr/local/nginx-vts-exporter
[root@localhost nginx-vts-exporter]# cd /usr/local/nginx-vts-exporter
[root@localhost nginx-vts-exporter]# ll
total 9756
 -rw-rw-r--. 1 2000 2000    1063 Feb 27  2018 LICENSE
 -rwxr-xr-x. 1 2000 2000 9982855 Feb 27  2018 nginx-vts-exporter

使用systemctl 管理nginx-vts-exporter进程。

[root@localhost nginx-vts-exporter]# vim /usr/lib/systemd/system/nginx_vts_exporter.service 
[Unit]
Description=prometheus_nginx_vts
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/nginx-vts-exporter/nginx-vts-exporter  -nginx.scrape_uri http://192.168.2.15:8088/status/format/json
Restart=on-failure

[Install]
WantedBy=multi-user.target
[root@localhost nginx-vts-exporter]# systemctl daemon-reload
[root@localhost nginx-vts-exporter]# systemctl enable  nginx_vts_exporter
[root@localhost nginx-vts-exporter]# systemctl start nginx_vts_exporter
[root@localhost nginx-vts-exporter]# systemctl status nginx_vts_exporter
● nginx_vts_exporter.service - prometheus_nginx_vts
   Loaded: loaded (/usr/lib/systemd/system/nginx_vts_exporter.service; disabled; vendor preset: disabled)
   Active: active (running) since Fri 2020-04-10 22:13:02 EDT; 4 days ago
 Main PID: 90274 (nginx-vts-expor)
   CGroup: /system.slice/nginx_vts_exporter.service
           └─90274 /usr/local/nginx-vts-exporter/nginx-vts-exporter -nginx.scrape_uri http://192.168.3.15:8088/status/format/json

prometheus server 配置

[root@monitor ~]# vim /usr/local/prometheus/prometheus.yml 
...
  - job_name: 'nginx' # 添加node配置
    static_configs:
      - targets:
        - '192.168.3.15:9913'


...

重启prometheus server。

[root@monitor ~]# systemctl restart prometheus

Grafana展示

导入2949号模板,开始监控。

prometheus之监控Nginx

常用监控汇总表达式:
DomainName对应nginx conf里的server_name,这里可以根据不同的server_name和upstream分别进行qps、2xx/3xx/4xx/5xx的状态码监控,另外也可以监控nginx每台后端server的qps和后端接口响应时间。如果不需要区分server_name,可以把表达式里的$DomainName改为星号,*代表所有;

求Nginx的QPS:
sum(irate(nginx_server_requests{code="total",host=~"$DomainName"}[5m]))
求4xx万分率(5xx类似,code=“5xx”):
(sum(irate(nginx_server_requests{code="4xx",host=~"$DomainName"}[5m])) / sum(irate(nginx_server_requests{code="total",host=~"$DomainName"}[5m]))) * 10000
求upstream的QPS(示例求group1的qps):
sum(irate(nginx_upstream_requests{code="total",upstream="group1"}[5m]))
求upstream后端server的响应时间(示例求group1的后端响应时间):
nginx_upstream_responseMsec{upstream=“group1”}

文章目录