Monitoring my servers with Prometheus – Part 2
After sharing my previous post, I had some questions about what actual exporters I’m using, where and why!
In this post I’ll tackle the exporters themselves, why I selected them and how they’re being used. If you haven’t read the previous article yet, I encourage you to take a look!
Using Prometheus, I have around a dozen exporters running, here’s what they are:
I’m utilizing the docker exporter for Prometheus, this allows me to view a list of my containers along with usages (processes, memory, storage consumption) in Grafana, and I can create alerts in Grafana on a per-container basis which has been immensely helpful!
For my three cache servers (CDN work-in-progress), I run openresty which is a variant of nginx. I utilize the default nginx exporter which exposes all the metrics I need for total requests, workers active, etc.
After running into some queue issues with Postfix after a misconfiguring Sendy instance, I decided to setup a postfix exporter and create its own Grafana dashboard. I set alerts for when the queue breaks over 200, and it’ll send me an email via a different relay.
Ever since I started running OpenLiteSpeed on my web hosting server (DirectAdmin), I wanted to be able to export the statistics. I know OLS has a web admin panel, but I’d prefer to have it unified in Grafana, so I’ve set up the exporter.
I also utilize the standard Prometheus Node Exporter for system metrics like Bandwidth, CPU, RAM, and IO information. It’s been immensely helpful when I get high traffic loads and I find where I need to optimize.
I’ve also written a handful of custom exporters using the Prometheus Python library.
The custom exporters I have running monitor database tables themselves for values (eg. fax logs), asterisk logs (inbound/outbound call logs) and uptime of custom services that require authentication.
The custom authentication modules are for my businesses applications that need two-factor authentication, so the test modules have their own tracked 2FA codes and do the whole sign-in process to ensure it’s working and stays up. I’ve set alerts anytime this process fails, and I’ve rigged Twilio to send me an SMS upon failure.
Custom exporters give me the freedom to customize to how I need it, and it’s a fun learning experience in the same breath. I’d encourage everyone to give it a try!
Next on tap for my monitoring is going to be aggregating total requests per second across all my systems, along with bandwidth consumption, and disk space usage. Then I’m going to look into exporting AWS SES statistics into Prometheus so I can display that relay alongside my Postfix outbound.
Why not Influxdb?
Influxdb is a great engine, but I’m just much more fond and experienced with Prometheus. I enjoy the immense flexibility of Prometheus that’s easily available and well-documented.
I feel the Prometheus style matches my workflow better and easier than the alternatives. I encourage you to find your own style and utilizes what gives you the most flexibility, stability and extensibility.
Let me know your thoughts in the comments!