One of the most important aspect of a system is its monitoring. Monitoring gives us the ability to perform a proactive response allowing us the keep the system stable and reliable.
The main source of input of health is typically log files. Today you find a myriad of tools that can help us to achieve this ranging from costly enterprise toolsets like splank to open source solutions like logstash or fluentd,
My preferred tool is fluentd. Mainly because of its simplicity to setup and configure. It has been endorsed by big tech giants like AWS, Google and Microsoft and it provides a huge list of plugins (500+) that can be used to read logs from components like docker, mongo, nginx and data writers to persist the data to a central repository like elastic search, MongoDB or Haadoop.
Another important aspect for me is that it can be easily customise it to handle custom log files for legacy systems and most important it is cross platform.
In my line of work I build a lot of windows services that crunch jobs on databases or other datasources and more often than not clients don’t have a monitoring system so in this situation fluentd comes in very handy as it provides an email plugin that can be used to send an email alert.
This allows the client to be notified in the event of an error and gives me the possibility to react on the error rather than waiting for client to discover the issue after months. Resulting me to have to sift through gigabytes of logs.
So lets dive in.
Installation on windows is through the following msi.
Once the installation is complete you will need to create a configuration file. In it you will define the location of your log and actions you want to do with your log.
The following is a typical config file I use to parse log files from .net applications.
path C:\Projects\ProjectName\log.txt # log location
format /^(?<time>[^ ]* [^ ,]*)(?<message>.*)$/
count_interval 3 # The time window for counting errors (in secs)
input_key message # The field to apply the regular expression
regexp \[Error\] # The regular expression to be applied
threshold 1 # The minimum number of errors to trigger an alert
add_tag_prefix catastrofic_error # Generate tags like "error_5xx.apache.access"
@type stdout # Print to stdout for debugging
host smtp.gmail.com # Change this to your SMTP server host
port 587 # Normally 25/587/465 are used for submission
user xxxx# Use your username to log in
password xxxxx # Use your login password
enable_starttls_auto true # Use this option to enable STARTTLS
from [email protected] # Set the sender address
to [email protected] # Set the recipient address
subject ‘ SERVICE ERRORS ()’
message Total Service Staging error count: %s\n\nPlease check environment ASAP
message_out_keys count # Use the "count" field to replace "%s" above
The config file is very simple. The first section defines the location of the log file and instructs fluentd that we are using a tail parser. We also need to define a named regex expression with the different sections of the log file. I am assuming the log file contains the time followed by a message which contains the error message.
The second section applies the grep counter on each line item. Allowing us to filter lines that contain [Error]. We can also define a count of errors that will trigger the alert.
The last section is the mail output that will send the email notification to the address that is defined in the parameters.