Using Prometheus and Grafana to capture Alerts and visualize Metrics

This is the third tutorial about how to use Prometheus server to capture metrics from a Microprofile compatible server like WildFly or Quarkus. In the first two tutorials, we have discussed how to set up Prometheus to connect to WildFly (Monitoring WildFly with Prometheus) and Quarkus ( Monitoring Quarkus with Prometheus) to capture Microprofile metrics. We will now learn how to use the Alert Manager to capture, group or export our alerts. Then, we will learn how to export our metrics in a Grafana dashboard.

It is required that you have already installed Prometheus server either locally or in a Container image. We will now take care of the Alert Manager component.

The Alert Manager handles alerts sent by client applications such as the Prometheus server. It takes care of the following tasks:

  • Grouping categorizes alerts of similar nature into a single notification. This is especially useful during larger outages when many systems fail at once and hundreds to thousands of alerts may be firing simultaneously.
  • Inhibition is a concept of suppressing notifications for certain alerts if certain other alerts are already firing.
  • Silences are a straightforward way to simply mute alerts for a given time

Download the AlertManager from: https://prometheus.io/download/

Once unzipped, we will at first configure Prometheus to send alerts to the Alert Manager through the default HTTP Target host/port. So, make sure your prometheus.yml contains the following alerting settings:

alerting:
   alertmanagers:
     - scheme: http
       static_configs:
         - targets: ['localhost:9093']

Also,  within the prometheus.yml we will load Rules from the external file rules.yml:

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "rules.yml"

Here is the bare and simple rules.yml file which defines a single Alert that is fired when one job (“metrics”) is DOWN:

groups:
- name: example
  rules:

  - alert: service_down
    expr: up{job="metrics"} == 0
    labels:
      severity: major
    annotations:
      description: Service {{ $labels.instance }} is unavailable.
      value: DOWN ({{ $value }})

This alert checks a Metric from a MicroProfile application (See this tutorial to learn more: Monitoring Quarkus with Prometheus)

Configuring the AlertManager

Done with Prometheus configuration, we will now edit the Alert Manager configuration file (alertmanager.yml) to receive Alerts which will be grouped and forwarded to a receiver, named gmail-notifications:

global:
 resolve_timeout: 1m

route:
 receiver: 'gmail-notifications'

receivers:
- name: 'gmail-notifications'
 email_configs:
 - to: monitoringinstances@gmail.com
   from: monitoringinstances@gmail.com
   smarthost: smtp.gmail.com:587
   auth_username: monitoringinstances@gmail.com
   auth_identity: monitoringinstances@gmail.com
   auth_password: password
   send_resolved: true

Now let’s start Prometheus from the bin folder with:

$ ./prometheus

Then, start the AlertManager from another shell:

$ ./alertmanager

If your Enterprise application is not available at the moment, you will see from Prometheus Console (localhost:9090) that your Rule is firing:

opensource monitoring solutions prometheus grafana

Now let’s move to the Alert Manager Console (http://localhost:9093):

opensource monitoring solutions prometheus grafana

As you can see, the Alert has been received by the Alert Manager and it will be routed to the Receiver’s email address. If you have configured a valid email server and credentials, you will receive an email with the above alert.

Visualize your application Metrics with Grafana

The next step, will be to visualize your metrics in a Grafana Dashboard. Grafana is open source visualization and analytics software. It allows you to query, visualize, alert on, and explore your metrics no matter where they are stored. In short, it provides you with tools to turn your time-series database data into beautiful graphs and visualizations.

Start by downloading Grafana from https://grafana.com/grafana/download .

Unzip the file and from the bin folder start it with:

$ ./grafana-server

The Grafana HTTP Console will be available at localhost:3000. Login with the default user (admin/admin). As soon as you are logged in, add a DataSource to it:

opensource monitoring solutions prometheus grafana

Choose to a Prometheus Datasource:

opensource monitoring solutions prometheus grafana

The Datasource will be bound by default to Prometheus default host/port (localhost:9090). If you didn’t change that, you can just finish the Datasource configuration.

Next, add a new Dashboard. A Dashboard is a set of one or more panels organized and arranged into one or more rows. Within your Dashboard, bind it to your Prometheus Datasource and choose a Query expression. The Query expression can be any metric which has been published by your Enterprise service. For example, choose the “base_memory_usedHeap_bytes” which is the amount of used heap memory in bytes:

opensource monitoring solutions prometheus grafana

As you can see from the above picture, the amount of Memory of your Enterprise application will be visualized in a nice Grafana Dashboard, which will be updating based on the chosen timing factor.

Conclusion

We have discussed how to use the Alert Manager to group/forward the Alerts for our Enterprise Application collected by Prometheus. We have then learnt how to export our Metrics in the Grafana opensource tool using a Datasource and check them in a Dashboard