How to monitor WildFly with Prometheus

In this tutorial we will get started with Prometheus platform, by learning how to monitor WildFly application server new metrics subsystem.

Prometheus is an open-source monitoring and alerting platform. Its main features are:

  • a multi-dimensional data model with time series metrics (key/value pairs)
  • a flexible query language (PromQL) to leverage the data model
  • autonomous single server nodes, therefore no reliance on distributed storage
  • time series data pull over HTTP
  • time series push is still supported using an intermediary gateway
  • targets discovery by configuration
  • multiple modes of graphing and dashboard support

In order to exploit these features, Prometheus relies on several components:

  • Prometheus core server which manages the series data
  • client libraries for instrumenting application code
  • a push gateway for supporting short-lived jobs
  • an alertmanager to handle alerts
  • special-purpose exporters for services like HAProxy, StatsD, Graphite, etc.

In order to start Prometheus you can either use the precompiled binaries or start its Docker image as follows:

docker run \
    -p 9090:9090 \
    -v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml \
    prom/prometheus

In the next section we will show how to install Prometheus on your Linux machine.

Installing Prometheus

Firstly, download the binaries from: https://prometheus.io/download/

$ wget https://github.com/prometheus/prometheus/releases/download/v2.32.1/prometheus-2.32.1.linux-arm64.tar.gz

Then, unzip the archive. You will have the following directory structure:

prometheus-2.32.1.linux-amd64
├── console_libraries
│   ├── menu.lib
│   └── prom.lib
├── consoles
│   ├── index.html.example
│   ├── node-cpu.html
│   ├── node-disk.html
│   ├── node.html
│   ├── node-overview.html
│   ├── prometheus.html
│   └── prometheus-overview.html
├── LICENSE
├── NOTICE
├── prometheus
├── prometheus.yml
└── promtool

The prometheus binary file is the core application. As we want to monitor WildFly metrics which are available at localhost:9990, we will edit the prometheus.yml to include WildFly endpoint:

global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:

  # this is the configuration to poll metrics from WildFly 15
  - job_name: 'metrics'
    scrape_interval: 15s

    static_configs:
      - targets: ['localhost:9990']

  # The job name is added as a label `job=` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    static_configs:
    - targets: ['localhost:9090']

Next, start prometheus:

$ prometheus
level=info ts=2019-02-08T15:44:41.660432983Z caller=main.go:303 build_context="(go=go1.11.5, [email protected], date=20190131-11:16:59)"
level=info ts=2019-02-08T15:44:41.660553379Z caller=main.go:304 host_details="(Linux 4.8.12-300.fc25.x86_64 #1 SMP Fri Dec 2 17:52:11 UTC 2016 x86_64 new-host (none))"
level=info ts=2019-02-08T15:44:41.660650441Z caller=main.go:305 fd_limits="(soft=1024, hard=4096)"
level=info ts=2019-02-08T15:44:41.660738167Z caller=main.go:306 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2019-02-08T15:44:41.662856196Z caller=main.go:620 msg="Starting TSDB ..."
level=info ts=2019-02-08T15:44:41.663085426Z caller=web.go:416 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2019-02-08T15:44:41.705189659Z caller=main.go:635 msg="TSDB started"
level=info ts=2019-02-08T15:44:41.705401172Z caller=main.go:695 msg="Loading configuration file" filename=prometheus.yml
level=info ts=2019-02-08T15:44:41.718157999Z caller=main.go:722 msg="Completed loading of configuration file" filename=prometheus.yml
level=info ts=2019-02-08T15:44:41.718361843Z caller=main.go:589 msg="Server is ready to receive web requests."

In another shell, we will start WildFly application server. Before that, check that the application server configuration includes the metrics subsystem:

<subsystem xmlns="urn:wildfly:metrics:1.0" security-enabled="false" exposed-subsystems="*" prefix="${wildfly.metrics.prefix:wildfly}"/>

With that in place, start WildFly:

$ ./standalone.sh

Next, check the metrics endpoint at http://localhost:9990/metrics:

curl --silent http://localhost:9990/metrics | head
# HELP base:classloader_total_loaded_class_count Displays the total number of classes that have been loaded since the Java virtual machine has started execution.
# TYPE base:classloader_total_loaded_class_count counter
base:classloader_total_loaded_class_count 23056.0
# HELP base:cpu_system_load_average Displays the system load average for the last minute. The system load average is the sum of the number of runnable entities queued to the available processors and the number of runnable entities running on the available processors averaged over a period of time. The way in which the load average is calculated is operating system specific but is typically a damped time-dependent average. If the load average is not available, a negative value is displayed. This attribute is designed to provide a hint about the system load and may be queried frequently. The load average may be unavailable on some platform where it is expensive to implement this method.
# TYPE base:cpu_system_load_average gauge
base:cpu_system_load_average 1.15
# HELP base:thread_count Number of currently deployed threads
# TYPE base:thread_count counter
base:thread_count 69.0
# HELP base:classloader_current_loaded_class_count Displays the number of classes that are currently loaded in the Java virtual machine.

Ok, that’s just the first part of WildFly metrics. Let’s jump into Prometheus console which is available at: http://localhost:9090

Now let’s try adding a WildFly metric like “base:cpu_system_load_average” into the search console and click on Execute:

Monitoring WildFly with Prometheus

As you can see the value for the metric (0.48) has been retrieved. If you want some basic Graph for this metric, just click on the Add Graph button.

Finally, if you select the Graph Tab, you will see that an XY Chart is now available for the metric:

Monitoring WildFly with Prometheus

Conclusion

In this tutorial we have learned how to get started with Prometheus, installing the binary server and capturing WildFly metrics. Keep learning about monitoring WildFly in this tutorial: Using Prometheus and Grafana to capture Alerts and visualize Metrics

If you want to learn how to monitor Spring Boot application with Prometheus, we recommend the following article: Monitoring Spring Boot with Prometheus and Micrometer