How to configure Web applications request limits in WildFly

In this article you will learn which strategies you can adopt on WildFly application server to configure the maximum number of concurrent requests using either a programmatic approach (MicroProfile Fault Tolerance) or declarative one (Undertow configuration).

Implementing a policy to define the number of concurrent requests is crucial to limit requests and prevent faults from harming entire systems. There are several approaches we can follow, we will discuss the following ones:

  • Using MicroProfile Fault Tolerance API which is available in WildFly / JBoss EAP / Quarkus
  • Using Undertow’s server configuration, which allows defining concurrent requests for specific Web prefixes

Setting resource limits with Microprofile API

MicroProfile Fault Tolerance specification provides a way to implement automatic retries on any CDI element, including CDI beans and the MicroProfile REST Client.

There are a couple of patterns you can adopt to implement Fault Tolerance and control the maximum number of concurrent requests:

The @Bulkhead annotation limits the operations that can be executed at the same time, keeping the new requests waiting, until the current execution requests can finish. For example, the following code limits the number of concurrent executions to 5:

public String getHelloBulkhead() {
  return "hello";

In the above example, if any other requests hits our service, the Web server will throw a BulkException. To avoid this scenario, you can add the @Asynchronous annotation which allows additional requests to be placed in a waiting queue. Example:

@Bulkhead(value = 5, waitingTaskQueue = 10)
public Future getHelloBulkhead(Consumer c) {
        return CompletableFuture.completedFuture(null);

Using a CircuitBreaker Pattern

The MicroProfile Fault Tolerance specification also includes the Circuit Breaker pattern to avoid making unnecessary calls if there are errors. The circuit breaker pattern fixes a cascading failure by detecting the number of consecutive failures inside a detection window.

As an example, let’s define a circuit breaker policy that fires after 3 errors in a window of 4 requests:

@CircuitBreaker(requestVolumeThreshold = 4, failureRatio = 0.75, delay = 1000)
public String processWithCircuitBreaker() {
   return "OK";

You can also mix @CircuitBreaker with other annotations such as @Timeout , @Fallback , @Bulkhead , or @Retry as in the following example:

@CircuitBreaker(requestVolumeThreshold = 4, failureRatio = 0.75, delay = 1000)
@Fallback(fallbackMethod = "onOverload")
public Future processWithCircuitBreaker(Consumer resultListener) {
    return CompletableFuture.completedFuture(null);

* fallback to this method if processWithCircuitBreaker is overloaded
public Future onOverload(Consumer resultListener) {
    System.out.println("Please call me next time later");
    return CompletableFuture.completedFuture(null);


However, keep in mind that:
• If you use a @Fallback method, the fallback logic executes if a CircuitBreakerOpenException happens.
• When using a @Retry, each retry will be processed by the Circuit Breaker and counts as either a success or a failure.
• Finally, if you use a @Bulkhead pattern, the Circuit Breaker has priority over the Bulkhead pattern.

To learn more about MicroProfile fault Tolerance API, we recommend checking this article: Getting started with MicroProfile FaultTolerance API

Configuring request limits with Undertow

In the second part of this tutorial, we will learn how to manage the number of concurrent request declaratively, using Undertow‘s server configuration.

To do that, you have to perform the following steps:

  1. Define a Configuration filter with a max-concurrent-requests and a queue-size
  2. Apply the filter to a server configuration
  3. You can optionally set the filter to a specific path-prefix, which can be the root Web application Context or single parts of it.

Here is how you can do that for the /rest-demo Web application:

/subsystem=undertow/configuration=filter/request-limit=rslimit:add(max-concurrent-requests=10, queue-size=20)



Here is the resulting configuration:

<subsystem xmlns="urn:jboss:domain:undertow:12.0" default-server="default-server" default-virtual-host="default-host" default-servlet-container="default" default-security-domain="other" statistics-enabled="${wildfly.undertow.statistics-enabled:${wildfly.statistics-enabled:false}}">

 . . .
    <server name="default-server">
        <http-listener name="default" socket-binding="http" redirect-socket="https" enable-http2="true"/>
        <https-listener name="https" socket-binding="https" ssl-context="applicationSSC" enable-http2="true"/>
        <host name="default-host" alias="localhost">
            <location name="/" handler="welcome-content"/>
            <filter-ref name="rslimit" predicate="path-prefix('/rest-demo/')"/>
            <http-invoker http-authentication-factory="application-http-authentication"/>
    <servlet-container name="default">
        <file name="welcome-content" path="${jboss.home.dir}/welcome-content"/>
        <request-limit name="rslimit" max-concurrent-requests="10" queue-size="20"/>

In the above example, we set a max-concurrent-requests of 10 and a queue-size of 20. If you hit the max-concurrent-requests, all new requests will be placed into the queue. If there’s no more room in the queue, you will hit an error 503 Service Unavailable:

It is worth mentioning, that you improve your response in two ways:

  • Configuring an error page in web.xml: