How to run standalone Jakarta Batch Jobs

Jakarta Batch, formerly known as Java Batch, is a specification that provides a standardized approach for implementing batch processing in Java applications. It offers a robust and scalable framework for executing large-scale, long-running, and data-intensive tasks. In this tutorial, we will explore the process of running Jakarta Batch Jobs as standalone Java applications, discussing the essential steps, configuration, and use cases.

Overview of Jakarta Batch API

The Jakarta Batch API allows executing Batch activities based on a Job Specification Language (JSL) using two main Programming models:

  • A Chunk is a processing unit within a batch job that operates on a set of data records. It is designed for handling large volumes of data and is suitable for scenarios where data processing can be divided into discrete chunks.  You can read more about it here: Batch Applications tutorial on WildFly
  • A Batchlet, on the other hand, is a simpler processing unit within a batch job that performs a single, non-divisible task. It is typically used for non-data-oriented processing or when the processing logic doesn’t fit into a chunk-based model. You can read more about Batchlets here: How to run Jakarta Batchlets with WildFly

While Jakarta Batch APIs are traditionally a server-side component of an architecture, you can still use Jakarta Batch API in a standalone Java application. Indeed, standalone Jakarta Batch tasks are ideal for executing long-running operations in the background, ensuring that the main application remains responsive and unaffected by the task’s execution time. Also, a Jakarta Batch API is resilient as you can persist the state via JDBC. Therefore, it provides several advantages over a basic cron-like Job execution.

Defining the Batch Job

The first step is defining the job via the Job Specification Language (JSL). Let’s create this file named simplebatchlet.xml in the folder src\main\resources\META-INF\batch-jobs of a Maven project:

<job id="helloworldjob" xmlns="http://xmlns.jcp.org/xml/ns/javaee"
    version="1.0">
    <step id="step1">
        <properties>
            <property name="say" value="Hello World!" />
        </properties>
        <batchlet ref="com.sample.jberet.HelloWorldBatchlet" />
    </step>
</job>

In this simple JSL file we are executing a Batchlet as part of “step1” which takes as input a property. We will use the “say” Property to call our HelloWorldBatchlet.

Defining the Batchlet

Then, we will code a simple Batchlet, which is a concrete implementation of the jakarta.batch.api.AbstractBatchlet Class:

package com.sample.jberet;

import jakarta.batch.api.AbstractBatchlet;
import jakarta.batch.runtime.context.StepContext;
import jakarta.inject.Inject;

public class HelloWorldBatchlet extends AbstractBatchlet {
    @Inject
    StepContext stepContext;

    @Override
    public String process() {
        String say = stepContext.getProperties().getProperty("say");
        System.out.println(say);
        return "COMPLETED";
    }
}

Within it, we are using the @Inject annotation to inject the StepContext object into the HelloWorldBatchlet instance. StepContext provides contextual information about the current step, such as step properties and runtime information.

The process() method is an overridden method from the AbstractBatchlet class, and it contains the main logic of the Batchlet. Here’s what it does:

  • It retrieves the value of a property named “say” from the StepContext using getProperties().getProperty("say"). The value of this property is assumed to be a String.
  • It prints the value of “say” to the console using System.out.println(say).
  • Finally, it returns the string “COMPLETED” to indicate the completion status of the Batchlet.

Running the Batch from a Main Java Class

You can trigger the execution of your job with simple main Java class:

package com.sample.jberet;

import java.util.concurrent.TimeUnit;

import org.jberet.runtime.JobExecutionImpl;

import jakarta.batch.operations.JobOperator;
import jakarta.batch.runtime.BatchRuntime;

public class Main {
    private static final String simpleJob = "simplebatchlet.xml";
    private static final JobOperator jobOperator = BatchRuntime.getJobOperator();
    public static void main(String[] args) {
        try {
            final long jobExecutionId = jobOperator.start(simpleJob, null);            
            final JobExecutionImpl jobExecution = (JobExecutionImpl) jobOperator.getJobExecution(jobExecutionId);
            jobExecution.awaitTermination(5, TimeUnit.MINUTES);
        } catch (Exception ex) {
            System.out.println("Error submitting Job! " + ex.getMessage());
            ex.printStackTrace();
        }
    }
}

The Main Java Class defines two constants: simpleJob and jobOperator.

  • simpleJob represents the name of the batch job configuration file (“simplebatchlet.xml”).
  • jobOperator is an instance of the JobOperator interface.  It provides operations for managing batch jobs, such as starting, stopping, and querying job executions.

Within the main method:

  • We are using the JobOperator instance (jobOperator) to start a batch job execution by calling the start() method.
  • Then, we retrieve the JobExecution instance corresponding to the started job execution using the getJobExecution() method of JobOperator. It casts the returned instance to JobExecutionImpl, which is a specific implementation class.
  • Finally, we call awaitTermination() on the JobExecutionImpl instance, specifying a timeout value of 5 minutes and the TimeUnit.MINUTES constant. This method blocks the main thread until the job execution completes or the specified timeout is reached.

Testing our Batchlet

In addition to the main Java Class, we can also add a JUnit Test class to verify the Batchlet execution:

public class SimpleIT {
    private static final String simpleJob = "simplebatchlet.xml";
    private static final JobOperator jobOperator = BatchRuntime.getJobOperator();
    @Test
    public void testSimpleBatchlet() throws Exception {
        final long jobExecutionId = jobOperator.start(simpleJob, null);
        final JobExecutionImpl jobExecution = (JobExecutionImpl) jobOperator.getJobExecution(jobExecutionId);
        jobExecution.awaitTermination(5, TimeUnit.MINUTES);
        Assert.assertEquals(BatchStatus.COMPLETED, jobExecution.getBatchStatus());
    }
}

You can Test the Batchlet Execution with:

mvn install

You should see in the Console output the Hello World message as result of the Batchlet execution:

jakarta ee batchlet standalone

Besides, you can also execute the Main Java Class with:

mvn install exec:java

Configuring JBeret engine

When using Batch Jobs within WildFly container you can configure Jobs persistence and thread pools via the batch subsystem. When running as standalone application you can do it via a file named jberet.properties which has to be placed in src\main\resources of your Maven project.
Here follows a sample jberet.properties file:

# Optional, valid values are jdbc (default), mongodb and in-memory
job-repository-type = jdbc

# Optional, default is jdbc:h2:~/jberet-repo for h2 database as the default job repository DBMS.
# For h2 in-memory database, db-url = jdbc:h2:mem:test;DB_CLOSE_DELAY=-1
# For mongodb, db-url includes all the parameters for MongoClientURI, including hosts, ports, username, password,



# Use the target directory to store the DB
db-url = jdbc:h2:./target/jberet-repo
db-user =sa
db-password =sa
db-properties =

# Configured: java.util.concurrent.ThreadPoolExecutor is created with thread-related properties as parameters.
thread-pool-type =

# New tasks are serviced first by creating core threads.
# Required for Configured type.
thread-pool-core-size =

# If all core threads are busy, new tasks are queued.
# int number indicating the size of the work queue. If 0 or negative, a java.util.concurrent.SynchronousQueue is used.
# Required for Configured type.
thread-pool-queue-capacity =

# If queue is full, additional non-core threads are created to service new tasks.
# int indicating the maximum size of the thread pool.
# Required for Configured type.
thread-pool-max-size =

# long number indicating the number of seconds a thread can stay idle.
# Required for Configured type.
thread-pool-keep-alive-time =

# Optional, valid values are true and false, defaults to false.
thread-pool-allow-core-thread-timeout =

# Optional, valid values are true and false, defaults to false.
thread-pool-prestart-all-core-threads =

# Optional, fully-qualified name of a class that implements java.util.concurrent.ThreadFactory.
# This property should not be needed in most cases.
thread-factory =

# Optional, fully-qualified name of a class that implements java.util.concurrent.RejectedExecutionHandler.
# This property should not be needed in most cases.
thread-pool-rejection-policy =

As you can see, this file largely relies on defaults for many variables like the thread pool. We have anyway applied a change in the job-repository-type to persist jobs on a DB (H2 DB). In this case we will need adding the JDBC Driver API to our Maven project as follows:

<dependency>
    <groupId>com.h2database</groupId>
    <artifactId>h2</artifactId>
    <version>2.2.220</version>
    <scope>runtime</scope>
</dependency>

Source code: https://github.com/fmarchioni/mastertheboss/tree/master/batch/standalone

Acknowledgments: I’d like to express my gratitude to Cheng Fang (JBeret project Lead) for providing useful insights for writing this article