Jakarta Batch, formerly known as Java Batch, is a specification that provides a standardized approach for implementing batch processing in Java applications. It offers a robust and scalable framework for executing large-scale, long-running, and data-intensive tasks. In this tutorial, we will explore the process of running Jakarta Batch Jobs as standalone Java applications, discussing the essential steps, configuration, and use cases.
Overview of Jakarta Batch API
The Jakarta Batch API allows executing Batch activities based on a Job Specification Language (JSL) using two main Programming models:
- A Chunk is a processing unit within a batch job that operates on a set of data records. It is designed for handling large volumes of data and is suitable for scenarios where data processing can be divided into discrete chunks. You can read more about it here: Batch Applications tutorial on WildFly
- A Batchlet, on the other hand, is a simpler processing unit within a batch job that performs a single, non-divisible task. It is typically used for non-data-oriented processing or when the processing logic doesn’t fit into a chunk-based model. You can read more about Batchlets here: How to run Jakarta Batchlets with WildFly
While Jakarta Batch APIs are traditionally a server-side component of an architecture, you can still use Jakarta Batch API in a standalone Java application. Indeed, standalone Jakarta Batch tasks are ideal for executing long-running operations in the background, ensuring that the main application remains responsive and unaffected by the task’s execution time. Also, a Jakarta Batch API is resilient as you can persist the state via JDBC. Therefore, it provides several advantages over a basic cron-like Job execution.
Defining the Batch Job
The first step is defining the job via the Job Specification Language (JSL). Let’s create this file named simplebatchlet.xml in the folder src\main\resources\META-INF\batch-jobs of a Maven project:
<job id="helloworldjob" xmlns="http://xmlns.jcp.org/xml/ns/javaee" version="1.0"> <step id="step1"> <properties> <property name="say" value="Hello World!" /> </properties> <batchlet ref="com.sample.jberet.HelloWorldBatchlet" /> </step> </job>
In this simple JSL file we are executing a Batchlet as part of “step1” which takes as input a property. We will use the “say” Property to call our HelloWorldBatchlet.
Defining the Batchlet
Then, we will code a simple Batchlet, which is a concrete implementation of the jakarta.batch.api.AbstractBatchlet Class:
package com.sample.jberet; import jakarta.batch.api.AbstractBatchlet; import jakarta.batch.runtime.context.StepContext; import jakarta.inject.Inject; public class HelloWorldBatchlet extends AbstractBatchlet { @Inject StepContext stepContext; @Override public String process() { String say = stepContext.getProperties().getProperty("say"); System.out.println(say); return "COMPLETED"; } }
Within it, we are using the @Inject annotation to inject the StepContext object into the HelloWorldBatchlet instance. StepContext provides contextual information about the current step, such as step properties and runtime information.
The process()
method is an overridden method from the AbstractBatchlet
class, and it contains the main logic of the Batchlet. Here’s what it does:
- It retrieves the value of a property named “say” from the
StepContext
usinggetProperties().getProperty("say")
. The value of this property is assumed to be a String. - It prints the value of “say” to the console using
System.out.println(say)
. - Finally, it returns the string “COMPLETED” to indicate the completion status of the Batchlet.
Running the Batch from a Main Java Class
You can trigger the execution of your job with simple main Java class:
package com.sample.jberet; import java.util.concurrent.TimeUnit; import org.jberet.runtime.JobExecutionImpl; import jakarta.batch.operations.JobOperator; import jakarta.batch.runtime.BatchRuntime; public class Main { private static final String simpleJob = "simplebatchlet.xml"; private static final JobOperator jobOperator = BatchRuntime.getJobOperator(); public static void main(String[] args) { try { final long jobExecutionId = jobOperator.start(simpleJob, null); final JobExecutionImpl jobExecution = (JobExecutionImpl) jobOperator.getJobExecution(jobExecutionId); jobExecution.awaitTermination(5, TimeUnit.MINUTES); } catch (Exception ex) { System.out.println("Error submitting Job! " + ex.getMessage()); ex.printStackTrace(); } } }
The Main Java Class defines two constants: simpleJob
and jobOperator
.
simpleJob
represents the name of the batch job configuration file (“simplebatchlet.xml”).jobOperator
is an instance of theJobOperator
interface. It provides operations for managing batch jobs, such as starting, stopping, and querying job executions.
Within the main method:
- We are using the
JobOperator
instance (jobOperator
) to start a batch job execution by calling thestart()
method. - Then, we retrieve the
JobExecution
instance corresponding to the started job execution using thegetJobExecution()
method ofJobOperator
. It casts the returned instance toJobExecutionImpl
, which is a specific implementation class. - Finally, we call
awaitTermination()
on theJobExecutionImpl
instance, specifying a timeout value of 5 minutes and theTimeUnit.MINUTES
constant. This method blocks the main thread until the job execution completes or the specified timeout is reached.
Testing our Batchlet
In addition to the main Java Class, we can also add a JUnit Test class to verify the Batchlet execution:
public class SimpleIT { private static final String simpleJob = "simplebatchlet.xml"; private static final JobOperator jobOperator = BatchRuntime.getJobOperator(); @Test public void testSimpleBatchlet() throws Exception { final long jobExecutionId = jobOperator.start(simpleJob, null); final JobExecutionImpl jobExecution = (JobExecutionImpl) jobOperator.getJobExecution(jobExecutionId); jobExecution.awaitTermination(5, TimeUnit.MINUTES); Assert.assertEquals(BatchStatus.COMPLETED, jobExecution.getBatchStatus()); } }
You can Test the Batchlet Execution with:
mvn install
You should see in the Console output the Hello World message as result of the Batchlet execution:
Besides, you can also execute the Main Java Class with:
mvn install exec:java
Configuring JBeret engine
When using Batch Jobs within WildFly container you can configure Jobs persistence and thread pools via the batch subsystem. When running as standalone application you can do it via a file named jberet.properties which has to be placed in src\main\resources of your Maven project.
Here follows a sample jberet.properties file:
# Optional, valid values are jdbc (default), mongodb and in-memory job-repository-type = jdbc # Optional, default is jdbc:h2:~/jberet-repo for h2 database as the default job repository DBMS. # For h2 in-memory database, db-url = jdbc:h2:mem:test;DB_CLOSE_DELAY=-1 # For mongodb, db-url includes all the parameters for MongoClientURI, including hosts, ports, username, password, # Use the target directory to store the DB db-url = jdbc:h2:./target/jberet-repo db-user =sa db-password =sa db-properties = # Configured: java.util.concurrent.ThreadPoolExecutor is created with thread-related properties as parameters. thread-pool-type = # New tasks are serviced first by creating core threads. # Required for Configured type. thread-pool-core-size = # If all core threads are busy, new tasks are queued. # int number indicating the size of the work queue. If 0 or negative, a java.util.concurrent.SynchronousQueue is used. # Required for Configured type. thread-pool-queue-capacity = # If queue is full, additional non-core threads are created to service new tasks. # int indicating the maximum size of the thread pool. # Required for Configured type. thread-pool-max-size = # long number indicating the number of seconds a thread can stay idle. # Required for Configured type. thread-pool-keep-alive-time = # Optional, valid values are true and false, defaults to false. thread-pool-allow-core-thread-timeout = # Optional, valid values are true and false, defaults to false. thread-pool-prestart-all-core-threads = # Optional, fully-qualified name of a class that implements java.util.concurrent.ThreadFactory. # This property should not be needed in most cases. thread-factory = # Optional, fully-qualified name of a class that implements java.util.concurrent.RejectedExecutionHandler. # This property should not be needed in most cases. thread-pool-rejection-policy =
As you can see, this file largely relies on defaults for many variables like the thread pool. We have anyway applied a change in the job-repository-type to persist jobs on a DB (H2 DB). In this case we will need adding the JDBC Driver API to our Maven project as follows:
<dependency> <groupId>com.h2database</groupId> <artifactId>h2</artifactId> <version>2.2.220</version> <scope>runtime</scope> </dependency>
Source code: https://github.com/fmarchioni/mastertheboss/tree/master/batch/standalone
Acknowledgments: I’d like to express my gratitude to Cheng Fang (JBeret project Lead) for providing useful insights for writing this article