Spring Batch Run Multiple Jobs Example
Spring Batch is a powerful framework for handling large-volume batch processing. It provides tools for creating robust and scalable batch applications. Let us delve into understanding how to run multiple jobs in Spring Batch efficiently, exploring the techniques and configurations that make it possible to execute jobs concurrently or sequentially.
1. Introduction
A Spring Batch job consists of multiple steps, each step being a well-defined stage in the batch process. A typical step involves three main components:
- Reader: The reader is responsible for reading input data from a specified source, such as a database, a file, or an API. It transforms the input into a format that can be processed further. Common implementations include
FlatFileItemReader
for reading from CSV files andJdbcCursorItemReader
for database records. - Processor: The processor applies business logic to the input data, transforming it into a desired output format. This might involve filtering, aggregating, or converting data. A processor can return null to indicate that a specific item should be skipped.
- Writer: The writer takes the processed data and writes it to an output destination, such as another database, a file, or even another system. Examples include
FlatFileItemWriter
for writing to files andJdbcBatchItemWriter
for writing to a database.
Jobs in Spring Batch can be executed in various ways depending on the requirements:
- Sequential Execution: In sequential execution, jobs are executed one after the other. This is useful when jobs have dependencies or need to be run in a specific order to maintain data consistency.
- Parallel Execution: Jobs can be configured to run concurrently, leveraging multi-threading or parallel processing to improve performance and reduce the total processing time. This is particularly useful for large datasets where each job operates independently.
- Conditional Execution: Jobs can be configured to execute based on specific conditions or decisions. For example, a job can decide the next step based on the outcome of the previous step. This is typically handled using decision components or job parameters.
Understanding these core concepts is crucial for designing efficient batch processes that can handle large volumes of data with reliability and performance.
2. Configuration
Spring Batch jobs are typically configured using Java-based configurations, which provide a clean and type-safe way of defining jobs, steps, and their components. Let’s break down the sample configuration provided:
@Configuration @EnableBatchProcessing public class BatchConfig { @Autowired private JobBuilderFactory jobBuilderFactory; @Autowired private StepBuilderFactory stepBuilderFactory; @Bean public Job job1() { return jobBuilderFactory.get("job1") .start(step1()) .next(step2()) .build(); } @Bean public Job job2() { return jobBuilderFactory.get("job2") .start(step3()) .build(); } @Bean public Step step1() { return stepBuilderFactory.get("step1") .tasklet((contribution, chunkContext) -> { System.out.println("Executing Step 1"); return RepeatStatus.FINISHED; }) .build(); } @Bean public Step step2() { return stepBuilderFactory.get("step2") .tasklet((contribution, chunkContext) -> { System.out.println("Executing Step 2"); return RepeatStatus.FINISHED; }) .build(); } @Bean public Step step3() { return stepBuilderFactory.get("step3") .tasklet((contribution, chunkContext) -> { System.out.println("Executing Step 3"); return RepeatStatus.FINISHED; }) .build(); } }
2.1 Code Explanation
In this code, the @Configuration
annotation marks the class as a source of bean definitions for the Spring application context. The @EnableBatchProcessing
annotation enables Spring Batch features and provides a base configuration for setting up batch jobs. The @Autowired
annotation is used to inject the JobBuilderFactory
and StepBuilderFactory
beans, which are essential for creating job and step definitions. In the job definition, job1
is defined to start with step1()
and proceed to step2()
, while job2
is defined to start and end with step3()
, without any subsequent steps. Each step serves as a building block of the job. In this configuration, step1
executes a tasklet that prints “Executing Step 1” and returns RepeatStatus.FINISHED
to indicate completion, while step2
performs a similar task, printing “Executing Step 2” and finishing with RepeatStatus.FINISHED
. step3
executes a tasklet that prints “Executing Step 3” and also completes with RepeatStatus.FINISHED
. A tasklet is a simple interface with a single method for executing a task, typically used for lightweight operations like logging or simple tasks that don’t require complex state management.
This configuration sets up two jobs, job1
and job2
, with three steps. The jobs can be executed sequentially or based on specific requirements, providing flexibility in how batch processing is managed in a Spring application.
3. Sequential Job Execution
In sequential execution, jobs are executed one after the other in a predefined order. This ensures that one job completes before the next one starts. Sequential execution is useful when jobs have dependencies or need to be run in a specific order to ensure data consistency or processing logic. In Spring Batch, this can be achieved by invoking jobs sequentially in the main application class. Let’s break down the example provided:
@SpringBootApplication public class SpringBatchApplication implements CommandLineRunner { @Autowired private JobLauncher jobLauncher; @Autowired private Job job1; @Autowired private Job job2; public static void main(String[] args) { SpringApplication.run(SpringBatchApplication.class, args); } @Override public void run(String... args) throws Exception { jobLauncher.run(job1, new JobParameters()); jobLauncher.run(job2, new JobParameters()); } }
3.1 Code Explanation and Output
In this code, the @SpringBootApplication
annotation combines @Configuration
, @EnableAutoConfiguration
, and @ComponentScan
, marking the main class of a Spring Boot application and enabling auto-configuration and component scanning. The CommandLineRunner
interface is used to execute logic upon application startup, with the run
method being overridden to define what should execute once the application starts. The @Autowired
annotation injects necessary beans, such as JobLauncher
, job1
, and job2
, into the main application class. The JobLauncher
is responsible for launching jobs, utilizing the run
method to execute a job with specific parameters. The JobParameters
class is used to pass runtime parameters to the job, and in this case, an empty JobParameters
object is used. The main
method triggers the application startup with SpringApplication.run
, and the run
method calls jobLauncher.run
twice to execute job1
and job2
sequentially, ensuring that job2
starts only after job1
completes.
When this application runs, the output will be the execution logs for job1
followed by the logs for job2
. This confirms that the jobs are executed one after the other.
Executing Step 1 Executing Step 2 Executing Step 3
4. Parallel Job Execution
Parallel execution in Spring Batch allows jobs to run concurrently, which can significantly speed up batch processing by utilizing multiple threads. This is especially useful when you have independent steps or flows that do not depend on each other. In Spring Batch, parallel execution can be achieved using a TaskExecutor
(which manages thread pools) and split
to run multiple flows in parallel. Below is an example that demonstrates parallel job execution in Spring Batch:
@Bean public TaskExecutor taskExecutor() { ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor(); taskExecutor.setCorePoolSize(4); taskExecutor.setMaxPoolSize(10); taskExecutor.setQueueCapacity(25); taskExecutor.initialize(); return taskExecutor; } @Bean public Job parallelJob() { return jobBuilderFactory.get("parallelJob") .start(step1()) .split(taskExecutor()).add(flow1(), flow2()) .end() .build(); } @Bean public Flow flow1() { return new FlowBuilder("flow1").start(step2()).build(); } @Bean public Flow flow2() { return new FlowBuilder("flow2").start(step3()).build(); }
4.1 Code Explanation and Output
In this code, the TaskExecutor
is a Spring Batch component used for managing the execution of tasks with a thread pool. In this example, a ThreadPoolTaskExecutor
is configured with a core pool size of 4 threads, a maximum pool size of 10 threads, and a queue capacity of 25 tasks. The initialize()
method is called to prepare the executor for use. The parallelJob
bean defines a job named “parallelJob” that starts with step1()
and uses the split
method to execute two flows, flow1()
and flow2()
, concurrently. The taskExecutor()
bean ensures the parallel execution of these flows. The flow1
and flow2
beans define separate flows that each consist of a single step, step2()
for flow1()
and step3()
for flow2()
. By using the split
method, both flows can run in parallel, utilizing multiple threads provided by the TaskExecutor
.
With this setup, both step2()
and step3()
will be executed in parallel, after step1()
finishes. The number of threads available for execution will be controlled by the ThreadPoolTaskExecutor
configuration.
Executing Step 1 Executing Step 2 (Parallel with Step 3) Executing Step 3 (Parallel with Step 2)
5. Using Job Scheduling
Spring Batch can be integrated with scheduling frameworks like Spring Scheduler or Quartz for periodic execution. Below is an example that demonstrates
@Component public class JobScheduler { @Autowired private JobLauncher jobLauncher; @Autowired private Job job1; @Scheduled(cron = "0 0 * * * ?") public void performJob() throws Exception { jobLauncher.run(job1, new JobParameters()); } }
5.1 Code Explanation and Output
In this code, the @Component
annotation marks the JobScheduler
class as a Spring bean, enabling it for dependency injection and management by the Spring container. The @Autowired
annotation automatically injects dependencies into the class, such as the JobLauncher
and job1
beans. The JobLauncher
is responsible for launching the job, and job1
is the job to be executed. The @Scheduled
annotation, with the cron expression "0 0 * * * ?"
, triggers the execution of the performJob()
method every hour on the hour. Inside performJob()
, the jobLauncher.run(job1, new JobParameters())
executes the job with empty parameters. This scheduled task ensures that job1
is executed automatically every hour, as per the defined schedule.
Once scheduled, the job will run automatically based on the cron schedule. In this case, the job will execute at the top of every hour, triggering the job’s steps to be executed as configured.
Job started at 00:00 Executing Step 1 Executing Step 2 Job completed at 00:05
6. Dynamic Job Execution
Dynamic job execution involves determining the jobs to run at runtime based on specific conditions:
@Component public class DynamicJobLauncher { @Autowired private JobLauncher jobLauncher; @Autowired private Map<String, Job> jobs; public void launchJob(String jobName) throws Exception { Job job = jobs.get(jobName); if (job != null) { jobLauncher.run(job, new JobParameters()); } else { throw new IllegalArgumentException("No such job configured: " + jobName); } } }
6.1 Code Explanation
The DynamicJobLauncher
class is a Spring-managed component that enables dynamic job execution based on a job name provided at runtime. It uses the @Autowired
annotation to inject the JobLauncher
and a map of jobs
(with job names as keys and job instances as values). The launchJob
method retrieves a job from the jobs
map using the provided job name. If the job is found, it is executed with jobLauncher.run
, using an empty set of JobParameters
. If the job name is invalid, an IllegalArgumentException
is thrown, indicating that the job is not configured. This approach allows for the flexible and dynamic execution of jobs based on runtime conditions, making it ideal for cases where job execution needs to be decided programmatically.
7. Conclusion
Running multiple jobs in Spring Batch can be achieved through various configurations such as sequential, parallel, scheduled, and dynamic executions. This flexibility allows developers to efficiently handle different batch processing requirements.