Enterprise Java

Spring Batch Run Multiple Jobs Example

Spring Batch is a powerful framework for handling large-volume batch processing. It provides tools for creating robust and scalable batch applications. Let us delve into understanding how to run multiple jobs in Spring Batch efficiently, exploring the techniques and configurations that make it possible to execute jobs concurrently or sequentially.

1. Introduction

A Spring Batch job consists of multiple steps, each step being a well-defined stage in the batch process. A typical step involves three main components:

  • Reader: The reader is responsible for reading input data from a specified source, such as a database, a file, or an API. It transforms the input into a format that can be processed further. Common implementations include FlatFileItemReader for reading from CSV files and JdbcCursorItemReader for database records.
  • Processor: The processor applies business logic to the input data, transforming it into a desired output format. This might involve filtering, aggregating, or converting data. A processor can return null to indicate that a specific item should be skipped.
  • Writer: The writer takes the processed data and writes it to an output destination, such as another database, a file, or even another system. Examples include FlatFileItemWriter for writing to files and JdbcBatchItemWriter for writing to a database.

Jobs in Spring Batch can be executed in various ways depending on the requirements:

  • Sequential Execution: In sequential execution, jobs are executed one after the other. This is useful when jobs have dependencies or need to be run in a specific order to maintain data consistency.
  • Parallel Execution: Jobs can be configured to run concurrently, leveraging multi-threading or parallel processing to improve performance and reduce the total processing time. This is particularly useful for large datasets where each job operates independently.
  • Conditional Execution: Jobs can be configured to execute based on specific conditions or decisions. For example, a job can decide the next step based on the outcome of the previous step. This is typically handled using decision components or job parameters.

Understanding these core concepts is crucial for designing efficient batch processes that can handle large volumes of data with reliability and performance.

2. Configuration

Spring Batch jobs are typically configured using Java-based configurations, which provide a clean and type-safe way of defining jobs, steps, and their components. Let’s break down the sample configuration provided:

@Configuration
@EnableBatchProcessing
public class BatchConfig {

    @Autowired
    private JobBuilderFactory jobBuilderFactory;

    @Autowired
    private StepBuilderFactory stepBuilderFactory;

    @Bean
    public Job job1() {
        return jobBuilderFactory.get("job1")
            .start(step1())
            .next(step2())
            .build();
    }

    @Bean
    public Job job2() {
        return jobBuilderFactory.get("job2")
            .start(step3())
            .build();
    }

    @Bean
    public Step step1() {
        return stepBuilderFactory.get("step1")
            .tasklet((contribution, chunkContext) -> {
                System.out.println("Executing Step 1");
                return RepeatStatus.FINISHED;
            })
            .build();
    }

    @Bean
    public Step step2() {
        return stepBuilderFactory.get("step2")
            .tasklet((contribution, chunkContext) -> {
                System.out.println("Executing Step 2");
                return RepeatStatus.FINISHED;
            })
            .build();
    }

    @Bean
    public Step step3() {
        return stepBuilderFactory.get("step3")
            .tasklet((contribution, chunkContext) -> {
                System.out.println("Executing Step 3");
                return RepeatStatus.FINISHED;
            })
            .build();
    }
}

2.1 Code Explanation

In this code, the @Configuration annotation marks the class as a source of bean definitions for the Spring application context. The @EnableBatchProcessing annotation enables Spring Batch features and provides a base configuration for setting up batch jobs. The @Autowired annotation is used to inject the JobBuilderFactory and StepBuilderFactory beans, which are essential for creating job and step definitions. In the job definition, job1 is defined to start with step1() and proceed to step2(), while job2 is defined to start and end with step3(), without any subsequent steps. Each step serves as a building block of the job. In this configuration, step1 executes a tasklet that prints “Executing Step 1” and returns RepeatStatus.FINISHED to indicate completion, while step2 performs a similar task, printing “Executing Step 2” and finishing with RepeatStatus.FINISHED. step3 executes a tasklet that prints “Executing Step 3” and also completes with RepeatStatus.FINISHED. A tasklet is a simple interface with a single method for executing a task, typically used for lightweight operations like logging or simple tasks that don’t require complex state management.

This configuration sets up two jobs, job1 and job2, with three steps. The jobs can be executed sequentially or based on specific requirements, providing flexibility in how batch processing is managed in a Spring application.

3. Sequential Job Execution

In sequential execution, jobs are executed one after the other in a predefined order. This ensures that one job completes before the next one starts. Sequential execution is useful when jobs have dependencies or need to be run in a specific order to ensure data consistency or processing logic. In Spring Batch, this can be achieved by invoking jobs sequentially in the main application class. Let’s break down the example provided:

@SpringBootApplication
public class SpringBatchApplication implements CommandLineRunner {

    @Autowired
    private JobLauncher jobLauncher;

    @Autowired
    private Job job1;

    @Autowired
    private Job job2;

    public static void main(String[] args) {
        SpringApplication.run(SpringBatchApplication.class, args);
    }

    @Override
    public void run(String... args) throws Exception {
        jobLauncher.run(job1, new JobParameters());
        jobLauncher.run(job2, new JobParameters());
    }
}

3.1 Code Explanation and Output

In this code, the @SpringBootApplication annotation combines @Configuration, @EnableAutoConfiguration, and @ComponentScan, marking the main class of a Spring Boot application and enabling auto-configuration and component scanning. The CommandLineRunner interface is used to execute logic upon application startup, with the run method being overridden to define what should execute once the application starts. The @Autowired annotation injects necessary beans, such as JobLauncher, job1, and job2, into the main application class. The JobLauncher is responsible for launching jobs, utilizing the run method to execute a job with specific parameters. The JobParameters class is used to pass runtime parameters to the job, and in this case, an empty JobParameters object is used. The main method triggers the application startup with SpringApplication.run, and the run method calls jobLauncher.run twice to execute job1 and job2 sequentially, ensuring that job2 starts only after job1 completes.

When this application runs, the output will be the execution logs for job1 followed by the logs for job2. This confirms that the jobs are executed one after the other.

Executing Step 1
Executing Step 2
Executing Step 3

4. Parallel Job Execution

Parallel execution in Spring Batch allows jobs to run concurrently, which can significantly speed up batch processing by utilizing multiple threads. This is especially useful when you have independent steps or flows that do not depend on each other. In Spring Batch, parallel execution can be achieved using a TaskExecutor (which manages thread pools) and split to run multiple flows in parallel. Below is an example that demonstrates parallel job execution in Spring Batch:

@Bean
public TaskExecutor taskExecutor() {
    ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
    taskExecutor.setCorePoolSize(4);
    taskExecutor.setMaxPoolSize(10);
    taskExecutor.setQueueCapacity(25);
    taskExecutor.initialize();
    return taskExecutor;
}

@Bean
public Job parallelJob() {
    return jobBuilderFactory.get("parallelJob")
        .start(step1())
        .split(taskExecutor()).add(flow1(), flow2())
        .end()
        .build();
}

@Bean
public Flow flow1() {
    return new FlowBuilder("flow1").start(step2()).build();
}

@Bean
public Flow flow2() {
    return new FlowBuilder("flow2").start(step3()).build();
}

4.1 Code Explanation and Output

In this code, the TaskExecutor is a Spring Batch component used for managing the execution of tasks with a thread pool. In this example, a ThreadPoolTaskExecutor is configured with a core pool size of 4 threads, a maximum pool size of 10 threads, and a queue capacity of 25 tasks. The initialize() method is called to prepare the executor for use. The parallelJob bean defines a job named “parallelJob” that starts with step1() and uses the split method to execute two flows, flow1() and flow2(), concurrently. The taskExecutor() bean ensures the parallel execution of these flows. The flow1 and flow2 beans define separate flows that each consist of a single step, step2() for flow1() and step3() for flow2(). By using the split method, both flows can run in parallel, utilizing multiple threads provided by the TaskExecutor.

With this setup, both step2() and step3() will be executed in parallel, after step1() finishes. The number of threads available for execution will be controlled by the ThreadPoolTaskExecutor configuration.

Executing Step 1
Executing Step 2 (Parallel with Step 3)
Executing Step 3 (Parallel with Step 2)

5. Using Job Scheduling

Spring Batch can be integrated with scheduling frameworks like Spring Scheduler or Quartz for periodic execution. Below is an example that demonstrates

@Component
public class JobScheduler {

    @Autowired
    private JobLauncher jobLauncher;

    @Autowired
    private Job job1;

    @Scheduled(cron = "0 0 * * * ?")
    public void performJob() throws Exception {
        jobLauncher.run(job1, new JobParameters());
    }
}

5.1 Code Explanation and Output

In this code, the @Component annotation marks the JobScheduler class as a Spring bean, enabling it for dependency injection and management by the Spring container. The @Autowired annotation automatically injects dependencies into the class, such as the JobLauncher and job1 beans. The JobLauncher is responsible for launching the job, and job1 is the job to be executed. The @Scheduled annotation, with the cron expression "0 0 * * * ?", triggers the execution of the performJob() method every hour on the hour. Inside performJob(), the jobLauncher.run(job1, new JobParameters()) executes the job with empty parameters. This scheduled task ensures that job1 is executed automatically every hour, as per the defined schedule.

Once scheduled, the job will run automatically based on the cron schedule. In this case, the job will execute at the top of every hour, triggering the job’s steps to be executed as configured.

Job started at 00:00
Executing Step 1
Executing Step 2
Job completed at 00:05

6. Dynamic Job Execution

Dynamic job execution involves determining the jobs to run at runtime based on specific conditions:

@Component
public class DynamicJobLauncher {

    @Autowired
    private JobLauncher jobLauncher;

    @Autowired
    private Map<String, Job> jobs;

    public void launchJob(String jobName) throws Exception {
        Job job = jobs.get(jobName);
        if (job != null) {
            jobLauncher.run(job, new JobParameters());
        } else {
            throw new IllegalArgumentException("No such job configured: " + jobName);
        }
    }
}

6.1 Code Explanation

The DynamicJobLauncher class is a Spring-managed component that enables dynamic job execution based on a job name provided at runtime. It uses the @Autowired annotation to inject the JobLauncher and a map of jobs (with job names as keys and job instances as values). The launchJob method retrieves a job from the jobs map using the provided job name. If the job is found, it is executed with jobLauncher.run, using an empty set of JobParameters. If the job name is invalid, an IllegalArgumentException is thrown, indicating that the job is not configured. This approach allows for the flexible and dynamic execution of jobs based on runtime conditions, making it ideal for cases where job execution needs to be decided programmatically.

7. Conclusion

Running multiple jobs in Spring Batch can be achieved through various configurations such as sequential, parallel, scheduled, and dynamic executions. This flexibility allows developers to efficiently handle different batch processing requirements.

Yatin Batra

An experience full-stack engineer well versed with Core Java, Spring/Springboot, MVC, Security, AOP, Frontend (Angular & React), and cloud technologies (such as AWS, GCP, Jenkins, Docker, K8).
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button