From c08c1c4a270e7e4f5c1329a2056251d8bb0c3897 Mon Sep 17 00:00:00 2001 From: ZhangKai Date: Mon, 7 Mar 2022 21:39:22 +0800 Subject: [PATCH] #27 Spring batch --- docs/en/spring-batch/appendix.md | 6 +- docs/en/spring-batch/common-patterns.md | 20 +- docs/en/spring-batch/domain.md | 28 +-- docs/en/spring-batch/glossary.md | 4 +- docs/en/spring-batch/job.md | 56 +++--- docs/en/spring-batch/jsr-352.md | 40 ++-- .../en/spring-batch/monitoring-and-metrics.md | 8 +- docs/en/spring-batch/processor.md | 10 +- docs/en/spring-batch/readersAndWriters.md | 184 +++++++++--------- docs/en/spring-batch/repeat.md | 18 +- docs/en/spring-batch/retry.md | 20 +- docs/en/spring-batch/scalability.md | 16 +- docs/en/spring-batch/schema-appendix.md | 30 +-- .../spring-batch/spring-batch-integration.md | 30 +-- docs/en/spring-batch/spring-batch-intro.md | 12 +- docs/en/spring-batch/step.md | 84 ++++---- docs/en/spring-batch/testing.md | 14 +- docs/en/spring-batch/transaction-appendix.md | 18 +- docs/en/spring-batch/whatsnew.md | 32 +-- docs/spring-batch/appendix.md | 6 +- docs/spring-batch/common-patterns.md | 20 +- docs/spring-batch/domain.md | 28 +-- docs/spring-batch/glossary.md | 4 +- docs/spring-batch/job.md | 56 +++--- docs/spring-batch/jsr-352.md | 40 ++-- docs/spring-batch/monitoring-and-metrics.md | 8 +- docs/spring-batch/processor.md | 10 +- docs/spring-batch/readersAndWriters.md | 184 +++++++++--------- docs/spring-batch/repeat.md | 18 +- docs/spring-batch/retry.md | 20 +- docs/spring-batch/scalability.md | 16 +- docs/spring-batch/schema-appendix.md | 30 +-- docs/spring-batch/spring-batch-integration.md | 30 +-- docs/spring-batch/spring-batch-intro.md | 12 +- docs/spring-batch/step.md | 84 ++++---- docs/spring-batch/testing.md | 14 +- docs/spring-batch/transaction-appendix.md | 18 +- docs/spring-batch/whatsnew.md | 32 +-- 38 files changed, 630 insertions(+), 630 deletions(-) diff --git a/docs/en/spring-batch/appendix.md b/docs/en/spring-batch/appendix.md index 0ab894e..10e84b5 100644 --- a/docs/en/spring-batch/appendix.md +++ b/docs/en/spring-batch/appendix.md @@ -1,6 +1,6 @@ -## [](#listOfReadersAndWriters)Appendix A: List of ItemReaders and ItemWriters +## Appendix A: List of ItemReaders and ItemWriters -### [](#itemReadersAppendix)Item Readers +### Item Readers | Item Reader | Description | |----------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| @@ -24,7 +24,7 @@ | StaxEventItemReader | Reads via StAX. see [`StaxEventItemReader`](readersAndWriters.html#StaxEventItemReader). | | JsonItemReader | Reads items from a Json document. see [`JsonItemReader`](readersAndWriters.html#JsonItemReader). | -### [](#itemWritersAppendix)Item Writers +### Item Writers | Item Writer | Description | |--------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| diff --git a/docs/en/spring-batch/common-patterns.md b/docs/en/spring-batch/common-patterns.md index ed4f833..f97609e 100644 --- a/docs/en/spring-batch/common-patterns.md +++ b/docs/en/spring-batch/common-patterns.md @@ -1,6 +1,6 @@ # Common Batch Patterns -## [](#commonPatterns)Common Batch Patterns +## Common Batch Patterns XMLJavaBoth @@ -15,7 +15,7 @@ to implement an `ItemWriter` or `ItemProcessor`. In this chapter, we provide a few examples of common patterns in custom business logic. These examples primarily feature the listener interfaces. It should be noted that an`ItemReader` or `ItemWriter` can implement a listener interface as well, if appropriate. -### [](#loggingItemProcessingAndFailures)Logging Item Processing and Failures +### Logging Item Processing and Failures A common use case is the need for special handling of errors in a step, item by item, perhaps logging to a special channel or inserting a record into a database. A @@ -73,7 +73,7 @@ public Step simpleStep() { | |if your listener does anything in an `onError()` method, it must be inside
a transaction that is going to be rolled back. If you need to use a transactional
resource, such as a database, inside an `onError()` method, consider adding a declarative
transaction to that method (see Spring Core Reference Guide for details), and giving its
propagation attribute a value of `REQUIRES_NEW`.| |---|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -### [](#stoppingAJobManuallyForBusinessReasons)Stopping a Job Manually for Business Reasons +### Stopping a Job Manually for Business Reasons Spring Batch provides a `stop()` method through the `JobOperator` interface, but this is really for use by the operator rather than the application programmer. Sometimes, it is @@ -178,7 +178,7 @@ public class CustomItemWriter extends ItemListenerSupport implements StepListene When the flag is set, the default behavior is for the step to throw a`JobInterruptedException`. This behavior can be controlled through the`StepInterruptionPolicy`. However, the only choice is to throw or not throw an exception, so this is always an abnormal ending to a job. -### [](#addingAFooterRecord)Adding a Footer Record +### Adding a Footer Record Often, when writing to flat files, a “footer” record must be appended to the end of the file, after all processing has be completed. This can be achieved using the`FlatFileFooterCallback` interface provided by Spring Batch. The `FlatFileFooterCallback`(and its counterpart, the `FlatFileHeaderCallback`) are optional properties of the`FlatFileItemWriter` and can be added to an item writer. @@ -224,7 +224,7 @@ public interface FlatFileFooterCallback { } ``` -#### [](#writingASummaryFooter)Writing a Summary Footer +#### Writing a Summary Footer A common requirement involving footer records is to aggregate information during the output process and to append this information to the end of the file. This footer often @@ -331,7 +331,7 @@ retrieves any existing `totalAmount` from the `ExecutionContext` and uses it as starting point for processing, allowing the `TradeItemWriter` to pick up on restart where it left off the previous time the `Step` was run. -### [](#drivingQueryBasedItemReaders)Driving Query Based ItemReaders +### Driving Query Based ItemReaders In the [chapter on readers and writers](readersAndWriters.html), database input using paging was discussed. Many database vendors, such as DB2, have extremely pessimistic @@ -360,7 +360,7 @@ An `ItemProcessor` should be used to transform the key obtained from the driving into a full `Foo` object. An existing DAO can be used to query for the full object based on the key. -### [](#multiLineRecords)Multi-Line Records +### Multi-Line Records While it is usually the case with flat files that each record is confined to a single line, it is common that a file might have records spanning multiple lines with multiple @@ -515,7 +515,7 @@ public Trade read() throws Exception { } ``` -### [](#executingSystemCommands)Executing System Commands +### Executing System Commands Many batch jobs require that an external command be called from within the batch job. Such a process could be kicked off separately by the scheduler, but the advantage of @@ -553,7 +553,7 @@ public SystemCommandTasklet tasklet() { } ``` -### [](#handlingStepCompletionWhenNoInputIsFound)Handling Step Completion When No Input is Found +### Handling Step Completion When No Input is Found In many batch scenarios, finding no rows in a database or file to process is not exceptional. The `Step` is simply considered to have found no work and completes with 0 @@ -584,7 +584,7 @@ The preceding `StepExecutionListener` inspects the `readCount` property of the`S is the case, an exit code `FAILED` is returned, indicating that the `Step` should fail. Otherwise, `null` is returned, which does not affect the status of the `Step`. -### [](#passingDataToFutureSteps)Passing Data to Future Steps +### Passing Data to Future Steps It is often useful to pass information from one step to another. This can be done through the `ExecutionContext`. The catch is that there are two `ExecutionContexts`: one at the`Step` level and one at the `Job` level. The `Step` `ExecutionContext` remains only as diff --git a/docs/en/spring-batch/domain.md b/docs/en/spring-batch/domain.md index 427ba65..2fcc89d 100644 --- a/docs/en/spring-batch/domain.md +++ b/docs/en/spring-batch/domain.md @@ -1,6 +1,6 @@ # The Domain Language of Batch -## [](#domainLanguageOfBatch)The Domain Language of Batch +## The Domain Language of Batch XMLJavaBoth @@ -38,7 +38,7 @@ The preceding diagram highlights the key concepts that make up the domain langua Spring Batch. A Job has one to many steps, each of which has exactly one `ItemReader`, one `ItemProcessor`, and one `ItemWriter`. A job needs to be launched (with`JobLauncher`), and metadata about the currently running process needs to be stored (in`JobRepository`). -### [](#job)Job +### Job This section describes stereotypes relating to the concept of a batch job. A `Job` is an entity that encapsulates an entire batch process. As is common with other Spring @@ -89,7 +89,7 @@ following example: ``` -#### [](#jobinstance)JobInstance +#### JobInstance A `JobInstance` refers to the concept of a logical job run. Consider a batch job that should be run once at the end of the day, such as the 'EndOfDay' `Job` from the preceding @@ -114,7 +114,7 @@ from previous executions is used. Using a new `JobInstance` means 'start from th beginning', and using an existing instance generally means 'start from where you left off'. -#### [](#jobparameters)JobParameters +#### JobParameters Having discussed `JobInstance` and how it differs from Job, the natural question to ask is: "How is one `JobInstance` distinguished from another?" The answer is:`JobParameters`. A `JobParameters` object holds a set of parameters used to start a batch @@ -133,7 +133,7 @@ a parameter of 01-02-2017. Thus, the contract can be defined as: `JobInstance` = | |Not all job parameters are required to contribute to the identification of a`JobInstance`. By default, they do so. However, the framework also allows the submission
of a `Job` with parameters that do not contribute to the identity of a `JobInstance`.| |---|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -#### [](#jobexecution)JobExecution +#### JobExecution A `JobExecution` refers to the technical concept of a single attempt to run a Job. An execution may end in failure or success, but the `JobInstance` corresponding to a given @@ -213,7 +213,7 @@ in both the `JobInstance` and `JobParameters` tables and two extra entries in th | |Column names may have been abbreviated or removed for the sake of clarity and
formatting.| |---|---------------------------------------------------------------------------------------------| -### [](#step)Step +### Step A `Step` is a domain object that encapsulates an independent, sequential phase of a batch job. Therefore, every Job is composed entirely of one or more steps. A `Step` contains @@ -228,7 +228,7 @@ with a `Job`, a `Step` has an individual `StepExecution` that correlates with a Figure 4. Job Hierarchy With Steps -#### [](#stepexecution)StepExecution +#### StepExecution A `StepExecution` represents a single attempt to execute a `Step`. A new `StepExecution`is created each time a `Step` is run, similar to `JobExecution`. However, if a step fails to execute because the step before it fails, no execution is persisted for it. A`StepExecution` is created only when its `Step` is actually started. @@ -256,7 +256,7 @@ restart. The following table lists the properties for `StepExecution`: | filterCount | The number of items that have been 'filtered' by the `ItemProcessor`. | | writeSkipCount | The number of times `write` has failed, resulting in a skipped item. | -### [](#executioncontext)ExecutionContext +### ExecutionContext An `ExecutionContext` represents a collection of key/value pairs that are persisted and controlled by the framework in order to allow developers a place to store persistent @@ -351,7 +351,7 @@ ExecutionContext ecJob = jobExecution.getExecutionContext(); As noted in the comment, `ecStep` does not equal `ecJob`. They are two different`ExecutionContexts`. The one scoped to the `Step` is saved at every commit point in the`Step`, whereas the one scoped to the Job is saved in between every `Step` execution. -### [](#jobrepository)JobRepository +### JobRepository `JobRepository` is the persistence mechanism for all of the Stereotypes mentioned above. It provides CRUD operations for `JobLauncher`, `Job`, and `Step` implementations. When a`Job` is first launched, a `JobExecution` is obtained from the repository, and, during @@ -367,7 +367,7 @@ with the `` tag, as shown in the following example: When using Java configuration, the `@EnableBatchProcessing` annotation provides a`JobRepository` as one of the components automatically configured out of the box. -### [](#joblauncher)JobLauncher +### JobLauncher `JobLauncher` represents a simple interface for launching a `Job` with a given set of`JobParameters`, as shown in the following example: @@ -382,27 +382,27 @@ public JobExecution run(Job job, JobParameters jobParameters) It is expected that implementations obtain a valid `JobExecution` from the`JobRepository` and execute the `Job`. -### [](#item-reader)Item Reader +### Item Reader `ItemReader` is an abstraction that represents the retrieval of input for a `Step`, one item at a time. When the `ItemReader` has exhausted the items it can provide, it indicates this by returning `null`. More details about the `ItemReader` interface and its various implementations can be found in[Readers And Writers](readersAndWriters.html#readersAndWriters). -### [](#item-writer)Item Writer +### Item Writer `ItemWriter` is an abstraction that represents the output of a `Step`, one batch or chunk of items at a time. Generally, an `ItemWriter` has no knowledge of the input it should receive next and knows only the item that was passed in its current invocation. More details about the `ItemWriter` interface and its various implementations can be found in[Readers And Writers](readersAndWriters.html#readersAndWriters). -### [](#item-processor)Item Processor +### Item Processor `ItemProcessor` is an abstraction that represents the business processing of an item. While the `ItemReader` reads one item, and the `ItemWriter` writes them, the`ItemProcessor` provides an access point to transform or apply other business processing. If, while processing the item, it is determined that the item is not valid, returning`null` indicates that the item should not be written out. More details about the`ItemProcessor` interface can be found in[Readers And Writers](readersAndWriters.html#readersAndWriters). -### [](#batch-namespace)Batch Namespace +### Batch Namespace Many of the domain concepts listed previously need to be configured in a Spring`ApplicationContext`. While there are implementations of the interfaces above that can be used in a standard bean definition, a namespace has been provided for ease of diff --git a/docs/en/spring-batch/glossary.md b/docs/en/spring-batch/glossary.md index 61cbb23..6f43da1 100644 --- a/docs/en/spring-batch/glossary.md +++ b/docs/en/spring-batch/glossary.md @@ -1,8 +1,8 @@ # Glossary -## [](#glossary)Appendix A: Glossary +## Appendix A: Glossary -### [](#spring-batch-glossary)Spring Batch Glossary +### Spring Batch Glossary Batch diff --git a/docs/en/spring-batch/job.md b/docs/en/spring-batch/job.md index 705ea7e..7888611 100644 --- a/docs/en/spring-batch/job.md +++ b/docs/en/spring-batch/job.md @@ -1,6 +1,6 @@ # Configuring and Running a Job -## [](#configureJob)Configuring and Running a Job +## Configuring and Running a Job XMLJavaBoth @@ -19,7 +19,7 @@ how a `Job` will be run and how its meta-data will be stored during that run. This chapter will explain the various configuration options and runtime concerns of a `Job`. -### [](#configuringAJob)Configuring a Job +### Configuring a Job There are multiple implementations of the [`Job`](#configureJob) interface. However, builders abstract away the difference in configuration. @@ -70,7 +70,7 @@ In addition to steps a job configuration can contain other elements that help wi parallelization (``), declarative flow control (``) and externalization of flow definitions (``). -#### [](#restartability)Restartability +#### Restartability One key issue when executing a batch job concerns the behavior of a `Job` when it is restarted. The launching of a `Job` is considered to be a 'restart' if a `JobExecution`already exists for the particular `JobInstance`. Ideally, all jobs should be able to start @@ -130,7 +130,7 @@ This snippet of JUnit code shows how attempting to create a`JobExecution` the fi job will cause no issues. However, the second attempt will throw a `JobRestartException`. -#### [](#interceptingJobExecution)Intercepting Job Execution +#### Intercepting Job Execution During the course of the execution of a Job, it may be useful to be notified of various @@ -198,7 +198,7 @@ The annotations corresponding to this interface are: * `@AfterJob` -#### [](#inheritingFromAParentJob)Inheriting from a Parent Job +#### Inheriting from a Parent Job If a group of Jobs share similar, but not identical, configurations, then it may be helpful to define a "parent"`Job` from which the concrete @@ -229,7 +229,7 @@ it with its own list of listeners to produce a`Job` with two listeners and one`S Please see the section on [Inheriting from a Parent Step](step.html#inheritingFromParentStep)for more detailed information. -#### [](#jobparametersvalidator)JobParametersValidator +#### JobParametersValidator A job declared in the XML namespace or using any subclass of`AbstractJob` can optionally declare a validator for the job parameters at runtime. This is useful when for instance you need to assert that a job @@ -263,7 +263,7 @@ public Job job1() { } ``` -### [](#javaConfig)Java Config +### Java Config Spring 3 brought the ability to configure applications via java instead of XML. As of Spring Batch 2.2.0, batch jobs can be configured using the same java config. @@ -351,7 +351,7 @@ public class AppConfig { } ``` -### [](#configuringJobRepository)Configuring a JobRepository +### Configuring a JobRepository When using `@EnableBatchProcessing`, a `JobRepository` is provided out of the box for you. This section addresses configuring your own. @@ -407,7 +407,7 @@ will be used. They are shown above for awareness purposes. The max varchar length defaults to 2500, which is the length of the long `VARCHAR` columns in the[sample schema scripts](schema-appendix.html#metaDataSchemaOverview) -#### [](#txConfigForJobRepository)Transaction Configuration for the JobRepository +#### Transaction Configuration for the JobRepository If the namespace or the provided `FactoryBean` is used, transactional advice is automatically created around the repository. This is to ensure that the batch meta-data, @@ -490,7 +490,7 @@ public TransactionProxyFactoryBean baseProxy() { } ``` -#### [](#repositoryTablePrefix)Changing the Table Prefix +#### Changing the Table Prefix Another modifiable property of the `JobRepository` is the table prefix of the meta-data tables. By default they are all prefaced with `BATCH_`. `BATCH_JOB_EXECUTION` and`BATCH_STEP_EXECUTION` are two examples. However, there are potential reasons to modify this @@ -528,7 +528,7 @@ Given the preceding changes, every query to the meta-data tables is prefixed wit | |Only the table prefix is configurable. The table and column names are not.| |---|--------------------------------------------------------------------------| -#### [](#inMemoryRepository)In-Memory Repository +#### In-Memory Repository There are scenarios in which you may not want to persist your domain objects to the database. One reason may be speed; storing domain objects at each commit point takes extra @@ -574,7 +574,7 @@ transactional (such as RDBMS access). For testing purposes many people find the` | |The `MapJobRepositoryFactoryBean` and related classes have been deprecated in v4 and are scheduled
for removal in v5. If you want to use an in-memory job repository, you can use an embedded database
like H2, Apache Derby or HSQLDB. There are several ways to create an embedded database and use it in
your Spring Batch application. One way to do that is by using the APIs from [Spring JDBC](https://docs.spring.io/spring-framework/docs/current/reference/html/data-access.html#jdbc-embedded-database-support):

```
@Bean
public DataSource dataSource() {
return new EmbeddedDatabaseBuilder()
.setType(EmbeddedDatabaseType.H2)
.addScript("/org/springframework/batch/core/schema-drop-h2.sql")
.addScript("/org/springframework/batch/core/schema-h2.sql")
.build();
}
```

Once you have defined your embedded datasource as a bean in your application context, it should be picked
up automatically if you use `@EnableBatchProcessing`. Otherwise you can configure it manually using the
JDBC based `JobRepositoryFactoryBean` as shown in the [Configuring a JobRepository section](#configuringJobRepository).| |---|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -#### [](#nonStandardDatabaseTypesInRepository)Non-standard Database Types in a Repository +#### Non-standard Database Types in a Repository If you are using a database platform that is not in the list of supported platforms, you may be able to use one of the supported types, if the SQL variant is close enough. To do @@ -620,7 +620,7 @@ If even that doesn’t work, or you are not using an RDBMS, then the only option may be to implement the various `Dao`interfaces that the `SimpleJobRepository` depends on and wire one up manually in the normal Spring way. -### [](#configuringJobLauncher)Configuring a JobLauncher +### Configuring a JobLauncher When using `@EnableBatchProcessing`, a `JobRegistry` is provided out of the box for you. This section addresses configuring your own. @@ -709,7 +709,7 @@ public JobLauncher jobLauncher() { Any implementation of the spring `TaskExecutor`interface can be used to control how jobs are asynchronously executed. -### [](#runningAJob)Running a Job +### Running a Job At a minimum, launching a batch job requires two things: the`Job` to be launched and a`JobLauncher`. Both can be contained within the same context or different contexts. For example, if launching a job from the @@ -718,7 +718,7 @@ job will have its own `JobLauncher`. However, if running from within a web container within the scope of an`HttpRequest`, there will usually be one`JobLauncher`, configured for asynchronous job launching, that multiple requests will invoke to launch their jobs. -#### [](#runningJobsFromCommandLine)Running Jobs from the Command Line +#### Running Jobs from the Command Line For users that want to run their jobs from an enterprise scheduler, the command line is the primary interface. This is because @@ -729,7 +729,7 @@ to launch a Java process besides a shell script, such as Perl, Ruby, or even 'build tools' such as ant or maven. However, because most people are familiar with shell scripts, this example will focus on them. -##### [](#commandLineJobRunner)The CommandLineJobRunner +##### The CommandLineJobRunner Because the script launching the job must kick off a Java Virtual Machine, there needs to be a class with a main method to act @@ -832,7 +832,7 @@ The preceding example is overly simplistic, since there are many more requiremen run a batch job in Spring Batch in general, but it serves to show the two main requirements of the `CommandLineJobRunner`: `Job` and `JobLauncher`. -##### [](#exitCodes)ExitCodes +##### ExitCodes When launching a batch job from the command-line, an enterprise scheduler is often used. Most schedulers are fairly dumb and work only @@ -876,7 +876,7 @@ that needs to be done to provide your own`ExitCodeMapper` is to declare the impl as a root level bean and ensure that it is part of the`ApplicationContext` that is loaded by the runner. -#### [](#runningJobsFromWebContainer)Running Jobs from within a Web Container +#### Running Jobs from within a Web Container Historically, offline processing such as batch jobs have been launched from the command-line, as described above. However, there are @@ -891,7 +891,7 @@ job asynchronously: Figure 4. Asynchronous Job Launcher Sequence From Web Container The controller in this case is a Spring MVC controller. More -information on Spring MVC can be found here: [](https://docs.spring.io/spring/docs/current/spring-framework-reference/web.html#mvc)[https://docs.spring.io/spring/docs/current/spring-framework-reference/web.html#mvc](https://docs.spring.io/spring/docs/current/spring-framework-reference/web.html#mvc). +information on Spring MVC can be found here: . The controller launches a `Job` using a`JobLauncher` that has been configured to launch[asynchronously](#runningJobsFromWebContainer), which immediately returns a `JobExecution`. The`Job` will likely still be running, however, this nonblocking behaviour allows the controller to return immediately, which @@ -915,7 +915,7 @@ public class JobLauncherController { } ``` -### [](#advancedMetaData)Advanced Meta-Data Usage +### Advanced Meta-Data Usage So far, both the `JobLauncher` and `JobRepository` interfaces have been discussed. Together, they represent simple launching of a job, and basic @@ -940,7 +940,7 @@ The `JobExplorer` and`JobOperator` interfaces, which will be discussed below, add additional functionality for querying and controlling the meta data. -#### [](#queryingRepository)Querying the Repository +#### Querying the Repository The most basic need before any advanced features is the ability to query the repository for existing executions. This functionality is @@ -1022,7 +1022,7 @@ public JobExplorer getJobExplorer() throws Exception { ... ``` -#### [](#jobregistry)JobRegistry +#### JobRegistry A `JobRegistry` (and its parent interface `JobLocator`) is not mandatory, but it can be useful if you want to keep track of which jobs are available in the context. It is also @@ -1059,7 +1059,7 @@ There are two ways to populate a `JobRegistry` automatically: using a bean post processor and using a registrar lifecycle component. These two mechanisms are described in the following sections. -##### [](#jobregistrybeanpostprocessor)JobRegistryBeanPostProcessor +##### JobRegistryBeanPostProcessor This is a bean post-processor that can register all jobs as they are created. @@ -1093,7 +1093,7 @@ example has been given an id so that it can be included in child contexts (e.g. as a parent bean definition) and cause all jobs created there to also be registered automatically. -##### [](#automaticjobregistrar)`AutomaticJobRegistrar` +##### `AutomaticJobRegistrar` This is a lifecycle component that creates child contexts and registers jobs from those contexts as they are created. One advantage of doing this is that, while the job names in @@ -1162,7 +1162,7 @@ used as well). For instance this might be desirable if there are jobs defined in the main parent context as well as in the child locations. -#### [](#JobOperator)JobOperator +#### JobOperator As previously discussed, the `JobRepository`provides CRUD operations on the meta-data, and the`JobExplorer` provides read-only operations on the meta-data. However, those operations are most useful when used together @@ -1250,7 +1250,7 @@ The following example shows a typical bean definition for `SimpleJobOperator` in | |If you set the table prefix on the job repository, don’t forget to set it on the job explorer as well.| |---|------------------------------------------------------------------------------------------------------| -#### [](#JobParametersIncrementer)JobParametersIncrementer +#### JobParametersIncrementer Most of the methods on `JobOperator` are self-explanatory, and more detailed explanations can be found on the[javadoc of the interface](https://docs.spring.io/spring-batch/docs/current/api/org/springframework/batch/core/launch/JobOperator.html). However, the`startNextInstance` method is worth noting. This @@ -1319,7 +1319,7 @@ public Job footballJob() { } ``` -#### [](#stoppingAJob)Stopping a Job +#### Stopping a Job One of the most common use cases of`JobOperator` is gracefully stopping a Job: @@ -1336,7 +1336,7 @@ business service. However, as soon as control is returned back to the framework, it will set the status of the current`StepExecution` to`BatchStatus.STOPPED`, save it, then do the same for the `JobExecution` before finishing. -#### [](#aborting-a-job)Aborting a Job +#### Aborting a Job A job execution which is `FAILED` can be restarted (if the `Job` is restartable). A job execution whose status is`ABANDONED` will not be restarted by the framework. diff --git a/docs/en/spring-batch/jsr-352.md b/docs/en/spring-batch/jsr-352.md index 9ff35e5..6ccea00 100644 --- a/docs/en/spring-batch/jsr-352.md +++ b/docs/en/spring-batch/jsr-352.md @@ -1,15 +1,15 @@ # JSR-352 Support -## [](#jsr-352)JSR-352 Support +## JSR-352 Support XMLJavaBoth As of Spring Batch 3.0 support for JSR-352 has been fully implemented. This section is not a replacement for the spec itself and instead, intends to explain how the JSR-352 specific concepts apply to Spring Batch. Additional information on JSR-352 can be found via the -JCP here: [](https://jcp.org/en/jsr/detail?id=352)[https://jcp.org/en/jsr/detail?id=352](https://jcp.org/en/jsr/detail?id=352) +JCP here: -### [](#jsrGeneralNotes)General Notes about Spring Batch and JSR-352 +### General Notes about Spring Batch and JSR-352 Spring Batch and JSR-352 are structurally the same. They both have jobs that are made up of steps. They both have readers, processors, writers, and listeners. However, their interactions are subtly different. @@ -24,9 +24,9 @@ artifacts (readers, writers, etc) will work within a job configured with JSR-352 important to note that batch artifacts that have been developed against the JSR-352 interfaces will not work within a traditional Spring Batch job. -### [](#jsrSetup)Setup +### Setup -#### [](#jsrSetupContexts)Application Contexts +#### Application Contexts All JSR-352 based jobs within Spring Batch consist of two application contexts. A parent context, that contains beans related to the infrastructure of Spring Batch such as the `JobRepository`,`PlatformTransactionManager`, etc and a child context that consists of the configuration @@ -37,7 +37,7 @@ property. | |The base context is not processed by the JSR-352 processors for things like property injection so
no components requiring that additional processing should be configured there.| |---|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -#### [](#jsrSetupLaunching)Launching a JSR-352 based job +#### Launching a JSR-352 based job JSR-352 requires a very simple path to executing a batch job. The following code is all that is needed to execute your first batch job: @@ -67,7 +67,7 @@ first time `BatchRuntime.getJobOperator()` is called: | |None of the above beans are optional for executing JSR-352 based jobs. All may be overridden to
provide customized functionality as needed.| |---|-----------------------------------------------------------------------------------------------------------------------------------------------| -### [](#dependencyInjection)Dependency Injection +### Dependency Injection JSR-352 is based heavily on the Spring Batch programming model. As such, while not explicitly requiring a formal dependency injection implementation, DI of some kind implied. Spring Batch supports all three @@ -157,9 +157,9 @@ referenced requires a no argument constructor which will be used to create the b ``` -### [](#jsrJobProperties)Batch Properties +### Batch Properties -#### [](#jsrPropertySupport)Property Support +#### Property Support JSR-352 allows for properties to be defined at the Job, Step and batch artifact level by way of configuration in the JSL. Batch properties are configured at each level in the following way: @@ -173,7 +173,7 @@ configuration in the JSL. Batch properties are configured at each level in the f `Properties` may be configured on any batch artifact. -#### [](#jsrBatchPropertyAnnotation)@BatchProperty annotation +#### @BatchProperty annotation `Properties` are referenced in batch artifacts by annotating class fields with the`@BatchProperty` and `@Inject` annotations (both annotations are required by the spec). As defined by JSR-352, fields for properties must be String typed. Any type @@ -194,7 +194,7 @@ public class MyItemReader extends AbstractItemReader { The value of the field "propertyName1" will be "propertyValue1" -#### [](#jsrPropertySubstitution)Property Substitution +#### Property Substitution Property substitution is provided by way of operators and simple conditional expressions. The general usage is `#{operator['key']}`. @@ -220,7 +220,7 @@ example, the result will resolve to a value of the system property file.separato expressions can be resolved, an empty String will be returned. Multiple conditions can be used, which are separated by a ';'. -### [](#jsrProcessingModels)Processing Models +### Processing Models JSR-352 provides the same two basic processing models that Spring Batch does: @@ -229,7 +229,7 @@ JSR-352 provides the same two basic processing models that Spring Batch does: * Task based processing - Using a `javax.batch.api.Batchlet`implementation. This processing model is the same as the`org.springframework.batch.core.step.tasklet.Tasklet` based processing currently available. -#### [](#item-based-processing)Item based processing +#### Item based processing Item based processing in this context is a chunk size being set by the number of items read by an`ItemReader`. To configure a step this way, specify the`item-count` (which defaults to 10) and optionally configure the`checkpoint-policy` as item (this is the default). @@ -250,7 +250,7 @@ This sets a time limit for how long the number of items specified has to be proc the timeout is reached, the chunk will complete with however many items have been read by then regardless of what the `item-count` is configured to be. -#### [](#custom-checkpointing)Custom checkpointing +#### Custom checkpointing JSR-352 calls the process around the commit interval within a step "checkpointing". Item-based checkpointing is one approach as mentioned above. However, this is not robust @@ -273,7 +273,7 @@ implementation of `CheckpointAlgorithm`. ... ``` -### [](#jsrRunningAJob)Running a job +### Running a job The entrance to executing a JSR-352 based job is through the`javax.batch.operations.JobOperator`. Spring Batch provides its own implementation of this interface (`org.springframework.batch.core.jsr.launch.JsrJobOperator`). This @@ -307,7 +307,7 @@ based `JobOperator#start(String jobXMLName, Properties jobParameters)`, the fram will always create a new JobInstance (JSR-352 job parameters are non-identifying). In order to restart a job, a call to`JobOperator#restart(long executionId, Properties restartParameters)` is required. -### [](#jsrContexts)Contexts +### Contexts JSR-352 defines two context objects that are used to interact with the meta-data of a job or step from within a batch artifact: `javax.batch.runtime.context.JobContext` and`javax.batch.runtime.context.StepContext`. Both of these are available in any step @@ -328,7 +328,7 @@ In Spring Batch, the `JobContext` and `StepContext` wrap their corresponding execution objects (`JobExecution` and`StepExecution` respectively). Data stored through`StepContext#setPersistentUserData(Serializable data)` is stored in the Spring Batch `StepExecution#executionContext`. -### [](#jsrStepFlow)Step Flow +### Step Flow Within a JSR-352 based job, the flow of steps works similarly as it does within Spring Batch. However, there are a few subtle differences: @@ -348,7 +348,7 @@ However, there are a few subtle differences: sorted from most specific to least specific and evaluated in that order. JSR-352 jobs evaluate transition elements in the order they are specified in the XML. -### [](#jsrScaling)Scaling a JSR-352 batch job +### Scaling a JSR-352 batch job Traditional Spring Batch jobs have four ways of scaling (the last two capable of being executed across multiple JVMs): @@ -367,7 +367,7 @@ JSR-352 provides two options for scaling batch jobs. Both options support only a * Partitioning - Conceptually the same as Spring Batch however implemented slightly different. -#### [](#jsrPartitioning)Partitioning +#### Partitioning Conceptually, partitioning in JSR-352 is the same as it is in Spring Batch. Meta-data is provided to each worker to identify the input to be processed, with the workers reporting back to the manager the @@ -407,7 +407,7 @@ results upon completion. However, there are some important differences: |`javax.batch.api.partition.PartitionAnalyzer` |End point that receives the information collected by the`PartitionCollector` as well as the resulting
statuses from a completed partition.| | `javax.batch.api.partition.PartitionReducer` | Provides the ability to provide compensating logic for a partitioned
step. | -### [](#jsrTesting)Testing +### Testing Since all JSR-352 based jobs are executed asynchronously, it can be difficult to determine when a job has completed. To help with testing, Spring Batch provides the`org.springframework.batch.test.JsrTestUtils`. This utility class provides the diff --git a/docs/en/spring-batch/monitoring-and-metrics.md b/docs/en/spring-batch/monitoring-and-metrics.md index e2982ad..86c682c 100644 --- a/docs/en/spring-batch/monitoring-and-metrics.md +++ b/docs/en/spring-batch/monitoring-and-metrics.md @@ -1,12 +1,12 @@ # Monitoring and metrics -## [](#monitoring-and-metrics)Monitoring and metrics +## Monitoring and metrics Since version 4.2, Spring Batch provides support for batch monitoring and metrics based on [Micrometer](https://micrometer.io/). This section describes which metrics are provided out-of-the-box and how to contribute custom metrics. -### [](#built-in-metrics)Built-in metrics +### Built-in metrics Metrics collection does not require any specific configuration. All metrics provided by the framework are registered in[Micrometer’s global registry](https://micrometer.io/docs/concepts#_global_registry)under the `spring.batch` prefix. The following table explains all the metrics in details: @@ -23,7 +23,7 @@ by the framework are registered in[Micrometer’s global registry](https://micro | |The `status` tag can be either `SUCCESS` or `FAILURE`.| |---|------------------------------------------------------| -### [](#custom-metrics)Custom metrics +### Custom metrics If you want to use your own metrics in your custom components, we recommend using Micrometer APIs directly. The following is an example of how to time a `Tasklet`: @@ -59,7 +59,7 @@ public class MyTimedTasklet implements Tasklet { } ``` -### [](#disabling-metrics)Disabling metrics +### Disabling metrics Metrics collection is a concern similar to logging. Disabling logs is typically done by configuring the logging library and this is no different for metrics. diff --git a/docs/en/spring-batch/processor.md b/docs/en/spring-batch/processor.md index 78f56bd..a25e493 100644 --- a/docs/en/spring-batch/processor.md +++ b/docs/en/spring-batch/processor.md @@ -1,6 +1,6 @@ # Item processing -## [](#itemProcessor)Item processing +## Item processing XMLJavaBoth @@ -112,7 +112,7 @@ public Step step1() { A difference between `ItemProcessor` and `ItemReader` or `ItemWriter` is that an `ItemProcessor`is optional for a `Step`. -### [](#chainingItemProcessors)Chaining ItemProcessors +### Chaining ItemProcessors Performing a single transformation is useful in many scenarios, but what if you want to 'chain' together multiple `ItemProcessor` implementations? This can be accomplished using @@ -220,7 +220,7 @@ public CompositeItemProcessor compositeProcessor() { } ``` -### [](#filteringRecords)Filtering Records +### Filtering Records One typical use for an item processor is to filter out records before they are passed to the `ItemWriter`. Filtering is an action distinct from skipping. Skipping indicates that @@ -239,7 +239,7 @@ that the result is `null` and avoids adding that item to the list of records del the `ItemWriter`. As usual, an exception thrown from the `ItemProcessor` results in a skip. -### [](#validatingInput)Validating Input +### Validating Input In the [ItemReaders and ItemWriters](readersAndWriters.html#readersAndWriters) chapter, multiple approaches to parsing input have been discussed. Each major implementation throws an exception if it is not 'well-formed'. The`FixedLengthTokenizer` throws an exception if a range of data is missing. Similarly, @@ -337,7 +337,7 @@ public BeanValidatingItemProcessor beanValidatingItemProcessor() throws } ``` -### [](#faultTolerant)Fault Tolerance +### Fault Tolerance When a chunk is rolled back, items that have been cached during reading may be reprocessed. If a step is configured to be fault tolerant (typically by using skip or diff --git a/docs/en/spring-batch/readersAndWriters.md b/docs/en/spring-batch/readersAndWriters.md index 1711c29..8311405 100644 --- a/docs/en/spring-batch/readersAndWriters.md +++ b/docs/en/spring-batch/readersAndWriters.md @@ -1,6 +1,6 @@ # ItemReaders and ItemWriters -## [](#readersAndWriters)ItemReaders and ItemWriters +## ItemReaders and ItemWriters XMLJavaBoth @@ -8,7 +8,7 @@ All batch processing can be described in its most simple form as reading in larg of data, performing some type of calculation or transformation, and writing the result out. Spring Batch provides three key interfaces to help perform bulk reading and writing:`ItemReader`, `ItemProcessor`, and `ItemWriter`. -### [](#itemReader)`ItemReader` +### `ItemReader` Although a simple concept, an `ItemReader` is the means for providing data from many different types of input. The most general examples include: @@ -51,7 +51,7 @@ also worth noting that a lack of items to process by an `ItemReader` does not ca exception to be thrown. For example, a database `ItemReader` that is configured with a query that returns 0 results returns `null` on the first invocation of `read`. -### [](#itemWriter)`ItemWriter` +### `ItemWriter` `ItemWriter` is similar in functionality to an `ItemReader` but with inverse operations. Resources still need to be located, opened, and closed but they differ in that an`ItemWriter` writes out, rather than reading in. In the case of databases or queues, @@ -77,7 +77,7 @@ method. For example, if writing to a Hibernate DAO, multiple calls to write can one for each item. The writer can then call `flush` on the hibernate session before returning. -### [](#itemStream)`ItemStream` +### `ItemStream` Both `ItemReaders` and `ItemWriters` serve their individual purposes well, but there is a common concern among both of them that necessitates another interface. In general, as @@ -109,7 +109,7 @@ store the state of a particular execution, with the expectation that it is retur the same `JobInstance` is started again. For those familiar with Quartz, the semantics are very similar to a Quartz `JobDataMap`. -### [](#delegatePatternAndRegistering)The Delegate Pattern and Registering with the Step +### The Delegate Pattern and Registering with the Step Note that the `CompositeItemWriter` is an example of the delegation pattern, which is common in Spring Batch. The delegates themselves might implement callback interfaces, @@ -183,7 +183,7 @@ public BarWriter barWriter() { } ``` -### [](#flatFiles)Flat Files +### Flat Files One of the most common mechanisms for interchanging bulk data has always been the flat file. Unlike XML, which has an agreed upon standard for defining how it is structured @@ -192,7 +192,7 @@ structured. In general, all flat files fall into two types: delimited and fixed Delimited files are those in which fields are separated by a delimiter, such as a comma. Fixed Length files have fields that are a set length. -#### [](#fieldSet)The `FieldSet` +#### The `FieldSet` When working with flat files in Spring Batch, regardless of whether it is for input or output, one of the most important classes is the `FieldSet`. Many architectures and @@ -218,7 +218,7 @@ consistent parsing of flat file input. Rather than each batch job parsing differ potentially unexpected ways, it can be consistent, both when handling errors caused by a format exception, or when doing simple data conversions. -#### [](#flatFileItemReader)`FlatFileItemReader` +#### `FlatFileItemReader` A flat file is any type of file that contains at most two-dimensional (tabular) data. Reading flat files in the Spring Batch framework is facilitated by the class called`FlatFileItemReader`, which provides basic functionality for reading and parsing flat @@ -255,7 +255,7 @@ interpreted, as described in the following table: |skippedLinesCallback | LineCallbackHandler |Interface that passes the raw line content of
the lines in the file to be skipped. If `linesToSkip` is set to 2, then this interface is
called twice.| | strict | boolean |In strict mode, the reader throws an exception on `ExecutionContext` if
the input resource does not exist. Otherwise, it logs the problem and continues. | -##### [](#lineMapper)`LineMapper` +##### `LineMapper` As with `RowMapper`, which takes a low-level construct such as `ResultSet` and returns an `Object`, flat file processing requires the same construct to convert a `String` line @@ -276,7 +276,7 @@ unlike `RowMapper`, the `LineMapper` is given a raw line which, as discussed abo gets you halfway there. The line must be tokenized into a `FieldSet`, which can then be mapped to an object, as described later in this document. -##### [](#lineTokenizer)`LineTokenizer` +##### `LineTokenizer` An abstraction for turning a line of input into a `FieldSet` is necessary because there can be many formats of flat file data that need to be converted to a `FieldSet`. In @@ -304,7 +304,7 @@ the following `LineTokenizer` implementations: * `PatternMatchingCompositeLineTokenizer`: Determines which `LineTokenizer` among a list of tokenizers should be used on a particular line by checking against a pattern. -##### [](#fieldSetMapper)`FieldSetMapper` +##### `FieldSetMapper` The `FieldSetMapper` interface defines a single method, `mapFieldSet`, which takes a`FieldSet` object and maps its contents to an object. This object may be a custom DTO, a domain object, or an array, depending on the needs of the job. The `FieldSetMapper` is @@ -321,7 +321,7 @@ public interface FieldSetMapper { The pattern used is the same as the `RowMapper` used by `JdbcTemplate`. -##### [](#defaultLineMapper)`DefaultLineMapper` +##### `DefaultLineMapper` Now that the basic interfaces for reading in flat files have been defined, it becomes clear that three basic steps are required: @@ -363,7 +363,7 @@ into the reader itself (as was done in previous versions of the framework) to al greater flexibility in controlling the parsing process, especially if access to the raw line is needed. -##### [](#simpleDelimitedFileReadingExample)Simple Delimited File Reading Example +##### Simple Delimited File Reading Example The following example illustrates how to read a flat file with an actual domain scenario. This particular batch job reads in football players from the following file: @@ -438,7 +438,7 @@ Player player = itemReader.read(); Each call to `read` returns a new`Player` object from each line in the file. When the end of the file is reached, `null` is returned. -##### [](#mappingFieldsByName)Mapping Fields by Name +##### Mapping Fields by Name There is one additional piece of functionality that is allowed by both`DelimitedLineTokenizer` and `FixedLengthTokenizer` and that is similar in function to a JDBC `ResultSet`. The names of the fields can be injected into either of these`LineTokenizer` implementations to increase the readability of the mapping function. @@ -472,7 +472,7 @@ public class PlayerMapper implements FieldSetMapper { } ``` -##### [](#beanWrapperFieldSetMapper)Automapping FieldSets to Domain Objects +##### Automapping FieldSets to Domain Objects For many, having to write a specific `FieldSetMapper` is equally as cumbersome as writing a specific `RowMapper` for a `JdbcTemplate`. Spring Batch makes this easier by providing @@ -523,7 +523,7 @@ same way the Spring container looks for setters matching a property name. Each a field in the `FieldSet` is mapped, and the resultant `Player` object is returned, with no code required. -##### [](#fixedLengthFileFormats)Fixed Length File Formats +##### Fixed Length File Formats So far, only delimited files have been discussed in much detail. However, they represent only half of the file reading picture. Many organizations that use flat files use fixed @@ -594,7 +594,7 @@ Because the `FixedLengthLineTokenizer` uses the same `LineTokenizer` interface a discussed above, it returns the same `FieldSet` as if a delimiter had been used. This lets the same approaches be used in handling its output, such as using the`BeanWrapperFieldSetMapper`. -##### [](#prefixMatchingLineMapper)Multiple Record Types within a Single File +##### Multiple Record Types within a Single File All of the file reading examples up to this point have all made a key assumption for simplicity’s sake: all of the records in a file have the same format. However, this may @@ -704,7 +704,7 @@ It is also common for a flat file to contain records that each span multiple lin handle this situation, a more complex strategy is required. A demonstration of this common pattern can be found in the `multiLineRecords` sample. -##### [](#exceptionHandlingInFlatFiles)Exception Handling in Flat Files +##### Exception Handling in Flat Files There are many scenarios when tokenizing a line may cause exceptions to be thrown. Many flat files are imperfect and contain incorrectly formatted records. Many users choose to @@ -714,7 +714,7 @@ reason, Spring Batch provides a hierarchy of exceptions for handling parse excep thrown by the `FlatFileItemReader` when any errors are encountered while trying to read a file. `FlatFileFormatException` is thrown by implementations of the `LineTokenizer`interface and indicates a more specific error encountered while tokenizing. -###### [](#incorrectTokenCountException)`IncorrectTokenCountException` +###### `IncorrectTokenCountException` Both `DelimitedLineTokenizer` and `FixedLengthLineTokenizer` have the ability to specify column names that can be used for creating a `FieldSet`. However, if the number of column @@ -736,7 +736,7 @@ catch (IncorrectTokenCountException e) { Because the tokenizer was configured with 4 column names but only 3 tokens were found in the file, an `IncorrectTokenCountException` was thrown. -###### [](#incorrectLineLengthException)`IncorrectLineLengthException` +###### `IncorrectLineLengthException` Files formatted in a fixed-length format have additional requirements when parsing because, unlike a delimited format, each column must strictly adhere to its predefined @@ -778,13 +778,13 @@ The preceding example is almost identical to the one before it, except that`toke line lengths when tokenizing the line. A `FieldSet` is now correctly created and returned. However, it contains only empty tokens for the remaining values. -#### [](#flatFileItemWriter)`FlatFileItemWriter` +#### `FlatFileItemWriter` Writing out to flat files has the same problems and issues that reading in from a file must overcome. A step must be able to write either delimited or fixed length formats in a transactional manner. -##### [](#lineAggregator)`LineAggregator` +##### `LineAggregator` Just as the `LineTokenizer` interface is necessary to take an item and turn it into a`String`, file writing must have a way to aggregate multiple fields into a single string for writing to a file. In Spring Batch, this is the `LineAggregator`, shown in the @@ -800,7 +800,7 @@ public interface LineAggregator { The `LineAggregator` is the logical opposite of `LineTokenizer`. `LineTokenizer` takes a`String` and returns a `FieldSet`, whereas `LineAggregator` takes an `item` and returns a`String`. -###### [](#PassThroughLineAggregator)`PassThroughLineAggregator` +###### `PassThroughLineAggregator` The most basic implementation of the `LineAggregator` interface is the`PassThroughLineAggregator`, which assumes that the object is already a string or that its string representation is acceptable for writing, as shown in the following code: @@ -818,7 +818,7 @@ The preceding implementation is useful if direct control of creating the string required but the advantages of a `FlatFileItemWriter`, such as transaction and restart support, are necessary. -##### [](#SimplifiedFileWritingExample)Simplified File Writing Example +##### Simplified File Writing Example Now that the `LineAggregator` interface and its most basic implementation,`PassThroughLineAggregator`, have been defined, the basic flow of writing can be explained: @@ -863,7 +863,7 @@ public FlatFileItemWriter itemWriter() { } ``` -##### [](#FieldExtractor)`FieldExtractor` +##### `FieldExtractor` The preceding example may be useful for the most basic uses of a writing to a file. However, most users of the `FlatFileItemWriter` have a domain object that needs to be @@ -901,14 +901,14 @@ Implementations of the `FieldExtractor` interface should create an array from th of the provided object, which can then be written out with a delimiter between the elements or as part of a fixed-width line. -###### [](#PassThroughFieldExtractor)`PassThroughFieldExtractor` +###### `PassThroughFieldExtractor` There are many cases where a collection, such as an array, `Collection`, or `FieldSet`, needs to be written out. "Extracting" an array from one of these collection types is very straightforward. To do so, convert the collection to an array. Therefore, the`PassThroughFieldExtractor` should be used in this scenario. It should be noted that, if the object passed in is not a type of collection, then the `PassThroughFieldExtractor`returns an array containing solely the item to be extracted. -###### [](#BeanWrapperFieldExtractor)`BeanWrapperFieldExtractor` +###### `BeanWrapperFieldExtractor` As with the `BeanWrapperFieldSetMapper` described in the file reading section, it is often preferable to configure how to convert a domain object to an object array, rather @@ -936,7 +936,7 @@ map. Just as the `BeanWrapperFieldSetMapper` needs field names to map fields on to map to getters for creating an object array. It is worth noting that the order of the names determines the order of the fields within the array. -##### [](#delimitedFileWritingExample)Delimited File Writing Example +##### Delimited File Writing Example The most basic flat file format is one in which all fields are separated by a delimiter. This can be accomplished using a `DelimitedLineAggregator`. The following example writes @@ -1020,7 +1020,7 @@ public FlatFileItemWriter itemWriter(Resource outputResource) th } ``` -##### [](#fixedWidthFileWritingExample)Fixed Width File Writing Example +##### Fixed Width File Writing Example Delimited is not the only type of flat file format. Many prefer to use a set width for each column to delineate between fields, which is usually referred to as 'fixed width'. @@ -1111,7 +1111,7 @@ public FlatFileItemWriter itemWriter(Resource outputResource) th } ``` -##### [](#handlingFileCreation)Handling File Creation +##### Handling File Creation `FlatFileItemReader` has a very simple relationship with file resources. When the reader is initialized, it opens the file (if it exists), and throws an exception if it does not. @@ -1125,7 +1125,7 @@ for this job is always the same? In this case, you would want to delete the file exists, unless it’s a restart. Because of this possibility, the `FlatFileItemWriter`contains the property, `shouldDeleteIfExists`. Setting this property to true causes an existing file with the same name to be deleted when the writer is opened. -### [](#xmlReadingWriting)XML Item Readers and Writers +### XML Item Readers and Writers Spring Batch provides transactional infrastructure for both reading XML records and mapping them to Java objects as well as writing Java objects as XML records. @@ -1159,7 +1159,7 @@ Figure 2. OXM Binding With an introduction to OXM and how one can use XML fragments to represent records, we can now more closely examine readers and writers. -#### [](#StaxEventItemReader)`StaxEventItemReader` +#### `StaxEventItemReader` The `StaxEventItemReader` configuration provides a typical setup for the processing of records from an XML input stream. First, consider the following set of XML records that @@ -1322,7 +1322,7 @@ while (hasNext) { } ``` -#### [](#StaxEventItemWriter)`StaxEventItemWriter` +#### `StaxEventItemWriter` Output works symmetrically to input. The `StaxEventItemWriter` needs a `Resource`, a marshaller, and a `rootTagName`. A Java object is passed to a marshaller (typically a @@ -1444,7 +1444,7 @@ trade.setCustomer("Customer1"); staxItemWriter.write(trade); ``` -### [](#jsonReadingWriting)JSON Item Readers And Writers +### JSON Item Readers And Writers Spring Batch provides support for reading and Writing JSON resources in the following format: @@ -1468,7 +1468,7 @@ Spring Batch provides support for reading and Writing JSON resources in the foll It is assumed that the JSON resource is an array of JSON objects corresponding to individual items. Spring Batch is not tied to any particular JSON library. -#### [](#JsonItemReader)`JsonItemReader` +#### `JsonItemReader` The `JsonItemReader` delegates JSON parsing and binding to implementations of the`org.springframework.batch.item.json.JsonObjectReader` interface. This interface is intended to be implemented by using a streaming API to read JSON objects @@ -1498,7 +1498,7 @@ public JsonItemReader jsonItemReader() { } ``` -#### [](#jsonfileitemwriter)`JsonFileItemWriter` +#### `JsonFileItemWriter` The `JsonFileItemWriter` delegates the marshalling of items to the`org.springframework.batch.item.json.JsonObjectMarshaller` interface. The contract of this interface is to take an object and marshall it to a JSON `String`. @@ -1527,7 +1527,7 @@ public JsonFileItemWriter jsonFileItemWriter() { } ``` -### [](#multiFileInput)Multi-File Input +### Multi-File Input It is a common requirement to process multiple files within a single `Step`. Assuming the files all have the same formatting, the `MultiResourceItemReader` supports this type of @@ -1575,7 +1575,7 @@ directories until completed successfully. | |Input resources are ordered by using `MultiResourceItemReader#setComparator(Comparator)`to make sure resource ordering is preserved between job runs in restart scenario.| |---|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -### [](#database)Database +### Database Like most enterprise application styles, a database is the central storage mechanism for batch. However, batch differs from other application styles due to the sheer size of the @@ -1587,7 +1587,7 @@ Spring Batch provides two types of solutions for this problem: * [Paging `ItemReader` Implementations](#pagingItemReaders) -#### [](#cursorBasedItemReaders)Cursor-based `ItemReader` Implementations +#### Cursor-based `ItemReader` Implementations Using a database cursor is generally the default approach of most batch developers, because it is the database’s solution to the problem of 'streaming' relational data. The @@ -1610,7 +1610,7 @@ completely mapped `Foo` object. Calling `read()` again moves the cursor to the n which is the `Foo` with an ID of 3. The results of these reads are written out after each`read`, allowing the objects to be garbage collected (assuming no instance variables are maintaining references to them). -##### [](#JdbcCursorItemReader)`JdbcCursorItemReader` +##### `JdbcCursorItemReader` `JdbcCursorItemReader` is the JDBC implementation of the cursor-based technique. It works directly with a `ResultSet` and requires an SQL statement to run against a connection @@ -1715,7 +1715,7 @@ public JdbcCursorItemReader itemReader() { } ``` -###### [](#JdbcCursorItemReaderProperties)Additional Properties +###### Additional Properties Because there are so many varying options for opening a cursor in Java, there are many properties on the `JdbcCursorItemReader` that can be set, as described in the following @@ -1731,7 +1731,7 @@ table: | driverSupportsAbsolute | Indicates whether the JDBC driver supports
setting the absolute row on a `ResultSet`. It is recommended that this is set to `true`for JDBC drivers that support `ResultSet.absolute()`, as it may improve performance,
especially if a step fails while working with a large data set. Defaults to `false`. | |setUseSharedExtendedConnection|Indicates whether the connection
used for the cursor should be used by all other processing, thus sharing the same
transaction. If this is set to `false`, then the cursor is opened with its own connection
and does not participate in any transactions started for the rest of the step processing.
If you set this flag to `true` then you must wrap the DataSource in an`ExtendedConnectionDataSourceProxy` to prevent the connection from being closed and
released after each commit. When you set this option to `true`, the statement used to
open the cursor is created with both 'READ\_ONLY' and 'HOLD\_CURSORS\_OVER\_COMMIT' options.
This allows holding the cursor open over transaction start and commits performed in the
step processing. To use this feature, you need a database that supports this and a JDBC
driver supporting JDBC 3.0 or later. Defaults to `false`.| -##### [](#HibernateCursorItemReader)`HibernateCursorItemReader` +##### `HibernateCursorItemReader` Just as normal Spring users make important decisions about whether or not to use ORM solutions, which affect whether or not they use a `JdbcTemplate` or a`HibernateTemplate`, Spring Batch users have the same options.`HibernateCursorItemReader` is the Hibernate implementation of the cursor technique. @@ -1796,7 +1796,7 @@ public HibernateCursorItemReader itemReader(SessionFactory sessionFactory) { } ``` -##### [](#StoredProcedureItemReader)`StoredProcedureItemReader` +##### `StoredProcedureItemReader` Sometimes it is necessary to obtain the cursor data by using a stored procedure. The`StoredProcedureItemReader` works like the `JdbcCursorItemReader`, except that, instead of running a query to obtain a cursor, it runs a stored procedure that returns a cursor. @@ -1990,13 +1990,13 @@ public StoredProcedureItemReader reader(DataSource dataSource) { In addition to the parameter declarations, we need to specify a `PreparedStatementSetter`implementation that sets the parameter values for the call. This works the same as for the `JdbcCursorItemReader` above. All the additional properties listed in[Additional Properties](#JdbcCursorItemReaderProperties) apply to the `StoredProcedureItemReader` as well. -#### [](#pagingItemReaders)Paging `ItemReader` Implementations +#### Paging `ItemReader` Implementations An alternative to using a database cursor is running multiple queries where each query fetches a portion of the results. We refer to this portion as a page. Each query must specify the starting row number and the number of rows that we want returned in the page. -##### [](#JdbcPagingItemReader)`JdbcPagingItemReader` +##### `JdbcPagingItemReader` One implementation of a paging `ItemReader` is the `JdbcPagingItemReader`. The`JdbcPagingItemReader` needs a `PagingQueryProvider` responsible for providing the SQL queries used to retrieve the rows making up a page. Since each database has its own @@ -2082,7 +2082,7 @@ query. If you use named parameters in the `where` clause, the key for each entry match the name of the named parameter. If you use a traditional '?' placeholder, then the key for each entry should be the number of the placeholder, starting with 1. -##### [](#JpaPagingItemReader)`JpaPagingItemReader` +##### `JpaPagingItemReader` Another implementation of a paging `ItemReader` is the `JpaPagingItemReader`. JPA does not have a concept similar to the Hibernate `StatelessSession`, so we have to use other @@ -2130,7 +2130,7 @@ described for the `JdbcPagingItemReader` above, assuming the `CustomerCredit` ob correct JPA annotations or ORM mapping file. The 'pageSize' property determines the number of entities read from the database for each query execution. -#### [](#databaseItemWriters)Database ItemWriters +#### Database ItemWriters While both flat files and XML files have a specific `ItemWriter` instance, there is no exact equivalent in the database world. This is because transactions provide all the needed functionality.`ItemWriter` implementations are necessary for files because they must act as if they’re transactional, @@ -2170,7 +2170,7 @@ implementations of `ItemWriter` is to flush on each call to `write()`. Doing so for items to be skipped reliably, with Spring Batch internally taking care of the granularity of the calls to `ItemWriter` after an error. -### [](#reusingExistingServices)Reusing Existing Services +### Reusing Existing Services Batch systems are often used in conjunction with other application styles. The most common is an online system, but it may also support integration or even a thick client @@ -2257,7 +2257,7 @@ public FooService fooService() { } ``` -### [](#process-indicator)Preventing State Persistence +### Preventing State Persistence By default, all of the `ItemReader` and `ItemWriter` implementations store their current state in the `ExecutionContext` before it is committed. However, this may not always be @@ -2320,7 +2320,7 @@ public JdbcCursorItemReader playerSummarizationSource(DataSource dataSource) { The `ItemReader` configured above does not make any entries in the `ExecutionContext` for any executions in which it participates. -### [](#customReadersWriters)Creating Custom ItemReaders and ItemWriters +### Creating Custom ItemReaders and ItemWriters So far, this chapter has discussed the basic contracts of reading and writing in Spring Batch and some common implementations for doing so. However, these are all fairly @@ -2328,7 +2328,7 @@ generic, and there are many potential scenarios that may not be covered by out-o implementations. This section shows, by using a simple example, how to create a custom`ItemReader` and `ItemWriter` implementation and implement their contracts correctly. The`ItemReader` also implements `ItemStream`, in order to illustrate how to make a reader or writer restartable. -#### [](#customReader)Custom `ItemReader` Example +#### Custom `ItemReader` Example For the purpose of this example, we create a simple `ItemReader` implementation that reads from a provided list. We start by implementing the most basic contract of`ItemReader`, the `read` method, as shown in the following code: @@ -2370,7 +2370,7 @@ assertEquals("3", itemReader.read()); assertNull(itemReader.read()); ``` -##### [](#restartableReader)Making the `ItemReader` Restartable +##### Making the `ItemReader` Restartable The final challenge is to make the `ItemReader` restartable. Currently, if processing is interrupted and begins again, the `ItemReader` must start at the beginning. This is @@ -2451,7 +2451,7 @@ to guarantee uniqueness. However, in the rare cases where two of the same type o output), a more unique name is needed. For this reason, many of the Spring Batch`ItemReader` and `ItemWriter` implementations have a `setName()` property that lets this key name be overridden. -#### [](#customWriter)Custom `ItemWriter` Example +#### Custom `ItemWriter` Example Implementing a Custom `ItemWriter` is similar in many ways to the `ItemReader` example above but differs in enough ways as to warrant its own example. However, adding @@ -2473,7 +2473,7 @@ public class CustomItemWriter implements ItemWriter { } ``` -##### [](#restartableWriter)Making the `ItemWriter` Restartable +##### Making the `ItemWriter` Restartable To make the `ItemWriter` restartable, we would follow the same process as for the`ItemReader`, adding and implementing the `ItemStream` interface to synchronize the execution context. In the example, we might have to count the number of items processed @@ -2487,12 +2487,12 @@ When you have a stateful writer you should probably be sure to implement `ItemSt well as `ItemWriter`. Remember also that the client of the writer needs to be aware of the `ItemStream`, so you may need to register it as a stream in the configuration. -### [](#itemReaderAndWriterImplementations)Item Reader and Writer Implementations +### Item Reader and Writer Implementations In this section, we will introduce you to readers and writers that have not already been discussed in the previous sections. -#### [](#decorators)Decorators +#### Decorators In some cases, a user needs specialized behavior to be appended to a pre-existing`ItemReader`. Spring Batch offers some out of the box decorators that can add additional behavior to to your `ItemReader` and `ItemWriter` implementations. @@ -2511,12 +2511,12 @@ Spring Batch includes the following decorators: * [`ClassifierCompositeItemProcessor`](#classifierCompositeItemProcessor) -##### [](#synchronizedItemStreamReader)`SynchronizedItemStreamReader` +##### `SynchronizedItemStreamReader` When using an `ItemReader` that is not thread safe, Spring Batch offers the`SynchronizedItemStreamReader` decorator, which can be used to make the `ItemReader`thread safe. Spring Batch provides a `SynchronizedItemStreamReaderBuilder` to construct an instance of the `SynchronizedItemStreamReader`. -##### [](#singleItemPeekableItemReader)`SingleItemPeekableItemReader` +##### `SingleItemPeekableItemReader` Spring Batch includes a decorator that adds a peek method to an `ItemReader`. This peek method lets the user peek one item ahead. Repeated calls to the peek returns the same @@ -2525,29 +2525,29 @@ item, and this is the next item returned from the `read` method. Spring Batch pr | |SingleItemPeekableItemReader’s peek method is not thread-safe, because it would not
be possible to honor the peek in multiple threads. Only one of the threads that peeked
would get that item in the next call to read.| |---|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -##### [](#synchronizedItemStreamWriter)`SynchronizedItemStreamWriter` +##### `SynchronizedItemStreamWriter` When using an `ItemWriter` that is not thread safe, Spring Batch offers the`SynchronizedItemStreamWriter` decorator, which can be used to make the `ItemWriter`thread safe. Spring Batch provides a `SynchronizedItemStreamWriterBuilder` to construct an instance of the `SynchronizedItemStreamWriter`. -##### [](#multiResourceItemWriter)`MultiResourceItemWriter` +##### `MultiResourceItemWriter` The `MultiResourceItemWriter` wraps a `ResourceAwareItemWriterItemStream` and creates a new output resource when the count of items written in the current resource exceeds the`itemCountLimitPerResource`. Spring Batch provides a `MultiResourceItemWriterBuilder` to construct an instance of the `MultiResourceItemWriter`. -##### [](#classifierCompositeItemWriter)`ClassifierCompositeItemWriter` +##### `ClassifierCompositeItemWriter` The `ClassifierCompositeItemWriter` calls one of a collection of `ItemWriter`implementations for each item, based on a router pattern implemented through the provided`Classifier`. The implementation is thread-safe if all delegates are thread-safe. Spring Batch provides a `ClassifierCompositeItemWriterBuilder` to construct an instance of the`ClassifierCompositeItemWriter`. -##### [](#classifierCompositeItemProcessor)`ClassifierCompositeItemProcessor` +##### `ClassifierCompositeItemProcessor` The `ClassifierCompositeItemProcessor` is an `ItemProcessor` that calls one of a collection of `ItemProcessor` implementations, based on a router pattern implemented through the provided `Classifier`. Spring Batch provides a`ClassifierCompositeItemProcessorBuilder` to construct an instance of the`ClassifierCompositeItemProcessor`. -#### [](#messagingReadersAndWriters)Messaging Readers And Writers +#### Messaging Readers And Writers Spring Batch offers the following readers and writers for commonly used messaging systems: @@ -2563,43 +2563,43 @@ Spring Batch offers the following readers and writers for commonly used messagin * [`KafkaItemWriter`](#kafkaItemWriter) -##### [](#amqpItemReader)`AmqpItemReader` +##### `AmqpItemReader` The `AmqpItemReader` is an `ItemReader` that uses an `AmqpTemplate` to receive or convert messages from an exchange. Spring Batch provides a `AmqpItemReaderBuilder` to construct an instance of the `AmqpItemReader`. -##### [](#amqpItemWriter)`AmqpItemWriter` +##### `AmqpItemWriter` The `AmqpItemWriter` is an `ItemWriter` that uses an `AmqpTemplate` to send messages to an AMQP exchange. Messages are sent to the nameless exchange if the name not specified in the provided `AmqpTemplate`. Spring Batch provides an `AmqpItemWriterBuilder` to construct an instance of the `AmqpItemWriter`. -##### [](#jmsItemReader)`JmsItemReader` +##### `JmsItemReader` The `JmsItemReader` is an `ItemReader` for JMS that uses a `JmsTemplate`. The template should have a default destination, which is used to provide items for the `read()`method. Spring Batch provides a `JmsItemReaderBuilder` to construct an instance of the`JmsItemReader`. -##### [](#jmsItemWriter)`JmsItemWriter` +##### `JmsItemWriter` The `JmsItemWriter` is an `ItemWriter` for JMS that uses a `JmsTemplate`. The template should have a default destination, which is used to send items in `write(List)`. Spring Batch provides a `JmsItemWriterBuilder` to construct an instance of the `JmsItemWriter`. -##### [](#kafkaItemReader)`KafkaItemReader` +##### `KafkaItemReader` The `KafkaItemReader` is an `ItemReader` for an Apache Kafka topic. It can be configured to read messages from multiple partitions of the same topic. It stores message offsets in the execution context to support restart capabilities. Spring Batch provides a`KafkaItemReaderBuilder` to construct an instance of the `KafkaItemReader`. -##### [](#kafkaItemWriter)`KafkaItemWriter` +##### `KafkaItemWriter` The `KafkaItemWriter` is an `ItemWriter` for Apache Kafka that uses a `KafkaTemplate` to send events to a default topic. Spring Batch provides a `KafkaItemWriterBuilder` to construct an instance of the `KafkaItemWriter`. -#### [](#databaseReaders)Database Readers +#### Database Readers Spring Batch offers the following database readers: @@ -2613,37 +2613,37 @@ Spring Batch offers the following database readers: * [`RepositoryItemReader`](#repositoryItemReader) -##### [](#Neo4jItemReader)`Neo4jItemReader` +##### `Neo4jItemReader` The `Neo4jItemReader` is an `ItemReader` that reads objects from the graph database Neo4j by using a paging technique. Spring Batch provides a `Neo4jItemReaderBuilder` to construct an instance of the `Neo4jItemReader`. -##### [](#mongoItemReader)`MongoItemReader` +##### `MongoItemReader` The `MongoItemReader` is an `ItemReader` that reads documents from MongoDB by using a paging technique. Spring Batch provides a `MongoItemReaderBuilder` to construct an instance of the `MongoItemReader`. -##### [](#hibernateCursorItemReader)`HibernateCursorItemReader` +##### `HibernateCursorItemReader` The `HibernateCursorItemReader` is an `ItemStreamReader` for reading database records built on top of Hibernate. It executes the HQL query and then, when initialized, iterates over the result set as the `read()` method is called, successively returning an object corresponding to the current row. Spring Batch provides a`HibernateCursorItemReaderBuilder` to construct an instance of the`HibernateCursorItemReader`. -##### [](#hibernatePagingItemReader)`HibernatePagingItemReader` +##### `HibernatePagingItemReader` The `HibernatePagingItemReader` is an `ItemReader` for reading database records built on top of Hibernate and reading only up to a fixed number of items at a time. Spring Batch provides a `HibernatePagingItemReaderBuilder` to construct an instance of the`HibernatePagingItemReader`. -##### [](#repositoryItemReader)`RepositoryItemReader` +##### `RepositoryItemReader` The `RepositoryItemReader` is an `ItemReader` that reads records by using a`PagingAndSortingRepository`. Spring Batch provides a `RepositoryItemReaderBuilder` to construct an instance of the `RepositoryItemReader`. -#### [](#databaseWriters)Database Writers +#### Database Writers Spring Batch offers the following database writers: @@ -2661,44 +2661,44 @@ Spring Batch offers the following database writers: * [`GemfireItemWriter`](#gemfireItemWriter) -##### [](#neo4jItemWriter)`Neo4jItemWriter` +##### `Neo4jItemWriter` The `Neo4jItemWriter` is an `ItemWriter` implementation that writes to a Neo4j database. Spring Batch provides a `Neo4jItemWriterBuilder` to construct an instance of the`Neo4jItemWriter`. -##### [](#mongoItemWriter)`MongoItemWriter` +##### `MongoItemWriter` The `MongoItemWriter` is an `ItemWriter` implementation that writes to a MongoDB store using an implementation of Spring Data’s `MongoOperations`. Spring Batch provides a`MongoItemWriterBuilder` to construct an instance of the `MongoItemWriter`. -##### [](#repositoryItemWriter)`RepositoryItemWriter` +##### `RepositoryItemWriter` The `RepositoryItemWriter` is an `ItemWriter` wrapper for a `CrudRepository` from Spring Data. Spring Batch provides a `RepositoryItemWriterBuilder` to construct an instance of the `RepositoryItemWriter`. -##### [](#hibernateItemWriter)`HibernateItemWriter` +##### `HibernateItemWriter` The `HibernateItemWriter` is an `ItemWriter` that uses a Hibernate session to save or update entities that are not part of the current Hibernate session. Spring Batch provides a `HibernateItemWriterBuilder` to construct an instance of the `HibernateItemWriter`. -##### [](#jdbcBatchItemWriter)`JdbcBatchItemWriter` +##### `JdbcBatchItemWriter` The `JdbcBatchItemWriter` is an `ItemWriter` that uses the batching features from`NamedParameterJdbcTemplate` to execute a batch of statements for all items provided. Spring Batch provides a `JdbcBatchItemWriterBuilder` to construct an instance of the`JdbcBatchItemWriter`. -##### [](#jpaItemWriter)`JpaItemWriter` +##### `JpaItemWriter` The `JpaItemWriter` is an `ItemWriter` that uses a JPA `EntityManagerFactory` to merge any entities that are not part of the persistence context. Spring Batch provides a`JpaItemWriterBuilder` to construct an instance of the `JpaItemWriter`. -##### [](#gemfireItemWriter)`GemfireItemWriter` +##### `GemfireItemWriter` The `GemfireItemWriter` is an `ItemWriter` that uses a `GemfireTemplate` that stores items in GemFire as key/value pairs. Spring Batch provides a `GemfireItemWriterBuilder`to construct an instance of the `GemfireItemWriter`. -#### [](#specializedReaders)Specialized Readers +#### Specialized Readers Spring Batch offers the following specialized readers: @@ -2708,26 +2708,26 @@ Spring Batch offers the following specialized readers: * [`AvroItemReader`](#avroItemReader) -##### [](#ldifReader)`LdifReader` +##### `LdifReader` The `LdifReader` reads LDIF (LDAP Data Interchange Format) records from a `Resource`, parses them, and returns a `LdapAttribute` object for each `read` executed. Spring Batch provides a `LdifReaderBuilder` to construct an instance of the `LdifReader`. -##### [](#mappingLdifReader)`MappingLdifReader` +##### `MappingLdifReader` The `MappingLdifReader` reads LDIF (LDAP Data Interchange Format) records from a`Resource`, parses them then maps each LDIF record to a POJO (Plain Old Java Object). Each read returns a POJO. Spring Batch provides a `MappingLdifReaderBuilder` to construct an instance of the `MappingLdifReader`. -##### [](#avroItemReader)`AvroItemReader` +##### `AvroItemReader` The `AvroItemReader` reads serialized Avro data from a Resource. Each read returns an instance of the type specified by a Java class or Avro Schema. The reader may be optionally configured for input that embeds an Avro schema or not. Spring Batch provides an `AvroItemReaderBuilder` to construct an instance of the `AvroItemReader`. -#### [](#specializedWriters)Specialized Writers +#### Specialized Writers Spring Batch offers the following specialized writers: @@ -2735,25 +2735,25 @@ Spring Batch offers the following specialized writers: * [`AvroItemWriter`](#avroItemWriter) -##### [](#simpleMailMessageItemWriter)`SimpleMailMessageItemWriter` +##### `SimpleMailMessageItemWriter` The `SimpleMailMessageItemWriter` is an `ItemWriter` that can send mail messages. It delegates the actual sending of messages to an instance of `MailSender`. Spring Batch provides a `SimpleMailMessageItemWriterBuilder` to construct an instance of the`SimpleMailMessageItemWriter`. -##### [](#avroItemWriter)`AvroItemWriter` +##### `AvroItemWriter` The `AvroItemWrite` serializes Java objects to a WriteableResource according to the given type or Schema. The writer may be optionally configured to embed an Avro schema in the output or not. Spring Batch provides an `AvroItemWriterBuilder` to construct an instance of the `AvroItemWriter`. -#### [](#specializedProcessors)Specialized Processors +#### Specialized Processors Spring Batch offers the following specialized processors: * [`ScriptItemProcessor`](#scriptItemProcessor) -##### [](#scriptItemProcessor)`ScriptItemProcessor` +##### `ScriptItemProcessor` The `ScriptItemProcessor` is an `ItemProcessor` that passes the current item to process to the provided script and the result of the script is returned by the processor. Spring diff --git a/docs/en/spring-batch/repeat.md b/docs/en/spring-batch/repeat.md index 45761a7..faf19b5 100644 --- a/docs/en/spring-batch/repeat.md +++ b/docs/en/spring-batch/repeat.md @@ -1,10 +1,10 @@ # Repeat -## [](#repeat)Repeat +## Repeat XMLJavaBoth -### [](#repeatTemplate)RepeatTemplate +### RepeatTemplate Batch processing is about repetitive actions, either as a simple optimization or as part of a job. To strategize and generalize the repetition and to provide what amounts to an @@ -61,7 +61,7 @@ considerations intrinsic to the work being done in the callback. Others are effe infinite loops as far as the callback is concerned and the completion decision is delegated to an external policy, as in the case shown in the preceding example. -#### [](#repeatContext)RepeatContext +#### RepeatContext The method parameter for the `RepeatCallback` is a `RepeatContext`. Many callbacks ignore the context. However, if necessary, it can be used as an attribute bag to store transient @@ -73,7 +73,7 @@ parent context is occasionally useful for storing data that need to be shared be calls to `iterate`. This is the case, for instance, if you want to count the number of occurrences of an event in the iteration and remember it across subsequent calls. -#### [](#repeatStatus)RepeatStatus +#### RepeatStatus `RepeatStatus` is an enumeration used by Spring Batch to indicate whether processing has finished. It has two possible `RepeatStatus` values, described in the following table: @@ -86,7 +86,7 @@ finished. It has two possible `RepeatStatus` values, described in the following `RepeatStatus` values can also be combined with a logical AND operation by using the`and()` method in `RepeatStatus`. The effect of this is to do a logical AND on the continuable flag. In other words, if either status is `FINISHED`, then the result is`FINISHED`. -### [](#completionPolicies)Completion Policies +### Completion Policies Inside a `RepeatTemplate`, the termination of the loop in the `iterate` method is determined by a `CompletionPolicy`, which is also a factory for the `RepeatContext`. The`RepeatTemplate` has the responsibility to use the current policy to create a`RepeatContext` and pass that in to the `RepeatCallback` at every stage in the iteration. @@ -99,7 +99,7 @@ Users might need to implement their own completion policies for more complicated decisions. For example, a batch processing window that prevents batch jobs from executing once the online systems are in use would require a custom policy. -### [](#repeatExceptionHandling)Exception Handling +### Exception Handling If there is an exception thrown inside a `RepeatCallback`, the `RepeatTemplate` consults an `ExceptionHandler`, which can decide whether or not to re-throw the exception. @@ -127,7 +127,7 @@ called `useParent`. It is `false` by default, so the limit is only accounted for current `RepeatContext`. When set to `true`, the limit is kept across sibling contexts in a nested iteration (such as a set of chunks inside a step). -### [](#repeatListeners)Listeners +### Listeners Often, it is useful to be able to receive additional callbacks for cross-cutting concerns across a number of different iterations. For this purpose, Spring Batch provides the`RepeatListener` interface. The `RepeatTemplate` lets users register `RepeatListener`implementations, and they are given callbacks with the `RepeatContext` and `RepeatStatus`where available during the iteration. @@ -149,14 +149,14 @@ The `open` and `close` callbacks come before and after the entire iteration. `be Note that, when there is more than one listener, they are in a list, so there is an order. In this case, `open` and `before` are called in the same order while `after`,`onError`, and `close` are called in reverse order. -### [](#repeatParallelProcessing)Parallel Processing +### Parallel Processing Implementations of `RepeatOperations` are not restricted to executing the callback sequentially. It is quite important that some implementations are able to execute their callbacks in parallel. To this end, Spring Batch provides the`TaskExecutorRepeatTemplate`, which uses the Spring `TaskExecutor` strategy to run the`RepeatCallback`. The default is to use a `SynchronousTaskExecutor`, which has the effect of executing the whole iteration in the same thread (the same as a normal`RepeatTemplate`). -### [](#declarativeIteration)Declarative Iteration +### Declarative Iteration Sometimes there is some business processing that you know you want to repeat every time it happens. The classic example of this is the optimization of a message pipeline. It is diff --git a/docs/en/spring-batch/retry.md b/docs/en/spring-batch/retry.md index 34e6443..1c4eadd 100644 --- a/docs/en/spring-batch/retry.md +++ b/docs/en/spring-batch/retry.md @@ -1,6 +1,6 @@ # Retry -## [](#retry)Retry +## Retry XMLJavaBoth @@ -9,7 +9,7 @@ automatically retry a failed operation in case it might succeed on a subsequent Errors that are susceptible to intermittent failure are often transient in nature. Examples include remote calls to a web service that fails because of a network glitch or a`DeadlockLoserDataAccessException` in a database update. -### [](#retryTemplate)`RetryTemplate` +### `RetryTemplate` | |The retry functionality was pulled out of Spring Batch as of 2.2.0.
It is now part of a new library, [Spring Retry](https://github.com/spring-projects/spring-retry).| |---|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------| @@ -75,7 +75,7 @@ Foo result = template.execute(new RetryCallback() { In the preceding example, we make a web service call and return the result to the user. If that call fails, then it is retried until a timeout is reached. -#### [](#retryContext)`RetryContext` +#### `RetryContext` The method parameter for the `RetryCallback` is a `RetryContext`. Many callbacks ignore the context, but, if necessary, it can be used as an attribute bag to store data for the @@ -85,7 +85,7 @@ A `RetryContext` has a parent context if there is a nested retry in progress in thread. The parent context is occasionally useful for storing data that need to be shared between calls to `execute`. -#### [](#recoveryCallback)`RecoveryCallback` +#### `RecoveryCallback` When a retry is exhausted, the `RetryOperations` can pass control to a different callback, called the `RecoveryCallback`. To use this feature, clients pass in the callbacks together @@ -106,7 +106,7 @@ Foo foo = template.execute(new RetryCallback() { If the business logic does not succeed before the template decides to abort, then the client is given the chance to do some alternate processing through the recovery callback. -#### [](#statelessRetry)Stateless Retry +#### Stateless Retry In the simplest case, a retry is just a while loop. The `RetryTemplate` can just keep trying until it either succeeds or fails. The `RetryContext` contains some state to @@ -115,7 +115,7 @@ to store it anywhere globally, so we call this stateless retry. The distinction stateless and stateful retry is contained in the implementation of the `RetryPolicy` (the`RetryTemplate` can handle both). In a stateless retry, the retry callback is always executed in the same thread it was on when it failed. -#### [](#statefulRetry)Stateful Retry +#### Stateful Retry Where the failure has caused a transactional resource to become invalid, there are some special considerations. This does not apply to a simple remote call because there is no @@ -154,7 +154,7 @@ The decision to retry or not is actually delegated to a regular `RetryPolicy`, s usual concerns about limits and timeouts can be injected there (described later in this chapter). -### [](#retryPolicies)Retry Policies +### Retry Policies Inside a `RetryTemplate`, the decision to retry or fail in the `execute` method is determined by a `RetryPolicy`, which is also a factory for the `RetryContext`. The`RetryTemplate` has the responsibility to use the current policy to create a`RetryContext` and pass that in to the `RetryCallback` at every attempt. After a callback @@ -206,7 +206,7 @@ Users might need to implement their own retry policies for more customized decis instance, a custom retry policy makes sense when there is a well-known, solution-specific classification of exceptions into retryable and not retryable. -### [](#backoffPolicies)Backoff Policies +### Backoff Policies When retrying after a transient failure, it often helps to wait a bit before trying again, because usually the failure is caused by some problem that can only be resolved by @@ -232,7 +232,7 @@ backoff with an exponentially increasing wait period, to avoid two retries getti lock step and both failing (this is a lesson learned from ethernet). For this purpose, Spring Batch provides the `ExponentialBackoffPolicy`. -### [](#retryListeners)Listeners +### Listeners Often, it is useful to be able to receive additional callbacks for cross cutting concerns across a number of different retries. For this purpose, Spring Batch provides the`RetryListener` interface. The `RetryTemplate` lets users register `RetryListeners`, and @@ -261,7 +261,7 @@ Note that, when there is more than one listener, they are in a list, so there is In this case, `open` is called in the same order while `onError` and `close` are called in reverse order. -### [](#declarativeRetry)Declarative Retry +### Declarative Retry Sometimes, there is some business processing that you know you want to retry every time it happens. The classic example of this is the remote service call. Spring Batch provides an diff --git a/docs/en/spring-batch/scalability.md b/docs/en/spring-batch/scalability.md index 83d9cae..d3fb0d6 100644 --- a/docs/en/spring-batch/scalability.md +++ b/docs/en/spring-batch/scalability.md @@ -1,6 +1,6 @@ # Scaling and Parallel Processing -## [](#scalability)Scaling and Parallel Processing +## Scaling and Parallel Processing XMLJavaBoth @@ -31,7 +31,7 @@ These break down into categories as well, as follows: First, we review the single-process options. Then we review the multi-process options. -### [](#multithreadedStep)Multi-threaded Step +### Multi-threaded Step The simplest way to start parallel processing is to add a `TaskExecutor` to your Step configuration. @@ -128,7 +128,7 @@ synchronizing delegator. You can synchronize the call to `read()` and as long as processing and writing is the most expensive part of the chunk, your step may still complete much faster than it would in a single threaded configuration. -### [](#scalabilityParallelSteps)Parallel Steps +### Parallel Steps As long as the application logic that needs to be parallelized can be split into distinct responsibilities and assigned to individual steps, then it can be parallelized in a @@ -203,7 +203,7 @@ aggregating the exit statuses and transitioning. See the section on [Split Flows](step.html#split-flows) for more detail. -### [](#remoteChunking)Remote Chunking +### Remote Chunking In remote chunking, the `Step` processing is split across multiple processes, communicating with each other through some middleware. The following image shows the @@ -233,7 +233,7 @@ the grid computing and shared memory product space. See the section on[Spring Batch Integration - Remote Chunking](spring-batch-integration.html#remote-chunking)for more detail. -### [](#partitioning)Partitioning +### Partitioning Spring Batch also provides an SPI for partitioning a `Step` execution and executing it remotely. In this case, the remote participants are `Step` instances that could just as @@ -302,7 +302,7 @@ Spring Batch creates step executions for the partitions called "step1:partition0 on. Many people prefer to call the manager step "step1:manager" for consistency. You can use an alias for the step (by specifying the `name` attribute instead of the `id`attribute). -#### [](#partitionHandler)PartitionHandler +#### PartitionHandler The `PartitionHandler` is the component that knows about the fabric of the remoting or grid environment. It is able to send `StepExecution` requests to the remote `Step`instances, wrapped in some fabric-specific format, like a DTO. It does not have to know @@ -371,7 +371,7 @@ copying large numbers of files or replicating filesystems into content managemen systems. It can also be used for remote execution by providing a `Step` implementation that is a proxy for a remote invocation (such as using Spring Remoting). -#### [](#partitioner)Partitioner +#### Partitioner The `Partitioner` has a simpler responsibility: to generate execution contexts as input parameters for new step executions only (no need to worry about restarts). It has a @@ -402,7 +402,7 @@ interface, then, on a restart, only the names are queried. If partitioning is ex this can be a useful optimization. The names provided by the `PartitionNameProvider` must match those provided by the `Partitioner`. -#### [](#bindingInputDataToSteps)Binding Input Data to Steps +#### Binding Input Data to Steps It is very efficient for the steps that are executed by the `PartitionHandler` to have identical configuration and for their input parameters to be bound at runtime from the`ExecutionContext`. This is easy to do with the StepScope feature of Spring Batch diff --git a/docs/en/spring-batch/schema-appendix.md b/docs/en/spring-batch/schema-appendix.md index 089b1bb..a4070b1 100644 --- a/docs/en/spring-batch/schema-appendix.md +++ b/docs/en/spring-batch/schema-appendix.md @@ -1,8 +1,8 @@ # Meta-Data Schema -## [](#metaDataSchema)Appendix A: Meta-Data Schema +## Appendix A: Meta-Data Schema -### [](#metaDataSchemaOverview)Overview +### Overview The Spring Batch Metadata tables closely match the Domain objects that represent them in Java. For example, `JobInstance`, `JobExecution`, `JobParameters`, and `StepExecution`map to `BATCH_JOB_INSTANCE`, `BATCH_JOB_EXECUTION`, `BATCH_JOB_EXECUTION_PARAMS`, and`BATCH_STEP_EXECUTION`, respectively. `ExecutionContext` maps to both`BATCH_JOB_EXECUTION_CONTEXT` and `BATCH_STEP_EXECUTION_CONTEXT`. The `JobRepository` is @@ -18,7 +18,7 @@ shows an ERD model of all 6 tables and their relationships to one another: Figure 1. Spring Batch Meta-Data ERD -#### [](#exampleDDLScripts)Example DDL Scripts +#### Example DDL Scripts The Spring Batch Core JAR file contains example scripts to create the relational tables for a number of database platforms (which are, in turn, auto-detected by the job @@ -27,7 +27,7 @@ modified with additional indexes and constraints as desired. The file names are form `schema-*.sql`, where "\*" is the short name of the target database platform. The scripts are in the package `org.springframework.batch.core`. -#### [](#migrationDDLScripts)Migration DDL Scripts +#### Migration DDL Scripts Spring Batch provides migration DDL scripts that you need to execute when you upgrade versions. These scripts can be found in the Core Jar file under `org/springframework/batch/core/migration`. @@ -37,7 +37,7 @@ Migration scripts are organized into folders corresponding to version numbers in * `4.1`: contains scripts needed if you are migrating from a version before `4.1` to version `4.1` -#### [](#metaDataVersion)Version +#### Version Many of the database tables discussed in this appendix contain a version column. This column is important because Spring Batch employs an optimistic locking strategy when @@ -47,7 +47,7 @@ back to save the value, if the version number has changed it throws an`Optimisti access. This check is necessary, since, even though different batch jobs may be running in different machines, they all use the same database tables. -#### [](#metaDataIdentity)Identity +#### Identity `BATCH_JOB_INSTANCE`, `BATCH_JOB_EXECUTION`, and `BATCH_STEP_EXECUTION` each contain columns ending in `_ID`. These fields act as primary keys for their respective tables. @@ -80,7 +80,7 @@ INSERT INTO BATCH_JOB_SEQ values(0); In the preceding case, a table is used in place of each sequence. The Spring core class,`MySQLMaxValueIncrementer`, then increments the one column in this sequence in order to give similar functionality. -### [](#metaDataBatchJobInstance)`BATCH_JOB_INSTANCE` +### `BATCH_JOB_INSTANCE` The `BATCH_JOB_INSTANCE` table holds all information relevant to a `JobInstance`, and serves as the top of the overall hierarchy. The following generic DDL statement is used @@ -109,7 +109,7 @@ The following list describes each column in the table: instances of the same job from one another. (`JobInstances` with the same job name must have different `JobParameters` and, thus, different `JOB_KEY` values). -### [](#metaDataBatchJobParams)`BATCH_JOB_EXECUTION_PARAMS` +### `BATCH_JOB_EXECUTION_PARAMS` The `BATCH_JOB_EXECUTION_PARAMS` table holds all information relevant to the`JobParameters` object. It contains 0 or more key/value pairs passed to a `Job` and serves as a record of the parameters with which a job was run. For each parameter that @@ -159,7 +159,7 @@ Note that there is no primary key for this table. This is because the framework use for one and, thus, does not require it. If need be, you can add a primary key may be added with a database generated key without causing any issues to the framework itself. -### [](#metaDataBatchJobExecution)`BATCH_JOB_EXECUTION` +### `BATCH_JOB_EXECUTION` The `BATCH_JOB_EXECUTION` table holds all information relevant to the `JobExecution`object. Every time a `Job` is run, there is always a new `JobExecution`, and a new row in this table. The following listing shows the definition of the `BATCH_JOB_EXECUTION`table: @@ -213,7 +213,7 @@ The following list describes each column: * `LAST_UPDATED`: Timestamp representing the last time this execution was persisted. -### [](#metaDataBatchStepExecution)`BATCH_STEP_EXECUTION` +### `BATCH_STEP_EXECUTION` The BATCH\_STEP\_EXECUTION table holds all information relevant to the `StepExecution`object. This table is similar in many ways to the `BATCH_JOB_EXECUTION` table, and there is always at least one entry per `Step` for each `JobExecution` created. The following @@ -293,7 +293,7 @@ The following list describes for each column: * `LAST_UPDATED`: Timestamp representing the last time this execution was persisted. -### [](#metaDataBatchJobExecutionContext)`BATCH_JOB_EXECUTION_CONTEXT` +### `BATCH_JOB_EXECUTION_CONTEXT` The `BATCH_JOB_EXECUTION_CONTEXT` table holds all information relevant to the`ExecutionContext` of a `Job`. There is exactly one `Job` `ExecutionContext` per`JobExecution`, and it contains all of the job-level data that is needed for a particular job execution. This data typically represents the state that must be retrieved after a @@ -319,7 +319,7 @@ The following list describes each column: * `SERIALIZED_CONTEXT`: The entire context, serialized. -### [](#metaDataBatchStepExecutionContext)`BATCH_STEP_EXECUTION_CONTEXT` +### `BATCH_STEP_EXECUTION_CONTEXT` The `BATCH_STEP_EXECUTION_CONTEXT` table holds all information relevant to the`ExecutionContext` of a `Step`. There is exactly one `ExecutionContext` per`StepExecution`, and it contains all of the data that needs to be persisted for a particular step execution. This data typically represents the @@ -345,7 +345,7 @@ The following list describes each column: * `SERIALIZED_CONTEXT`: The entire context, serialized. -### [](#metaDataArchiving)Archiving +### Archiving Because there are entries in multiple tables every time a batch job is run, it is common to create an archive strategy for the metadata tables. The tables themselves are designed @@ -362,7 +362,7 @@ job, with a few notable exceptions pertaining to restart: this table for jobs that have not completed successfully prevents them from starting at the correct point if run again. -### [](#multiByteCharacters)International and Multi-byte Characters +### International and Multi-byte Characters If you are using multi-byte character sets (such as Chinese or Cyrillic) in your business processing, then those characters might need to be persisted in the Spring Batch schema. @@ -370,7 +370,7 @@ Many users find that simply changing the schema to double the length of the `VAR value of the `VARCHAR` column length. Some users have also reported that they use`NVARCHAR` in place of `VARCHAR` in their schema definitions. The best result depends on the database platform and the way the database server has been configured locally. -### [](#recommendationsForIndexingMetaDataTables)Recommendations for Indexing Meta Data Tables +### Recommendations for Indexing Meta Data Tables Spring Batch provides DDL samples for the metadata tables in the core jar file for several common database platforms. Index declarations are not included in that DDL, diff --git a/docs/en/spring-batch/spring-batch-integration.md b/docs/en/spring-batch/spring-batch-integration.md index b2ac804..f882c5e 100644 --- a/docs/en/spring-batch/spring-batch-integration.md +++ b/docs/en/spring-batch/spring-batch-integration.md @@ -1,10 +1,10 @@ # Spring Batch Integration -## [](#springBatchIntegration)Spring Batch Integration +## Spring Batch Integration XMLJavaBoth -### [](#spring-batch-integration-introduction)Spring Batch Integration Introduction +### Spring Batch Integration Introduction Many users of Spring Batch may encounter requirements that are outside the scope of Spring Batch but that may be efficiently and @@ -44,7 +44,7 @@ This section covers the following key concepts: * [Externalizing Batch Process Execution](#externalizing-batch-process-execution) -#### [](#namespace-support)Namespace Support +#### Namespace Support Since Spring Batch Integration 1.3, dedicated XML Namespace support was added, with the aim to provide an easier configuration @@ -97,7 +97,7 @@ could possibly create issues when updating the Spring Batch Integration dependencies, as they may require more recent versions of the XML schema. -#### [](#launching-batch-jobs-through-messages)Launching Batch Jobs through Messages +#### Launching Batch Jobs through Messages When starting batch jobs by using the core Spring Batch API, you basically have 2 options: @@ -138,7 +138,7 @@ message flow in order to start a Batch job. The[EIP (Enterprise Integration Patt Figure 1. Launch Batch Job -##### [](#transforming-a-file-into-a-joblaunchrequest)Transforming a file into a JobLaunchRequest +##### Transforming a file into a JobLaunchRequest ``` package io.spring.sbi; @@ -176,7 +176,7 @@ public class FileMessageToJobRequest { } ``` -##### [](#the-jobexecution-response)The `JobExecution` Response +##### The `JobExecution` Response When a batch job is being executed, a`JobExecution` instance is returned. This instance can be used to determine the status of an execution. If @@ -191,7 +191,7 @@ using the `JobExplorer`. For more information, please refer to the Spring Batch reference documentation on[Querying the Repository](job.html#queryingRepository). -##### [](#spring-batch-integration-configuration)Spring Batch Integration Configuration +##### Spring Batch Integration Configuration Consider a case where someone needs to create a file `inbound-channel-adapter` to listen for CSV files in the provided directory, hand them off to a transformer @@ -263,7 +263,7 @@ public IntegrationFlow integrationFlow(JobLaunchingGateway jobLaunchingGateway) } ``` -##### [](#example-itemreader-configuration)Example ItemReader Configuration +##### Example ItemReader Configuration Now that we are polling for files and launching jobs, we need to configure our Spring Batch `ItemReader` (for example) to use the files found at the location defined by the job @@ -301,7 +301,7 @@ The main points of interest in the preceding example are injecting the value of` to have *Step scope*. Setting the bean to have Step scope takes advantage of the late binding support, which allows access to the`jobParameters` variable. -### [](#availableAttributesOfTheJobLaunchingGateway)Available Attributes of the Job-Launching Gateway +### Available Attributes of the Job-Launching Gateway The job-launching gateway has the following attributes that you can set to control a job: @@ -338,7 +338,7 @@ The job-launching gateway has the following attributes that you can set to contr * `order`: Specifies the order of invocation when this endpoint is connected as a subscriber to a `SubscribableChannel`. -### [](#sub-elements)Sub-Elements +### Sub-Elements When this `Gateway` is receiving messages from a`PollableChannel`, you must either provide a global default `Poller` or provide a `Poller` sub-element to the`Job Launching Gateway`. @@ -368,7 +368,7 @@ public JobLaunchingGateway sampleJobLaunchingGateway() { } ``` -#### [](#providing-feedback-with-informational-messages)Providing Feedback with Informational Messages +#### Providing Feedback with Informational Messages As Spring Batch jobs can run for long times, providing progress information is often critical. For example, stake-holders may want @@ -477,7 +477,7 @@ public Job importPaymentsJob() { } ``` -#### [](#asynchronous-processors)Asynchronous Processors +#### Asynchronous Processors Asynchronous Processors help you to scale the processing of items. In the asynchronous processor use case, an `AsyncItemProcessor` serves as a dispatcher, executing the logic of @@ -549,7 +549,7 @@ public AsyncItemWriter writer(ItemWriter itemWriter) { Again, the `delegate` property is actually a reference to your `ItemWriter` bean. -#### [](#externalizing-batch-process-execution)Externalizing Batch Process Execution +#### Externalizing Batch Process Execution The integration approaches discussed so far suggest use cases where Spring Integration wraps Spring Batch like an outer-shell. @@ -563,7 +563,7 @@ provides dedicated support for: * Remote Partitioning -##### [](#remote-chunking)Remote Chunking +##### Remote Chunking ![Remote Chunking](./images/remote-chunking-sbi.png) @@ -922,7 +922,7 @@ public class RemoteChunkingJobConfiguration { You can find a complete example of a remote chunking job[here](https://github.com/spring-projects/spring-batch/tree/master/spring-batch-samples#remote-chunking-sample). -##### [](#remote-partitioning)Remote Partitioning +##### Remote Partitioning ![Remote Partitioning](./images/remote-partitioning.png) diff --git a/docs/en/spring-batch/spring-batch-intro.md b/docs/en/spring-batch/spring-batch-intro.md index 336380f..e569085 100644 --- a/docs/en/spring-batch/spring-batch-intro.md +++ b/docs/en/spring-batch/spring-batch-intro.md @@ -1,6 +1,6 @@ # Spring Batch Introduction -## [](#spring-batch-intro)Spring Batch Introduction +## Spring Batch Introduction Many applications within the enterprise domain require bulk processing to perform business operations in mission critical environments. These business operations include: @@ -37,7 +37,7 @@ as complex, high volume use cases (such as moving high volumes of data between d transforming it, and so on). High-volume batch jobs can leverage the framework in a highly scalable manner to process significant volumes of information. -### [](#springBatchBackground)Background +### Background While open source software projects and associated communities have focused greater attention on web-based and microservices-based architecture frameworks, there has been a @@ -69,7 +69,7 @@ consistently leveraged by enterprise users when creating batch applications. Com and government agencies desiring to deliver standard, proven solutions to their enterprise IT environments can benefit from Spring Batch. -### [](#springBatchUsageScenarios)Usage Scenarios +### Usage Scenarios A typical batch program generally: @@ -125,7 +125,7 @@ Technical Objectives * Provide a simple deployment model, with the architecture JARs completely separate from the application, built using Maven. -### [](#springBatchArchitecture)Spring Batch Architecture +### Spring Batch Architecture Spring Batch is designed with extensibility and a diverse group of end users in mind. The figure below shows the layered architecture that supports the extensibility and ease of @@ -144,7 +144,7 @@ infrastructure. This infrastructure contains common readers and writers and serv writers, such as `ItemReader` and `ItemWriter`) and the core framework itself (retry, which is its own library). -### [](#batchArchitectureConsiderations)General Batch Principles and Guidelines +### General Batch Principles and Guidelines The following key principles, guidelines, and general considerations should be considered when building a batch solution. @@ -199,7 +199,7 @@ when building a batch solution. If the system depends on flat files, file backup procedures should not only be in place and documented but be regularly tested as well. -### [](#batchProcessingStrategy)Batch Processing Strategies +### Batch Processing Strategies To help design and implement batch systems, basic batch application building blocks and patterns should be provided to the designers and programmers in the form of sample diff --git a/docs/en/spring-batch/step.md b/docs/en/spring-batch/step.md index ba941e4..650f0a5 100644 --- a/docs/en/spring-batch/step.md +++ b/docs/en/spring-batch/step.md @@ -1,6 +1,6 @@ # Configuring a Step -## [](#configureStep)Configuring a `Step` +## Configuring a `Step` XMLJavaBoth @@ -17,7 +17,7 @@ processing, as shown in the following image: Figure 1. Step -### [](#chunkOrientedProcessing)Chunk-oriented Processing +### Chunk-oriented Processing Spring Batch uses a 'Chunk-oriented' processing style within its most common implementation. Chunk oriented processing refers to reading the data one at a time and @@ -73,7 +73,7 @@ itemWriter.write(processedItems); For more details about item processors and their use cases, please refer to the[Item processing](processor.html#itemProcessor) section. -#### [](#configuringAStep)Configuring a `Step` +#### Configuring a `Step` Despite the relatively short list of required dependencies for a `Step`, it is an extremely complex class that can potentially contain many collaborators. @@ -159,7 +159,7 @@ optional, since the item could be directly passed from the reader to the writer. It should be noted that `repository` defaults to `jobRepository` and `transactionManager`defaults to `transactionManager` (all provided through the infrastructure from`@EnableBatchProcessing`). Also, the `ItemProcessor` is optional, since the item could be directly passed from the reader to the writer. -#### [](#InheritingFromParentStep)Inheriting from a Parent `Step` +#### Inheriting from a Parent `Step` If a group of `Steps` share similar configurations, then it may be helpful to define a "parent" `Step` from which the concrete `Steps` may inherit properties. Similar to class @@ -193,7 +193,7 @@ reasons: * When creating job flows, as described later in this chapter, the `next` attribute should be referring to the step in the flow, not the standalone step. -##### [](#abstractStep)Abstract `Step` +##### Abstract `Step` Sometimes, it may be necessary to define a parent `Step` that is not a complete `Step`configuration. If, for instance, the `reader`, `writer`, and `tasklet` attributes are left off of a `Step` configuration, then initialization fails. If a parent must be @@ -217,7 +217,7 @@ were not declared to be abstract. The `Step`, "concreteStep2", has 'itemReader', ``` -##### [](#mergingListsOnStep)Merging Lists +##### Merging Lists Some of the configurable elements on `Steps` are lists, such as the `` element. If both the parent and child `Steps` declare a `` element, then the @@ -245,7 +245,7 @@ In the following example, the `Step` "concreteStep3", is created with two listen ``` -#### [](#commitInterval)The Commit Interval +#### The Commit Interval As mentioned previously, a step reads in and writes out items, periodically committing using the supplied `PlatformTransactionManager`. With a `commit-interval` of 1, it @@ -296,12 +296,12 @@ In the preceding example, 10 items are processed within each transaction. At the beginning of processing, a transaction is begun. Also, each time `read` is called on the`ItemReader`, a counter is incremented. When it reaches 10, the list of aggregated items is passed to the `ItemWriter`, and the transaction is committed. -#### [](#stepRestart)Configuring a `Step` for Restart +#### Configuring a `Step` for Restart In the "[Configuring and Running a Job](job.html#configureJob)" section , restarting a`Job` was discussed. Restart has numerous impacts on steps, and, consequently, may require some specific configuration. -##### [](#startLimit)Setting a Start Limit +##### Setting a Start Limit There are many scenarios where you may want to control the number of times a `Step` may be started. For example, a particular `Step` might need to be configured so that it only @@ -342,7 +342,7 @@ The step shown in the preceding example can be run only once. Attempting to run causes a `StartLimitExceededException` to be thrown. Note that the default value for the start-limit is `Integer.MAX_VALUE`. -##### [](#allowStartIfComplete)Restarting a Completed `Step` +##### Restarting a Completed `Step` In the case of a restartable job, there may be one or more steps that should always be run, regardless of whether or not they were successful the first time. An example might @@ -379,7 +379,7 @@ public Step step1() { } ``` -##### [](#stepRestartExample)`Step` Restart Configuration Example +##### `Step` Restart Configuration Example The following XML example shows how to configure a job to have steps that can be restarted: @@ -509,7 +509,7 @@ Run 3: the third execution of `playerSummarization`, and its limit is only 2. Either the limit must be raised or the `Job` must be executed as a new `JobInstance`. -#### [](#configuringSkip)Configuring Skip Logic +#### Configuring Skip Logic There are many scenarios where errors encountered while processing should not result in`Step` failure, but should be skipped instead. This is usually a decision that must be made by someone who understands the data itself and what meaning it has. Financial data, @@ -615,7 +615,7 @@ The order of the `` and `` elements does not matter. The order of the `skip` and `noSkip` method calls does not matter. -#### [](#retryLogic)Configuring Retry Logic +#### Configuring Retry Logic In most cases, you want an exception to cause either a skip or a `Step` failure. However, not all exceptions are deterministic. If a `FlatFileParseException` is encountered while @@ -658,7 +658,7 @@ public Step step1() { The `Step` allows a limit for the number of times an individual item can be retried and a list of exceptions that are 'retryable'. More details on how retry works can be found in[retry](retry.html#retry). -#### [](#controllingRollback)Controlling Rollback +#### Controlling Rollback By default, regardless of retry or skip, any exceptions thrown from the `ItemWriter`cause the transaction controlled by the `Step` to rollback. If skip is configured as described earlier, exceptions thrown from the `ItemReader` do not cause a rollback. @@ -699,7 +699,7 @@ public Step step1() { } ``` -##### [](#transactionalReaders)Transactional Readers +##### Transactional Readers The basic contract of the `ItemReader` is that it is forward only. The step buffers reader input, so that in the case of a rollback, the items do not need to be re-read @@ -738,7 +738,7 @@ public Step step1() { } ``` -#### [](#transactionAttributes)Transaction Attributes +#### Transaction Attributes Transaction attributes can be used to control the `isolation`, `propagation`, and`timeout` settings. More information on setting transaction attributes can be found in the[Spring @@ -782,7 +782,7 @@ public Step step1() { } ``` -#### [](#registeringItemStreams)Registering `ItemStream` with a `Step` +#### Registering `ItemStream` with a `Step` The step has to take care of `ItemStream` callbacks at the necessary points in its lifecycle (For more information on the `ItemStream` interface, see[ItemStream](readersAndWriters.html#itemStream)). This is vital if a step fails and might @@ -861,7 +861,7 @@ explicitly registered as a stream because it is a direct property of the `Step`. is now restartable, and the state of the reader and writer is correctly persisted in the event of a failure. -#### [](#interceptingStepExecution)Intercepting `Step` Execution +#### Intercepting `Step` Execution Just as with the `Job`, there are many events during the execution of a `Step` where a user may need to perform some functionality. For example, in order to write out to a flat @@ -918,7 +918,7 @@ custom implementations of chunk components such as `ItemReader` or `ItemWriter` as well as registered with the `listener` methods in the builders, so all you need to do is use the XML namespace or builders to register the listeners with a step. -##### [](#stepExecutionListener)`StepExecutionListener` +##### `StepExecutionListener` `StepExecutionListener` represents the most generic listener for `Step` execution. It allows for notification before a `Step` is started and after it ends, whether it ended @@ -943,7 +943,7 @@ The annotations corresponding to this interface are: * `@AfterStep` -##### [](#chunkListener)`ChunkListener` +##### `ChunkListener` A chunk is defined as the items processed within the scope of a transaction. Committing a transaction, at each commit interval, commits a 'chunk'. A `ChunkListener` can be used to @@ -976,7 +976,7 @@ A `ChunkListener` can be applied when there is no chunk declaration. The `Taskle responsible for calling the `ChunkListener`, so it applies to a non-item-oriented tasklet as well (it is called before and after the tasklet). -##### [](#itemReadListener)`ItemReadListener` +##### `ItemReadListener` When discussing skip logic previously, it was mentioned that it may be beneficial to log the skipped records, so that they can be dealt with later. In the case of read errors, @@ -1005,7 +1005,7 @@ The annotations corresponding to this interface are: * `@OnReadError` -##### [](#itemProcessListener)`ItemProcessListener` +##### `ItemProcessListener` Just as with the `ItemReadListener`, the processing of an item can be 'listened' to, as shown in the following interface definition: @@ -1033,7 +1033,7 @@ The annotations corresponding to this interface are: * `@OnProcessError` -##### [](#itemWriteListener)`ItemWriteListener` +##### `ItemWriteListener` The writing of an item can be 'listened' to with the `ItemWriteListener`, as shown in the following interface definition: @@ -1062,7 +1062,7 @@ The annotations corresponding to this interface are: * `@OnWriteError` -##### [](#skipListener)`SkipListener` +##### `SkipListener` `ItemReadListener`, `ItemProcessListener`, and `ItemWriteListener` all provide mechanisms for being notified of errors, but none informs you that a record has actually been @@ -1093,7 +1093,7 @@ The annotations corresponding to this interface are: * `@OnSkipInProcess` -###### [](#skipListenersAndTransactions)SkipListeners and Transactions +###### SkipListeners and Transactions One of the most common use cases for a `SkipListener` is to log out a skipped item, so that another batch process or even human process can be used to evaluate and fix the @@ -1107,7 +1107,7 @@ may be rolled back, Spring Batch makes two guarantees: to ensure that any transactional resources call by the listener are not rolled back by a failure within the `ItemWriter`. -### [](#taskletStep)`TaskletStep` +### `TaskletStep` [Chunk-oriented processing](#chunkOrientedProcessing) is not the only way to process in a`Step`. What if a `Step` must consist of a simple stored procedure call? You could implement the call as an `ItemReader` and return null after the procedure finishes. @@ -1145,7 +1145,7 @@ public Step step1() { | |`TaskletStep` automatically registers the
tasklet as a `StepListener` if it implements the `StepListener`interface.| |---|-----------------------------------------------------------------------------------------------------------------------| -#### [](#taskletAdapter)`TaskletAdapter` +#### `TaskletAdapter` As with other adapters for the `ItemReader` and `ItemWriter` interfaces, the `Tasklet`interface contains an implementation that allows for adapting itself to any pre-existing class: `TaskletAdapter`. An example where this may be useful is an existing DAO that is @@ -1181,7 +1181,7 @@ public MethodInvokingTaskletAdapter myTasklet() { } ``` -#### [](#exampleTaskletImplementation)Example `Tasklet` Implementation +#### Example `Tasklet` Implementation Many batch jobs contain steps that must be done before the main processing begins in order to set up various resources or after processing has completed to cleanup those @@ -1276,7 +1276,7 @@ public FileDeletingTasklet fileDeletingTasklet() { } ``` -### [](#controllingStepFlow)Controlling Step Flow +### Controlling Step Flow With the ability to group steps together within an owning job comes the need to be able to control how the job "flows" from one step to another. The failure of a `Step` does not @@ -1284,7 +1284,7 @@ necessarily mean that the `Job` should fail. Furthermore, there may be more than of 'success' that determines which `Step` should be executed next. Depending upon how a group of `Steps` is configured, certain steps may not even be processed at all. -#### [](#SequentialFlow)Sequential Flow +#### Sequential Flow The simplest flow scenario is a job where all of the steps execute sequentially, as shown in the following image: @@ -1329,7 +1329,7 @@ then the entire `Job` fails and 'step B' does not execute. | |With the Spring Batch XML namespace, the first step listed in the configuration is*always* the first step run by the `Job`. The order of the other step elements does not
matter, but the first step must always appear first in the xml.| |---|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -#### [](#conditionalFlow)Conditional Flow +#### Conditional Flow In the example above, there are only two possibilities: @@ -1407,7 +1407,7 @@ transitions from most specific to least specific. This means that, even if the o were swapped for "stepA" in the example above, an `ExitStatus` of "FAILED" would still go to "stepC". -##### [](#batchStatusVsExitStatus)Batch Status Versus Exit Status +##### Batch Status Versus Exit Status When configuring a `Job` for conditional flow, it is important to understand the difference between `BatchStatus` and `ExitStatus`. `BatchStatus` is an enumeration that @@ -1503,7 +1503,7 @@ The above code is a `StepExecutionListener` that first checks to make sure the ` successful and then checks to see if the skip count on the `StepExecution` is higher than 0. If both conditions are met, a new `ExitStatus` with an exit code of`COMPLETED WITH SKIPS` is returned. -#### [](#configuringForStop)Configuring for Stop +#### Configuring for Stop After the discussion of [BatchStatus and ExitStatus](#batchStatusVsExitStatus), one might wonder how the `BatchStatus` and `ExitStatus` are determined for the `Job`. @@ -1547,7 +1547,7 @@ important to note that the stop transition elements have no effect on either the final statuses of the `Job`. For example, it is possible for every step in a job to have a status of `FAILED` but for the job to have a status of `COMPLETED`. -##### [](#endElement)Ending at a Step +##### Ending at a Step Configuring a step end instructs a `Job` to stop with a `BatchStatus` of `COMPLETED`. A`Job` that has finished with status `COMPLETED` cannot be restarted (the framework throws a `JobInstanceAlreadyCompleteException`). @@ -1590,7 +1590,7 @@ public Job job() { } ``` -##### [](#failElement)Failing a Step +##### Failing a Step Configuring a step to fail at a given point instructs a `Job` to stop with a`BatchStatus` of `FAILED`. Unlike end, the failure of a `Job` does not prevent the `Job`from being restarted. @@ -1632,7 +1632,7 @@ public Job job() { } ``` -##### [](#stopElement)Stopping a Job at a Given Step +##### Stopping a Job at a Given Step Configuring a job to stop at a particular step instructs a `Job` to stop with a`BatchStatus` of `STOPPED`. Stopping a `Job` can provide a temporary break in processing, so that the operator can take some action before restarting the `Job`. @@ -1668,7 +1668,7 @@ public Job job() { } ``` -#### [](#programmaticFlowDecisions)Programmatic Flow Decisions +#### Programmatic Flow Decisions In some situations, more information than the `ExitStatus` may be required to decide which step to execute next. In this case, a `JobExecutionDecider` can be used to assist @@ -1727,7 +1727,7 @@ public Job job() { } ``` -#### [](#split-flows)Split Flows +#### Split Flows Every scenario described so far has involved a `Job` that executes its steps one at a time in a linear fashion. In addition to this typical style, Spring Batch also allows @@ -1785,7 +1785,7 @@ public Job job(Flow flow1, Flow flow2) { } ``` -#### [](#external-flows)Externalizing Flow Definitions and Dependencies Between Jobs +#### Externalizing Flow Definitions and Dependencies Between Jobs Part of the flow in a job can be externalized as a separate bean definition and then re-used. There are two ways to do so. The first is to simply declare the flow as a @@ -1905,7 +1905,7 @@ jobs and steps. Using `JobStep` is also often a good answer to the question: "Ho create dependencies between jobs?" It is a good way to break up a large system into smaller modules and control the flow of jobs. -### [](#late-binding)Late Binding of `Job` and `Step` Attributes +### Late Binding of `Job` and `Step` Attributes Both the XML and flat file examples shown earlier use the Spring `Resource` abstraction to obtain a file. This works because `Resource` has a `getFile` method, which returns a`java.io.File`. Both XML and flat file resources can be configured using standard Spring @@ -2060,7 +2060,7 @@ public FlatFileItemReader flatFileItemReader(@Value("#{stepExecutionContext['inp | |If you are using Spring 3.0 (or above), the expressions in step-scoped beans are in the
Spring Expression Language, a powerful general purpose language with many interesting
features. To provide backward compatibility, if Spring Batch detects the presence of
older versions of Spring, it uses a native expression language that is less powerful and
that has slightly different parsing rules. The main difference is that the map keys in
the example above do not need to be quoted with Spring 2.5, but the quotes are mandatory
in Spring 3.0.| |---|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -#### [](#step-scope)Step Scope +#### Step Scope All of the late binding examples shown earlier have a scope of “step” declared on the bean definition. @@ -2114,7 +2114,7 @@ The following example includes the bean definition explicitly: ``` -#### [](#job-scope)Job Scope +#### Job Scope `Job` scope, introduced in Spring Batch 3.0, is similar to `Step` scope in configuration but is a Scope for the `Job` context, so that there is only one instance of such a bean diff --git a/docs/en/spring-batch/testing.md b/docs/en/spring-batch/testing.md index 043fd90..9a563a9 100644 --- a/docs/en/spring-batch/testing.md +++ b/docs/en/spring-batch/testing.md @@ -1,6 +1,6 @@ # Unit Testing -## [](#testing)Unit Testing +## Unit Testing XMLJavaBoth @@ -11,7 +11,7 @@ to think about how to 'end to end' test a batch job, which is what this chapter The spring-batch-test project includes classes that facilitate this end-to-end test approach. -### [](#creatingUnitTestClass)Creating a Unit Test Class +### Creating a Unit Test Class In order for the unit test to run a batch job, the framework must load the job’s ApplicationContext. Two annotations are used to trigger this behavior: @@ -51,7 +51,7 @@ Using XML Configuration public class SkipSampleFunctionalTests { ... } ``` -### [](#endToEndTesting)End-To-End Testing of Batch Jobs +### End-To-End Testing of Batch Jobs 'End To End' testing can be defined as testing the complete run of a batch job from beginning to end. This allows for a test that sets up a test condition, executes the job, @@ -135,7 +135,7 @@ public class SkipSampleFunctionalTests { } ``` -### [](#testingIndividualSteps)Testing Individual Steps +### Testing Individual Steps For complex batch jobs, test cases in the end-to-end testing approach may become unmanageable. It these cases, it may be more useful to have test cases to test individual @@ -148,7 +148,7 @@ results directly. The following example shows how to use the `launchStep` method JobExecution jobExecution = jobLauncherTestUtils.launchStep("loadFileStep"); ``` -### [](#testing-step-scoped-components)Testing Step-Scoped Components +### Testing Step-Scoped Components Often, the components that are configured for your steps at runtime use step scope and late binding to inject context from the step or job execution. These are tricky to test as @@ -243,7 +243,7 @@ int count = StepScopeTestUtils.doInStepScope(stepExecution, }); ``` -### [](#validatingOutputFiles)Validating Output Files +### Validating Output Files When a batch job writes to the database, it is easy to query the database to verify that the output is as expected. However, if the batch job writes to a file, it is equally @@ -260,7 +260,7 @@ AssertFile.assertFileEquals(new FileSystemResource(EXPECTED_FILE), new FileSystemResource(OUTPUT_FILE)); ``` -### [](#mockingDomainObjects)Mocking Domain Objects +### Mocking Domain Objects Another common issue encountered while writing unit and integration tests for Spring Batch components is how to mock domain objects. A good example is a `StepExecutionListener`, as diff --git a/docs/en/spring-batch/transaction-appendix.md b/docs/en/spring-batch/transaction-appendix.md index 4edda97..17ecb12 100644 --- a/docs/en/spring-batch/transaction-appendix.md +++ b/docs/en/spring-batch/transaction-appendix.md @@ -1,8 +1,8 @@ # Batch Processing and Transactions -## [](#transactions)Appendix A: Batch Processing and Transactions +## Appendix A: Batch Processing and Transactions -### [](#transactionsNoRetry)Simple Batching with No Retry +### Simple Batching with No Retry Consider the following simple example of a nested batch with no retries. It shows a common scenario for batch processing: An input source is processed until exhausted, and @@ -29,7 +29,7 @@ be either transactional or idempotent. If the chunk at `REPEAT` (3) fails because of a database exception at 3.2, then `TX` (2) must roll back the whole chunk. -### [](#transactionStatelessRetry)Simple Stateless Retry +### Simple Stateless Retry It is also useful to use a retry for an operation which is not transactional, such as a call to a web-service or other remote resource, as shown in the following example: @@ -50,7 +50,7 @@ access (2.1) eventually succeeds, the transaction, `TX` (0), commits. If the rem access (2.1) eventually fails, then the transaction, `TX` (0), is guaranteed to roll back. -### [](#repeatRetry)Typical Repeat-Retry Pattern +### Typical Repeat-Retry Pattern The most typical batch processing pattern is to add a retry to the inner block of the chunk, as shown in the following example: @@ -121,7 +121,7 @@ consecutive attempts but not necessarily at the same item. This is consistent wi overall retry strategy. The inner `RETRY` (4) is aware of the history of each item and can decide whether or not to have another attempt at it. -### [](#asyncChunkProcessing)Asynchronous Chunk Processing +### Asynchronous Chunk Processing The inner batches or chunks in the [typical example](#repeatRetry) can be executed concurrently by configuring the outer batch to use an `AsyncTaskExecutor`. The outer @@ -148,7 +148,7 @@ asynchronous chunk processing: | } ``` -### [](#asyncItemProcessing)Asynchronous Item Processing +### Asynchronous Item Processing The individual items in chunks in the [typical example](#repeatRetry) can also, in principle, be processed concurrently. In this case, the transaction boundary has to move @@ -179,7 +179,7 @@ This plan sacrifices the optimization benefit, which the simple plan had, of hav the transactional resources chunked together. It is only useful if the cost of the processing (5) is much higher than the cost of transaction management (3). -### [](#transactionPropagation)Interactions Between Batching and Transaction Propagation +### Interactions Between Batching and Transaction Propagation There is a tighter coupling between batch-retry and transaction management than we would ideally like. In particular, a stateless retry cannot be used to retry database @@ -241,7 +241,7 @@ What about non-default propagation? Consequently, the `NESTED` pattern is best if the retry block contains any database access. -### [](#specialTransactionOrthogonal)Special Case: Transactions with Orthogonal Resources +### Special Case: Transactions with Orthogonal Resources Default propagation is always OK for simple cases where there are no nested database transactions. Consider the following example, where the `SESSION` and `TX` are not @@ -264,7 +264,7 @@ starts. There is no database access outside the `RETRY` (2) block. If `TX` (3) f then eventually succeeds on a retry, `SESSION` (0) can commit (independently of a `TX`block). This is similar to the vanilla "best-efforts-one-phase-commit" scenario. The worst that can happen is a duplicate message when the `RETRY` (2) succeeds and the`SESSION` (0) cannot commit (for example, because the message system is unavailable). -### [](#statelessRetryCannotRecover)Stateless Retry Cannot Recover +### Stateless Retry Cannot Recover The distinction between a stateless and a stateful retry in the typical example above is important. It is actually ultimately a transactional constraint that forces the diff --git a/docs/en/spring-batch/whatsnew.md b/docs/en/spring-batch/whatsnew.md index bd5b942..b0ce201 100644 --- a/docs/en/spring-batch/whatsnew.md +++ b/docs/en/spring-batch/whatsnew.md @@ -1,19 +1,19 @@ # What’s New in Spring Batch 4.3 -## [](#whatsNew)What’s New in Spring Batch 4.3 +## What’s New in Spring Batch 4.3 This release comes with a number of new features, performance improvements, dependency updates and API deprecations. This section describes the most important changes. For a complete list of changes, please refer to the[release notes](https://github.com/spring-projects/spring-batch/releases/tag/4.3.0). -### [](#newFeatures)New features +### New features -#### [](#new-synchronized-itemstreamwriter)New synchronized ItemStreamWriter +#### New synchronized ItemStreamWriter Similar to the `SynchronizedItemStreamReader`, this release introduces a`SynchronizedItemStreamWriter`. This feature is useful in multi-threaded steps where concurrent threads need to be synchronized to not override each other’s writes. -#### [](#new-jpaqueryprovider-for-named-queries)New JpaQueryProvider for named queries +#### New JpaQueryProvider for named queries This release introduces a new `JpaNamedQueryProvider` next to the`JpaNativeQueryProvider` to ease the configuration of JPA named queries when using the `JpaPagingItemReader`: @@ -26,22 +26,22 @@ JpaPagingItemReader reader = new JpaPagingItemReaderBuilder() .build(); ``` -#### [](#new-jpacursoritemreader-implementation)New JpaCursorItemReader Implementation +#### New JpaCursorItemReader Implementation JPA 2.2 added the ability to stream results as a cursor instead of only paging. This release introduces a new JPA item reader that uses this feature to stream results in a cursor-based fashion similar to the `JdbcCursorItemReader`and `HibernateCursorItemReader`. -#### [](#new-jobparametersincrementer-implementation)New JobParametersIncrementer implementation +#### New JobParametersIncrementer implementation Similar to the `RunIdIncrementer`, this release adds a new `JobParametersIncrementer`that is based on a `DataFieldMaxValueIncrementer` from Spring Framework. -#### [](#graalvm-support)GraalVM Support +#### GraalVM Support This release adds initial support to run Spring Batch applications on GraalVM. The support is still experimental and will be improved in future releases. -#### [](#java-records-support)Java records Support +#### Java records Support This release adds support to use Java records as items in chunk-oriented steps. The newly added `RecordFieldSetMapper` supports data mapping from flat files to @@ -69,29 +69,29 @@ public record Person(int id, String name) { } The `FlatFileItemReader` uses the new `RecordFieldSetMapper` to map data from the `persons.csv` file to records of type `Person`. -### [](#performanceImprovements)Performance improvements +### Performance improvements -#### [](#use-bulk-writes-in-repositoryitemwriter)Use bulk writes in RepositoryItemWriter +#### Use bulk writes in RepositoryItemWriter Up to version 4.2, in order to use `CrudRepository#saveAll` in `RepositoryItemWriter`, it was required to extend the writer and override `write(List)`. In this release, the `RepositoryItemWriter` has been updated to use`CrudRepository#saveAll` by default. -#### [](#use-bulk-writes-in-mongoitemwriter)Use bulk writes in MongoItemWriter +#### Use bulk writes in MongoItemWriter The `MongoItemWriter` used `MongoOperations#save()` in a for loop to save items to the database. In this release, this writer has been updated to use `org.springframework.data.mongodb.core.BulkOperations` instead. -#### [](#job-startrestart-time-improvement)Job start/restart time improvement +#### Job start/restart time improvement The implementation of `JobRepository#getStepExecutionCount()` used to load all job executions and step executions in-memory to do the count on the framework side. In this release, the implementation has been changed to do a single call to the database with a SQL count query in order to count step executions. -### [](#dependencyUpdates)Dependency updates +### Dependency updates This release updates dependent Spring projects to the following versions: @@ -107,9 +107,9 @@ This release updates dependent Spring projects to the following versions: * Micrometer 1.5 -### [](#deprecation)Deprecations +### Deprecations -#### [](#apiDeprecation)API deprecation +#### API deprecation The following is a list of APIs that have been deprecated in this release: @@ -139,7 +139,7 @@ The following is a list of APIs that have been deprecated in this release: Suggested replacements can be found in the Javadoc of each deprecated API. -#### [](#sqlfireDeprecation)SQLFire support deprecation +#### SQLFire support deprecation SQLFire has been in [EOL](https://www.vmware.com/latam/products/pivotal-sqlfire.html)since November 1st, 2014. This release deprecates the support of using SQLFire as a job repository and schedules it for removal in version 5.0. \ No newline at end of file diff --git a/docs/spring-batch/appendix.md b/docs/spring-batch/appendix.md index 8ed32ff..c352884 100644 --- a/docs/spring-batch/appendix.md +++ b/docs/spring-batch/appendix.md @@ -1,6 +1,6 @@ -## [](#listOfReadersAndWriters)附录 A:条目阅读器和条目编写器列表 +## 附录 A:条目阅读器和条目编写器列表 -### [](#itemReadersAppendix)条目阅读器 +### 条目阅读器 | Item Reader |说明| |----------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| @@ -24,7 +24,7 @@ | StaxEventItemReader |通过 stax 进行读取,参见[`StaxEventItemReader`]。| | JsonItemReader |从 JSON 文档中读取项目。参见[`JsonItemReader`]。| -### [](#itemWritersAppendix)条目编写者 +### 条目编写者 | Item Writer |说明| |--------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| diff --git a/docs/spring-batch/common-patterns.md b/docs/spring-batch/common-patterns.md index 7e33935..d3434e0 100644 --- a/docs/spring-batch/common-patterns.md +++ b/docs/spring-batch/common-patterns.md @@ -1,6 +1,6 @@ # 常见的批处理模式 -## [](#commonPatterns)常见的批处理模式 +## 常见的批处理模式 XMLJavaBoth @@ -8,7 +8,7 @@ XMLJavaBoth 在这一章中,我们提供了几个自定义业务逻辑中常见模式的示例。这些示例主要以侦听器接口为特征。应该注意的是,如果合适的话,`ItemReader`或`ItemWriter`也可以实现侦听器接口。 -### [](#loggingItemProcessingAndFailures)记录项目处理和失败 +### 记录项目处理和失败 一个常见的用例是需要在一个步骤中对错误进行特殊处理,逐项处理,可能是登录到一个特殊的通道,或者将一条记录插入到数据库中。面向块的`Step`(从 Step Factory Bean 创建)允许用户实现这个用例,它使用一个简单的`ItemReadListener`表示`read`上的错误,使用一个`ItemWriteListener`表示`write`上的错误。以下代码片段演示了记录读写失败的侦听器: @@ -61,7 +61,7 @@ public Step simpleStep() { | |如果你的侦听器在`onError()`方法中执行任何操作,则它必须位于
将被回滚的事务中。如果需要在`onError()`方法中使用事务性
资源,例如数据库,请考虑向该方法添加声明性
事务(有关详细信息,请参见 Spring Core Reference Guide),并给其
传播属性一个值`REQUIRES_NEW`。| |---|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -### [](#stoppingAJobManuallyForBusinessReasons)由于业务原因手动停止作业 +### 由于业务原因手动停止作业 Spring Batch 通过`JobOperator`接口提供了`stop()`方法,但这实际上是供操作员而不是应用程序程序员使用的。有时,从业务逻辑中停止作业执行更方便或更有意义。 @@ -154,7 +154,7 @@ public class CustomItemWriter extends ItemListenerSupport implements StepListene 设置标志时,默认的行为是抛出`JobInterruptedException`。这种行为可以通过`StepInterruptionPolicy`来控制。然而,唯一的选择是抛出或不抛出异常,因此这始终是工作的异常结束。 -### [](#addingAFooterRecord)添加页脚记录 +### 添加页脚记录 通常,当写入平面文件时,在所有处理完成后,必须在文件的末尾附加一个“页脚”记录。这可以使用由 Spring 批提供的`FlatFileFooterCallback`接口来实现。`FlatFileFooterCallback`(及其对应的`FlatFileHeaderCallback`)是`FlatFileItemWriter`的可选属性,可以添加到项编写器中。 @@ -198,7 +198,7 @@ public interface FlatFileFooterCallback { } ``` -#### [](#writingASummaryFooter)编写摘要页脚 +#### 编写摘要页脚 涉及页脚记录的一个常见要求是在输出过程中聚合信息,并将这些信息附加到文件的末尾。这个页脚通常用作文件的摘要或提供校验和。 @@ -293,7 +293,7 @@ public void update(ExecutionContext executionContext) { 更新方法将最新版本的`totalAmount`存储到`ExecutionContext`,就在该对象持久化到数据库之前。open 方法从`ExecutionContext`中检索任何已存在的`totalAmount`,并将其用作处理的起点,从而允许`TradeItemWriter`在重新启动时在上次运行`Step`时未启动的地方进行拾取。 -### [](#drivingQueryBasedItemReaders)基于项目阅读器的驾驶查询 +### 基于项目阅读器的驾驶查询 在[关于读者和作家的章节](readersAndWriters.html)中,讨论了利用分页进行数据库输入的问题。许多数据库供应商(例如 DB2)都有非常悲观的锁定策略,如果正在读取的表也需要由在线应用程序的其他部分使用,这些策略可能会导致问题。此外,在非常大的数据集上打开游标可能会导致某些供应商的数据库出现问题。因此,许多项目更喜欢使用“驱动查询”方法来读取数据。这种方法的工作原理是对键进行迭代,而不是对需要返回的整个对象进行迭代,如下图所示: @@ -309,7 +309,7 @@ public void update(ExecutionContext executionContext) { 应该使用`ItemProcessor`将从驱动查询中获得的键转换为完整的`Foo`对象。现有的 DAO 可以用于基于该键查询完整的对象。 -### [](#multiLineRecords)多行记录 +### 多行记录 虽然平面文件的情况通常是,每个记录都被限制在单行中,但一个文件的记录可能跨越多行,并具有多种格式,这是很常见的。下面摘自一个文件,展示了这种安排的一个例子: @@ -450,7 +450,7 @@ public Trade read() throws Exception { } ``` -### [](#executingSystemCommands)执行系统命令 +### 执行系统命令 许多批处理作业要求从批处理作业中调用外部命令。这样的进程可以由调度器单独启动,但是有关运行的公共元数据的优势将会丧失。此外,一个多步骤的工作也需要被分解成多个工作。 @@ -484,7 +484,7 @@ public SystemCommandTasklet tasklet() { } ``` -### [](#handlingStepCompletionWhenNoInputIsFound)未找到输入时的处理步骤完成 +### 未找到输入时的处理步骤完成 在许多批处理场景中,在数据库或文件中找不到要处理的行并不是例外情况。将`Step`简单地视为未找到工作,并在读取 0 项的情况下完成。所有的`ItemReader`实现都是在 Spring 批处理中提供的,默认为这种方法。如果即使存在输入,也没有写出任何内容,这可能会导致一些混乱(如果文件被错误命名或出现类似问题,通常会发生这种情况)。因此,应该检查元数据本身,以确定框架需要处理多少工作。然而,如果发现没有输入被认为是例外情况怎么办?在这种情况下,最好的解决方案是通过编程方式检查元数据,以确保未处理任何项目并导致失败。因为这是一个常见的用例, Spring Batch 提供了一个具有这种功能的侦听器,如`NoWorkFoundStepExecutionListener`的类定义所示: @@ -503,7 +503,7 @@ public class NoWorkFoundStepExecutionListener extends StepExecutionListenerSuppo 前面的`StepExecutionListener`在“afterstep”阶段检查`StepExecution`的`readCount`属性,以确定是否没有读取任何项。如果是这种情况,将返回一个退出代码`FAILED`,表示`Step`应该失败。否则,将返回`null`,这不会影响`Step`的状态。 -### [](#passingDataToFutureSteps)将数据传递给未来的步骤 +### 将数据传递给未来的步骤 将信息从一个步骤传递到另一个步骤通常是有用的。这可以通过`ExecutionContext`来完成。问题是有两个`ExecutionContexts`:一个在`Step`水平,一个在`Job`水平。`Step``ExecutionContext`只保留到步骤的长度,而`Job``ExecutionContext`则保留到整个`Job`。另一方面,`Step``ExecutionContext`每次`Step`提交一个块时都会更新`Job``ExecutionContext`,而`Step`只在每个`Step`的末尾更新。 diff --git a/docs/spring-batch/domain.md b/docs/spring-batch/domain.md index 78f4e91..2096c83 100644 --- a/docs/spring-batch/domain.md +++ b/docs/spring-batch/domain.md @@ -1,6 +1,6 @@ # 批处理的领域语言 -## [](#domainLanguageOfBatch)批处理的域语言 +## 批处理的域语言 XMLJavaBoth @@ -22,7 +22,7 @@ XMLJavaBoth 前面的图表突出了构成 Spring 批处理的域语言的关键概念。一个作业有一个到多个步骤,每个步骤正好有一个`ItemReader`,一个`ItemProcessor`和一个`ItemWriter`。需要启动一个作业(使用`JobLauncher`),并且需要存储有关当前运行的进程的元数据(在`JobRepository`中)。 -### [](#job)工作 +### 工作 这一部分描述了与批处理作业的概念有关的刻板印象。`Job`是封装整个批处理过程的实体。与其他 Spring 项目一样,`Job`与 XML 配置文件或基于 Java 的配置连接在一起。这种配置可以称为“作业配置”。然而,`Job`只是整个层次结构的顶部,如下图所示: @@ -61,13 +61,13 @@ public Job footballJob() { ``` -#### [](#jobinstance)JobInstance +#### JobInstance a`JobInstance`指的是逻辑作业运行的概念。考虑应该在一天结束时运行一次的批处理作业,例如前面图表中的“endofday”`Job`。有一个“endofday”作业,但是`Job`的每个单独运行都必须单独跟踪。在这种情况下,每天有一个逻辑`JobInstance`。例如,有一个 1 月 1 日运行,1 月 2 日运行,以此类推。如果 1 月 1 日运行第一次失败,并在第二天再次运行,它仍然是 1 月 1 日运行。(通常,这也对应于它正在处理的数据,这意味着 1 月 1 日运行处理 1 月 1 日的数据)。因此,每个`JobInstance`都可以有多个执行(`JobExecution`在本章后面更详细地讨论),并且只有一个`JobInstance`对应于特定的`Job`和标识`JobParameters`的执行可以在给定的时间运行。 `JobInstance`的定义与要加载的数据完全无关。这完全取决于`ItemReader`实现来确定如何加载数据。例如,在 Endofday 场景中,数据上可能有一个列,该列指示数据所属的“生效日期”或“计划日期”。因此,1 月 1 日的运行将只加载 1 日的数据,而 1 月 2 日的运行将只使用 2 日的数据。因为这个决定很可能是一个商业决定,所以它是由`ItemReader`来决定的。然而,使用相同的`JobInstance`确定是否使用来自先前执行的’状态’(即`ExecutionContext`,这将在本章后面讨论)。使用一个新的`JobInstance`表示“从开始”,而使用一个现有的实例通常表示“从你停止的地方开始”。 -#### [](#jobparameters)JobParameters +#### JobParameters 在讨论了`JobInstance`以及它与约伯有何不同之后,我们自然要问的问题是:“一个`JobInstance`如何与另一个区分开来?”答案是:`JobParameters`。`JobParameters`对象持有一组用于启动批处理作业的参数。它们可以用于标识,甚至在运行过程中作为参考数据,如下图所示: @@ -80,7 +80,7 @@ a`JobInstance`指的是逻辑作业运行的概念。考虑应该在一天结束 | |并非所有作业参数都需要有助于识别`JobInstance`。在默认情况下,他们会这么做。但是,该框架还允许使用不影响`JobInstance`的恒等式的参数提交
的`Job`。| |---|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -#### [](#jobexecution)jobexecution +#### jobexecution a`JobExecution`指的是一次尝试运行作业的技术概念。一次执行可能以失败或成功结束,但除非执行成功完成,否则对应于给定执行的`JobInstance`不被认为是完成的。以前面描述的 Endofday`Job`为例,考虑第一次运行时失败的 01-01-2017 的`JobInstance`。如果以与第一次运行(01-01-2017)相同的标识作业参数再次运行,则会创建一个新的`JobExecution`。然而,仍然只有一个`JobInstance`。 @@ -136,7 +136,7 @@ a`JobExecution`指的是一次尝试运行作业的技术概念。一次执行 | |列名可能已被缩写或删除,以求清楚和
格式。| |---|---------------------------------------------------------------------------------------------| -### [](#step)步骤 +### 步骤 `Step`是一个域对象,它封装了批处理作业的一个独立的、连续的阶段。因此,每一项工作都完全由一个或多个步骤组成。a`Step`包含定义和控制实际批处理所需的所有信息。这必然是一个模糊的描述,因为任何给定的`Step`的内容都是由编写`Job`的开发人员自行决定的。a`Step`可以是简单的,也可以是复杂的,正如开发人员所希望的那样。简单的`Step`可能会将文件中的数据加载到数据库中,只需要很少或不需要代码(取决于使用的实现)。更复杂的`Step`可能具有复杂的业务规则,这些规则作为处理的一部分被应用。与`Job`一样,`Step`具有与唯一的`StepExecution`相关的个体`StepExecution`,如下图所示: @@ -144,7 +144,7 @@ a`JobExecution`指的是一次尝试运行作业的技术概念。一次执行 图 4。带有步骤的工作层次结构 -#### [](#stepexecution)分步执行 +#### 分步执行 a`StepExecution`表示试图执行`Step`的一次尝试。每次运行`Step`都会创建一个新的`StepExecution`,类似于`JobExecution`。但是,如果一个步骤由于它失败之前的步骤而无法执行,则不会对它执行持久化。只有当它的`Step`实际启动时,才会创建`StepExecution`。 @@ -166,7 +166,7 @@ a`StepExecution`表示试图执行`Step`的一次尝试。每次运行`Step`都 | filterCount |已被`ItemProcessor`“过滤”的项数。| | writeSkipCount |失败的次数`write`,导致项目被跳过。| -### [](#executioncontext)ExecutionContext +### ExecutionContext `ExecutionContext`表示一组键/值对的集合,这些键/值对由框架持久化并控制,以便允许开发人员有一个存储持久状态的位置,该状态的作用域为`StepExecution`对象或`JobExecution`对象。对于那些熟悉 Quartz 的人来说,它与 JobDataMap 非常相似。最好的使用示例是方便重新启动。以平面文件输入为例,在处理单个行时,该框架会在提交点周期性地保存`ExecutionContext`。这样做允许`ItemReader`存储其状态,以防在运行过程中发生致命错误,甚至断电。所需要的只是将当前读取的行数放入上下文中,如下面的示例所示,框架将完成其余的工作: @@ -225,7 +225,7 @@ ExecutionContext ecJob = jobExecution.getExecutionContext(); 如注释中所指出的,`ecStep`不等于`ecJob`。它们是两个不同的`ExecutionContexts`。作用域为`Step`的一个被保存在`Step`中的每个提交点,而作用域为该作业的一个被保存在每个`Step`执行之间。 -### [](#jobrepository)JobRepository +### JobRepository `JobRepository`是上述所有刻板印象的持久性机制。它为`JobLauncher`、`Job`和`Step`实现提供增删改查操作。当`Job`首次启动时,将从存储库获得`JobExecution`,并且在执行过程中,通过将`StepExecution`和`JobExecution`实现传递到存储库来持久化它们。 @@ -237,7 +237,7 @@ Spring 批处理 XML 命名空间提供了对配置带有``标 当使用 Java 配置时,`@EnableBatchProcessing`注释提供了`JobRepository`作为自动配置的组件之一。 -### [](#joblauncher)joblauncher +### joblauncher `JobLauncher`表示用于启动`Job`具有给定的`JobParameters`集的`Job`的简单接口,如以下示例所示: @@ -252,19 +252,19 @@ public JobExecution run(Job job, JobParameters jobParameters) 期望实现从`JobRepository`获得有效的`JobExecution`并执行`Job`。 -### [](#item-reader)条目阅读器 +### 条目阅读器 `ItemReader`是一种抽象,表示对`Step`输入的检索,每次检索一项。当`ItemReader`已经耗尽了它可以提供的项时,它通过返回`null`来表示这一点。有关`ItemReader`接口及其各种实现方式的更多详细信息,请参见[读者和作家](readersAndWriters.html#readersAndWriters)。 -### [](#item-writer)item writer +### item writer `ItemWriter`是一种抽象,它表示`Step`的输出,一次输出一个批处理或一大块项目。通常,`ItemWriter`不知道下一步应该接收的输入,只知道当前调用中传递的项。有关`ItemWriter`接口及其各种实现方式的更多详细信息,请参见[读者和作家](readersAndWriters.html#readersAndWriters)。 -### [](#item-processor)项处理器 +### 项处理器 `ItemProcessor`是表示项目的业务处理的抽象。当`ItemReader`读取一个项,而`ItemWriter`写入它们时,`ItemProcessor`提供了一个接入点来转换或应用其他业务处理。如果在处理该项时确定该项无效,则返回`null`表示不应写出该项。有关`ItemProcessor`接口的更多详细信息,请参见[读者和作家](readersAndWriters.html#readersAndWriters)。 -### [](#batch-namespace)批处理名称空间 +### 批处理名称空间 前面列出的许多域概念需要在 Spring `ApplicationContext`中进行配置。虽然有上述接口的实现方式可以在标准 Bean 定义中使用,但提供了一个名称空间以便于配置,如以下示例所示: diff --git a/docs/spring-batch/glossary.md b/docs/spring-batch/glossary.md index fa5431f..1fa34f7 100644 --- a/docs/spring-batch/glossary.md +++ b/docs/spring-batch/glossary.md @@ -1,8 +1,8 @@ # 词汇表 -## [](#glossary)附录 A:术语表 +## 附录 A:术语表 -### [](#spring-batch-glossary) Spring 批处理术语表 +### Spring 批处理术语表 批处理 diff --git a/docs/spring-batch/job.md b/docs/spring-batch/job.md index 23b0424..09fa70f 100644 --- a/docs/spring-batch/job.md +++ b/docs/spring-batch/job.md @@ -1,6 +1,6 @@ # 配置和运行作业 -## [](#configureJob)配置和运行作业 +## 配置和运行作业 XMLJavaBoth @@ -12,7 +12,7 @@ XMLJavaBoth 虽然`Job`对象看起来像是一个用于步骤的简单容器,但开发人员必须了解许多配置选项。此外,对于如何运行`Job`以及在运行期间如何存储其元数据,有许多考虑因素。本章将解释`Job`的各种配置选项和运行时关注点。 -### [](#configuringAJob)配置作业 +### 配置作业 [`Job`](#configurejob)接口有多个实现方式。然而,构建者会抽象出配置上的差异。 @@ -53,7 +53,7 @@ public Job footballJob() { 除了步骤之外,作业配置还可以包含有助于并行(``)、声明性流控制(``)和流定义外部化(``)的其他元素。 -#### [](#restartability)可重启性 +#### 可重启性 执行批处理作业时的一个关键问题与`Job`重新启动时的行为有关。如果对于特定的`Job`已经存在`JobExecution`,则将`Job`的启动视为“重新启动”。理想情况下,所有的工作都应该能够在它们停止的地方启动,但是在某些情况下这是不可能的。* 完全由开发人员来确保在此场景中创建一个新的`JobInstance`。* 但是, Spring 批处理确实提供了一些帮助。如果`Job`永远不应该重新启动,而应该始终作为新的`JobInstance`的一部分运行,那么可重启属性可以设置为“false”。 @@ -103,7 +103,7 @@ catch (JobRestartException e) { 这段 JUnit 代码展示了如何在第一次为不可重启作业创建`JobExecution`时尝试创建`JobExecution`不会导致任何问题。但是,第二次尝试将抛出`JobRestartException`。 -#### [](#interceptingJobExecution)拦截作业执行 +#### 拦截作业执行 在作业的执行过程中,通知其生命周期中的各种事件可能是有用的,以便可以执行自定义代码。通过在适当的时间调用`JobListener`,`SimpleJob`允许这样做: @@ -167,7 +167,7 @@ public void afterJob(JobExecution jobExecution){ * `@AfterJob` -#### [](#inheritingFromAParentJob)继承父作业 +#### 继承父作业 如果一组作业共享相似但不相同的配置,那么定义一个“父”`Job`可能会有所帮助,具体的作业可以从该“父”中继承属性。与 Java 中的类继承类似,“child”`Job`将把它的元素和属性与父元素和属性结合在一起。 @@ -191,7 +191,7 @@ public void afterJob(JobExecution jobExecution){ 有关更多详细信息,请参见[从父步骤继承](step.html#inheritingFromParentStep)一节。 -#### [](#jobparametersvalidator)JobParametersValidator +#### JobParametersValidator 在 XML 命名空间中声明的作业或使用`AbstractJob`的任意子类可以在运行时为作业参数声明验证器。例如,当你需要断言一个作业是以其所有的强制参数启动时,这是有用的。有一个`DefaultJobParametersValidator`可以用来约束简单的强制参数和可选参数的组合,对于更复杂的约束,你可以自己实现接口。 @@ -218,7 +218,7 @@ public Job job1() { } ``` -### [](#javaConfig)Java 配置 +### Java 配置 Spring 3 带来了通过 Java 而不是 XML 配置应用程序的能力。从 Spring Batch2.2.0 开始,可以使用相同的 Java 配置配置来配置批处理作业。基于 Java 的配置有两个组件:`@EnableBatchProcessing`注释和两个构建器。 @@ -293,7 +293,7 @@ public class AppConfig { } ``` -### [](#configuringJobRepository)配置 JobRepository +### 配置 JobRepository 当使用`@EnableBatchProcessing`时,将为你提供一个`JobRepository`。本节讨论如何配置自己的配置。 @@ -336,7 +336,7 @@ protected JobRepository createJobRepository() throws Exception { 除了数据源和 TransactionManager 之外,上面列出的配置选项都不是必需的。如果没有设置,将使用上面显示的默认值。以上所示是为了提高认识。最大 VARCHAR 长度默认为 2500,这是`VARCHAR`中的长[示例模式脚本](schema-appendix.html#metaDataSchemaOverview)列的长度 -#### [](#txConfigForJobRepository)JobRepository 的事务配置 +#### JobRepository 的事务配置 如果使用了名称空间或提供的`FactoryBean`,则会在存储库周围自动创建事务建议。这是为了确保批处理元数据(包括在发生故障后重新启动所必需的状态)被正确地持久化。如果存储库方法不是事务性的,那么框架的行为就没有得到很好的定义。`create*`方法属性中的隔离级别是单独指定的,以确保在启动作业时,如果两个进程试图同时启动相同的作业,则只有一个进程成功。该方法的默认隔离级别是`SERIALIZABLE`,这是非常激进的。`READ_COMMITTED`同样有效。如果两个过程不太可能以这种方式碰撞,`READ_UNCOMMITTED`就可以了。然而,由于对`create*`方法的调用相当短,所以只要数据库平台支持它,`SERIALIZED`不太可能导致问题。然而,这一点可以被重写。 @@ -404,7 +404,7 @@ public TransactionProxyFactoryBean baseProxy() { } ``` -#### [](#repositoryTablePrefix)更改表格前缀 +#### 更改表格前缀 `JobRepository`的另一个可修改的属性是元数据表的表前缀。默认情况下,它们都以`BATCH_`开头。`BATCH_JOB_EXECUTION`和`BATCH_STEP_EXECUTION`是两个例子。然而,有潜在的理由修改这个前缀。如果需要将模式名称前置到表名,或者如果同一模式中需要多个元数据表集合,则需要更改表前缀: @@ -438,7 +438,7 @@ protected JobRepository createJobRepository() throws Exception { | |只有表前缀是可配置的。表和列名不是。| |---|--------------------------------------------------------------------------| -#### [](#inMemoryRepository)内存存储库 +#### 内存存储库 在某些情况下,你可能不希望将域对象持久化到数据库。原因之一可能是速度;在每个提交点存储域对象需要额外的时间。另一个原因可能是,你不需要为一份特定的工作坚持现状。出于这个原因, Spring 批处理提供了作业存储库的内存`Map`版本。 @@ -474,7 +474,7 @@ protected JobRepository createJobRepository() throws Exception { | |在 V4 中,`MapJobRepositoryFactoryBean`和相关的类已被弃用,并计划在 V5 中删除
。如果希望使用内存中的作业存储库,可以使用嵌入式数据库
,比如 H2、 Apache Derby 或 HSQLDB。有几种方法可以创建嵌入式数据库并在
你的 Spring 批处理应用程序中使用它。一种方法是使用[Spring JDBC](https://docs.spring.io/spring-framework/docs/current/reference/html/data-access.html#jdbc-embedded-database-support)中的 API:

```
@Bean
public DataSource dataSource() {
return new EmbeddedDatabaseBuilder()
.setType(EmbeddedDatabaseType.H2)
.addScript("/org/springframework/batch/core/schema-drop-h2.sql")
.addScript("/org/springframework/batch/core/schema-h2.sql")
.build();
}
```

一旦你在应用程序上下文中将嵌入式数据源定义为 Bean,如果你使用`@EnableBatchProcessing`,就应该自动选择
。否则,你可以使用
基于`JobRepositoryFactoryBean`的 JDBC 手动配置它,如[配置 JobRepository 部分](#configuringJobRepository)所示。| |---|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -#### [](#nonStandardDatabaseTypesInRepository)存储库中的非标准数据库类型 +#### 存储库中的非标准数据库类型 如果你使用的数据库平台不在受支持的平台列表中,那么如果 SQL 变量足够接近,则可以使用受支持的类型之一。要做到这一点,你可以使用 RAW`JobRepositoryFactoryBean`而不是名称空间快捷方式,并使用它将数据库类型设置为最接近的匹配。 @@ -509,7 +509,7 @@ protected JobRepository createJobRepository() throws Exception { 如果连这都不起作用,或者你没有使用 RDBMS,那么唯一的选择可能是实现`Dao`所依赖的各种`SimpleJobRepository`接口,并以正常的方式手动连接。 -### [](#configuringJobLauncher)配置一个 joblauncher +### 配置一个 joblauncher 当使用`@EnableBatchProcessing`时,将为你提供一个`JobRegistry`。本节讨论如何配置自己的配置。 @@ -588,15 +588,15 @@ public JobLauncher jobLauncher() { Spring `TaskExecutor`接口的任何实现都可以用来控制如何异步执行作业。 -### [](#runningAJob)运行作业 +### 运行作业 至少,启动批处理作业需要两个条件:启动`Job`和`JobLauncher`。两者都可以包含在相同的上下文中,也可以包含在不同的上下文中。例如,如果从命令行启动一个作业,将为每个作业实例化一个新的 JVM,因此每个作业都有自己的`JobLauncher`。但是,如果在`HttpRequest`范围内的 Web 容器中运行,通常会有一个`JobLauncher`,该配置用于异步作业启动,多个请求将调用以启动其作业。 -#### [](#runningJobsFromCommandLine)从命令行运行作业 +#### 从命令行运行作业 对于希望从 Enterprise 调度器运行作业的用户,命令行是主要的接口。这是因为大多数调度程序(Quartz 除外,除非使用 nativeJob)直接与操作系统进程一起工作,主要是通过 shell 脚本开始的。除了 shell 脚本之外,还有许多启动 Java 进程的方法,例如 Perl、Ruby,甚至是 Ant 或 Maven 之类的“构建工具”。但是,由于大多数人都熟悉 shell 脚本,因此本例将重点讨论它们。 -##### [](#commandLineJobRunner)The CommandlineJobrunner +##### The CommandlineJobrunner 因为启动作业的脚本必须启动一个 Java 虚拟机,所以需要有一个具有 main 方法的类来充当主要入口点。 Spring 批处理提供了一种实现,它仅服务于此目的:`CommandLineJobRunner`。需要注意的是,这只是引导应用程序的一种方法,但是启动 Java 进程的方法有很多,并且这个类绝不应该被视为确定的。`CommandLineJobRunner`执行四项任务: @@ -678,7 +678,7 @@ public class EndOfDayJobConfiguration { 前面的示例过于简单,因为在 Spring 批处理中运行一个批处理作业通常有更多的需求,但是它用于显示`CommandLineJobRunner`的两个主要需求:`Job`和`JobLauncher`。 -##### [](#exitCodes)exitcodes +##### exitcodes 当从命令行启动批处理作业时,通常使用 Enterprise 调度器。大多数调度器都相当笨拙,只能在流程级别工作。这意味着他们只知道一些操作系统进程,比如他们正在调用的 shell 脚本。在这种情况下,将工作的成功或失败反馈给调度程序的唯一方法是通过返回代码。返回代码是进程返回给调度程序的一个数字,它指示运行的结果。在最简单的情况下:0 是成功,1 是失败。然而,可能有更复杂的情况:如果作业 A 返回 4,则启动作业 B,如果它返回 5,则启动作业 C。这种类型的行为是在计划程序级别上配置的,但是重要的是, Spring 批处理框架提供了一种方法来返回用于特定批处理作业的“退出代码”的数字表示。在 Spring 批处理中,这被封装在`ExitStatus`中,这在第 5 章中有更详细的介绍。为了讨论退出代码,唯一需要知道的是`ExitStatus`具有一个退出代码属性,该属性由框架(或开发人员)设置,并作为从`JobLauncher`返回的`JobExecution`的一部分返回。`CommandLineJobRunner`使用`ExitCodeMapper`接口将这个字符串值转换为一个数字: @@ -692,7 +692,7 @@ public interface ExitCodeMapper { `ExitCodeMapper`的基本契约是,给定一个字符串退出代码,将返回一个数字表示。Job Runner 使用的默认实现是`SimpleJvmExitCodeMapper`,它返回 0 表示完成,1 表示泛型错误,2 表示任何 Job Runner 错误,例如无法在提供的上下文中找到`Job`。如果需要比上述 3 个值更复杂的值,则必须提供`ExitCodeMapper`接口的自定义实现。因为`CommandLineJobRunner`是创建`ApplicationContext`的类,因此不能“连线在一起”,所以需要重写的任何值都必须是自动连线的。这意味着,如果在`BeanFactory`中找到了`ExitCodeMapper`的实现,则将在创建上下文后将其注入到运行器中。要提供你自己的`ExitCodeMapper`,需要做的就是将实现声明为根级别 Bean,并确保它是由运行器加载的`ApplicationContext`的一部分。 -#### [](#runningJobsFromWebContainer)在 Web 容器中运行作业 +#### 在 Web 容器中运行作业 从历史上看,离线处理(如批处理作业)是从命令行启动的,如上文所述。然而,在许多情况下,从`HttpRequest`发射是更好的选择。许多这样的用例包括报告、临时作业运行和 Web 应用程序支持。因为按定义,批处理作业是长时间运行的,所以最重要的问题是确保异步启动该作业: @@ -700,7 +700,7 @@ public interface ExitCodeMapper { 图 4。来自 Web 容器的异步作业启动器序列 -在这种情况下,控制器是 Spring MVC 控制器。关于 Spring MVC 的更多信息可以在这里找到:[](https://docs.spring.io/spring/docs/current/spring-framework-reference/web.html#mvc)[https://docs.spring.io/spring/docs/current/spring-framework-reference/web.html#mvc](https://docs.spring.io/spring/docs/current/spring-framework-reference/web.html#mvc)。控制器使用已配置为启动[异步](#runningJobsFromWebContainer)的`Job`启动`Job`,该控制器立即返回`JobExecution`。`Job`可能仍在运行,但是,这种非阻塞行为允许控制器立即返回,这是处理`HttpRequest`时所需的。以下是一个例子: +在这种情况下,控制器是 Spring MVC 控制器。关于 Spring MVC 的更多信息可以在这里找到:的`Job`启动`Job`,该控制器立即返回`JobExecution`。`Job`可能仍在运行,但是,这种非阻塞行为允许控制器立即返回,这是处理`HttpRequest`时所需的。以下是一个例子: ``` @Controller @@ -719,7 +719,7 @@ public class JobLauncherController { } ``` -### [](#advancedMetaData)高级元数据使用 +### 高级元数据使用 到目前为止,`JobLauncher`和`JobRepository`接口都已经讨论过了。它们一起表示作业的简单启动,以及批处理域对象的基本操作: @@ -735,7 +735,7 @@ a`JobLauncher`使用`JobRepository`来创建新的`JobExecution`对象并运行 下面将讨论`JobExplorer`和`JobOperator`接口,它们添加了用于查询和控制元数据的附加功能。 -#### [](#queryingRepository)查询存储库 +#### 查询存储库 在任何高级特性之前,最基本的需求是查询存储库中现有执行的能力。此功能由`JobExplorer`接口提供: @@ -811,7 +811,7 @@ public JobExplorer getJobExplorer() throws Exception { ... ``` -#### [](#jobregistry)JobRegistry +#### JobRegistry a`JobRegistry`(及其父接口`JobLocator`)不是强制性的,但如果你想跟踪上下文中哪些作业可用,它可能会很有用。当工作在其他地方创建时(例如,在子上下文中),它对于在应用程序上下文中集中收集工作也很有用。还可以使用自定义`JobRegistry`实现来操作已注册作业的名称和其他属性。该框架只提供了一个实现,它基于从作业名称到作业实例的简单映射。 @@ -839,7 +839,7 @@ public JobRegistry jobRegistry() throws Exception { 有两种方法可以自动填充`JobRegistry`:使用 Bean 后处理器和使用注册商生命周期组件。这两种机制在下面的部分中进行了描述。 -##### [](#jobregistrybeanpostprocessor)jobregistrybeanpostprocessor +##### jobregistrybeanpostprocessor 这是一个 Bean 后处理器,它可以在创建所有作业时注册它们。 @@ -868,7 +868,7 @@ public JobRegistryBeanPostProcessor jobRegistryBeanPostProcessor() { 虽然这不是严格必要的,但是在示例中的后处理器已经被赋予了一个 ID,以便它可以被包括在子上下文中(例如作为父 Bean 定义),并导致在那里创建的所有作业也被自动注册。 -##### [](#automaticjobregistrar)`AutomaticJobRegistrar` +##### `AutomaticJobRegistrar` 这是一个生命周期组件,它创建子上下文,并在创建这些上下文时从这些上下文注册作业。这样做的一个好处是,虽然子上下文中的作业名称在注册表中仍然必须是全局唯一的,但它们的依赖项可能具有“自然”名称。因此,例如,你可以创建一组 XML 配置文件,每个配置文件只具有一个作业,但所有配置文件都具有具有具有相同 Bean 名称的`ItemReader`的不同定义,例如“reader”。如果将所有这些文件导入到相同的上下文中,则读写器定义将发生冲突并相互覆盖,但是使用自动注册器可以避免这种情况。这使得集成来自应用程序的独立模块的作业变得更加容易。 @@ -914,7 +914,7 @@ public AutomaticJobRegistrar registrar() { 如果需要,`AutomaticJobRegistrar`可以与`JobRegistryBeanPostProcessor`一起使用(只要`DefaultJobLoader`也可以使用)。例如,如果在主父上下文和子位置中定义了作业,那么这可能是可取的。 -#### [](#JobOperator)joboperator +#### joboperator 如前所述,`JobRepository`提供对元数据的增删改查操作,而`JobExplorer`提供对元数据的只读操作。然而,当这些操作一起用来执行常见的监视任务时,它们是最有用的,例如停止、重新启动或汇总作业,就像批处理操作符通常做的那样。 Spring 批处理通过`JobOperator`接口提供这些类型的操作: @@ -997,7 +997,7 @@ public interface JobOperator { | |如果你在作业存储库上设置了表前缀,请不要忘记在作业资源管理器上也设置它。| |---|------------------------------------------------------------------------------------------------------| -#### [](#JobParametersIncrementer)JobParametersIncrementer +#### JobParametersIncrementer 关于`JobOperator`的大多数方法都是不言自明的,更详细的解释可以在[接口的 Javadoc](https://docs.spring.io/spring-batch/docs/current/api/org/springframework/batch/core/launch/JobOperator.html)上找到。然而,`startNextInstance`方法是值得注意的。这个方法总是会启动一个作业的新实例。如果`JobExecution`中存在严重问题,并且需要从一开始就重新开始工作,那么这将非常有用。与`JobLauncher`不同,`JobLauncher`需要一个新的`JobParameters`对象,如果参数与以前的任何一组参数不同,则该对象将触发一个新的`JobInstance`,`startNextInstance`方法将使用绑定到`JobParametersIncrementer`的`Job`来强制将`Job`转换为一个新实例: @@ -1046,7 +1046,7 @@ public Job footballJob() { } ``` -#### [](#stoppingAJob)停止工作 +#### 停止工作 `JobOperator`最常见的用例之一是优雅地停止一项工作: @@ -1057,7 +1057,7 @@ jobOperator.stop(executions.iterator().next()); 关闭不是立即的,因为无法强制立即关闭,特别是如果当前执行的是框架无法控制的开发人员代码,例如业务服务。但是,一旦将控件返回到框架中,就会将当前`StepExecution`的状态设置为`BatchStatus.STOPPED`,保存它,然后在完成之前对`JobExecution`执行相同的操作。 -#### [](#aborting-a-job)终止作业 +#### 终止作业 可以重新启动`FAILED`的作业执行(如果`Job`是可重启的)。状态为`ABANDONED`的作业执行将不会被框架重新启动。在步骤执行中,`ABANDONED`状态也被用于在重新启动的作业执行中将其标记为可跳过的:如果作业正在执行,并且遇到了在上一个失败的作业执行中标记`ABANDONED`的步骤,它将进入下一个步骤(由作业流定义和步骤执行退出状态决定)。 diff --git a/docs/spring-batch/jsr-352.md b/docs/spring-batch/jsr-352.md index 2a4cb5f..7c1c6a0 100644 --- a/docs/spring-batch/jsr-352.md +++ b/docs/spring-batch/jsr-352.md @@ -1,25 +1,25 @@ # JSR-352 支援 -## [](#jsr-352)JSR-352 支持 +## JSR-352 支持 XMLJavaBoth -截至 Spring,对 JSR-352 的批处理 3.0 支持已经完全实现。本节不是规范本身的替代,而是打算解释 JSR-352 特定概念如何应用于 Spring 批处理。有关 JSR-352 的其他信息可以通过 JCP 在这里找到:[](https://jcp.org/en/jsr/detail?id=352)[https://jcp.org/en/jsr/detail?id=352](https://jcp.org/en/jsr/detail?id=352) +截至 Spring,对 JSR-352 的批处理 3.0 支持已经完全实现。本节不是规范本身的替代,而是打算解释 JSR-352 特定概念如何应用于 Spring 批处理。有关 JSR-352 的其他信息可以通过 JCP 在这里找到: -### [](#jsrGeneralNotes)关于 Spring 批和 JSR-352 的一般说明 +### 关于 Spring 批和 JSR-352 的一般说明 Spring Batch 和 JSR-352 在结构上是相同的。他们俩的工作都是由台阶组成的。它们都有读取器、处理器、编写器和监听器。然而,他们之间的互动却有微妙的不同。例如, Spring 批处理中的`org.springframework.batch.core.SkipListener#onSkipInWrite(S item, Throwable t)`接收两个参数:被跳过的项和导致跳过的异常。相同方法的 JSR-352 版本(`javax.batch.api.chunk.listener.SkipWriteListener#onSkipWriteItem(List items, Exception ex)`)也接收两个参数。但是,第一个是当前块中所有项的`List`,第二个是导致跳过的`Exception`。由于这些差异,重要的是要注意,在 Spring 批处理中执行作业有两种路径:传统的 Spring 批处理作业或基于 JSR-352 的作业。虽然 Spring 批处理工件(读取器、编写器等)的使用将在使用 JSR-352 的 JSL 配置并使用`JsrJobOperator`执行的作业中进行,但它们的行为将遵循 JSR-352 的规则。还需要注意的是,针对 JSR-352 接口开发的批处理工件将不能在传统的批处理作业中工作。 -### [](#jsrSetup)设置 +### 设置 -#### [](#jsrSetupContexts)应用程序上下文 +#### 应用程序上下文 Spring 批处理中的所有基于 JSR-352 的作业都由两个应用程序上下文组成。父上下文,它包含与 Spring 批处理的基础结构相关的 bean,例如`JobRepository`、`PlatformTransactionManager`等,以及包含要运行的作业的配置的子上下文。父上下文是通过框架提供的`jsrBaseContext.xml`定义的。可以通过设置`JSR-352-BASE-CONTEXT`系统属性来重写此上下文。 | |对于属性注入之类的事情,JSR-352 处理器不会处理基本上下文,因此
不需要在此配置额外处理的组件。| |---|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -#### [](#jsrSetupLaunching)启动基于 JSR-352 的作业 +#### 启动基于 JSR-352 的作业 JSR-352 需要一个非常简单的路径来执行批处理作业。以下代码是执行第一批作业所需的全部内容: @@ -46,7 +46,7 @@ jobOperator.start("myJob", new Properties()); | |对于执行基于 JSR-352 的作业,上面的 bean 都不是可选的。所有这些都可以被重写到
,根据需要提供定制的功能。| |---|-----------------------------------------------------------------------------------------------------------------------------------------------| -### [](#dependencyInjection)依赖注入 +### 依赖注入 JSR-352 在很大程度上基于 Spring 批编程模型。因此,虽然没有显式地要求正式的依赖注入实现,但是隐含了某种类型的 DI。 Spring 批处理支持用于加载 JSR-352 定义的批处理工件的所有三种方法: @@ -122,9 +122,9 @@ Spring 上下文(导入等)的组装与 JSR-352 作业一起工作,就像 ``` -### [](#jsrJobProperties)批处理属性 +### 批处理属性 -#### [](#jsrPropertySupport)属性支持 +#### 属性支持 JSR-352 允许通过在 JSL 中的配置在作业、步骤和批处理工件级别定义属性。在每个级别上,按以下方式配置批处理属性: @@ -137,7 +137,7 @@ JSR-352 允许通过在 JSL 中的配置在作业、步骤和批处理工件级 `Properties`可以在任何批处理工件上进行配置。 -#### [](#jsrBatchPropertyAnnotation)@batchproperty 注释 +#### @batchproperty 注释 `Properties`在批处理工件中通过使用`@BatchProperty`和`@Inject`注释(这两个注释都是规范所要求的)注释类字段来引用。根据 JSR-352 的定义,属性的字段必须是字符串类型的。任何类型转换都要由实现开发人员来执行。 @@ -155,7 +155,7 @@ public class MyItemReader extends AbstractItemReader { 字段“PropertyName1”的值将是“PropertyValue1” -#### [](#jsrPropertySubstitution)属性替换 +#### 属性替换 属性替换是通过运算符和简单条件表达式来提供的。一般用法是`#{operator['key']}`。 @@ -175,7 +175,7 @@ public class MyItemReader extends AbstractItemReader { 赋值的左边是期望值,右边是默认值。在前面的示例中,结果将解析为系统属性文件的值。分隔符 #{jobparamets[’unsolving.prop’]}被假定为不可解析。如果两个表达式都不能解析,将返回一个空字符串。可以使用多个条件,这些条件由“;”分隔。 -### [](#jsrProcessingModels)处理模型 +### 处理模型 JSR-352 提供了与 Spring 批处理相同的两个基本处理模型: @@ -183,7 +183,7 @@ JSR-352 提供了与 Spring 批处理相同的两个基本处理模型: * 基于任务的处理-使用`javax.batch.api.Batchlet`实现。这种处理模型与当前可用的基于`org.springframework.batch.core.step.tasklet.Tasklet`的处理相同。 -#### [](#item-based-processing)基于项目的处理 +#### 基于项目的处理 在此上下文中,基于项的处理是由`ItemReader`读取的项数设置的块大小。要以这种方式配置步骤,请指定`item-count`(默认值为 10),并可选择将`checkpoint-policy`配置为项(这是默认值)。 @@ -201,7 +201,7 @@ JSR-352 提供了与 Spring 批处理相同的两个基本处理模型: 如果选择了基于项的检查点,则支持一个附加属性`time-limit`。这为必须处理指定的项数设置了一个时间限制。如果达到了超时,那么不管`item-count`配置为什么,该块都将完成,到那时已经读取了多少项。 -#### [](#custom-checkpointing)自定义检查点 +#### 自定义检查点 JSR-352 在步骤“检查点”中调用围绕提交间隔的进程。基于项目的检查点是上面提到的一种方法。然而,在许多情况下,这还不够强大。因此,规范允许通过实现`javax.batch.api.chunk.CheckpointAlgorithm`接口来实现自定义检查点算法。该功能在功能上与 Spring Batch 的自定义完成策略相同。要使用`CheckpointAlgorithm`的实现,请使用自定义`checkpoint-policy`配置你的步骤,如下所示,其中`fooCheckpointer`是指`CheckpointAlgorithm`的实现。 @@ -218,7 +218,7 @@ JSR-352 在步骤“检查点”中调用围绕提交间隔的进程。基于项 ... ``` -### [](#jsrRunningAJob)运行作业 +### 运行作业 执行基于 JSR-352 的作业的入口是通过`javax.batch.operations.JobOperator`。 Spring 批处理提供了它自己实现的这个接口(`org.springframework.batch.core.jsr.launch.JsrJobOperator`)。这个实现是通过`javax.batch.runtime.BatchRuntime`加载的。启动基于 JSR-352 的批处理作业的实现如下: @@ -240,7 +240,7 @@ long jobExecutionId = jobOperator.start("fooJob", new Properties()); 当使用`JobOperator#start`调用`SimpleJobOperator`时, Spring 批处理确定调用是初始运行还是对先前执行的运行的重试。使用基于 JSR-352 的`JobOperator#start(String jobXMLName, Properties jobParameters)`,框架将始终创建一个新的 JobInstance(JSR-352 作业参数是不标识的)。为了重新启动作业,需要调用`JobOperator#restart(long executionId, Properties restartParameters)`。 -### [](#jsrContexts)上下文 +### 上下文 JSR-352 定义了两个上下文对象,用于与批处理工件中的作业或步骤的元数据交互:`javax.batch.runtime.context.JobContext`和`javax.batch.runtime.context.StepContext`。这两个都可以在任何步骤级别的工件(`Batchlet`,`ItemReader`等)中使用,而`JobContext`也可以用于作业级别工件(例如`JobListener`)。 @@ -256,7 +256,7 @@ JobContext jobContext; 在 Spring 批处理中,`JobContext`和`StepContext`分别包装其对应的执行对象(`JobExecution`和`StepExecution`)。通过`StepContext#setPersistentUserData(Serializable data)`存储的数据存储在 Spring 批中`StepExecution#executionContext`。 -### [](#jsrStepFlow)阶跃流 +### 阶跃流 在基于 JSR-352 的作业中,步骤流程的工作方式与 Spring 批处理中的工作方式类似。然而,这里有几个细微的区别: @@ -266,7 +266,7 @@ JobContext jobContext; * 转换元素排序--在标准 Spring 批处理作业中,转换元素从最特定的到最不特定的进行排序,并按照该顺序进行评估。JSR-352 作业按照转换元素在 XML 中指定的顺序对其进行评估。 -### [](#jsrScaling)缩放 JSR-352 批处理作业 +### 缩放 JSR-352 批处理作业 Spring 传统的批处理作业有四种缩放方式(最后两种能够跨多个 JVM 执行): @@ -284,7 +284,7 @@ JSR-352 提供了两种缩放批处理作业的选项。这两个选项都只支 * 分区-概念上与 Spring 批处理相同,但实现方式略有不同。 -#### [](#jsrPartitioning)分区 +#### 分区 从概念上讲,JSR-352 中的分区与 Spring 批处理中的分区相同。元数据被提供给每个工作人员,以标识要处理的输入,工作人员在完成后将结果报告给经理。然而,也有一些重要的不同之处: @@ -307,6 +307,6 @@ JSR-352 提供了两种缩放批处理作业的选项。这两个选项都只支 |`javax.batch.api.partition.PartitionAnalyzer` |端点接收由`PartitionCollector`收集的信息,以及从一个完整的分区获得的结果
状态。| | `javax.batch.api.partition.PartitionReducer` |提供为分区
步骤提供补偿逻辑的能力。| -### [](#jsrTesting)测试 +### 测试 由于所有基于 JSR-352 的作业都是异步执行的,因此很难确定作业何时完成。为了帮助进行测试, Spring Batch 提供了`org.springframework.batch.test.JsrTestUtils`。这个实用程序类提供了启动作业、重新启动作业并等待作业完成的功能。作业完成后,将返回相关的`JobExecution`。 \ No newline at end of file diff --git a/docs/spring-batch/monitoring-and-metrics.md b/docs/spring-batch/monitoring-and-metrics.md index 07063d4..f05a5ec 100644 --- a/docs/spring-batch/monitoring-and-metrics.md +++ b/docs/spring-batch/monitoring-and-metrics.md @@ -1,10 +1,10 @@ # 监测和量度 -## [](#monitoring-and-metrics)监控和度量 +## 监控和度量 自版本 4.2 以来, Spring Batch 提供了对基于[Micrometer](https://micrometer.io/)的批监视和度量的支持。本节描述了哪些度量是开箱即用的,以及如何贡献自定义度量。 -### [](#built-in-metrics)内置度量 +### 内置度量 度量集合不需要任何特定的配置。框架提供的所有指标都注册在[千分尺的全球注册中心](https://micrometer.io/docs/concepts#_global_registry)的`spring.batch`前缀下。下表详细解释了所有指标: @@ -20,7 +20,7 @@ | |`status`标记可以是`SUCCESS`或`FAILURE`。| |---|------------------------------------------------------| -### [](#custom-metrics)自定义度量 +### 自定义度量 如果你想在自定义组件中使用自己的度量,我们建议直接使用 Micrometer API。以下是如何对`Tasklet`进行计时的示例: @@ -55,7 +55,7 @@ public class MyTimedTasklet implements Tasklet { } ``` -### [](#disabling-metrics)禁用度量 +### 禁用度量 度量收集是一个类似于日志记录的问题。禁用日志通常是通过配置日志记录库来完成的,对于度量标准来说也是如此。在 Spring 批处理中没有禁用千分尺的度量的功能,这应该在千分尺的一侧完成。由于 Spring 批处理将度量存储在带有`spring.batch`前缀的 Micrometer 的全局注册中心中,因此可以通过以下代码片段将 Micrometer 配置为忽略/拒绝批处理度量: diff --git a/docs/spring-batch/processor.md b/docs/spring-batch/processor.md index 5ca8a89..0f38737 100644 --- a/docs/spring-batch/processor.md +++ b/docs/spring-batch/processor.md @@ -1,6 +1,6 @@ # 项目处理 -## [](#itemProcessor)项处理 +## 项处理 XMLJavaBoth @@ -96,7 +96,7 @@ public Step step1() { `ItemProcessor`与`ItemReader`或`ItemWriter`之间的区别在于,`ItemProcessor`对于`Step`是可选的。 -### [](#chainingItemProcessors)链接项目处理器 +### 链接项目处理器 在许多场景中,执行单个转换是有用的,但是如果你想将多个`ItemProcessor`实现“链”在一起,该怎么办?这可以使用前面提到的复合模式来完成。为了更新前面的单个转换,例如,将`Foo`转换为`Bar`,将其转换为`Foobar`并写出,如以下示例所示: @@ -201,7 +201,7 @@ public CompositeItemProcessor compositeProcessor() { } ``` -### [](#filteringRecords)过滤记录 +### 过滤记录 项目处理器的一个典型用途是在将记录传递给`ItemWriter`之前过滤掉它们。过滤是一种不同于跳过的动作。跳过表示记录无效,而筛选只表示不应写入记录。 @@ -209,7 +209,7 @@ public CompositeItemProcessor compositeProcessor() { 要过滤记录,可以从`ItemProcessor`返回`null`。该框架检测到结果是`null`,并避免将该项添加到交付给`ItemWriter`的记录列表中。像往常一样,从`ItemProcessor`抛出的异常会导致跳过。 -### [](#validatingInput)验证输入 +### 验证输入 在[项目阅读器和项目编写器](readersAndWriters.html#readersAndWriters)章中,讨论了多种解析输入的方法。如果不是“格式良好”的,每个主要实现都会抛出一个异常。如果缺少数据范围,`FixedLengthTokenizer`将抛出一个异常。类似地,试图访问`RowMapper`或`FieldSetMapper`中不存在或格式与预期不同的索引,会引发异常。所有这些类型的异常都是在`read`返回之前抛出的。但是,它们没有解决返回的项目是否有效的问题。例如,如果其中一个字段是年龄,那么它显然不可能是负的。它可以正确地解析,因为它存在并且是一个数字,但是它不会导致异常。由于已经有过多的验证框架, Spring Batch 不会尝试提供另一种验证框架。相反,它提供了一个名为`Validator`的简单接口,可以由任意数量的框架实现,如以下接口定义所示: @@ -294,6 +294,6 @@ public BeanValidatingItemProcessor beanValidatingItemProcessor() throws } ``` -### [](#faultTolerant)容错 +### 容错 当块被回滚时,在读取过程中缓存的项可能会被重新处理。如果一个步骤被配置为容错(通常通过使用跳过或重试处理),则所使用的任何`ItemProcessor`都应该以幂等的方式实现。通常,这将包括对`ItemProcessor`的输入项不执行任何更改,并且只更新结果中的实例。 \ No newline at end of file diff --git a/docs/spring-batch/readersAndWriters.md b/docs/spring-batch/readersAndWriters.md index 4d52154..12d4c77 100644 --- a/docs/spring-batch/readersAndWriters.md +++ b/docs/spring-batch/readersAndWriters.md @@ -1,12 +1,12 @@ # 项目阅读器和项目编写器 -## [](#readersAndWriters)条目阅读器和条目编写器 +## 条目阅读器和条目编写器 XMLJavaBoth 所有批处理都可以用最简单的形式描述为读取大量数据,执行某种类型的计算或转换,并将结果写出来。 Spring Batch 提供了三个关键接口来帮助执行大容量读写:`ItemReader`、`ItemProcessor`和`ItemWriter`。 -### [](#itemReader)`ItemReader` +### `ItemReader` 虽然是一个简单的概念,但`ItemReader`是从许多不同类型的输入提供数据的手段。最常见的例子包括: @@ -32,7 +32,7 @@ public interface ItemReader { 预计`ItemReader`接口的实现方式仅是前向的。但是,如果底层资源是事务性的(例如 JMS 队列),那么在回滚场景中,调用`read`可能会在随后的调用中返回相同的逻辑项。还值得注意的是,缺少由`ItemReader`处理的项并不会导致抛出异常。例如,配置了返回 0 结果的查询的数据库`ItemReader`在`read`的第一次调用时返回`null`。 -### [](#itemWriter)`ItemWriter` +### `ItemWriter` `ItemWriter`在功能上类似于`ItemReader`,但具有反向操作。资源仍然需要定位、打开和关闭,但它们的不同之处在于`ItemWriter`写出,而不是读入。在数据库或队列的情况下,这些操作可以是插入、更新或发送。输出的序列化的格式是特定于每个批处理作业的。 @@ -48,7 +48,7 @@ public interface ItemWriter { 与`read`上的`ItemReader`一样,`write`提供了`ItemWriter`的基本契约。它尝试写出传入的项目列表,只要它是打开的。由于通常期望将项目“批处理”到一个块中,然后输出,因此接口接受一个项目列表,而不是一个项目本身。在写出列表之后,可以在从写方法返回之前执行任何必要的刷新。例如,如果对 Hibernate DAO 进行写操作,则可以对每个项进行多个 write 调用。然后,写入器可以在返回之前调用 Hibernate 会话上的`flush`。 -### [](#itemStream)`ItemStream` +### `ItemStream` `ItemReaders`和`ItemWriters`都很好地服务于它们各自的目的,但是它们之间有一个共同的关注点,那就是需要另一个接口。通常,作为批处理作业范围的一部分,读取器和编写器需要被打开、关闭,并且需要一种机制来保持状态。`ItemStream`接口实现了这一目的,如下例所示: @@ -67,7 +67,7 @@ public interface ItemStream { 在`ItemStream`的客户端是`Step`(来自 Spring 批处理核心)的特殊情况下,将为每个分步执行创建一个`ExecutionContext`,以允许用户存储特定执行的状态,期望在再次启动相同的`JobInstance`时返回。对于那些熟悉 Quartz 的人,其语义非常类似于 Quartz`JobDataMap`。 -### [](#delegatePatternAndRegistering)委托模式并与步骤一起注册 +### 委托模式并与步骤一起注册 请注意,`CompositeItemWriter`是委托模式的一个示例,这在 Spring 批处理中很常见。委托本身可能实现回调接口,例如`StepListener`。如果它们确实存在,并且如果它们是作为`Job`中的`Step`的一部分与 Spring 批处理核心一起使用的,那么几乎肯定需要用`Step`手动注册它们。直接连接到`Step`的读取器、编写器或处理器如果实现`ItemStream`或`StepListener`接口,就会自动注册。但是,由于委托不为`Step`所知,因此需要将它们作为侦听器或流注入(或者在适当的情况下将两者都注入)。 @@ -135,11 +135,11 @@ public BarWriter barWriter() { } ``` -### [](#flatFiles)平面文件 +### 平面文件 交换大容量数据的最常见机制之一一直是平面文件。与 XML 不同的是,XML 有一个一致的标准来定义它是如何结构化的(XSD),任何读取平面文件的人都必须提前确切地了解文件是如何结构化的。一般来说,所有的平面文件都分为两种类型:定长和定长。分隔符文件是那些字段被分隔符(如逗号)分隔的文件。固定长度文件的字段是固定长度的。 -#### [](#fieldSet)the`FieldSet` +#### the`FieldSet` 在处理 Spring 批处理中的平面文件时,无论它是用于输入还是输出,最重要的类之一是`FieldSet`。许多体系结构和库包含帮助你从文件中读取的抽象,但它们通常返回`String`或`String`对象的数组。这真的只会让你走到一半。`FieldSet`是 Spring 批处理的抽象,用于从文件资源中绑定字段。它允许开发人员以与处理数据库输入大致相同的方式处理文件输入。a`FieldSet`在概念上类似于 jdbc`ResultSet`。`FieldSet`只需要一个参数:一个`String`令牌数组。还可以选择地配置字段的名称,以便可以按照`ResultSet`之后的模式通过索引或名称访问字段,如以下示例所示: @@ -153,7 +153,7 @@ boolean booleanValue = fs.readBoolean(2); 在`FieldSet`接口上还有许多选项,例如`Date`、long、`BigDecimal`,等等。`FieldSet`的最大优点是它提供了对平面文件输入的一致解析。在处理由格式异常引起的错误或进行简单的数据转换时,它可以是一致的,而不是以潜在的意外方式对每个批处理作业进行不同的解析。 -#### [](#flatFileItemReader)`FlatFileItemReader` +#### `FlatFileItemReader` 平面文件是最多包含二维(表格)数据的任何类型的文件。 Spring 批处理框架中的平面文件的读取是由一个名为`FlatFileItemReader`的类提供的,该类为平面文件的读取和解析提供了基本功能。`FlatFileItemReader`的两个最重要的必需依赖项是`Resource`和`LineMapper`。`LineMapper`接口将在下一节中进行更多的探讨。资源属性表示 Spring 核心`Resource`。说明如何创建这种类型的 bean 的文档可以在[Spring Framework, Chapter 5. Resources](https://docs.spring.io/spring/docs/current/spring-framework-reference/core.html#resources)中找到。因此,除了展示下面的简单示例之外,本指南不涉及创建`Resource`对象的细节: @@ -176,7 +176,7 @@ Resource resource = new FileSystemResource("resources/trades.csv"); |skippedLinesCallback | LineCallbackHandler |传递
中要跳过的文件行的原始行内容的接口。如果`linesToSkip`被设置为 2,那么这个接口被
调用了两次。| | strict | boolean |在严格模式下,如果输入资源不存在
,读取器将在`ExecutionContext`上抛出异常。否则,它会记录问题并继续处理。| -##### [](#lineMapper)`LineMapper` +##### `LineMapper` 与`RowMapper`一样,它接受一个低层次的构造,例如`ResultSet`并返回一个`Object`,平面文件处理需要相同的构造来将`String`行转换为`Object`,如以下接口定义所示: @@ -190,7 +190,7 @@ public interface LineMapper { 基本的约定是,给定当前行和与其相关联的行号,映射器应该返回一个结果域对象。这类似于`RowMapper`,因为每一行都与其行号关联,就像`ResultSet`中的每一行都与其行号关联一样。这允许将行号绑定到结果域对象,以进行身份比较或进行更有信息量的日志记录。然而,与`RowMapper`不同的是,`LineMapper`给出的是一条未加工的线,正如上面讨论的那样,这条线只能让你达到一半。该行必须标记为`FieldSet`,然后可以映射到对象,如本文档后面所述。 -##### [](#lineTokenizer)`LineTokenizer` +##### `LineTokenizer` 将一行输入转换为`FieldSet`的抽象是必要的,因为可能有许多格式的平面文件数据需要转换为`FieldSet`。在 Spring 批处理中,这个接口是`LineTokenizer`: @@ -210,7 +210,7 @@ a`LineTokenizer`的契约是这样的,给定一条输入线(理论上`String * `PatternMatchingCompositeLineTokenizer`:通过检查模式,确定在特定行上应该使用记号符列表中的哪一个`LineTokenizer`。 -##### [](#fieldSetMapper)`FieldSetMapper` +##### `FieldSetMapper` `FieldSetMapper`接口定义了一个方法`mapFieldSet`,它接受一个`FieldSet`对象并将其内容映射到一个对象。该对象可以是自定义 DTO、域对象或数组,具体取决于作业的需要。`FieldSetMapper`与`LineTokenizer`结合使用,以将资源中的一行数据转换为所需类型的对象,如以下接口定义所示: @@ -224,7 +224,7 @@ public interface FieldSetMapper { 使用的模式与`JdbcTemplate`使用的`RowMapper`相同。 -##### [](#defaultLineMapper)`DefaultLineMapper` +##### `DefaultLineMapper` 既然已经定义了在平面文件中读取的基本接口,那么显然需要三个基本步骤: @@ -259,7 +259,7 @@ public class DefaultLineMapper implements LineMapper<>, InitializingBean { 上述功能是在默认实现中提供的,而不是内置在阅读器本身中(就像框架的以前版本中所做的那样),以允许用户在控制解析过程中具有更大的灵活性,尤其是在需要访问原始行的情况下。 -##### [](#simpleDelimitedFileReadingExample)简单分隔的文件读取示例 +##### 简单分隔的文件读取示例 下面的示例演示了如何在实际的域场景中读取平面文件。这个特定的批处理作业从以下文件中读取足球运动员: @@ -331,7 +331,7 @@ Player player = itemReader.read(); 对`read`的每次调用都会从文件中的每一行返回一个新的`Player`对象。当到达文件的末尾时,将返回`null`。 -##### [](#mappingFieldsByName)按名称映射字段 +##### 按名称映射字段 还有一个额外的功能块是`DelimitedLineTokenizer`和`FixedLengthTokenizer`都允许的,它在功能上类似于 JDBC`ResultSet`。字段的名称可以被注入到这些`LineTokenizer`实现中,以增加映射函数的可读性。首先,将平面文件中所有字段的列名注入到记号生成器中,如下例所示: @@ -362,7 +362,7 @@ public class PlayerMapper implements FieldSetMapper { } ``` -##### [](#beanWrapperFieldSetMapper)向域对象自动设置字段集 +##### 向域对象自动设置字段集 对于许多人来说,必须为`FieldSetMapper`编写特定的`RowMapper`,就像为`JdbcTemplate`编写特定的`RowMapper`一样麻烦。 Spring 批处理通过提供`FieldSetMapper`使这一点变得更容易,该批处理通过使用 JavaBean 规范将字段名称与对象上的 setter 匹配来自动映射字段。 @@ -404,7 +404,7 @@ public Player player() { 对于`FieldSet`中的每个条目,映射器在`Player`对象的新实例上查找相应的 setter(由于这个原因,需要原型作用域),就像 Spring 容器查找匹配属性名的 setter 一样。映射`FieldSet`中的每个可用字段,并返回结果`Player`对象,不需要任何代码。 -##### [](#fixedLengthFileFormats)固定长度文件格式 +##### 固定长度文件格式 到目前为止,只对分隔的文件进行了详细的讨论。然而,它们只代表了文件阅读图片的一半。许多使用平面文件的组织使用固定长度格式。下面是固定长度文件的示例: @@ -466,7 +466,7 @@ public FixedLengthTokenizer fixedLengthTokenizer() { 因为`FixedLengthLineTokenizer`使用与上面讨论的相同的`LineTokenizer`接口,所以它返回相同的`FieldSet`,就像使用了分隔符一样。这使得在处理其输出时可以使用相同的方法,例如使用`BeanWrapperFieldSetMapper`。 -##### [](#prefixMatchingLineMapper)单个文件中的多个记录类型 +##### 单个文件中的多个记录类型 到目前为止,所有的文件读取示例都为了简单起见做出了一个关键假设:文件中的所有记录都具有相同的格式。然而,情况可能并不总是如此。很常见的一种情况是,一个文件可能具有不同格式的记录,这些记录需要以不同的方式进行标记并映射到不同的对象。下面的文件摘录说明了这一点: @@ -554,11 +554,11 @@ tokenizers.put("*", defaultLineTokenizer()); 平面文件中包含的记录跨越多行也是很常见的。要处理这种情况,需要一种更复杂的策略。在`multiLineRecords`示例中可以找到这种常见模式的演示。 -##### [](#exceptionHandlingInFlatFiles)平面文件中的异常处理 +##### 平面文件中的异常处理 在许多情况下,对一行进行标记化可能会导致抛出异常。许多平面文件是不完美的,包含格式不正确的记录。许多用户在记录问题、原始行号和行号时选择跳过这些错误行。这些日志稍后可以手动检查,也可以通过另一个批处理作业进行检查。出于这个原因, Spring Batch 为处理解析异常提供了一个异常层次结构:`FlatFileParseException`和`FlatFileFormatException`。当试图读取文件时遇到任何错误时,`FlatFileParseException`将抛出`FlatFileItemReader`。`FlatFileFormatException`由`LineTokenizer`接口的实现抛出,并指示在标记时遇到的更具体的错误。 -###### [](#incorrectTokenCountException)`IncorrectTokenCountException` +###### `IncorrectTokenCountException` `DelimitedLineTokenizer`和`FixedLengthLineTokenizer`都可以指定可用于创建`FieldSet`的列名。但是,如果列名的数量与对一行进行标记时发现的列数不匹配,则无法创建`FieldSet`,并抛出一个`IncorrectTokenCountException`,其中包含遇到的令牌数量和预期的数量,如以下示例所示: @@ -576,7 +576,7 @@ catch (IncorrectTokenCountException e) { 因为标记器配置了 4 个列名,但在文件中只找到了 3 个令牌,所以抛出了一个`IncorrectTokenCountException`。 -###### [](#incorrectLineLengthException)`IncorrectLineLengthException` +###### `IncorrectLineLengthException` 以固定长度格式格式化的文件在解析时有额外的要求,因为与分隔格式不同,每个列必须严格遵守其预定义的宽度。如果行的总长度不等于此列的最大值,则抛出一个异常,如以下示例所示: @@ -606,11 +606,11 @@ assertEquals("", tokens.readString(1)); 前面的示例与前面的示例几乎相同,只是调用了`tokenizer.setStrict(false)`。这个设置告诉标记器在标记行时不要强制行长。现在正确地创建并返回了`FieldSet`。但是,对于其余的值,它只包含空标记。 -#### [](#flatFileItemWriter)`FlatFileItemWriter` +#### `FlatFileItemWriter` 写入平面文件也存在从文件读入时必须克服的问题。一个步骤必须能够以事务性的方式编写分隔格式或固定长度格式。 -##### [](#lineAggregator)`LineAggregator` +##### `LineAggregator` 正如`LineTokenizer`接口是获取一个项并将其转换为`String`所必需的一样,文件写入必须有一种方法,可以将多个字段聚合到一个字符串中,以便将其写入文件。在 Spring 批处理中,这是`LineAggregator`,如下面的接口定义所示: @@ -624,7 +624,7 @@ public interface LineAggregator { `LineAggregator`是`LineTokenizer`的逻辑对立面。`LineTokenizer`接受一个`String`并返回一个`FieldSet`,而`LineAggregator`接受一个`item`并返回一个`String`。 -###### [](#PassThroughLineAggregator)`PassThroughLineAggregator` +###### `PassThroughLineAggregator` `LineAggregator`接口的最基本的实现是`PassThroughLineAggregator`,它假定对象已经是一个字符串,或者它的字符串表示可以用于编写,如下面的代码所示: @@ -639,7 +639,7 @@ public class PassThroughLineAggregator implements LineAggregator { 如果需要直接控制创建字符串,那么前面的实现是有用的,但是`FlatFileItemWriter`的优点,例如事务和重新启动支持,是必要的。 -##### [](#SimplifiedFileWritingExample)简化文件编写示例 +##### 简化文件编写示例 既然`LineAggregator`接口及其最基本的实现`PassThroughLineAggregator`已经定义好了,那么编写的基本流程就可以解释了: @@ -683,7 +683,7 @@ public FlatFileItemWriter itemWriter() { } ``` -##### [](#FieldExtractor)`FieldExtractor` +##### `FieldExtractor` 前面的示例对于对文件的写入的最基本使用可能是有用的。然而,`FlatFileItemWriter`的大多数用户都有一个需要写出的域对象,因此必须将其转换为一行。在文件阅读中,需要进行以下操作: @@ -713,11 +713,11 @@ public interface FieldExtractor { `FieldExtractor`接口的实现应该从提供的对象的字段创建一个数组,然后可以在元素之间使用分隔符写出该数组,或者作为固定宽度线的一部分。 -###### [](#PassThroughFieldExtractor)`PassThroughFieldExtractor` +###### `PassThroughFieldExtractor` 在许多情况下,需要写出集合,例如一个数组,`Collection`或`FieldSet`。从这些集合类型中的一种“提取”一个数组是非常简单的。要做到这一点,将集合转换为一个数组。因此,在此场景中应该使用`PassThroughFieldExtractor`。应该注意的是,如果传入的对象不是集合的类型,那么`PassThroughFieldExtractor`将返回一个仅包含要提取的项的数组。 -###### [](#BeanWrapperFieldExtractor)`BeanWrapperFieldExtractor` +###### `BeanWrapperFieldExtractor` 与文件读取部分中描述的`BeanWrapperFieldSetMapper`一样,通常更好的方法是配置如何将域对象转换为对象数组,而不是自己编写转换。`BeanWrapperFieldExtractor`提供了这种功能,如以下示例所示: @@ -739,7 +739,7 @@ assertEquals(born, values[2]); 这个提取器实现只有一个必需的属性:要映射的字段的名称。正如`BeanWrapperFieldSetMapper`需要字段名称来将`FieldSet`上的字段映射到所提供对象上的 setter 一样,`BeanWrapperFieldExtractor`也需要名称来映射到 getter 以创建对象数组。值得注意的是,名称的顺序决定了数组中字段的顺序。 -##### [](#delimitedFileWritingExample)分隔的文件编写示例 +##### 分隔的文件编写示例 最基本的平面文件格式是一种所有字段都用分隔符分隔的格式。这可以使用`DelimitedLineAggregator`来完成。下面的示例写出了一个简单的域对象,该对象表示对客户帐户的信用: @@ -818,7 +818,7 @@ public FlatFileItemWriter itemWriter(Resource outputResource) th } ``` -##### [](#fixedWidthFileWritingExample)固定宽度文件编写示例 +##### 固定宽度文件编写示例 分隔符并不是唯一一种平面文件格式。许多人更喜欢为每个列使用一个设置的宽度来划分字段,这通常称为“固定宽度”。 Spring 批处理在用`FormatterLineAggregator`写文件时支持这一点。 @@ -901,11 +901,11 @@ public FlatFileItemWriter itemWriter(Resource outputResource) th } ``` -##### [](#handlingFileCreation)处理文件创建 +##### 处理文件创建 `FlatFileItemReader`与文件资源的关系非常简单。当读取器被初始化时,它会打开该文件(如果它存在的话),如果它不存在,则会抛出一个异常。写文件并不是那么简单。乍一看,对于`FlatFileItemWriter`似乎应该存在类似的直接契约:如果文件已经存在,则抛出一个异常,如果不存在,则创建它并开始写入。然而,重新启动`Job`可能会导致问题。在正常的重启场景中,契约是相反的:如果文件存在,则从最后一个已知的良好位置开始向它写入,如果不存在,则抛出一个异常。但是,如果此作业的文件名总是相同,会发生什么情况?在这种情况下,如果文件存在,你可能想要删除它,除非是重新启动。由于这种可能性,`FlatFileItemWriter`包含属性`shouldDeleteIfExists`。将此属性设置为 true 将导致在打开 Writer 时删除同名的现有文件。 -### [](#xmlReadingWriting)XML 项读取器和编写器 +### XML 项读取器和编写器 Spring Batch 提供了用于读取 XML 记录并将它们映射到 Java 对象以及将 Java 对象写为 XML 记录的事务基础设施。 @@ -926,7 +926,7 @@ Spring Batch 提供了用于读取 XML 记录并将它们映射到 Java 对象 通过介绍 OXM 以及如何使用 XML 片段来表示记录,我们现在可以更仔细地研究阅读器和编写器。 -#### [](#StaxEventItemReader)`StaxEventItemReader` +#### `StaxEventItemReader` `StaxEventItemReader`配置为处理来自 XML 输入流的记录提供了一个典型的设置。首先,考虑`StaxEventItemReader`可以处理的以下一组 XML 记录: @@ -1071,7 +1071,7 @@ while (hasNext) { } ``` -#### [](#StaxEventItemWriter)`StaxEventItemWriter` +#### `StaxEventItemWriter` 输出与输入对称地工作。`StaxEventItemWriter`需要一个`Resource`、一个编组器和一个`rootTagName`。将 Java 对象传递给编组器(通常是标准的 Spring OXM 编组器),该编组器通过使用自定义事件编写器将 OXM 工具为每个片段产生的`StartDocument`和`EndDocument`事件进行过滤,从而将其写到`Resource`。 @@ -1185,7 +1185,7 @@ trade.setCustomer("Customer1"); staxItemWriter.write(trade); ``` -### [](#jsonReadingWriting)JSON 条目阅读器和编写器 +### JSON 条目阅读器和编写器 Spring Batch 以以下格式提供对读取和写入 JSON 资源的支持: @@ -1208,7 +1208,7 @@ Spring Batch 以以下格式提供对读取和写入 JSON 资源的支持: 假定 JSON 资源是与单个项对应的 JSON 对象数组。 Spring 批处理不绑定到任何特定的 JSON 库。 -#### [](#JsonItemReader)`JsonItemReader` +#### `JsonItemReader` `JsonItemReader`将 JSON 解析和绑定委托给`org.springframework.batch.item.json.JsonObjectReader`接口的实现。该接口旨在通过使用流 API 以块形式读取 JSON 对象来实现。目前提供了两种实现方式: @@ -1235,7 +1235,7 @@ public JsonItemReader jsonItemReader() { } ``` -#### [](#jsonfileitemwriter)`JsonFileItemWriter` +#### `JsonFileItemWriter` `JsonFileItemWriter`将项的编组委托给`org.springframework.batch.item.json.JsonObjectMarshaller`接口。这个接口的契约是将一个对象带到一个 JSON`String`。目前提供了两种实现方式: @@ -1262,7 +1262,7 @@ public JsonFileItemWriter jsonFileItemWriter() { } ``` -### [](#multiFileInput)多文件输入 +### 多文件输入 在一个`Step`中处理多个文件是一个常见的要求。假设所有文件都具有相同的格式,`MultiResourceItemReader`在 XML 和平面文件处理中都支持这种类型的输入。考虑目录中的以下文件: @@ -1302,7 +1302,7 @@ public MultiResourceItemReader multiResourceReader() { | |通过使用`MultiResourceItemReader#setComparator(Comparator)`对输入资源进行排序,以确保在重新启动场景中的作业运行之间保留资源排序。| |---|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -### [](#database)数据库 +### 数据库 像大多数 Enterprise 应用程序样式一样,数据库是批处理的中心存储机制。然而,由于系统必须使用的数据集的巨大规模,批处理与其他应用程序样式不同。如果 SQL 语句返回 100 万行,那么结果集可能会将所有返回的结果保存在内存中,直到所有行都被读取为止。 Spring Batch 为此问题提供了两种类型的解决方案: @@ -1310,7 +1310,7 @@ public MultiResourceItemReader multiResourceReader() { * [分页`ItemReader`实现] -#### [](#cursorBasedItemReaders)基于光标的`ItemReader`实现 +#### 基于光标的`ItemReader`实现 使用数据库游标通常是大多数批处理开发人员的默认方法,因为它是数据库解决关系数据“流”问题的方法。Java`ResultSet`类本质上是一种用于操作游标的面向对象机制。a`ResultSet`维护当前数据行的游标。在`ResultSet`上调用`next`将光标移动到下一行。 Spring 基于批处理游标的`ItemReader`实现在初始化时打开游标,并在每次调用`read`时将游标向前移动一行,返回可用于处理的映射对象。然后调用`close`方法,以确保释放所有资源。 Spring 核心`JdbcTemplate`通过使用回调模式来完全映射`ResultSet`中的所有行,并在将控制权返回给方法调用方之前关闭,从而绕过了这个问题。然而,在批处理中,这必须等到步骤完成。下图显示了基于游标的`ItemReader`如何工作的通用关系图。请注意,虽然示例使用 SQL(因为 SQL 是广为人知的),但任何技术都可以实现基本方法。 @@ -1320,7 +1320,7 @@ public MultiResourceItemReader multiResourceReader() { 这个例子说明了基本模式。给定一个有三列的“foo”表:`ID`、`NAME`和`BAR`,选择 ID 大于 1 但小于 7 的所有行。这将把游标的开头(第 1 行)放在 ID2 上。该行的结果应该是一个完全映射的`Foo`对象。调用`read()`再次将光标移动到下一行,即 ID 为 3 的`Foo`。在每个`read`之后写出这些读取的结果,从而允许对对象进行垃圾收集(假设没有实例变量维护对它们的引用)。 -##### [](#JdbcCursorItemReader)`JdbcCursorItemReader` +##### `JdbcCursorItemReader` `JdbcCursorItemReader`是基于光标的技术的 JDBC 实现。它可以直接与`ResultSet`一起工作,并且需要针对从`DataSource`获得的连接运行 SQL 语句。下面的数据库模式用作示例: @@ -1413,7 +1413,7 @@ public JdbcCursorItemReader itemReader() { } ``` -###### [](#JdbcCursorItemReaderProperties)附加属性 +###### 附加属性 因为在 Java 中有很多不同的打开光标的选项,所以`JdbcCursorItemReader`上有很多可以设置的属性,如下表所示: @@ -1427,7 +1427,7 @@ public JdbcCursorItemReader itemReader() { | driverSupportsAbsolute |指示 JDBC 驱动程序是否支持
设置`ResultSet`上的绝对行。对于支持`ResultSet.absolute()`的 JDBC 驱动程序,建议将其设置为`true`,因为这可能会提高性能,
特别是在使用大数据集时发生步骤失败时。默认值为`false`。| |setUseSharedExtendedConnection|指示用于光标的连接
是否应由所有其他处理使用,从而共享相同的
事务。如果将其设置为`false`,然后用它自己的连接
打开光标,并且不参与启动的任何事务对于步骤处理的其余部分,
如果将此标志设置为`true`,则必须将数据源包装在`ExtendedConnectionDataSourceProxy`中,以防止连接被关闭,并在每次提交后释放
。当你将此选项设置为`true`时,用于
打开光标的语句将使用’只读’和’持有 \_ 游标 \_over\_commit’选项创建。
这允许在事务启动时保持光标打开,并在
步骤处理中执行提交。要使用此功能,你需要一个支持此功能的数据库,以及一个支持 JDBC3.0 或更高版本的 JDBC
驱动程序。默认值为`false`。| -##### [](#HibernateCursorItemReader)`HibernateCursorItemReader` +##### `HibernateCursorItemReader` 正如正常的 Spring 用户对是否使用 ORM 解决方案做出重要的决定,这会影响他们是否使用`JdbcTemplate`或`HibernateTemplate`, Spring 批处理用户具有相同的选项。`HibernateCursorItemReader`是 Hibernate 游标技术的实现。 Hibernate 的批量使用一直颇具争议。这在很大程度上是因为 Hibernate 最初是为了支持在线应用程序样式而开发的。然而,这并不意味着它不能用于批处理。解决这个问题的最简单的方法是使用`StatelessSession`,而不是使用标准会话。这删除了 Hibernate 使用的所有缓存和脏检查,这可能会在批处理场景中导致问题。有关无状态会话和正常 Hibernate 会话之间的差异的更多信息,请参阅你的特定 Hibernate 版本的文档。`HibernateCursorItemReader`允许你声明一个 HQL 语句,并传入一个`SessionFactory`,它将在每个调用中传回一个项,以与`JdbcCursorItemReader`相同的基本方式进行读取。下面的示例配置使用了与 JDBC 阅读器相同的“客户信用”示例: @@ -1477,7 +1477,7 @@ public HibernateCursorItemReader itemReader(SessionFactory sessionFactory) { } ``` -##### [](#StoredProcedureItemReader)`StoredProcedureItemReader` +##### `StoredProcedureItemReader` 有时需要使用存储过程来获取游标数据。`StoredProcedureItemReader`的工作原理与`JdbcCursorItemReader`类似,不同的是,它运行的是返回光标的存储过程,而不是运行查询来获取光标。存储过程可以以三种不同的方式返回光标: @@ -1657,11 +1657,11 @@ public StoredProcedureItemReader reader(DataSource dataSource) { 除了参数声明外,我们还需要指定一个`PreparedStatementSetter`实现,该实现为调用设置参数值。这与上面的`JdbcCursorItemReader`的工作原理相同。[附加属性](#JdbcCursorItemReaderProperties)中列出的所有附加属性也适用于`StoredProcedureItemReader`。 -#### [](#pagingItemReaders)分页`ItemReader`实现 +#### 分页`ItemReader`实现 使用数据库游标的一种替代方法是运行多个查询,其中每个查询获取部分结果。我们把这一部分称为一个页面。每个查询必须指定起始行号和我们希望在页面中返回的行数。 -##### [](#JdbcPagingItemReader)`JdbcPagingItemReader` +##### `JdbcPagingItemReader` 分页`ItemReader`的一个实现是`JdbcPagingItemReader`。`JdbcPagingItemReader`需要一个`PagingQueryProvider`,负责提供用于检索构成页面的行的 SQL 查询。由于每个数据库都有自己的策略来提供分页支持,因此我们需要为每个受支持的数据库类型使用不同的`PagingQueryProvider`。还有`SqlPagingQueryProviderFactoryBean`自动检测正在使用的数据库,并确定适当的`PagingQueryProvider`实现。这简化了配置,是推荐的最佳实践。 @@ -1734,7 +1734,7 @@ public SqlPagingQueryProviderFactoryBean queryProvider() { “parametervalues”属性可用于为查询指定一个`Map`参数值。如果在`where`子句中使用命名参数,则每个条目的键应该与命名参数的名称匹配。如果使用传统的“?”占位符,那么每个条目的键应该是占位符的编号,从 1 开始。 -##### [](#JpaPagingItemReader)`JpaPagingItemReader` +##### `JpaPagingItemReader` 分页`ItemReader`的另一个实现是`JpaPagingItemReader`。 JPA 不具有类似于 Hibernate 的概念,因此我们不得不使用由 JPA 规范提供的其他特征。由于 JPA 支持分页,所以当涉及到使用 JPA 进行批处理时,这是一个自然的选择。在读取每个页面之后,这些实体将被分离,持久性上下文将被清除,从而允许在页面被处理之后对这些实体进行垃圾收集。 @@ -1770,7 +1770,7 @@ public JpaPagingItemReader itemReader() { 这个配置的`ItemReader`以与上面描述的`JdbcPagingItemReader`对象完全相同的方式返回`CustomerCredit`对象,假设`CustomerCredit`对象具有正确的 JPA 注释或 ORM 映射文件。“PageSize”属性确定每个查询执行从数据库中读取的实体的数量。 -#### [](#databaseItemWriters)数据库项目编写器 +#### 数据库项目编写器 虽然平面文件和 XML 文件都有一个特定的`ItemWriter`实例,但在数据库世界中没有完全相同的实例。这是因为事务提供了所需的所有功能。`ItemWriter`实现对于文件来说是必要的,因为它们必须像事务一样工作,跟踪写好的项目,并在适当的时候刷新或清除。数据库不需要此功能,因为写操作已经包含在事务中了。用户可以创建自己的 DAO 来实现`ItemWriter`接口,或者使用自定义的`ItemWriter`接口,这是为通用处理问题编写的。无论哪种方式,它们的工作都应该没有任何问题。需要注意的一点是批处理输出所提供的性能和错误处理能力。当使用 Hibernate 作为`ItemWriter`时,这是最常见的,但是当使用 JDBC 批处理模式时,可能会有相同的问题。批处理数据库输出没有任何固有的缺陷,前提是我们要小心刷新,并且数据中没有错误。然而,书写时的任何错误都可能导致混淆,因为无法知道是哪个单独的项目导致了异常,或者即使是任何单独的项目是负责任的,如下图所示: @@ -1786,7 +1786,7 @@ public JpaPagingItemReader itemReader() { 这是一个常见的用例,尤其是在使用 Hibernate 时,而`ItemWriter`的实现的简单准则是在每次调用`write()`时刷新。这样做允许可靠地跳过项, Spring 批处理在内部处理错误后对`ItemWriter`的调用的粒度。 -### [](#reusingExistingServices)重用现有服务 +### 重用现有服务 批处理系统通常与其他应用程序样式结合使用。最常见的是在线系统,但它也可以通过移动每个应用程序样式使用的必要的大容量数据来支持集成,甚至支持厚客户机应用程序。由于这个原因,许多用户希望在其批处理作业中重用现有的 DAO 或其他服务是很常见的。 Spring 容器本身通过允许注入任何必要的类,使这一点变得相当容易。然而,可能存在现有服务需要充当`ItemReader`或`ItemWriter`的情况,要么是为了满足另一个 Spring 批处理类的依赖关系,要么是因为它确实是主要的`ItemReader`的一个步骤。为每个需要包装的服务编写一个适配器类是相当琐碎的,但是由于这是一个常见的问题, Spring Batch 提供了实现:`ItemReaderAdapter`和`ItemWriterAdapter`。这两个类都通过调用委托模式来实现标准 Spring 方法,并且设置起来相当简单。 @@ -1860,7 +1860,7 @@ public FooService fooService() { } ``` -### [](#process-indicator)防止状态持久性 +### 防止状态持久性 默认情况下,所有`ItemReader`和`ItemWriter`实现在提交之前将其当前状态存储在`ExecutionContext`中。然而,这可能并不总是理想的行为。例如,许多开发人员选择通过使用过程指示器使他们的数据库阅读器“可重新运行”。在输入数据中添加一个额外的列,以指示是否对其进行了处理。当读取(或写入)特定记录时,处理后的标志从`false`翻转到`true`。然后,SQL 语句可以在`where`子句中包含一个额外的语句,例如`where PROCESSED_IND = false`,从而确保在重新启动的情况下仅返回未处理的记录。在这种情况下,最好不要存储任何状态,例如当前行号,因为它在重新启动时是不相关的。由于这个原因,所有的读者和作者都包括“SaveState”财产。 @@ -1912,11 +1912,11 @@ public JdbcCursorItemReader playerSummarizationSource(DataSource dataSource) { 上面配置的`ItemReader`不会在`ExecutionContext`中为其参与的任何执行创建任何条目。 -### [](#customReadersWriters)创建自定义项目阅读器和项目编写器 +### 创建自定义项目阅读器和项目编写器 到目前为止,本章已经讨论了 Spring 批处理中的读和写的基本契约,以及这样做的一些常见实现。然而,这些都是相当通用的,并且有许多潜在的场景可能不会被开箱即用的实现所覆盖。本节通过使用一个简单的示例,展示了如何创建自定义`ItemReader`和`ItemWriter`实现,并正确地实现它们的契约。`ItemReader`还实现了`ItemStream`,以说明如何使读取器或写入器重新启动。 -#### [](#customReader)自定义`ItemReader`示例 +#### 自定义`ItemReader`示例 为了这个示例的目的,我们创建了一个简单的`ItemReader`实现,该实现从提供的列表中读取数据。我们首先实现`ItemReader`的最基本契约,即`read`方法,如以下代码所示: @@ -1955,7 +1955,7 @@ assertEquals("3", itemReader.read()); assertNull(itemReader.read()); ``` -##### [](#restartableReader)使`ItemReader`可重启 +##### 使`ItemReader`可重启 最后的挑战是使`ItemReader`重新启动。目前,如果处理被中断并重新开始,`ItemReader`必须在开始时开始。这实际上在许多场景中都是有效的,但有时更可取的做法是,在批处理作业停止的地方重新启动它。关键的判别式通常是读者是有状态的还是无状态的。无状态的读者不需要担心重启性,但是有状态的读者必须尝试在重新启动时重建其最后已知的状态。出于这个原因,我们建议你在可能的情况下保持自定义阅读器的无状态,这样你就不必担心重启性了。 @@ -2021,7 +2021,7 @@ assertEquals("2", itemReader.read()); 还值得注意的是,`ExecutionContext`中使用的键不应该是微不足道的。这是因为相同的`ExecutionContext`用于`ItemStreams`中的所有`Step`。在大多数情况下,只需在键前加上类名就足以保证唯一性。然而,在很少的情况下,在相同的步骤中使用两个相同类型的`ItemStream`(如果需要输出两个文件,可能会发生这种情况),则需要一个更唯一的名称。由于这个原因,许多 Spring 批处理`ItemReader`和`ItemWriter`实现都有一个`setName()`属性,该属性允许重写这个键名。 -#### [](#customWriter)自定义`ItemWriter`示例 +#### 自定义`ItemWriter`示例 实现自定义`ItemWriter`在许多方面与上面的`ItemReader`示例相似,但在足够多的方面有所不同,以保证它自己的示例。然而,添加可重启性本质上是相同的,因此在本例中不涉及它。与`ItemReader`示例一样,使用`List`是为了使示例尽可能简单: @@ -2040,17 +2040,17 @@ public class CustomItemWriter implements ItemWriter { } ``` -##### [](#restartableWriter)使`ItemWriter`重新启动 +##### 使`ItemWriter`重新启动 要使`ItemWriter`可重启,我们将遵循与`ItemReader`相同的过程,添加并实现`ItemStream`接口以同步执行上下文。在这个示例中,我们可能必须计算处理的项目的数量,并将其添加为页脚记录。如果需要这样做,我们可以在`ItemWriter`中实现`ItemStream`,这样,如果流被重新打开,计数器将从执行上下文中重新构造。 在许多实际的情况下,自定义`ItemWriters`也会委托给另一个本身是可重启的编写器(例如,当写到文件时),或者它会写到事务资源,因此不需要重启,因为它是无状态的。当你有一个有状态的编写器时,你可能应该确保实现`ItemStream`以及`ItemWriter`。还请记住,Writer 的客户机需要知道`ItemStream`,因此你可能需要在配置中将其注册为流。 -### [](#itemReaderAndWriterImplementations)项读取器和编写器实现 +### 项读取器和编写器实现 在本节中,我们将向你介绍在前几节中尚未讨论过的读者和作者。 -#### [](#decorators)装饰者 +#### 装饰者 在某些情况下,用户需要将专门的行为附加到预先存在的`ItemReader`。 Spring Batch 提供了一些开箱即用的装饰器,它们可以将额外的行为添加到你的`ItemReader`和`ItemWriter`实现中。 @@ -2068,34 +2068,34 @@ Spring 批处理包括以下装饰器: * [`ClassifierCompositeItemProcessor`] -##### [](#synchronizedItemStreamReader)`SynchronizedItemStreamReader` +##### `SynchronizedItemStreamReader` 当使用不是线程安全的`ItemReader`时, Spring Batch 提供`SynchronizedItemStreamReader`decorator,该 decorator 可用于使`ItemReader`线程安全。 Spring 批处理提供了一个`SynchronizedItemStreamReaderBuilder`来构造`SynchronizedItemStreamReader`的实例。 -##### [](#singleItemPeekableItemReader)`SingleItemPeekableItemReader` +##### `SingleItemPeekableItemReader` Spring 批处理包括向`ItemReader`添加 PEEK 方法的装饰器。这种 peek 方法允许用户提前查看一项。对 Peek 的重复调用返回相同的项,这是从`read`方法返回的下一个项。 Spring 批处理提供了一个`SingleItemPeekableItemReaderBuilder`来构造`SingleItemPeekableItemReader`的实例。 | |SingleitemPeekableitemreader 的 Peek 方法不是线程安全的,因为它不可能
在多个线程中执行 Peek。窥视
的线程中只有一个会在下一次调用中获得要读取的项。| |---|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -##### [](#synchronizedItemStreamWriter)`SynchronizedItemStreamWriter` +##### `SynchronizedItemStreamWriter` 当使用不是线程安全的`ItemWriter`时, Spring Batch 提供`SynchronizedItemStreamWriter`decorator,该 decorator 可用于使`ItemWriter`线程安全。 Spring 批处理提供了一个`SynchronizedItemStreamWriterBuilder`来构造`SynchronizedItemStreamWriter`的实例。 -##### [](#multiResourceItemWriter)`MultiResourceItemWriter` +##### `MultiResourceItemWriter` 当当前资源中写入的项数超过`itemCountLimitPerResource`时,`MultiResourceItemWriter`包装一个`ResourceAwareItemWriterItemStream`并创建一个新的输出资源。 Spring 批处理提供了一个`MultiResourceItemWriterBuilder`来构造`MultiResourceItemWriter`的实例。 -##### [](#classifierCompositeItemWriter)`ClassifierCompositeItemWriter` +##### `ClassifierCompositeItemWriter` `ClassifierCompositeItemWriter`调用用于每个项的`ItemWriter`实现的集合之一,该实现基于通过提供的`Classifier`实现的路由器模式。如果所有委托都是线程安全的,则实现是线程安全的。 Spring 批处理提供了一个`ClassifierCompositeItemWriterBuilder`来构造`ClassifierCompositeItemWriter`的实例。 -##### [](#classifierCompositeItemProcessor)`ClassifierCompositeItemProcessor` +##### `ClassifierCompositeItemProcessor` `ClassifierCompositeItemProcessor`是一个`ItemProcessor`,它调用`ItemProcessor`实现的集合之一,该实现基于通过所提供的`Classifier`实现的路由器模式。 Spring 批处理提供了一个`ClassifierCompositeItemProcessorBuilder`来构造`ClassifierCompositeItemProcessor`的实例。 -#### [](#messagingReadersAndWriters)消息阅读器和消息编写器 +#### 消息阅读器和消息编写器 Spring Batch 为常用的消息传递系统提供了以下读取器和编写器: @@ -2111,31 +2111,31 @@ Spring Batch 为常用的消息传递系统提供了以下读取器和编写器 * [`KafkaItemWriter`] -##### [](#amqpItemReader)`AmqpItemReader` +##### `AmqpItemReader` `AmqpItemReader`是一个`ItemReader`,它使用`AmqpTemplate`来接收或转换来自交换的消息。 Spring 批处理提供了一个`AmqpItemReaderBuilder`来构造`AmqpItemReader`的实例。 -##### [](#amqpItemWriter)`AmqpItemWriter` +##### `AmqpItemWriter` `AmqpItemWriter`是一个`ItemWriter`,它使用`AmqpTemplate`向 AMQP 交换发送消息。如果提供的`AmqpTemplate`中未指定名称,则将消息发送到无名交换机。 Spring 批处理提供了`AmqpItemWriterBuilder`来构造`AmqpItemWriter`的实例。 -##### [](#jmsItemReader)`JmsItemReader` +##### `JmsItemReader` 对于使用`JmsTemplate`的 JMS,`ItemReader`是`ItemReader`。模板应该有一个默认的目标,它用于为`read()`方法提供项。 Spring 批处理提供了一个`JmsItemReaderBuilder`来构造`JmsItemReader`的实例。 -##### [](#jmsItemWriter)`JmsItemWriter` +##### `JmsItemWriter` 对于使用`JmsTemplate`的 JMS,`ItemWriter`是`ItemWriter`。模板应该有一个默认的目的地,用于在`write(List)`中发送项。 Spring 批处理提供了一个`JmsItemWriterBuilder`来构造`JmsItemWriter`的实例。 -##### [](#kafkaItemReader)`KafkaItemReader` +##### `KafkaItemReader` 对于 Apache Kafka 主题,`KafkaItemReader`是`ItemReader`。可以将其配置为从同一主题的多个分区中读取消息。它在执行上下文中存储消息偏移量,以支持重新启动功能。 Spring 批处理提供了一个`KafkaItemReaderBuilder`来构造`KafkaItemReader`的实例。 -##### [](#kafkaItemWriter)`KafkaItemWriter` +##### `KafkaItemWriter` `KafkaItemWriter`是用于 Apache Kafka 的`ItemWriter`,它使用`KafkaTemplate`将事件发送到默认主题。 Spring 批处理提供了一个`KafkaItemWriterBuilder`来构造`KafkaItemWriter`的实例。 -#### [](#databaseReaders)数据库阅读器 +#### 数据库阅读器 Spring Batch 提供以下数据库阅读器: @@ -2149,27 +2149,27 @@ Spring Batch 提供以下数据库阅读器: * [`RepositoryItemReader`] -##### [](#Neo4jItemReader)`Neo4jItemReader` +##### `Neo4jItemReader` `Neo4jItemReader`是一个`ItemReader`,它使用分页技术从图数据库 NEO4j 中读取对象。 Spring 批处理提供了一个`Neo4jItemReaderBuilder`来构造`Neo4jItemReader`的实例。 -##### [](#mongoItemReader)`MongoItemReader` +##### `MongoItemReader` `MongoItemReader`是一个`ItemReader`,它使用分页技术从 MongoDB 读取文档。 Spring 批处理提供了一个`MongoItemReaderBuilder`来构造`MongoItemReader`的实例。 -##### [](#hibernateCursorItemReader)`HibernateCursorItemReader` +##### `HibernateCursorItemReader` `HibernateCursorItemReader`是用于读取在 Hibernate 之上构建的数据库记录的`ItemStreamReader`。它执行 HQL 查询,然后在初始化时,在调用`read()`方法时对结果集进行迭代,依次返回与当前行对应的对象。 Spring 批处理提供了一个`HibernateCursorItemReaderBuilder`来构造`HibernateCursorItemReader`的实例。 -##### [](#hibernatePagingItemReader)`HibernatePagingItemReader` +##### `HibernatePagingItemReader` `HibernatePagingItemReader`是一个`ItemReader`,用于读取建立在 Hibernate 之上的数据库记录,并且一次只读取固定数量的项。 Spring 批处理提供了一个`HibernatePagingItemReaderBuilder`来构造`HibernatePagingItemReader`的实例。 -##### [](#repositoryItemReader)`RepositoryItemReader` +##### `RepositoryItemReader` `RepositoryItemReader`是通过使用`PagingAndSortingRepository`读取记录的`ItemReader`。 Spring 批处理提供了一个`RepositoryItemReaderBuilder`来构造`RepositoryItemReader`的实例。 -#### [](#databaseWriters)数据库编写者 +#### 数据库编写者 Spring Batch 提供以下数据库编写器: @@ -2187,35 +2187,35 @@ Spring Batch 提供以下数据库编写器: * [`GemfireItemWriter`] -##### [](#neo4jItemWriter)`Neo4jItemWriter` +##### `Neo4jItemWriter` `Neo4jItemWriter`是一个`ItemWriter`实现,它将写到 NEO4J 数据库。 Spring 批处理提供了一个`Neo4jItemWriterBuilder`来构造`Neo4jItemWriter`的实例。 -##### [](#mongoItemWriter)`MongoItemWriter` +##### `MongoItemWriter` `MongoItemWriter`是一个`ItemWriter`实现,它使用 Spring data 的`MongoOperations`的实现将数据写到 MongoDB 存储。 Spring 批处理提供了一个`MongoItemWriterBuilder`来构造`MongoItemWriter`的实例。 -##### [](#repositoryItemWriter)`RepositoryItemWriter` +##### `RepositoryItemWriter` `RepositoryItemWriter`是来自 Spring 数据的`ItemWriter`包装器。 Spring 批处理提供了一个`RepositoryItemWriterBuilder`来构造`RepositoryItemWriter`的实例。 -##### [](#hibernateItemWriter)`HibernateItemWriter` +##### `HibernateItemWriter` `HibernateItemWriter`是一个`ItemWriter`,它使用一个 Hibernate 会话来保存或更新不是当前 Hibernate 会话的一部分的实体。 Spring 批处理提供了一个`HibernateItemWriterBuilder`来构造`HibernateItemWriter`的实例。 -##### [](#jdbcBatchItemWriter)`JdbcBatchItemWriter` +##### `JdbcBatchItemWriter` `JdbcBatchItemWriter`是一个`ItemWriter`,它使用`NamedParameterJdbcTemplate`中的批处理特性来为提供的所有项执行一批语句。 Spring 批处理提供了一个`JdbcBatchItemWriterBuilder`来构造`JdbcBatchItemWriter`的实例。 -##### [](#jpaItemWriter)`JpaItemWriter` +##### `JpaItemWriter` `JpaItemWriter`是一个`ItemWriter`,它使用 JPA `EntityManagerFactory`来合并不属于持久性上下文的任何实体。 Spring 批处理提供了一个`JpaItemWriterBuilder`来构造`JpaItemWriter`的实例。 -##### [](#gemfireItemWriter)`GemfireItemWriter` +##### `GemfireItemWriter` `GemfireItemWriter`是一个`ItemWriter`,它使用一个`GemfireTemplate`将项目存储在 Gemfire 中,作为键/值对。 Spring 批处理提供了一个`GemfireItemWriterBuilder`来构造`GemfireItemWriter`的实例。 -#### [](#specializedReaders)专业阅读器 +#### 专业阅读器 Spring Batch 提供以下专门的阅读器: @@ -2225,19 +2225,19 @@ Spring Batch 提供以下专门的阅读器: * [`AvroItemReader`] -##### [](#ldifReader)`LdifReader` +##### `LdifReader` `AvroItemWriter`读取来自`Resource`的 LDIF(LDAP 数据交换格式)记录,对它们进行解析,并为执行的每个`LdapAttribute`返回一个`LdapAttribute`对象。 Spring 批处理提供了一个`LdifReaderBuilder`来构造`LdifReader`的实例。 -##### [](#mappingLdifReader)`MappingLdifReader` +##### `MappingLdifReader` `MappingLdifReader`从`Resource`读取 LDIF(LDAP 数据交换格式)记录,解析它们,然后将每个 LDIF 记录映射到 POJO(普通的旧 Java 对象)。每个读都返回一个 POJO。 Spring 批处理提供了一个`MappingLdifReaderBuilder`来构造`MappingLdifReader`的实例。 -##### [](#avroItemReader)`AvroItemReader` +##### `AvroItemReader` `AvroItemReader`从资源中读取序列化的 AVRO 数据。每个读取返回由 Java 类或 AVRO 模式指定的类型的实例。读取器可以被可选地配置为嵌入 AVRO 模式的输入或不嵌入该模式的输入。 Spring 批处理提供了一个`AvroItemReaderBuilder`来构造`AvroItemReader`的实例。 -#### [](#specializedWriters)专业作家 +#### 专业作家 Spring Batch 提供以下专业的写作人员: @@ -2245,20 +2245,20 @@ Spring Batch 提供以下专业的写作人员: * [`AvroItemWriter`] -##### [](#simpleMailMessageItemWriter)`SimpleMailMessageItemWriter` +##### `SimpleMailMessageItemWriter` `SimpleMailMessageItemWriter`是可以发送邮件的`ItemWriter`。它将消息的实际发送委托给`MailSender`的实例。 Spring 批处理提供了一个`SimpleMailMessageItemWriterBuilder`来构造`SimpleMailMessageItemWriter`的实例。 -##### [](#avroItemWriter)`AvroItemWriter` +##### `AvroItemWriter` `AvroItemWrite`根据给定的类型或模式将 Java 对象序列化到一个 WriteableResource。编写器可以被可选地配置为在输出中嵌入或不嵌入 AVRO 模式。 Spring 批处理提供了一个`AvroItemWriterBuilder`来构造`AvroItemWriter`的实例。 -#### [](#specializedProcessors)专用处理器 +#### 专用处理器 Spring Batch 提供以下专门的处理器: * [`ScriptItemProcessor`] -##### [](#scriptItemProcessor)`ScriptItemProcessor` +##### `ScriptItemProcessor` `ScriptItemProcessor`是一个`ItemProcessor`,它将当前项目传递给提供的脚本,并且该脚本的结果将由处理器返回。 Spring 批处理提供了一个`ScriptItemProcessorBuilder`来构造`ScriptItemProcessor`的实例。 \ No newline at end of file diff --git a/docs/spring-batch/repeat.md b/docs/spring-batch/repeat.md index 02bdb64..9dae85d 100644 --- a/docs/spring-batch/repeat.md +++ b/docs/spring-batch/repeat.md @@ -1,10 +1,10 @@ # 重复 -## [](#repeat)重复 +## 重复 XMLJavaBoth -### [](#repeatTemplate)repeatemplate +### repeatemplate 批处理是关于重复的操作,或者作为简单的优化,或者作为工作的一部分。 Spring Batch 具有`RepeatOperations`接口,可以对重复进行策略规划和推广,并提供相当于迭代器框架的内容。`RepeatOperations`接口具有以下定义: @@ -47,13 +47,13 @@ template.iterate(new RepeatCallback() { 在前面的示例中,我们返回`RepeatStatus.CONTINUABLE`,以表明还有更多的工作要做。回调还可以返回`RepeatStatus.FINISHED`,向调用者发出信号,表示没有更多的工作要做。一些迭代可以由回调中所做的工作固有的考虑因素来终止。就回调而言,其他方法实际上是无限循环,并且完成决策被委托给外部策略,如前面示例中所示的情况。 -#### [](#repeatContext)repeatcontext +#### repeatcontext `RepeatCallback`的方法参数是`RepeatContext`。许多回调忽略了上下文。但是,如果有必要,它可以作为一个属性包来存储迭代期间的瞬态数据。在`iterate`方法返回后,上下文不再存在。 如果正在进行嵌套的迭代,则`RepeatContext`具有父上下文。父上下文有时用于存储需要在对`iterate`的调用之间共享的数据。例如,如果你想计算迭代中某个事件发生的次数,并在随后的调用中记住它,那么就是这种情况。 -#### [](#repeatStatus)重复状态 +#### 重复状态 `RepeatStatus`是 Spring 批处理用来指示处理是否已经完成的枚举。它有两个可能的`RepeatStatus`值,如下表所示: @@ -64,7 +64,7 @@ template.iterate(new RepeatCallback() { `RepeatStatus`值也可以通过在`RepeatStatus`中使用`and()`方法与逻辑和操作结合。这样做的效果是在可持续的标志上做一个合乎逻辑的操作。换句话说,如果任一状态是`FINISHED`,则结果是`FINISHED`。 -### [](#completionPolicies)完工政策 +### 完工政策 在`RepeatTemplate`内,`iterate`方法中的循环的终止由`CompletionPolicy`确定,这也是`RepeatContext`的工厂。`RepeatTemplate`负责使用当前策略创建`RepeatContext`,并在迭代的每个阶段将其传递给`RepeatCallback`。回调完成其`doInIteration`后,`RepeatTemplate`必须调用`CompletionPolicy`,以要求它更新其状态(该状态将存储在`RepeatContext`中)。然后,它询问策略迭代是否完成。 @@ -72,7 +72,7 @@ Spring 批处理提供了`CompletionPolicy`的一些简单的通用实现。`Sim 对于更复杂的决策,用户可能需要实现自己的完成策略。例如,一旦联机系统投入使用,一个批处理窗口就会阻止批处理作业的执行,这将需要一个自定义策略。 -### [](#repeatExceptionHandling)异常处理 +### 异常处理 如果在`RepeatCallback`中抛出了异常,则`RepeatTemplate`查询`ExceptionHandler`,该查询可以决定是否重新抛出异常。 @@ -91,7 +91,7 @@ public interface ExceptionHandler { `SimpleLimitExceptionHandler`的一个重要的可选属性是名为`useParent`的布尔标志。默认情况下它是`false`,因此该限制仅在当前的`RepeatContext`中考虑。当设置为`true`时,该限制在嵌套迭代中跨兄弟上下文(例如步骤中的一组块)保持不变。 -### [](#repeatListeners)听众 +### 听众 通常情况下,能够接收跨多个不同迭代的交叉关注点的额外回调是有用的。为此, Spring Batch 提供了`RepeatListener`接口。`RepeatTemplate`允许用户注册`RepeatListener`实现,并且在迭代期间可用的情况下,他们将获得带有`RepeatContext`和`RepeatStatus`的回调。 @@ -111,11 +111,11 @@ public interface RepeatListener { 请注意,当有多个侦听器时,它们在一个列表中,因此有一个顺序。在这种情况下,`open`和`before`的调用顺序相同,而`after`、`onError`和`close`的调用顺序相反。 -### [](#repeatParallelProcessing)并行处理 +### 并行处理 `RepeatOperations`的实现不限于按顺序执行回调。一些实现能够并行地执行它们的回调,这一点非常重要。为此, Spring Batch 提供了`TaskExecutorRepeatTemplate`,它使用 Spring `TaskExecutor`策略来运行`RepeatCallback`。默认值是使用`SynchronousTaskExecutor`,其效果是在相同的线程中执行整个迭代(与正常的`RepeatTemplate`相同)。 -### [](#declarativeIteration)声明式迭代 +### 声明式迭代 有时,你知道有一些业务处理在每次发生时都想要重复。这方面的经典示例是消息管道的优化。如果一批消息经常到达,那么处理它们比为每条消息承担单独事务的成本更有效。 Spring Batch 提供了一个 AOP 拦截器,该拦截器仅为此目的将方法调用包装在`RepeatOperations`对象中。将`RepeatOperationsInterceptor`执行所截获的方法并根据所提供的`CompletionPolicy`中的`RepeatTemplate`进行重复。 diff --git a/docs/spring-batch/retry.md b/docs/spring-batch/retry.md index 1e0c87c..43c48f1 100644 --- a/docs/spring-batch/retry.md +++ b/docs/spring-batch/retry.md @@ -1,12 +1,12 @@ # 重试 -## [](#retry)重试 +## 重试 XMLJavaBoth 为了使处理更健壮,更不容易失败,有时自动重试失败的操作会有所帮助,以防随后的尝试可能会成功。容易发生间歇性故障的错误通常是暂时的。例如,对 Web 服务的远程调用由于网络故障或数据库更新中的`DeadlockLoserDataAccessException`而失败。 -### [](#retryTemplate)`RetryTemplate` +### `RetryTemplate` | |重试功能在 2.2.0 时从 Spring 批中退出。
它现在是一个新库[Spring Retry](https://github.com/spring-projects/spring-retry)的一部分。| |---|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------| @@ -64,13 +64,13 @@ Foo result = template.execute(new RetryCallback() { 在前面的示例中,我们进行一个 Web 服务调用,并将结果返回给用户。如果该调用失败,则重试该调用,直到达到超时为止。 -#### [](#retryContext)`RetryContext` +#### `RetryContext` `RetryCallback`的方法参数是`RetryContext`。许多回调忽略了上下文,但如果有必要,它可以作为一个属性包来存储迭代期间的数据。 如果同一个线程中有一个正在进行的嵌套重试,则`RetryContext`具有父上下文。父上下文有时用于存储需要在对`execute`的调用之间共享的数据。 -#### [](#recoveryCallback)`RecoveryCallback` +#### `RecoveryCallback` 当重试用完时,`RetryOperations`可以将控制权传递给另一个回调,称为`RecoveryCallback`。要使用此功能,客户机将回调一起传递给相同的方法,如以下示例所示: @@ -88,11 +88,11 @@ Foo foo = template.execute(new RetryCallback() { 如果业务逻辑在模板决定中止之前没有成功,那么客户机将有机会通过恢复回调执行一些替代处理。 -#### [](#statelessRetry)无状态重试 +#### 无状态重试 在最简单的情况下,重试只是一个 while 循环。`RetryTemplate`可以一直尝试,直到成功或失败为止。`RetryContext`包含一些状态来决定是重试还是中止,但是这个状态在堆栈上,不需要在全局的任何地方存储它,所以我们将其称为无状态重试。无状态重试和有状态重试之间的区别包含在`RetryPolicy`的实现中(`RetryTemplate`可以同时处理这两个)。在无状态的重试中,重试回调总是在它失败时所在的线程中执行。 -#### [](#statefulRetry)有状态重试 +#### 有状态重试 在故障导致事务资源无效的情况下,有一些特殊的考虑因素。这不适用于简单的远程调用,因为(通常)没有事务性资源,但有时确实适用于数据库更新,尤其是在使用 Hibernate 时。在这种情况下,只有立即重新抛出调用故障的异常才有意义,这样事务就可以回滚,并且我们可以启动一个新的有效事务。 @@ -109,7 +109,7 @@ Foo foo = template.execute(new RetryCallback() { 是否重试的决定实际上被委托给一个常规的`RetryPolicy`,因此通常对限制和超时的关注可以被注入到那里(在本章后面描述)。 -### [](#retryPolicies)重试策略 +### 重试策略 在`RetryTemplate`中,在`execute`方法中重试或失败的决定由`RetryPolicy`决定,这也是`RetryContext`的工厂。`RetryTemplate`负责使用当前策略创建`RetryContext`,并在每次尝试时将其传递给`RetryCallback`。回调失败后,`RetryTemplate`必须调用`RetryPolicy`,要求它更新其状态(存储在`RetryContext`中),然后询问策略是否可以进行另一次尝试。如果无法进行另一次尝试(例如,当达到限制或检测到超时时时),则策略还负责处理耗尽状态。简单的实现方式会抛出`RetryExhaustedException`,这会导致任何封闭事务被回滚。更复杂的实现可能会尝试采取一些恢复操作,在这种情况下,事务可以保持不变。 @@ -143,7 +143,7 @@ template.execute(new RetryCallback() { 用户可能需要实现他们自己的重试策略,以做出更多定制的决策。例如,当存在已知的、特定于解决方案的异常的可重试和不可重试的分类时,自定义重试策略是有意义的。 -### [](#backoffPolicies)退避政策 +### 退避政策 当在短暂的失败之后重试时,在再次尝试之前等待一下通常会有所帮助,因为通常故障是由某些只能通过等待解决的问题引起的。如果`RetryCallback`失败,`RetryTemplate`可以根据`BackoffPolicy`暂停执行。 @@ -162,7 +162,7 @@ public interface BackoffPolicy { a`BackoffPolicy`可以自由地以它选择的任何方式实现退避。 Spring Batch Out of the Box 提供的策略都使用。一个常见的用例是后退,等待时间呈指数增长,以避免两次重试进入锁定步骤,两次都失败(这是从以太网学到的经验教训)。为此, Spring batch 提供了`ExponentialBackoffPolicy`。 -### [](#retryListeners)听众 +### 听众 通常情况下,能够接收跨多个不同重试中的交叉关注点的额外回调是有用的。为此, Spring Batch 提供了`RetryListener`接口。`RetryTemplate`允许用户注册`RetryListeners`,并且在迭代期间可用的情况下,给出带有`RetryContext`和`Throwable`的回调。 @@ -183,7 +183,7 @@ public interface RetryListener { 请注意,当有多个侦听器时,它们在一个列表中,因此有一个顺序。在这种情况下,以相同的顺序调用`open`,而以相反的顺序调用`onError`和`close`。 -### [](#declarativeRetry)声明式重试 +### 声明式重试 有时,你知道有些业务处理在每次发生时都想要重试。这方面的典型例子是远程服务调用。 Spring Batch 提供了 AOP 拦截器,该拦截器仅为此目的在`RetryOperations`实现中包装方法调用。根据提供的`RepeatTemplate`中的`RetryPolicy`,`RetryOperationsInterceptor`执行截获的方法并在失败时重试。 diff --git a/docs/spring-batch/scalability.md b/docs/spring-batch/scalability.md index 7c5a8a9..45b821d 100644 --- a/docs/spring-batch/scalability.md +++ b/docs/spring-batch/scalability.md @@ -1,6 +1,6 @@ # 缩放和并行处理 -## [](#scalability)缩放和并行处理 +## 缩放和并行处理 XMLJavaBoth @@ -24,7 +24,7 @@ Spring 当你准备好开始用一些并行处理来实现一个作业时, Spr 首先,我们回顾一下单流程选项。然后,我们回顾了多进程的选择。 -### [](#multithreadedStep)多线程步骤 +### 多线程步骤 启动并行处理的最简单方法是在步骤配置中添加`TaskExecutor`。 @@ -93,7 +93,7 @@ public Step sampleStep(TaskExecutor taskExecutor) { Spring 批处理提供了`ItemWriter`和`ItemReader`的一些实现方式。通常,他们会在 Javadoc 中说明它们是否是线程安全的,或者你必须做什么来避免在并发环境中出现问题。如果 Javadoc 中没有信息,则可以检查实现,以查看是否存在任何状态。如果阅读器不是线程安全的,那么你可以使用提供的`SynchronizedItemStreamReader`来装饰它,或者在你自己的同步委托程序中使用它。你可以将调用同步到`read()`,并且只要处理和写入是块中最昂贵的部分,你的步骤仍然可以比在单线程配置中快得多地完成。 -### [](#scalabilityParallelSteps)平行步骤 +### 平行步骤 只要需要并行化的应用程序逻辑可以划分为不同的职责,并分配给各个步骤,那么就可以在单个流程中进行并行化。并行步骤执行很容易配置和使用。 @@ -163,7 +163,7 @@ public TaskExecutor taskExecutor() { 有关更多详细信息,请参见[拆分流](step.html#split-flows)一节。 -### [](#remoteChunking)远程分块 +### 远程分块 在远程分块中,`Step`处理被分割到多个进程中,通过一些中间件相互通信。下图显示了该模式: @@ -179,7 +179,7 @@ Manager 是 Spring 批处理`Step`的实现,其中`ItemWriter`被一个通用 有关更多详细信息,请参见[Spring Batch Integration - Remote Chunking](spring-batch-integration.html#remote-chunking)一节。 -### [](#partitioning)分区 +### 分区 Spring 批处理还提供了用于分区`Step`执行并远程执行它的 SPI。在这种情况下,远程参与者是`Step`实例,这些实例可以很容易地被配置并用于本地处理。下图显示了该模式: @@ -229,7 +229,7 @@ public Step step1Manager() { Spring 批处理为被称为“Step1:Partition0”的分区创建步骤执行,以此类推。为了保持一致性,许多人更喜欢将 Manager 步骤称为“Step1:Manager”。你可以为步骤使用别名(通过指定`name`属性而不是`id`属性)。 -#### [](#partitionHandler)分区处理程序 +#### 分区处理程序 `PartitionHandler`是了解远程或网格环境结构的组件。它能够将`StepExecution`请求发送到远程`Step`实例,并以某种特定于织物的格式包装,例如 DTO。它不需要知道如何分割输入数据或如何聚合多个`Step`执行的结果。一般来说,它可能也不需要了解弹性或故障转移,因为在许多情况下,这些都是织物的功能。在任何情况下, Spring 批处理总是提供独立于织物的重启性。失败的`Job`总是可以重新启动,并且只重新执行失败的`Steps`。 @@ -278,7 +278,7 @@ public PartitionHandler partitionHandler() { `TaskExecutorPartitionHandler`对于 IO 密集型`Step`实例很有用,例如复制大量文件或将文件系统复制到内容管理系统中。它还可以通过提供`Step`实现来用于远程执行,该实现是远程调用的代理(例如使用 Spring remoting)。 -#### [](#partitioner)分割者 +#### 分割者 `Partitioner`有一个更简单的职责:仅为新的步骤执行生成执行上下文作为输入参数(无需担心重新启动)。它只有一个方法,如下面的接口定义所示: @@ -294,7 +294,7 @@ public interface Partitioner { 可以使用一个名为`PartitionNameProvider`的可选接口来提供与分区本身分开的分区名称。如果`Partitioner`实现了这个接口,那么在重新启动时,只会查询名称。如果分区是昂贵的,这可以是一个有用的优化。由`PartitionNameProvider`提供的名称必须与`Partitioner`提供的名称匹配。 -#### [](#bindingInputDataToSteps)将输入数据绑定到步骤 +#### 将输入数据绑定到步骤 由`PartitionHandler`执行的步骤具有相同的配置,并且它们的输入参数在运行时从`ExecutionContext`绑定,这是非常有效的。 Spring 批处理的 StepScope 特性很容易做到这一点(在[后期绑定](step.html#late-binding)一节中更详细地介绍)。例如,如果`Partitioner`使用一个名为`fileName`的属性键创建`ExecutionContext`实例,并针对每个步骤调用指向不同的文件(或目录),则`Partitioner`输出可能类似于下表的内容: diff --git a/docs/spring-batch/schema-appendix.md b/docs/spring-batch/schema-appendix.md index 6beae67..de01a10 100644 --- a/docs/spring-batch/schema-appendix.md +++ b/docs/spring-batch/schema-appendix.md @@ -1,8 +1,8 @@ # 元数据模式 -## [](#metaDataSchema)附录 A:元数据模式 +## 附录 A:元数据模式 -### [](#metaDataSchemaOverview)概述 +### 概述 Spring 批处理元数据表与在 Java 中表示它们的域对象非常匹配。例如,`JobInstance`,`JobExecution`,`JobParameters`,和`StepExecution`分别映射到`BATCH_JOB_INSTANCE`,`BATCH_JOB_EXECUTION`,`BATCH_JOB_EXECUTION_PARAMS`和`BATCH_STEP_EXECUTION`。`ExecutionContext`映射到`BATCH_JOB_EXECUTION_CONTEXT`和`BATCH_STEP_EXECUTION_CONTEXT`。`JobRepository`负责将每个 Java 对象保存并存储到其正确的表中。本附录详细描述了元数据表,以及在创建元数据表时做出的许多设计决策。在查看下面的各种表创建语句时,重要的是要认识到所使用的数据类型是尽可能通用的。 Spring Batch 提供了许多模式作为示例,所有这些模式都具有不同的数据类型,这是由于各个数据库供应商处理数据类型的方式有所不同。下图显示了所有 6 个表及其相互关系的 ERD 模型: @@ -10,11 +10,11 @@ Spring 批处理元数据表与在 Java 中表示它们的域对象非常匹配 图 1。 Spring 批处理元数据 ERD -#### [](#exampleDDLScripts)示例 DDL 脚本 +#### 示例 DDL 脚本 Spring 批处理核心 JAR 文件包含用于为许多数据库平台创建关系表的示例脚本(反过来,这些平台由作业存储库工厂 Bean 或等效的名称空间自动检测)。这些脚本可以按原样使用,也可以根据需要修改附加的索引和约束。文件名的形式为`schema-*.sql`,其中“\*”是目标数据库平台的简称。脚本在包`org.springframework.batch.core`中。 -#### [](#migrationDDLScripts)迁移 DDL 脚本 +#### 迁移 DDL 脚本 Spring Batch 提供了在升级版本时需要执行的迁移 DDL 脚本。这些脚本可以在`org/springframework/batch/core/migration`下的核心 JAR 文件中找到。迁移脚本被组织到与版本号对应的文件夹中,这些版本号被引入: @@ -22,11 +22,11 @@ Spring Batch 提供了在升级版本时需要执行的迁移 DDL 脚本。这 * `4.1`:如果你从`4.1`之前的版本迁移到`4.1`版本,则包含所需的脚本 -#### [](#metaDataVersion)版本 +#### 版本 本附录中讨论的许多数据库表都包含一个版本列。这一列很重要,因为 Spring 批处理在处理数据库更新时采用了乐观的锁定策略。这意味着每次“触摸”(更新)记录时,Version 列中的值都会增加一个。当存储库返回以保存该值时,如果版本号发生了更改,它将抛出一个`OptimisticLockingFailureException`,表示在并发访问中出现了错误。这种检查是必要的,因为即使不同的批处理作业可能在不同的机器中运行,它们都使用相同的数据库表。 -#### [](#metaDataIdentity)恒等式 +#### 恒等式 `BATCH_JOB_INSTANCE`、`BATCH_JOB_EXECUTION`和`BATCH_STEP_EXECUTION`都包含以`_ID`结尾的列。这些字段充当各自表的主键。然而,它们不是数据库生成的密钥。相反,它们是由单独的序列生成的。这是必要的,因为在将一个域对象插入到数据库中之后,需要在实际对象上设置给定的键,以便在 Java 中对它们进行唯一标识。较新的数据库驱动程序(JDBC3.0 及以上版本)通过数据库生成的键支持此功能。然而,使用的是序列,而不是要求该功能。模式的每个变体都包含以下语句的某种形式: @@ -49,7 +49,7 @@ INSERT INTO BATCH_JOB_SEQ values(0); 在前一种情况下,用一个表来代替每个序列。 Spring 核心类`MySQLMaxValueIncrementer`然后在这个序列中增加一列,以便提供类似的功能。 -### [](#metaDataBatchJobInstance)`BATCH_JOB_INSTANCE` +### `BATCH_JOB_INSTANCE` `BATCH_JOB_INSTANCE`表保存了与`JobInstance`相关的所有信息,并作为整个层次结构的顶部。下面的通用 DDL 语句用于创建它: @@ -72,7 +72,7 @@ CREATE TABLE BATCH_JOB_INSTANCE ( * `JOB_KEY`:`JobParameters`的序列化,该序列化唯一地标识同一作业的不同实例。(具有相同工作名称的`JobInstances`必须有不同的`JobParameters`,因此,不同的`JOB_KEY`值)。 -### [](#metaDataBatchJobParams)`BATCH_JOB_EXECUTION_PARAMS` +### `BATCH_JOB_EXECUTION_PARAMS` `BATCH_JOB_EXECUTION_PARAMS`表包含与`JobParameters`对象相关的所有信息。它包含传递给`Job`的 0 个或更多个键/值对,并用作运行作业的参数的记录。对于每个有助于生成作业标识的参数,`IDENTIFYING`标志被设置为 true。请注意,该表已被非规范化。不是为每个类型创建一个单独的表,而是有一个表,其中有一列指示类型,如下面的清单所示: @@ -111,7 +111,7 @@ CREATE TABLE BATCH_JOB_EXECUTION_PARAMS ( 请注意,此表没有主键。这是因为该框架不需要一个框架,因此不需要它。如果需要,可以添加主键,也可以添加与数据库生成的键,而不会对框架本身造成任何问题。 -### [](#metaDataBatchJobExecution)`BATCH_JOB_EXECUTION` +### `BATCH_JOB_EXECUTION` `BATCH_JOB_EXECUTION`表包含与`JobExecution`对象相关的所有信息。每次运行`Job`时,总会有一个新的`JobExecution`,并在此表中有一个新的行。下面的清单显示了`BATCH_JOB_EXECUTION`表的定义: @@ -155,7 +155,7 @@ CREATE TABLE BATCH_JOB_EXECUTION ( * `LAST_UPDATED`:时间戳表示此执行最后一次被持久化的时间。 -### [](#metaDataBatchStepExecution)`BATCH_STEP_EXECUTION` +### `BATCH_STEP_EXECUTION` 批处理 \_step\_execution 表保存与`StepExecution`对象相关的所有信息。该表在许多方面与`BATCH_JOB_EXECUTION`表类似,并且对于每个创建的`JobExecution`,每个`Step`总是至少有一个条目。下面的清单显示了`BATCH_STEP_EXECUTION`表的定义: @@ -222,7 +222,7 @@ CREATE TABLE BATCH_STEP_EXECUTION ( * `LAST_UPDATED`:时间戳表示此执行最后一次被持久化的时间。 -### [](#metaDataBatchJobExecutionContext)`BATCH_JOB_EXECUTION_CONTEXT` +### `BATCH_JOB_EXECUTION_CONTEXT` `BATCH_JOB_EXECUTION_CONTEXT`表包含与`Job`的`ExecutionContext`相关的所有信息。这里正好有一个`Job``ExecutionContext`per`JobExecution`,它包含特定作业执行所需的所有作业级别数据。该数据通常表示故障后必须检索的状态,因此`JobInstance`可以“从它停止的地方开始”。下面的清单显示了`BATCH_JOB_EXECUTION_CONTEXT`表的定义: @@ -244,7 +244,7 @@ CREATE TABLE BATCH_JOB_EXECUTION_CONTEXT ( * `SERIALIZED_CONTEXT`:整个上下文,序列化。 -### [](#metaDataBatchStepExecutionContext)`BATCH_STEP_EXECUTION_CONTEXT` +### `BATCH_STEP_EXECUTION_CONTEXT` `BATCH_STEP_EXECUTION_CONTEXT`表包含与`Step`的`ExecutionContext`相关的所有信息。每`StepExecution`正好有一个`ExecutionContext`,它包含了为执行特定步骤而需要持久化的所有数据。该数据通常表示故障后必须检索的状态,这样`JobInstance`就可以“从它停止的地方开始”。下面的清单显示了`BATCH_STEP_EXECUTION_CONTEXT`表的定义: @@ -266,7 +266,7 @@ CREATE TABLE BATCH_STEP_EXECUTION_CONTEXT ( * `SERIALIZED_CONTEXT`:整个上下文,序列化。 -### [](#metaDataArchiving)存档 +### 存档 由于每次运行批处理作业时,多个表中都有条目,因此通常需要为元数据表创建归档策略。这些表本身旨在显示过去发生的事情的记录,并且通常不会影响任何作业的运行,只有一些与重新启动有关的明显例外: @@ -276,11 +276,11 @@ CREATE TABLE BATCH_STEP_EXECUTION_CONTEXT ( * 如果重新启动了一个作业,框架将使用已持久化到`ExecutionContext`的任何数据来恢复`Job’s`状态。因此,如果作业没有成功完成,从该表中删除任何条目,将阻止它们在再次运行时从正确的点开始。 -### [](#multiByteCharacters)国际和多字节字符 +### 国际和多字节字符 如果你在业务处理中使用多字节字符集(例如中文或西里尔),那么这些字符可能需要在 Spring 批处理模式中持久化。许多用户发现,只需将模式更改为`VARCHAR`列的长度的两倍就足够了。其他人更喜欢将[JobRepository](job.html#configuringJobRepository)配置为`max-varchar-length`列长度的一半。一些用户还报告说,他们在模式定义中使用`NVARCHAR`代替`VARCHAR`。最佳结果取决于数据库平台和本地配置数据库服务器的方式。 -### [](#recommendationsForIndexingMetaDataTables)建立元数据表索引的建议 +### 建立元数据表索引的建议 Spring 批处理为几个常见的数据库平台的核心 JAR 文件中的元数据表提供了 DDL 示例。索引声明不包含在该 DDL 中,因为用户可能希望索引的方式有太多的变化,这取决于他们的精确平台、本地约定以及作业如何操作的业务需求。下面的内容提供了一些指示,说明 Spring Batch 提供的 DAO 实现将在`WHERE`子句中使用哪些列,以及它们可能被使用的频率,以便各个项目可以就索引做出自己的决定: diff --git a/docs/spring-batch/spring-batch-integration.md b/docs/spring-batch/spring-batch-integration.md index 1a7d81e..4d48cb6 100644 --- a/docs/spring-batch/spring-batch-integration.md +++ b/docs/spring-batch/spring-batch-integration.md @@ -1,10 +1,10 @@ # Spring 批处理集成 -## [](#springBatchIntegration) Spring 批处理集成 +## Spring 批处理集成 XMLJavaBoth -### [](#spring-batch-integration-introduction) Spring 批处理集成介绍 +### Spring 批处理集成介绍 Spring 批处理的许多用户可能会遇到不在 Spring 批处理范围内的需求,但这些需求可以通过使用 Spring 集成来高效而简洁地实现。相反, Spring 集成用户可能会遇到 Spring 批处理需求,并且需要一种有效地集成这两个框架的方法。在这种情况下,出现了几种模式和用例, Spring 批处理集成解决了这些需求。 @@ -24,7 +24,7 @@ Spring 批处理和 Spring 集成之间的界限并不总是清晰的,但有 * [外部化批处理过程执行](#externalizing-batch-process-execution) -#### [](#namespace-support)名称空间支持 +#### 名称空间支持 Spring 自批处理集成 1.3 以来,添加了专用的 XML 命名空间支持,目的是提供更简单的配置体验。为了激活命名空间,请将以下命名空间声明添加到 Spring XML 应用程序上下文文件中: @@ -66,7 +66,7 @@ Spring 自批处理集成 1.3 以来,添加了专用的 XML 命名空间支持 也允许将版本号附加到引用的 XSD 文件中,但是,由于无版本声明总是使用最新的模式,因此我们通常不建议将版本号附加到 XSD 名称中。添加版本号可能会在更新 Spring 批处理集成依赖项时产生问题,因为它们可能需要 XML 模式的最新版本。 -#### [](#launching-batch-jobs-through-messages)通过消息启动批处理作业 +#### 通过消息启动批处理作业 当通过使用核心 Spring 批处理 API 启动批处理作业时,你基本上有两个选项: @@ -86,7 +86,7 @@ Spring 批处理集成提供了`JobLaunchingMessageHandler`类,你可以使用 图 1。启动批处理作业 -##### [](#transforming-a-file-into-a-joblaunchrequest)将文件转换为 joblaunchrequest +##### 将文件转换为 joblaunchrequest ``` package io.spring.sbi; @@ -124,13 +124,13 @@ public class FileMessageToJobRequest { } ``` -##### [](#the-jobexecution-response)the`JobExecution`响应 +##### the`JobExecution`响应 当执行批处理作业时,将返回一个`JobExecution`实例。此实例可用于确定执行的状态。如果`JobExecution`能够成功创建,则无论实际执行是否成功,它总是被返回。 如何返回`JobExecution`实例的确切行为取决于所提供的`TaskExecutor`。如果使用`synchronous`(单线程)`TaskExecutor`实现,则只返回`JobExecution`响应`after`作业完成。当使用`asynchronous``TaskExecutor`时,将立即返回`JobExecution`实例。然后,用户可以使用`JobExecution`的`id`实例(带有`JobExecution.getJobId()`),并使用`JobExplorer`查询`JobRepository`中的作业更新状态。有关更多信息,请参阅关于[查询存储库](job.html#queryingRepository)的 Spring 批参考文档。 -##### [](#spring-batch-integration-configuration) Spring 批处理集成配置 +##### Spring 批处理集成配置 考虑这样一种情况:需要创建一个文件`inbound-channel-adapter`来监听所提供的目录中的 CSV 文件,将它们交给转换器(`FileMessageToJobRequest`),通过*工作启动网关*启动作业,然后用`logging-channel-adapter`记录`JobExecution`的输出。 @@ -199,7 +199,7 @@ public IntegrationFlow integrationFlow(JobLaunchingGateway jobLaunchingGateway) } ``` -##### [](#example-itemreader-configuration)示例 itemreader 配置 +##### 示例 itemreader 配置 现在我们正在轮询文件和启动作业,我们需要配置我们的 Spring 批处理`ItemReader`(例如),以使用在名为“input.file.name”的作业参数所定义的位置找到的文件,如下面的 Bean 配置所示: @@ -233,7 +233,7 @@ public ItemReader sampleReader(@Value("#{jobParameters[input.file.name]}") Strin 在前面的示例中,主要的关注点是注入`#{jobParameters['input.file.name']}`的值作为资源属性值,并将`ItemReader` Bean 设置为具有*步骤范围*。将 Bean 设置为具有步骤作用域利用了后期绑定支持,这允许访问`jobParameters`变量。 -### [](#availableAttributesOfTheJobLaunchingGateway)作业启动网关的可用属性 +### 作业启动网关的可用属性 作业启动网关具有以下属性,你可以设置这些属性来控制作业: @@ -255,7 +255,7 @@ public ItemReader sampleReader(@Value("#{jobParameters[input.file.name]}") Strin * `order`:指定当此端点作为订阅服务器连接到`SubscribableChannel`时的调用顺序。 -### [](#sub-elements)子元素 +### 子元素 当`Gateway`接收来自`PollableChannel`的消息时,你必须为`Poller`提供一个全局默认值`Poller`,或者为`Job Launching Gateway`提供一个子元素。 @@ -284,7 +284,7 @@ public JobLaunchingGateway sampleJobLaunchingGateway() { } ``` -#### [](#providing-feedback-with-informational-messages)提供反馈信息 +#### 提供反馈信息 Spring 由于批处理作业可以运行很长时间,因此提供进度信息通常是至关重要的。例如,如果批处理作业的某些部分或所有部分都失败了,利益相关者可能希望得到通知。 Spring 批处理为正在通过以下方式收集的此信息提供支持: @@ -381,7 +381,7 @@ public Job importPaymentsJob() { } ``` -#### [](#asynchronous-processors)异步处理器 +#### 异步处理器 异步处理器帮助你扩展项目的处理。在异步处理器用例中,`AsyncItemProcessor`充当调度器,为新线程上的项执行`ItemProcessor`的逻辑。项目完成后,将`Future`传递给要写入的`AsynchItemWriter`。 @@ -447,7 +447,7 @@ public AsyncItemWriter writer(ItemWriter itemWriter) { 同样,`delegate`属性实际上是对你的`ItemWriter` Bean 的引用。 -#### [](#externalizing-batch-process-execution)外部化批处理过程执行 +#### 外部化批处理过程执行 到目前为止讨论的集成方法建议使用 Spring 集成像外壳一样包装 Spring 批处理的用例。然而, Spring 批处理也可以在内部使用 Spring 集成。 Spring 使用这种方法,批处理用户可以将项目甚至块的处理委托给外部进程。这允许你卸载复杂的处理。 Spring 批处理集成为以下方面提供了专门的支持: @@ -455,7 +455,7 @@ public AsyncItemWriter writer(ItemWriter itemWriter) { * 远程分区 -##### [](#remote-chunking)远程分块 +##### 远程分块 ![远程分块](./images/remote-chunking-sbi.png) @@ -784,7 +784,7 @@ public class RemoteChunkingJobConfiguration { 你可以找到远程分块作业[here](https://github.com/spring-projects/spring-batch/tree/master/spring-batch-samples#remote-chunking-sample)的完整示例。 -##### [](#remote-partitioning)远程分区 +##### 远程分区 ![远程分区](./images/remote-partitioning.png) diff --git a/docs/spring-batch/spring-batch-intro.md b/docs/spring-batch/spring-batch-intro.md index f3d4eac..1a8aecc 100644 --- a/docs/spring-batch/spring-batch-intro.md +++ b/docs/spring-batch/spring-batch-intro.md @@ -1,6 +1,6 @@ # Spring 批量介绍 -## [](#spring-batch-intro) Spring 批介绍 +## Spring 批介绍 Enterprise 领域中的许多应用程序需要大容量处理,以在关键任务环境中执行业务操作。这些业务包括: @@ -14,7 +14,7 @@ Spring 批处理是一种轻量级的、全面的批处理框架,旨在使开 Spring 批处理提供了可重用的功能,这些功能在处理大量记录中是必不可少的,包括日志记录/跟踪、事务管理、作业处理统计、作业重新启动、跳过和资源管理。它还提供了更先进的技术服务和功能,通过优化和分区技术实现了非常大的批量和高性能的批处理作业。 Spring 批处理既可以用于简单的用例(例如将文件读入数据库或运行存储过程),也可以用于复杂的、大容量的用例(例如在数据库之间移动大容量的数据,对其进行转换,等等)。大批量批处理作业可以以高度可伸缩的方式利用框架来处理大量信息。 -### [](#springBatchBackground)背景 +### 背景 虽然开放源码软件项目和相关社区更多地关注基于 Web 和基于微服务的架构框架,但明显缺乏对可重用架构框架的关注,以满足基于 Java 的批处理需求,尽管仍然需要在 EnterpriseIT 环境中处理此类处理。缺乏标准的、可重用的批处理体系结构导致了在客户 EnterpriseIT 功能中开发的许多一次性内部解决方案的激增。 @@ -24,7 +24,7 @@ SpringSource(现为 Pivotal)和埃森哲合作改变了这种状况。埃森 埃森哲和 SpringSource 之间的合作旨在促进软件处理方法、框架和工具的标准化,这些方法、框架和工具可以由 Enterprise 用户在创建批处理应用程序时始终如一地加以利用。希望为其 EnterpriseIT 环境提供标准的、经过验证的解决方案的公司和政府机构可以从 Spring 批处理中受益。 -### [](#springBatchUsageScenarios)使用场景 +### 使用场景 一个典型的批处理程序通常是: @@ -70,7 +70,7 @@ Spring 批处理自动化了这种基本的批处理迭代,提供了将类似 * 提供一个简单的部署模型,其体系结构 JAR 与应用程序完全分开,使用 Maven 构建。 -### [](#springBatchArchitecture) Spring 批处理体系结构 +### Spring 批处理体系结构 Spring Batch 的设计考虑到了可扩展性和多样化的最终用户群体。下图显示了支持最终用户开发人员的可扩展性和易用性的分层架构。 @@ -80,7 +80,7 @@ Spring Batch 的设计考虑到了可扩展性和多样化的最终用户群体 这个分层架构突出了三个主要的高级组件:应用程序、核心和基础架构。该应用程序包含由开发人员使用 Spring 批处理编写的所有批处理作业和自定义代码。批处理核心包含启动和控制批处理作业所必需的核心运行时类。它包括`JobLauncher`、`Job`和`Step`的实现。应用程序和核心都是建立在一个共同的基础架构之上的。这个基础结构包含常见的读取器、编写器和服务(例如`RetryTemplate`),应用程序开发人员(读取器和编写器,例如`ItemReader`和`ItemWriter`)和核心框架本身(Retry,这是它自己的库)都使用它们。 -### [](#batchArchitectureConsiderations)一般批处理原则和准则 +### 一般批处理原则和准则 在构建批处理解决方案时,应考虑以下关键原则、指南和一般考虑因素。 @@ -114,7 +114,7 @@ Spring Batch 的设计考虑到了可扩展性和多样化的最终用户群体 * 在大批量系统中,备份可能是具有挑战性的,特别是如果系统在 24-7 的基础上与在线并发运行。数据库备份通常在联机设计中得到很好的处理,但是文件备份也应该被认为是同样重要的。如果系统依赖于平面文件,那么文件备份过程不仅应该到位并记录在案,还应该定期进行测试。 -### [](#batchProcessingStrategy)批处理策略 +### 批处理策略 为了帮助设计和实现批处理系统,基本的批处理应用程序构建块和模式应该以示例结构图和代码 shell 的形式提供给设计人员和程序员。在开始设计批处理作业时,应该将业务逻辑分解为一系列步骤,这些步骤可以使用以下标准构建块来实现: diff --git a/docs/spring-batch/step.md b/docs/spring-batch/step.md index d401c3d..39d9ac2 100644 --- a/docs/spring-batch/step.md +++ b/docs/spring-batch/step.md @@ -1,6 +1,6 @@ # 配置一个步骤 -## [](#configureStep)配置`Step` +## 配置`Step` XMLJavaBoth @@ -10,7 +10,7 @@ XMLJavaBoth 图 1。步骤 -### [](#chunkOrientedProcessing)面向块的处理 +### 面向块的处理 Spring 批处理在其最常见的实现中使用了一种“面向块”的处理风格。面向块的处理指的是一次读取一个数据,并创建在事务边界内写出的“块”。一旦读取的项数等于提交间隔,`ItemWriter`就会写出整个块,然后提交事务。下图显示了这个过程: @@ -61,7 +61,7 @@ itemWriter.write(processedItems); 有关项处理器及其用例的更多详细信息,请参阅[项目处理](processor.html#itemProcessor)部分。 -#### [](#configuringAStep)配置`Step` +#### 配置`Step` 尽管`Step`所需依赖项的列表相对较短,但它是一个非常复杂的类,可能包含许多协作者。 @@ -133,7 +133,7 @@ public Step sampleStep(PlatformTransactionManager transactionManager) { 需要注意的是,`repository`默认为`jobRepository`,`transactionManager`默认为`transactionManager`(都是通过`@EnableBatchProcessing`中的基础设施提供的)。而且,`ItemProcessor`是可选的,因为该项可以直接从阅读器传递给编写器。 -#### [](#InheritingFromParentStep)从父节点继承`Step` +#### 从父节点继承`Step` 如果一组`Steps`共享类似的配置,那么定义一个“父”`Step`可能是有帮助的,具体的`Steps`可以从中继承属性。与 Java 中的类继承类似,“child”`Step`将其元素和属性与父元素和属性结合在一起。子程序还重写父程序的任何`Steps`。 @@ -159,7 +159,7 @@ public Step sampleStep(PlatformTransactionManager transactionManager) { * 在创建工作流时,如本章后面所述,`next`属性应该指代工作流中的步骤,而不是独立的步骤。 -##### [](#abstractStep)摘要`Step` +##### 摘要`Step` 有时,可能需要定义不是完整的`Step`配置的父`Step`。例如,如果`reader`、`writer`和`tasklet`属性在`Step`配置中被保留,则初始化失败。如果必须在没有这些属性的情况下定义父属性,那么应该使用`abstract`属性。`abstract``Step`只是扩展,不是实例化。 @@ -179,7 +179,7 @@ public Step sampleStep(PlatformTransactionManager transactionManager) { ``` -##### [](#mergingListsOnStep)合并列表 +##### 合并列表 `Steps`上的一些可配置元素是列表,例如``元素。如果父元素和子元素`Steps`都声明一个``元素,那么子元素的列表将覆盖父元素的列表。为了允许子元素向父元素定义的列表中添加额外的侦听器,每个 List 元素都具有`merge`属性。如果元素指定`merge="true"`,那么子元素的列表将与父元素的列表合并,而不是覆盖它。 @@ -202,7 +202,7 @@ public Step sampleStep(PlatformTransactionManager transactionManager) { ``` -#### [](#commitInterval)提交间隔 +#### 提交间隔 如前所述,一个步骤读入并写出项,并使用提供的`PlatformTransactionManager`定期提交。如果`commit-interval`为 1,则在写入每个单独的项后提交。在许多情况下,这是不理想的,因为开始和提交事务是昂贵的。理想情况下,最好是在每个事务中处理尽可能多的项,这完全取决于所处理的数据类型以及与该步骤交互的资源。因此,可以配置在提交中处理的项数。 @@ -244,11 +244,11 @@ public Step step1() { 在前面的示例中,在每个事务中处理 10 个项目。在处理的开始,事务就开始了。此外,每次在`read`上调用`ItemReader`时,计数器都会递增。当它达到 10 时,聚合项的列表被传递给`ItemWriter`,事务被提交。 -#### [](#stepRestart)配置用于重新启动的`Step` +#### 配置用于重新启动的`Step` 在“[配置和运行作业](job.html#configureJob)”小节中,讨论了重新启动`Job`。重启对步骤有很多影响,因此,可能需要一些特定的配置。 -##### [](#startLimit)设置启动限制 +##### 设置启动限制 在许多情况下,你可能希望控制`Step`可以启动的次数。例如,可能需要对特定的`Step`进行配置,使其仅运行一次,因为它会使一些必须手动修复的资源失效,然后才能再次运行。这是在步骤级别上可配置的,因为不同的步骤可能有不同的需求。可以只执行一次的`Step`可以作为同一`Job`的一部分存在,也可以作为可以无限运行的`Step`的一部分存在。 @@ -282,7 +282,7 @@ public Step step1() { 前面示例中所示的步骤只能运行一次。试图再次运行它将导致抛出`StartLimitExceededException`。请注意,start-limit 的默认值是`Integer.MAX_VALUE`。 -##### [](#allowStartIfComplete)重新启动已完成的`Step` +##### 重新启动已完成的`Step` 在可重启作业的情况下,无论第一次是否成功,都可能有一个或多个应该始终运行的步骤。例如,验证步骤或`Step`在处理前清理资源。在对重新启动的作业进行正常处理期间,跳过状态为“已完成”的任何步骤,这意味着该步骤已成功完成。将`allow-start-if-complete`设置为“true”会重写此项,以便该步骤始终运行。 @@ -314,7 +314,7 @@ public Step step1() { } ``` -##### [](#stepRestartExample)`Step`重新启动配置示例 +##### `Step`重新启动配置示例 下面的 XML 示例展示了如何将作业配置为具有可以重新启动的步骤: @@ -418,7 +418,7 @@ public Step playerSummarization() { 3. 由于这是`playerSummarization`的第三次执行,因此`playerSummarization`未启动并立即终止作业,并且其限制仅为 2。要么必须提高限制,要么必须执行`Job`作为新的`JobInstance`。 -#### [](#configuringSkip)配置跳过逻辑 +#### 配置跳过逻辑 在许多情况下,在处理过程中遇到的错误不会导致`Step`失败,而是应该跳过。这通常是一个必须由了解数据本身及其含义的人做出的决定。例如,财务数据可能不会被跳过,因为它会导致资金转移,而这需要完全准确。另一方面,加载供应商列表可能会允许跳过。如果某个供应商由于格式化不正确或缺少必要的信息而未加载,那么很可能就不存在问题。通常,这些不良记录也会被记录下来,稍后在讨论听众时会对此进行讨论。 @@ -506,7 +506,7 @@ public Step step1() { `skip`和`noSkip`方法调用的顺序并不重要。 -#### [](#retryLogic)配置重试逻辑 +#### 配置重试逻辑 在大多数情况下,你希望异常导致跳过或`Step`失败。然而,并非所有的例外都是确定性的。如果在读取时遇到`FlatFileParseException`,则总是为该记录抛出该记录。重置`ItemReader`不会有帮助。但是,对于其他异常,例如`DeadlockLoserDataAccessException`,它表示当前进程试图更新另一个进程持有锁定的记录。等待并再次尝试可能会取得成功。 @@ -543,7 +543,7 @@ public Step step1() { `Step`允许对单个项目的重试次数进行限制,并提供“可重试”的异常列表。有关重试工作原理的更多详细信息,请参见[retry](retry.html#retry)。 -#### [](#controllingRollback)控制回滚 +#### 控制回滚 默认情况下,不管是重试还是跳过,从`ItemWriter`抛出的任何异常都会导致由`Step`控制的事务回滚。如果按照前面描述的方式配置了 Skip,则从`ItemReader`抛出的异常不会导致回滚。但是,在许多情况下,从`ItemWriter`抛出的异常不应该导致回滚,因为没有发生任何使事务无效的操作。出于这个原因,`Step`可以配置一个不应导致回滚的异常列表。 @@ -579,7 +579,7 @@ public Step step1() { } ``` -##### [](#transactionalReaders)事务读取器 +##### 事务读取器 `ItemReader`的基本契约是,它只是远期的。该步骤缓冲读写器的输入,以便在回滚的情况下,不需要从读写器重新读取项目。然而,在某些情况下,读取器是建立在事务性资源之上的,例如 JMS 队列。在这种情况下,由于队列与回滚的事务绑定在一起,因此从队列中拉出的消息将被放回。出于这个原因,可以将该步骤配置为不缓冲项。 @@ -612,7 +612,7 @@ public Step step1() { } ``` -#### [](#transactionAttributes)事务属性 +#### 事务属性 事务属性可用于控制`isolation`、`propagation`和`timeout`设置。有关设置事务属性的更多信息,请参见[Spring core documentation](https://docs.spring.io/spring/docs/current/spring-framework-reference/data-access.html#transaction)。 @@ -652,7 +652,7 @@ public Step step1() { } ``` -#### [](#registeringItemStreams)用`Step`注册`ItemStream` +#### 用`Step`注册`ItemStream` 该步骤必须在其生命周期中的必要点处理`ItemStream`回调(有关`ItemStream`接口的更多信息,请参见[ItemStream](readersAndWriters.html#itemStream))。如果一个步骤失败并且可能需要重新启动,这是至关重要的,因为`ItemStream`接口是该步骤获取所需的关于两次执行之间的持久状态的信息的地方。 @@ -721,7 +721,7 @@ public CompositeItemWriter compositeItemWriter() { 在上面的示例中,`CompositeItemWriter`不是`ItemStream`,但它的两个委托都是。因此,为了使框架能够正确地处理这两个委托编写器,必须将这两个委托编写器显式地注册为流。`ItemReader`不需要显式地注册为流,因为它是`Step`的直接属性。该步骤现在可以重新启动,并且在发生故障时,Reader 和 Writer 的状态被正确地持久化。 -#### [](#interceptingStepExecution)拦截`Step`执行 +#### 拦截`Step`执行 就像`Job`一样,在执行`Step`的过程中有许多事件,其中用户可能需要执行某些功能。例如,为了写出到需要页脚的平面文件,需要在`ItemWriter`已完成时通知`Step`,以便可以写出页脚。这可以通过使用许多`Step`范围的侦听器中的一个来实现。 @@ -762,7 +762,7 @@ public Step step1() { 除了`StepListener`接口外,还提供了注释来解决相同的问题。普通的旧 Java 对象可以具有带有这些注释的方法,然后将这些方法转换为相应的`StepListener`类型。对组块组件的定制实现进行注释也是常见的,例如`ItemReader`或`ItemWriter`或`Tasklet`。XML 解析器分析``元素的注释,并在构建器中使用`listener`方法注册注释,因此你所需要做的就是使用 XML 名称空间或构建器通过一个步骤注册侦听器。 -##### [](#stepExecutionListener)`StepExecutionListener` +##### `StepExecutionListener` `StepExecutionListener`表示用于`Step`执行的最通用的侦听器。它允许在`Step`开始之前和结束之后发出通知,无论是正常结束还是失败,如下例所示: @@ -784,7 +784,7 @@ public interface StepExecutionListener extends StepListener { * `@AfterStep` -##### [](#chunkListener)`ChunkListener` +##### `ChunkListener` 块被定义为在事务范围内处理的项。在每个提交间隔时间提交一个事务,提交一个“块”。a`ChunkListener`可用于在块开始处理之前或在块成功完成之后执行逻辑,如以下接口定义所示: @@ -810,7 +810,7 @@ public interface ChunkListener extends StepListener { 当没有块声明时,可以应用`ChunkListener`。`TaskletStep`负责调用`ChunkListener`,因此它也适用于非面向项目的任务小程序(在任务小程序之前和之后调用它)。 -##### [](#itemReadListener)`ItemReadListener` +##### `ItemReadListener` 在前面讨论跳过逻辑时,提到了记录跳过的记录可能是有益的,这样可以在以后处理它们。在读取错误的情况下,可以使用`ItemReaderListener`完成此操作,如下面的接口定义所示: @@ -834,7 +834,7 @@ public interface ItemReadListener extends StepListener { * `@OnReadError` -##### [](#itemProcessListener)`ItemProcessListener` +##### `ItemProcessListener` 就像`ItemReadListener`一样,可以“监听”项目的处理,如以下接口定义所示: @@ -858,7 +858,7 @@ public interface ItemProcessListener extends StepListener { * `@OnProcessError` -##### [](#itemWriteListener)`ItemWriteListener` +##### `ItemWriteListener` 可以使用`ItemWriteListener`“监听”项目的写入,如以下接口定义所示: @@ -882,7 +882,7 @@ public interface ItemWriteListener extends StepListener { * `@OnWriteError` -##### [](#skipListener)`SkipListener` +##### `SkipListener` `ItemReadListener`、`ItemProcessListener`和`ItemWriteListener`都提供了通知错误的机制,但没有一个通知你记录实际上已被跳过。例如,`onWriteError`即使一个项目被重试并成功,也会被调用。出于这个原因,有一个单独的接口用于跟踪跳过的项目,如以下接口定义所示: @@ -906,7 +906,7 @@ public interface SkipListener extends StepListener { * `@OnSkipInProcess` -###### [](#skipListenersAndTransactions)跳过侦听器和事务 +###### 跳过侦听器和事务 `SkipListener`最常见的用例之一是注销一个跳过的项,这样就可以使用另一个批处理过程甚至人工过程来评估和修复导致跳过的问题。因为在许多情况下原始事务可能会被回滚, Spring Batch 提供了两个保证: @@ -914,7 +914,7 @@ public interface SkipListener extends StepListener { 2. 总是在事务提交之前调用`SkipListener`。这是为了确保侦听器调用的任何事务资源不会因`ItemWriter`中的故障而回滚。 -### [](#taskletStep)`TaskletStep` +### `TaskletStep` [面向块的处理](#chunkOrientedProcessing)并不是在`Step`中进行处理的唯一方法。如果`Step`必须包含一个简单的存储过程调用怎么办?你可以将调用实现为`ItemReader`,并在过程完成后返回 null。然而,这样做有点不自然,因为需要有一个 no-op`ItemWriter`。 Spring Batch 为此场景提供了`TaskletStep`。 @@ -942,7 +942,7 @@ public Step step1() { | |`TaskletStep`如果实现`StepListener`接口,则自动将
任务集注册为`StepListener`。| |---|-----------------------------------------------------------------------------------------------------------------------| -#### [](#taskletAdapter)`TaskletAdapter` +#### `TaskletAdapter` 与`ItemReader`和`ItemWriter`接口的其他适配器一样,`Tasklet`接口包含一个允许自适应到任何预先存在的类的实现:`TaskletAdapter`。这可能有用的一个例子是现有的 DAO,该 DAO 用于更新一组记录上的标志。`TaskletAdapter`可以用来调用这个类,而不必为`Tasklet`接口编写适配器。 @@ -975,7 +975,7 @@ public MethodInvokingTaskletAdapter myTasklet() { } ``` -#### [](#exampleTaskletImplementation)示例`Tasklet`实现 +#### 示例`Tasklet`实现 许多批处理作业包含一些步骤,这些步骤必须在主处理开始之前完成,以便设置各种资源,或者在处理完成之后清理这些资源。如果作业中的文件很多,那么在成功地将某些文件上传到另一个位置后,通常需要在本地删除这些文件。下面的示例(取自[Spring Batch samples project](https://github.com/spring-projects/spring-batch/tree/master/spring-batch-samples))是一个带有这样的职责的`Tasklet`实现: @@ -1063,11 +1063,11 @@ public FileDeletingTasklet fileDeletingTasklet() { } ``` -### [](#controllingStepFlow)控制阶跃流 +### 控制阶跃流 在拥有一份工作的过程中,有了将步骤组合在一起的能力,就需要能够控制工作如何从一个步骤“流动”到另一个步骤。a`Step`失败并不一定意味着`Job`应该失败。此外,可能有不止一种类型的“成功”来决定下一步应该执行哪个`Step`。根据`Steps`组的配置方式,某些步骤甚至可能根本不会被处理。 -#### [](#SequentialFlow)序贯流 +#### 序贯流 最简单的流程场景是所有步骤都按顺序执行的作业,如下图所示: @@ -1109,7 +1109,7 @@ public Job job() { | |对于 Spring 批处理 XML 命名空间,配置中列出的第一步是 *always*`Job`运行的第一步。其他步骤元素的顺序并不是
重要的,但是第一步必须始终首先出现在 XML 中。| |---|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -#### [](#conditionalFlow)条件流 +#### 条件流 在上面的例子中,只有两种可能性: @@ -1170,7 +1170,7 @@ public Job job() { 虽然对`Step`上的转换元素的数量没有限制,但是如果`Step`执行导致元素不覆盖的`ExitStatus`,那么框架将抛出一个异常,而`Job`将失败。该框架自动命令从最特定到最不特定的转换。这意味着,即使在上面的示例中将顺序交换为“stepa”,`ExitStatus`的“failed”仍将转到“stepc”。 -##### [](#batchStatusVsExitStatus)批处理状态与退出状态 +##### 批处理状态与退出状态 在为条件流配置`Job`时,重要的是要理解`BatchStatus`和`ExitStatus`之间的区别。`BatchStatus`是一个枚举,它是`JobExecution`和`StepExecution`的属性,框架使用它来记录`Job`或`Step`的状态。它可以是以下值之一:`COMPLETED`,`STARTING`,`STARTED`,`STOPPING`,`STOPPED`,`FAILED`,`ABANDONED`,或`UNKNOWN`。其中大多数是不言自明的:`COMPLETED`是当一个步骤或作业成功完成时设置的状态,`FAILED`是当它失败时设置的状态,依此类推。 @@ -1252,7 +1252,7 @@ public class SkipCheckingListener extends StepExecutionListenerSupport { 上面的代码是`StepExecutionListener`,该代码首先检查以确保`Step`成功,然后检查`StepExecution`上的跳过计数是否高于 0. 如果这两个条件都满足,则返回一个新的`ExitStatus`,其退出代码为`COMPLETED WITH SKIPS`。 -#### [](#configuringForStop)配置停止 +#### 配置停止 在讨论了[batchstatus 和 exitstatus](#batchStatusVsExitStatus)之后,人们可能想知道如何确定`BatchStatus`和`ExitStatus`的`Job`。虽然这些状态是由执行的代码为`Step`确定的,但`Job`的状态是基于配置确定的。 @@ -1283,7 +1283,7 @@ public Job job() { 虽然这种终止批处理作业的方法对于某些批处理作业(例如简单的连续步骤作业)来说已经足够了,但可能需要自定义的作业停止场景。为此, Spring Batch 提供了三个转换元素来停止`Job`(除了我们前面讨论的[`next`元素](#NextElement))。这些停止元素中的每一个都以特定的`BatchStatus`停止`Job`。重要的是要注意,停止转换元件对`BatchStatus`中的任何`Steps`的`ExitStatus`或`ExitStatus`都没有影响。这些元素只影响`Job`的最终状态。例如,对于作业中的每一步,都可能具有`FAILED`的状态,但是对于作业,则可能具有`COMPLETED`的状态。 -##### [](#endElement)以一步结尾 +##### 以一步结尾 配置步骤结束指示`Job`以`BatchStatus`的`COMPLETED`停止。已经完成了 status`COMPLETED`的`Job`不能重新启动(框架抛出一个`JobInstanceAlreadyCompleteException`)。 @@ -1321,7 +1321,7 @@ public Job job() { } ``` -##### [](#failElement)失败的步骤 +##### 失败的步骤 配置在给定点失败的步骤指示`Job`以`BatchStatus`的`FAILED`停止。与 END 不同的是,`Job`的失败并不会阻止`Job`被重新启动。 @@ -1360,7 +1360,7 @@ public Job job() { } ``` -##### [](#stopElement)在给定的步骤停止作业 +##### 在给定的步骤停止作业 将作业配置为在特定的步骤停止,将指示`Job`使用`BatchStatus`的`STOPPED`停止作业。停止`Job`可以在处理中提供临时中断,以便操作员可以在重新启动`Job`之前采取一些操作。 @@ -1392,7 +1392,7 @@ public Job job() { } ``` -#### [](#programmaticFlowDecisions)程序化流程决策 +#### 程序化流程决策 在某些情况下,可能需要比`ExitStatus`更多的信息来决定下一步执行哪个步骤。在这种情况下,可以使用`JobExecutionDecider`来辅助决策,如以下示例所示: @@ -1447,7 +1447,7 @@ public Job job() { } ``` -#### [](#split-flows)拆分流 +#### 拆分流 到目前为止描述的每个场景都涉及一个`Job`,它以线性方式一次执行一个步骤。 Spring 除了这种典型的样式之外,批处理还允许使用并行的流来配置作业。 @@ -1496,7 +1496,7 @@ public Job job(Flow flow1, Flow flow2) { } ``` -#### [](#external-flows)外部化作业之间的流定义和依赖关系 +#### 外部化作业之间的流定义和依赖关系 作业中的部分流可以作为单独的 Bean 定义外部化,然后重新使用。有两种方法可以做到这一点。第一种方法是简单地将流声明为对别处定义的流的引用。 @@ -1602,7 +1602,7 @@ public DefaultJobParametersExtractor jobParametersExtractor() { 作业参数提取器是一种策略,它确定如何将`Step`的`ExecutionContext`转换为正在运行的`JobParameters`的`JobParameters`。当你希望有一些更细粒度的选项来监视和报告作业和步骤时,`JobStep`非常有用。使用`JobStep`通常也是对这个问题的一个很好的回答:“我如何在工作之间创建依赖关系?”这是一种很好的方法,可以将一个大型系统分解成更小的模块,并控制工作流程。 -### [](#late-binding)`Job`和`Step`属性的后期绑定 +### `Job`和`Step`属性的后期绑定 前面显示的 XML 和平面文件示例都使用 Spring `Resource`抽象来获取文件。这是因为`Resource`有一个`getFile`方法,它返回一个`java.io.File`。XML 和平面文件资源都可以使用标准的 Spring 构造进行配置: @@ -1748,7 +1748,7 @@ public FlatFileItemReader flatFileItemReader(@Value("#{stepExecutionContext['inp | |如果你正在使用 Spring 3.0(或更高版本),则步骤作用域 bean 中的表达式使用
Spring 表达式语言,这是一种功能强大的通用语言,具有许多有趣的
特性。为了提供向后兼容性,如果 Spring 批检测到
Spring 的旧版本的存在,则它使用一种功能不那么强大的原生表达式语言和具有略有不同的解析规则的
。主要的区别在于,在
上面的示例中的 MAP 键不需要引用 Spring 2.5,但是在 Spring 3.0 中的引用是强制性的
。| |---|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -#### [](#step-scope)步骤作用域 +#### 步骤作用域 前面显示的所有延迟绑定示例都在 Bean 定义中声明了“步骤”的范围。 @@ -1796,7 +1796,7 @@ public FlatFileItemReader flatFileItemReader(@Value("#{jobParameters[input.file. ``` -#### [](#job-scope)工作范围 +#### 工作范围 在 Spring 批 3.0 中引入的`Job`作用域在配置中类似于`Step`作用域,但它是`Job`上下文的作用域,因此每个运行的作业只有一个这样的 Bean 实例。此外,还支持使用`#{..}`占位符从`JobContext`访问的引用的后期绑定。使用此特性, Bean 可以从作业或作业执行上下文和作业参数中提取属性。 diff --git a/docs/spring-batch/testing.md b/docs/spring-batch/testing.md index 9c36c42..f575d69 100644 --- a/docs/spring-batch/testing.md +++ b/docs/spring-batch/testing.md @@ -1,12 +1,12 @@ # 单元测试 -## [](#testing)单元测试 +## 单元测试 XMLJavaBoth 与其他应用程序样式一样,对作为批处理作业的一部分编写的任何代码进行单元测试是非常重要的。 Spring 核心文档非常详细地介绍了如何使用 Spring 进行单元和集成测试,因此在此不再赘述。然而,重要的是要考虑如何“端到端”地测试批处理作业,这就是本章所涵盖的内容。 Spring-batch-test 项目包括促进这种端到端测试方法的类。 -### [](#creatingUnitTestClass)创建单元测试类 +### 创建单元测试类 为了让单元测试运行批处理作业,框架必须加载作业的应用上下文。使用两个注释来触发此行为: @@ -42,7 +42,7 @@ public class SkipSampleFunctionalTests { ... } public class SkipSampleFunctionalTests { ... } ``` -### [](#endToEndTesting)批处理作业的端到端测试 +### 批处理作业的端到端测试 “端到端”测试可以定义为从开始到结束测试批处理作业的完整运行。这允许测试设置测试条件、执行作业并验证最终结果。 @@ -119,7 +119,7 @@ public class SkipSampleFunctionalTests { } ``` -### [](#testingIndividualSteps)测试单个步骤 +### 测试单个步骤 对于复杂的批处理作业,端到端测试方法中的测试用例可能变得难以管理。如果是这些情况,那么让测试用例自行测试单个步骤可能会更有用。`AbstractJobTests`类包含一个名为`launchStep`的方法,该方法使用一个步骤名并仅运行特定的`Step`。这种方法允许更有针对性的测试,让测试只为该步骤设置数据,并直接验证其结果。下面的示例展示了如何使用`launchStep`方法按名称加载`Step`: @@ -127,7 +127,7 @@ public class SkipSampleFunctionalTests { JobExecution jobExecution = jobLauncherTestUtils.launchStep("loadFileStep"); ``` -### [](#testing-step-scoped-components)测试步骤范围内的组件 +### 测试步骤范围内的组件 通常,在运行时为你的步骤配置的组件使用步骤作用域和后期绑定从步骤或作业执行中注入上下文。这些作为独立组件进行测试是很棘手的,除非你有一种方法来设置上下文,就好像它们是在一个步骤执行中一样。这是 Spring 批处理中两个组件的目标:`StepScopeTestExecutionListener`和`StepScopeTestUtils`。 @@ -207,7 +207,7 @@ int count = StepScopeTestUtils.doInStepScope(stepExecution, }); ``` -### [](#validatingOutputFiles)验证输出文件 +### 验证输出文件 当批处理作业写到数据库时,很容易查询数据库以验证输出是否如预期的那样。然而,如果批处理作业写入文件,那么验证输出也同样重要。 Spring Batch 提供了一个名为的类,以便于对输出文件进行验证。名为`assertFileEquals`的方法接受两个`File`对象(或两个`Resource`对象),并逐行断言这两个文件具有相同的内容。因此,可以创建一个具有预期输出的文件,并将其与实际结果进行比较,如下例所示: @@ -219,7 +219,7 @@ AssertFile.assertFileEquals(new FileSystemResource(EXPECTED_FILE), new FileSystemResource(OUTPUT_FILE)); ``` -### [](#mockingDomainObjects)模拟域对象 +### 模拟域对象 在为 Spring 批处理组件编写单元和集成测试时遇到的另一个常见问题是如何模拟域对象。一个很好的例子是`StepExecutionListener`,如以下代码片段所示: diff --git a/docs/spring-batch/transaction-appendix.md b/docs/spring-batch/transaction-appendix.md index 8b76ead..6153585 100644 --- a/docs/spring-batch/transaction-appendix.md +++ b/docs/spring-batch/transaction-appendix.md @@ -1,8 +1,8 @@ # 批处理和交易 -## [](#transactions)附录 A:批处理和事务 +## 附录 A:批处理和事务 -### [](#transactionsNoRetry)不需要重试的简单批处理 +### 不需要重试的简单批处理 考虑以下简单的嵌套批处理示例,该批处理不需要重试。它展示了批处理的一个常见场景:一个输入源被处理到耗尽,并且我们在处理的“块”结束时定期提交。 @@ -23,7 +23,7 @@ 如果`REPEAT`(3)处的块由于 3.2 处的数据库异常而失败,那么`TX`(2)必须回滚整个块。 -### [](#transactionStatelessRetry)简单无状态重试 +### 简单无状态重试 对于非事务性的操作,例如对 Web 服务或其他远程资源的调用,使用重试也很有用,如下面的示例所示: @@ -39,7 +39,7 @@ 这实际上是重试中最有用的应用程序之一,因为与数据库更新相比,远程调用更有可能失败并可重试。只要远程访问(2.1)最终成功,事务`TX`(0)就提交。如果远程访问(2.1)最终失败,那么事务`TX`(0)将保证回滚。 -### [](#repeatRetry)典型的重复重试模式 +### 典型的重复重试模式 最典型的批处理模式是向块的内部块添加重试,如以下示例所示: @@ -79,7 +79,7 @@ 但是,请注意,如果`TX`(2)失败并且我们*做*再试一次,根据外部完成策略,在内部`REPEAT`(3)中下一个处理的项并不能保证就是刚刚失败的项。它可能是,但它取决于输入的实现(4.1)。因此,输出(5.1)可能在新项或旧项上再次失败。批处理的客户机不应假定每次`RETRY`(4)尝试处理的项与上次失败的尝试处理的项相同。例如,如果`REPEAT`(1)的终止策略是在 10 次尝试后失败,则它在连续 10 次尝试后失败,但不一定在同一项上失败。这与总体重试策略是一致的。内部`RETRY`(4)了解每个项目的历史,并可以决定是否对它进行另一次尝试。 -### [](#asyncChunkProcessing)异步块处理 +### 异步块处理 通过将外部批配置为使用`AsyncTaskExecutor`,可以同时执行[典型例子](#repeatRetry)中的内部批或块。外部批处理在完成之前等待所有的块完成。下面的示例展示了异步块处理: @@ -103,7 +103,7 @@ | } ``` -### [](#asyncItemProcessing)异步项处理 +### 异步项处理 在[典型例子](#repeatRetry)中,以块为单位的单个项目原则上也可以同时处理。在这种情况下,事务边界必须移动到单个项的级别,以便每个事务都在单个线程上,如以下示例所示: @@ -129,7 +129,7 @@ 这个计划牺牲了优化的好处,这也是简单计划的好处,因为它将所有事务资源合并在一起。只有当处理(5)的成本远高于事务管理(3)的成本时,它才是有用的。 -### [](#transactionPropagation)批处理和事务传播之间的交互 +### 批处理和事务传播之间的交互 批处理重试和事务管理之间的耦合比我们理想的更紧密。特别是,无状态重试不能用于使用不支持嵌套传播的事务管理器重试数据库操作。 @@ -179,7 +179,7 @@ 因此,如果重试块包含任何数据库访问,`NESTED`模式是最好的。 -### [](#specialTransactionOrthogonal)特殊情况:使用正交资源的事务 +### 特殊情况:使用正交资源的事务 对于没有嵌套数据库事务的简单情况,默认传播总是 OK 的。考虑以下示例,其中`SESSION`和`TX`不是全局`XA`资源,因此它们的资源是正交的: @@ -196,7 +196,7 @@ 这里有一个事务消息`SESSION`(0),但是它不参与`PlatformTransactionManager`的其他事务,因此当`TX`(3)开始时它不会传播。在`RETRY`(2)块之外没有数据库访问权限。如果`TX`(3)失败,然后在重试时最终成功,`SESSION`(0)可以提交(独立于`TX`块)。这类似于普通的“尽最大努力-一阶段-提交”场景。当`RETRY`(2)成功而`SESSION`(0)无法提交(例如,因为消息系统不可用)时,可能发生的最坏情况是重复消息。 -### [](#statelessRetryCannotRecover)无状态重试无法恢复 +### 无状态重试无法恢复 在上面的典型示例中,无状态重试和有状态重试之间的区别很重要。它实际上最终是一个事务性约束,它强制了这种区别,并且这种约束也使区别存在的原因变得很明显。 diff --git a/docs/spring-batch/whatsnew.md b/docs/spring-batch/whatsnew.md index 9885616..50c5cc7 100644 --- a/docs/spring-batch/whatsnew.md +++ b/docs/spring-batch/whatsnew.md @@ -1,16 +1,16 @@ # 最新更新在 Spring 批 4.3 中 -## 在 Spring 批 4.3 中[](#whatsNew)最新更新 +## 在 Spring 批 4.3 中最新更新 这个版本附带了许多新特性、性能改进、依赖更新和 API 修改。这一节描述了最重要的变化。有关更改的完整列表,请参阅[发行说明](https://github.com/spring-projects/spring-batch/releases/tag/4.3.0)。 -### [](#newFeatures)新功能 +### 新功能 -#### [](#new-synchronized-itemstreamwriter)新建同步 ItemStreamWriter +#### 新建同步 ItemStreamWriter 与`SynchronizedItemStreamReader`类似,该版本引入了`SynchronizedItemStreamWriter`。这个特性在多线程的步骤中很有用,在这些步骤中,并发线程需要同步,以避免覆盖彼此的写操作。 -#### [](#new-jpaqueryprovider-for-named-queries)用于命名查询的新 JPaqueryProvider +#### 用于命名查询的新 JPaqueryProvider 这个版本在`JpaNativeQueryProvider`旁边引入了一个新的`JpaNamedQueryProvider`,以便在使用`JpaPagingItemReader`时简化 JPA 命名查询的配置: @@ -22,19 +22,19 @@ JpaPagingItemReader reader = new JpaPagingItemReaderBuilder() .build(); ``` -#### [](#new-jpacursoritemreader-implementation)新的 jpacursoritemreader 实现 +#### 新的 jpacursoritemreader 实现 JPA 2.2 增加了将结果作为游标而不是只进行分页的能力。该版本引入了一种新的 JPA 项读取器,该读取器使用此功能以类似于`JdbcCursorItemReader`和`HibernateCursorItemReader`的基于光标的方式流式传输结果。 -#### [](#new-jobparametersincrementer-implementation)新 JobParametersIncrementer 实现 +#### 新 JobParametersIncrementer 实现 与`RunIdIncrementer`类似,这个版本添加了一个新的`JobParametersIncrementer`,它基于 Spring 框架中的`DataFieldMaxValueIncrementer`。 -#### [](#graalvm-support)graalvm 支持 +#### graalvm 支持 这个版本增加了在 GraalVM 上运行 Spring 批处理应用程序的初始支持。该支持仍处于实验阶段,并将在未来的版本中进行改进。 -#### [](#java-records-support)Java 记录支持 +#### Java 记录支持 这个版本增加了在面向块的步骤中使用 Java 记录作为项的支持。新添加的`RecordFieldSetMapper`支持从平面文件到 Java 记录的数据映射,如以下示例所示: @@ -59,23 +59,23 @@ public record Person(int id, String name) { } `FlatFileItemReader`使用新的`RecordFieldSetMapper`将来自`persons.csv`文件的数据映射到类型`Person`的记录。 -### [](#performanceImprovements)性能改进 +### 性能改进 -#### [](#use-bulk-writes-in-repositoryitemwriter)在 RepositorYitemWriter 中使用批量写操作 +#### 在 RepositorYitemWriter 中使用批量写操作 直到版本 4.2,为了在`RepositoryItemWriter`中使用`CrudRepository#saveAll`,需要扩展 writer 并覆盖`write(List)`。 在此版本中,`RepositoryItemWriter`已更新为默认使用`CrudRepository#saveAll`。 -#### [](#use-bulk-writes-in-mongoitemwriter)在 MongoitemWriter 中使用批量写操作 +#### 在 MongoitemWriter 中使用批量写操作 `MongoItemWriter`在 for 循环中使用`MongoOperations#save()`将项保存到数据库中。在此版本中,此 Writer 已更新为使用`org.springframework.data.mongodb.core.BulkOperations`。 -#### [](#job-startrestart-time-improvement)作业启动/重启时间改进 +#### 作业启动/重启时间改进 `JobRepository#getStepExecutionCount()`的实现用于在内存中加载所有作业执行和步骤执行,以在框架端完成计数。在这个版本中,实现被更改为使用 SQL Count 查询对数据库执行一个单独的调用,以便计算执行的步骤。 -### [](#dependencyUpdates)依赖项更新 +### 依赖项更新 此版本将依赖 Spring 项目更新为以下版本: @@ -91,9 +91,9 @@ public record Person(int id, String name) { } * 千分尺 1.5 -### [](#deprecation)异议 +### 异议 -#### [](#apiDeprecation)API 反对 +#### API 反对 以下是在此版本中已被弃用的 API 列表: @@ -123,6 +123,6 @@ public record Person(int id, String name) { } 建议的替换可以在每个不推荐的 API 的 Javadoc 中找到。 -#### [](#sqlfireDeprecation)SQLFire 支持弃用 +#### SQLFire 支持弃用 自 2014 年 11 月 1 日起,SQLfire 一直位于[EOL](https://www.vmware.com/latam/products/pivotal-sqlfire.html)。这个版本取消了使用 SQLFire 作为作业存储库的支持,并计划在 5.0 版本中删除它。 \ No newline at end of file -- GitLab