Skip to content

Add (JOB_INSTANCE_ID) unique constraint to JobExecution table #1475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
spring-projects-issues opened this issue Oct 10, 2013 · 4 comments
Closed
Labels
related-to: ddl-scripts status: declined Features that we don't intend to implement or Bug reports that are invalid or missing enough details

Comments

@spring-projects-issues
Copy link
Collaborator

Erwin Vervaet opened BATCH-2120 and commented

As originally hinted at by BATCH-1852, very quick jobs are poorly identified.

JdbcJobExecutionDao.getLastJobExecution() assumes that the combination of the JOB_INSTANCE_ID and CREATE_TIME uniquely identifies a JobExecution instance:

SELECT JOB_EXECUTION_ID, START_TIME, END_TIME, STATUS, EXIT_CODE, EXIT_MESSAGE, CREATE_TIME, LAST_UPDATED, VERSION
from %PREFIX%JOB_EXECUTION
where JOB_INSTANCE_ID = ? and CREATE_TIME = (SELECT max(CREATE_TIME) from %PREFIX%JOB_EXECUTION where JOB_INSTANCE_ID = ?)

However, this is not enforced by a constraint in the database. This can give problems in certain situations, i.e. with fast running jobs on MySQL which does not have a millisecond data type (see http://forum.spring.io/forum/spring-projects/batch/78846-conflict-when-starting-3rd-times-a-job).

I would suggest to add such a constraint to the SQL scripts. For MySQL this could be:

ALTER TABLE BATCH_JOB_EXECUTION ADD UNIQUE job_instance_id_create_time_unique (JOB_INSTANCE_ID, CREATE_TIME);

No further details from BATCH-2120

@spring-projects-issues
Copy link
Collaborator Author

Michael Minella commented

This isn't quite that simple. The issue really stems from the fact that many databases don't support millisecond precision (MySql being one of those). Adding this constraint would prevent you from launching them any faster than once per second which is would probably not be good enough in some environments.

@spring-projects-issues
Copy link
Collaborator Author

Erwin Vervaet commented

I agree that this doesn't really solve the problem. However, it prevents bad data (i.e. violating Spring Batch's assumption that the combination of job_instance_id and create_time uniquely identifies a JobExecution) from ending up in the database.

Is there a clean way of recovering from a situation like this? As far as I know the only way out is directly manipulating the data in the database, something that I would like to avoid.

In my opinion having the limitation that you can only launch one JobExecution for a particuler JobInstance per minimum time fragment supported by your database, is better than allowing normal framework usage from corrupting the frameworks internal data, forcing manual intervention.

@fmbenhassine
Copy link
Contributor

fmbenhassine commented May 4, 2022

JdbcJobExecutionDao.getLastJobExecution() assumes that the combination of the JOB_INSTANCE_ID and CREATE_TIME uniquely identifies a JobExecution instance:

CREATE_TIME is not used anymore in the query since 00aab3e, so the question about the frequency of job launching is not relevant anymore here.

That said, I think the unique constraint should be added for JOB_INSTANCE_ID, to enforce data consistency at the job repository level.

Related issues: #3966, #3788.

@fmbenhassine fmbenhassine changed the title Add (JOB_INSTANCE_ID, CREATE_TIME) unique constraint to JobExecution table [BATCH-2120] Add (JOB_INSTANCE_ID) unique constraint to JobExecution table May 4, 2022
@fmbenhassine fmbenhassine modified the milestones: 5.0.0, 5.0.0-M6 Aug 31, 2022
@fmbenhassine fmbenhassine modified the milestones: 5.0.0-M6, 5.0.0-M7 Sep 21, 2022
@fmbenhassine fmbenhassine modified the milestones: 5.0.0-M7, 5.0.0-M8 Oct 4, 2022
@fmbenhassine fmbenhassine modified the milestones: 5.0.0-M8, 5.0.0-RC1 Oct 12, 2022
@fmbenhassine fmbenhassine modified the milestones: 5.0.0-RC1, 5.0.0-RC2 Oct 20, 2022
@fmbenhassine fmbenhassine modified the milestones: 5.0.0-RC2, 5.0.0 Nov 8, 2022
@fmbenhassine
Copy link
Contributor

That said, I think the unique constraint should be added for JOB_INSTANCE_ID, to enforce data consistency at the job repository level.

I seem to have overlooked the failure scenario in my previous comment. Adding a unique constraint on JOB_INSTANCE_ID is not correct: the BATCH_JOB_EXECUTION table can contain multiple records (ie job executions) for the same job instance ID (which is by design as a job instance might have multiple executions, for the failure/restart case for instance). Adding a unique constraint on JOB_INSTANCE_ID would prevent that, which is not the goal in the first place.

@fmbenhassine fmbenhassine closed this as not planned Won't fix, can't repro, duplicate, stale Nov 17, 2022
@fmbenhassine fmbenhassine removed this from the 5.0.0 milestone Nov 17, 2022
@fmbenhassine fmbenhassine added status: declined Features that we don't intend to implement or Bug reports that are invalid or missing enough details and removed in: core type: enhancement labels Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
related-to: ddl-scripts status: declined Features that we don't intend to implement or Bug reports that are invalid or missing enough details
Projects
None yet
Development

No branches or pull requests

2 participants