-
Notifications
You must be signed in to change notification settings - Fork 1.5k
merge/persist distinction in SimpleJpaRepository not always correct #2546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
For reference purposes, a later comment by @odrotbohm on #1594 shows that the optimization applied then reverted was inadequate, and introduced a different scenario that could break things. I believe we need to further capture these various scenarios into a batch of test cases. Otherwise, we're talking in circles. This is the only way to guard against the scenario presented above AND the scenario in this related comment. |
I don't think we need to throw out the baby with the bathwater. The arrangement outlined is in clear violation of the notion of an aggregate. In particular, a child entity belonging to two aggregates. This fundamentally subverts the ability of the original aggregate to control its state, as the change to another aggregate would change the state of the original one behind its back. That's where the subtle mismatches in JPA-specific entity state come from. If the current implementation of |
After further evaluation, this ticket doesn't warrant changes to Spring Data JPA. |
Of course it is not. A relational database model is not necessarily a tree, but can be an arbitrary graph where on entity type is both referencis multiple other types and is references by multiple types. Perhaps even my example was too complicated. The same problem as described here will also occur in an N:M relationship if an existing and a new entity on the 'N' side of the relationship are referencing a new entity on the 'M' side. I also tried to explain why the merge method is incorrectly used in the repository implementation. You are using the merge method obviously in the belief that it is supposed to be used for updating an entity, but that is not what the method is there for. The method is there to reattach detached entites. Why are you ignoring this obvious fact? |
The current implementation of
SimpleJpaRepository.save
is as follows:id == null
, the call is delegated toEntityManager.persist
.EntityManager.merge
.When using JPA directly, the things to consider when choosing wether to use the
persist
or themerge
methods in theEntityManager
are unfortunately not that simple. Invoking themerge
method with a managed entity can, for example in connection with cascading, have unpredictable results and is often discouraged. Accidentally, the optimization introduced with #1285 would also have solved our problem, but it was removed later in #1594.The problem is as follows:
I have three entity types,
Parent1
,Parent2
andChild
with bothParent1
andParent2
having a OneToMany relationship toChild
withCascadeType.ALL
. TheChild
can in the DB reference either aParent1
or aParent2
or both, but these references are not modeled in the Java classes. The tables are defined as follows:If I now in one transaction:
p1
of typeParent1
p2
of typeParent2
c
of typeChild
c
to bothp1
andp2
p1
andp2
through their repositories... I would expect to have only one instance of type
Child
persisted and the same instance references both byp1
andp2
. In reality however, the outcome depends on in which order I savep1
andp2
.If I save
p1
first and thenp2
, the following happens:p1
,SimpleJpaRepository
recognizesp1
as an already persistent entity and delegates the call toEntityManager.merge
.p1
is referencing a new entityc
through aCascadeType.ALL
reference, theEntityManager
will also mergec
and this is where the JPA specification might not be particularly obvious. Merging a new or detached entity means that the object is replaced with a managed copy. The replacement is in this case done in the list of children inp1
, sop1
is now referencing a new, merged copy ofc
, let's call itc2
.p2
,SimpleJpaRepository
recognizesp2
as a new entity and delegates the call toEntityManager.persist
.p2
is still referencing the original instance ofc
, which after the merge in the previous step is still not managed, the JPA implementation believes to see a new, unsaved enitity and creates a second managedChild
.After the save operations, I have one
Parent1
, oneParent2
and twoChild
entities in the database, with each of theChild
entities referencing eitherParent1
orParent2
. This result is not expected.If I save
p2
first and thenp1
, the following happens:p2
,SimpleJpaRepository
recognizesp2
as a new entity and delegates the call toEntityManager.persist
.p2
andc
to become managed entities. Note that in this case, the original instances will become managed and are not replaced with managed copies.p1
,SimpleJpaRepository
recognizesp1
as a persistent entity and delegates the call toEntityManager.merge
.p1
is still referencing the samec
instance as referenced fromp2
and this instance is already managed thanks to the previous persist operation. Now, the merge operation does not make a new copy ofc
and sincec
is already managed and up to date, does nothing.After the save operations, I have one
Parent1
, oneParent2
and oneChild
entitiy in the database, with theChild
entity referencing bothParent1
andParent2
. This is the expected result.Since JPA implementations are required to recognize changes in managed entities without any explicit call to
merge
orpersist
, the optimization introduced in #1285 (invokingsave
with a persistent and already managed entity will neither forward it tomerge
norpersist
) incidentially would also have fixed this problem. This optimization was reverted in #1594 because it allegedly broke functionality, then it seems to be disputed if the broken functionality was caused by the optimization at all and also requested that the revert should be reverted.And just to already here address the comments in #1594 (comment) arguing in favour of not reverting the revert:
EntityManager.merge
does not do the optimization itself because the purpose of themerge
method is not to save changes in a managed entity, but to reattach a detached entity to the current transaction. IMHO, the merge method is simply incorrectly used in the Spring Data implementation.parent
being a managed entity andchild
being a new entity, callingem.merge(parent)
would in this case still not makechild
a managed entity as the comment author seems to expect. As I already pointed out earlier, in this case, themerge
method will replace the instance in the list of children inparent
with a managed copy ofchild
, but thechild
instance as we see it in the method scope is still not managed. If the intention is to makechild
a managed entity in this scenario, the Spring Data repository would have to delegateparent
topersist
instead ofmerge
, as invokingpersist
with an already managed entity will ignore the root entity, but still cascade the persist operation to potentially new referenced objects and then attach the originalchild
instance to the EntityManager as a managed (but yet unsaved) entity.Implementations of the classes referenced in the text and a JUnit test showing the difference in behaviour depending on the order of the
save
methods: example.zipRelated issues:
#1285
#1594
The text was updated successfully, but these errors were encountered: