-
Notifications
You must be signed in to change notification settings - Fork 616
Non deterministic issue with not initialize fields when projecting the same Node to two different interfaces in projection hierachy #2858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
After some further investigation i can tell that it depends on the order of the queries that resolve the relationships. If the friends for relationships are fetched first everything works finde. If the related to relationship is fetched first the collection of the friends relationship is empty, as the projection for the related to relationship does not contain the friends field and is thus not included in the result. |
I tried to investigate this further. I noticed that for certain queries or graphs the query strategy changes. In the example provided above the data is fetched using multiple queries. In another example the data is fetchen using a single query looking similar to this:
executing that single query actually returns all the data. However, the issue that the set is empty occurs as well. So it might not be about the queries but about handling the results returned by the query. Unfortunately I have no Idea where to start debugging in the spring-data-neo4j project. Neither do I know how to trigger that single query strategy so I can not provide a simple example project. |
Thanks for your input. I am currently already working on this. |
Just to be clear: The issue regarding non initialized fields also applies to the query in my last comment. Although the query actually returns the data if executed in the neo4j browser. So there might be two issues:
|
This is not part of the tests, right? |
Yeah its not part of the Test. I experienced it in my real project but couldn't create a demo as i don't know how to model my domain/projections to have the single query behavior. |
You get the single query behaviour if you don't repeat an entity in your chain. As explained earlier this is unrelated to the projections. A Good news regarding the mapping: I found the problem. It is rooted in the, what I thought is a good idea, concept to not iterate over an already discovered entity again. Like you said, fetching the "family Bob" first with the projection in mind, results into a dead end because there are no further relationships defined in the |
Thank you for the clearification. I didn't get it because in the project that i first discovered the issue in, has this cycle but uses a single query. I guess this is due to the cycle not containing the same entity from spring-data-neo4js POV as it is based on an inheritance hierachy and looks more like this:
Where Person and VerySpecialPerson are effectively the same Node but propably not treated as possibly beeing the same entity. And great that you were able to find the problem. |
I created another example that is closer to my original project. It uses the single query and fails consistently: https://github.com/NilsWild/spring-issue-demo/tree/f1533f8ff6cee6d9daa8fe262276fa906575bfbe |
I thought about this statement. Wouldn't it make more sense to track the projectionInformation / returnType instead of the relationships that led there? If Bob is already known and should be projected to the same return type it can return fast if not it should be "another" Bob proxy. Though i am not sure if that information is available at that point. |
That's a problem to solve this even nicer. The relationships give me the information of connected entities, unrelated to possible projections defined (and later being used). Will check out your other example tomorrow. I am curious how this relates to the problem with cascading queries because the query bug is down in the subsequent query creation for cycles and not something that would also apply to the directed query. On the other hand I am pretty "confident" that there might be also a similar logic in this case :) For now tracking the relationships that resolved the ids works. I am not yet sure if this is a performant solution because I keep a copy of the ids in a map around, that gets modified heavily when there are a lot of related nodes. |
As far as i can tell the query is fine and returns all the data needed to initialize the entity. However the entity might be cached and not initialized twice if the same entity is returned twice in the query results. Let me know when you have a solution, so i can test it on my project and provide another example if needed. |
If an entity has already been loaded by any relationship, it gets marked as processed. But this is not a valid state if there are multiple relationships to this entity and it is loaded via different projections for each relationship. In those cases SDN will just stop to find other relationships. This commit fixes this behaviour by also taking the relationship the entity got loaded with into account.
I finally managed to fix the initial problematic bit regarding the loading of back-references.
|
Looks good. Solves the issue for the multiple queries. |
Great, thanks for your feedback. |
I don't get a directed query in the commit, you have posted. The cyclic query makes sense for me because e.g. a Nevertheless, I get a
Is this this problem you were also observing? (Or the commit hash is the wrong one) |
Just double checked. You're right it's not a single query. The final query just gave me the impression. I get the following final query for "save and retrieve" test:
The query includes all relevant attributes and relationships. And yes you're right thats the problem i was referring to. Notice that besides the assertion made in line 40 the one of line 41 checking the dependsOn relationship fails as well. Don't know if those are separate problems or the same as one is referencing an attribute of the node and the other one a relationship. |
This is a tough one.
Spring Data Neo4j usually maps an entity once per record. Again, given the randomised order of property traversal, this can either be the first case of Would take some time to figure out if there is an potential "middle ground" that solves this. |
I see. However, i think that this is a very harsh functional limitation. For the headers field a workaround would be to just declare it everywhere. For the dependsOn relationship i can not imagine any workaround as it would lead to infinite recursion to include the relationship in every projection as there's no such mechanism as JsonIdentityInfo. So only custom queries could solve this for now i guess. |
I think the solution should to fix this limitation. It's just that nobody before encountered this, and tbh I never thought about this scenario with different projections for an entity within one query. |
…pping. Prior to this, an incomplete loaded entity, due to projection, was never touched again to add missing properties loaded via a different relationship and projection definition.
And we have a few commits more on the branch. As usual the snapshot (same name) should be available in ~30 minutes, if nothing breaks 🤞 |
Based on my tests i have for my project. I can confirm that this snapshot fixed the problem. |
Many thanks for the second confirmation. A little bit refactoring tomorrow and both fixes will get merged for the next round of releases. |
…pping. Prior to this, an incomplete loaded entity, due to projection, was never touched again to add missing properties loaded via a different relationship and projection definition.
If an entity has already been loaded by any relationship, it gets marked as processed. But this is not a valid state if there are multiple relationships to this entity and it is loaded via different projections for each relationship. In those cases SDN will just stop to find other relationships. This commit fixes this behaviour by also taking the relationship the entity got loaded with into account.
…pping. Prior to this, an incomplete loaded entity, due to projection, was never touched again to add missing properties loaded via a different relationship and projection definition.
If an entity has already been loaded by any relationship, it gets marked as processed. But this is not a valid state if there are multiple relationships to this entity and it is loaded via different projections for each relationship. In those cases SDN will just stop to find other relationships. This commit fixes this behaviour by also taking the relationship the entity got loaded with into account.
…pping. Prior to this, an incomplete loaded entity, due to projection, was never touched again to add missing properties loaded via a different relationship and projection definition.
Thanks again for all your feedback. The fix(es) will be in the next 7.2. release. |
spring-data-neo4j:7.21
neo4j 5.16-community
Java 17
Kotlin 1.9.22
Given a Node with two relationships that could point to the same entities and using projections to get the hierachy, there seems to be a non deteministic behavior leading to non initialized fields.
An example: Given a node like this:
I can create a Graph with two persons Alice and Bob. Both are friends with each other and Alice is related to Bob.
This works perfectly.
I also have a Projection to query a person with their friends, friends of friends and their relatives:
Notice that the "Family" projection is missing the lastname field whereas the "Friends" and "FriendsOfFriends" Projections are missing the lastname field. Furthermore the "FriendsOfFriends" projection is missing the friends field.
When i now try to retrieve alice like this:
and retrieve her friends of friends (which would be her) like this:
There is a chance that the collection is reported as empty:
Same thing if i want to retrieve the name of alice friend like this:
There is a chance that the name field of the person entity is reported as not initialized. I suspect that this is due to only one of the projections for the respective node is taken into account and thus the respective fields are not queried by the repository. The behavior is not deterministic.
I provided the example project here: https://github.com/NilsWild/spring-issue-demo/tree/d20ed4eb9469592fe9ed5625f42791d414d6e01f
Just run the provided testcase multiple times and it will fail eventually.
The text was updated successfully, but these errors were encountered: