Skip to content

Unable to acquire connection from pool within configured maximum time #2632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
RomanRomanenkov opened this issue Nov 21, 2022 · 15 comments
Closed

Comments

@RomanRomanenkov
Copy link

RomanRomanenkov commented Nov 21, 2022

Hi, I have a Reactive REST API using Spring Data Neo4j (SpringBoot v2.7.5).

I noticed lots of error logs in our service "org.neo4j.driver.exceptions.ClientException: Unable to acquire connection from the pool within configured maximum time of 60000ms". I use default configuration provided by SDN.

I was able to reproduce the error. If I return Mono.error in case of an empty result, all the connections from the pool are in use and the service cannot acquire a new connection, no matter how much time passes.

private fun findUserById(id: UUID): Mono<User> {
        return userRepository.findById(id)
            .switchIfEmpty(Mono.error(NotFoundException("User $id not found")))
    }

With enabled property log-leaked-sessions: true, there are "Neo4j Session object leaked, please ensure that your application fully consumes results in Sessions or explicitly calls close on Sessions before disposing of the objects." logs.

Only after restarting the service, it returns to a healthy state. Why are the connections from the pool not get getting closed?

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Nov 21, 2022
@michael-simons michael-simons self-assigned this Nov 22, 2022
@michael-simons
Copy link
Collaborator

Hello @RomanRomanenkov thanks for using our module. I tried to recreate your scenario, please see attached project

sdn2632.zip

It basically recreates what you have

@RestController
public class MovieController {

	private final MovieRepository movieRepository;

	public MovieController(MovieRepository movieRepository) {
		this.movieRepository = movieRepository;
	}

	@GetMapping("/{id}")
	public Mono<Movie> findById(@PathVariable UUID id) {
		return movieRepository.findById(id)
			.switchIfEmpty(Mono.error(new SomeException()));
	}

	static class SomeException extends RuntimeException {

	}

	@ExceptionHandler
	public ResponseEntity<String> handle(SomeException ex) {
		return ResponseEntity.internalServerError().body("Schade");
	}

}

with minimal config

spring.neo4j.pool.log-leaked-sessions=true

spring.data.neo4j.repositories.type=reactive

spring.neo4j.authentication.username=neo4j
spring.neo4j.authentication.password=secret

You'll need the tx manager:

package com.example.sdn2632;

import org.neo4j.driver.Driver;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.neo4j.core.ReactiveDatabaseSelectionProvider;
import org.springframework.data.neo4j.core.ReactiveUserSelectionProvider;
import org.springframework.data.neo4j.core.transaction.ReactiveNeo4jTransactionManager;
import org.springframework.data.neo4j.repository.config.ReactiveNeo4jRepositoryConfigurationExtension;
import org.springframework.transaction.ReactiveTransactionManager;

@Configuration
public class Neo4jConfig {
	@Bean(ReactiveNeo4jRepositoryConfigurationExtension.DEFAULT_TRANSACTION_MANAGER_BEAN_NAME)
	public ReactiveTransactionManager reactiveTransactionManager(
		Driver driver,
		ReactiveDatabaseSelectionProvider databaseSelectionProvider
	) {
		return ReactiveNeo4jTransactionManager.with(driver)
			.withDatabaseSelectionProvider(databaseSelectionProvider)
			.withUserSelectionProvider(ReactiveUserSelectionProvider.getDefaultSelectionProvider())
			.build();
	}
}

With apache bench used like this

ab -n 20000 -c 10 localhost:8080/6aeb4340-5712-4678-ba94-fd001a56b43d

I See the connections via call dbms.listConnections(); spike to 100, which is the drivers default.

All requests succeed, regardless whether I use an id that exists or not.

The connections will be there for an hour, which is also the default.

Configuring

spring.neo4j.pool.max-connection-lifetime=30s

you will see them going away in the database again.

No logs about leaked sessions.

Please share as much details as you have, thanks.

@michael-simons michael-simons added blocked: awaiting feedback and removed status: waiting-for-triage An issue we've not yet triaged labels Nov 23, 2022
@RomanRomanenkov
Copy link
Author

RomanRomanenkov commented Nov 24, 2022

Hello @michael-simons, thank you for your reply. Yes I see that just a simple findById works fine.
I have to add more details of my particular case. I have a more complex structure of Flux methods invocation in my project, so i guess it makes the difference.

My method consumes a list of objects and creates a Flux to iterate over the list, than in flatMap there is another repository call which returns an existing node and then I do a call to get a user which is not found and an exception is thrown.

I easily reproduce it with such an example

@GetMapping("/{id}")
fun doSomethingByUser(@PathVariable userId: UUID): Mono<Void> = userService.doSomething(userId)

fun doSomething(id: UUID): Mono<Void> {
        return Flux.fromIterable(listOf(1, 2, 3, 4, 5, 7, 8, 9, 10))
            .flatMap {
                nodeRepository.findById(nodeIdWhichExists)
                    .then(findUserById(id))
            }
            .collectList().then()
    }

private fun findUserById(id: UUID): Mono<User> {
        return userRepository.findById(id)
            .switchIfEmpty(Mono.error(NotFoundException("User $id not found")))
    }

collectList is important here, since I do some mapping in the project after that

As the result the requests are stuck and I see "org.neo4j.driver.exceptions.ClientException: Unable to acquire connection from the pool within configured maximum time of 60000ms"

I used your apach bench command to reproduce
ab -n 20000 -c 10 localhost:8080/6aeb4340-5712-4678-ba94-fd001a56b43d

@michael-simons
Copy link
Collaborator

Thanks @RomanRomanenkov this is helpful, though I am not happy about the result.

@michael-simons michael-simons added status: needs-investigation An issue that has been triaged but needs further investigation and removed blocked: awaiting feedback labels Nov 24, 2022
@michael-simons
Copy link
Collaborator

So I can see sessions being closed when the inner flow runs into an error, which is great. However, it seems that either the base infrastructure of Spring Data / Spring Framework or our implementation of a reactive transaction manager has issues when it fans out big time like in the above scenario.

Here are two solutions that work for me. Both change the the flow in such a way that you have only one transaction ongoing, first by injecting a transactional operator into the flow, the other one using the declarative approach.
With the original scenario you will have a bunch of concurrent transactions.

package com.example.sdn2632;

import java.util.List;
import java.util.UUID;

import org.springframework.http.ResponseEntity;
import org.springframework.stereotype.Service;
import org.springframework.transaction.ReactiveTransactionManager;
import org.springframework.transaction.annotation.Transactional;
import org.springframework.transaction.reactive.TransactionalOperator;
import org.springframework.web.bind.annotation.ExceptionHandler;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;

import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;

@RestController
public class MovieController {

	private final ReactiveTransactionManager transactionManager;

	private final MovieRepository movieRepository;

	private final SomeService someService;

	public MovieController(ReactiveTransactionManager transactionManager, MovieRepository movieRepository, SomeService someService) {
		this.transactionManager = transactionManager;
		this.movieRepository = movieRepository;
		this.someService = someService;
	}

	@GetMapping("/{id}")
	public Mono<Void> findById(@PathVariable UUID id) {
		return Flux
			.fromIterable(List.of(1, 2, 3, 4, 6, 7, 8, 9, 10))
			.flatMap(i -> movieRepository.findById(id)
				.switchIfEmpty(Mono.error(new SomeException())))
			.collectList()
			.as(TransactionalOperator.create(transactionManager)::transactional)
			.then();
	}

	@GetMapping("/v2/{id}")
	public Mono<Void> findByIdV2(@PathVariable UUID id) {
		return someService.doStuff(id);

	}

	// Other solution
	@Service
	public static class SomeService {

		private final MovieRepository movieRepository;

		public SomeService(MovieRepository movieRepository) {
			this.movieRepository = movieRepository;
		}

		@Transactional
		public Mono<Void> doStuff(UUID id) {
			return Flux
				.fromIterable(List.of(1, 2, 3, 4, 6, 7, 8, 9, 10))
				.flatMap(i -> movieRepository.findById(id)
					.switchIfEmpty(Mono.error(new SomeException())))
				.collectList()
				.then();
		}
	}


	static class SomeException extends RuntimeException {
	}

	@ExceptionHandler
	public ResponseEntity<String> handle(SomeException ex) {
		return ResponseEntity.internalServerError().body("Schade");
	}
}

in both solutions I have again proper behaviour with sessions being closed / connections removed from the pool.

@RomanRomanenkov
Copy link
Author

RomanRomanenkov commented Nov 24, 2022

Thank you @michael-simons!
I can confirm that this workaround works fine in my case.
unfortunately, I have to keep in mind now to add @transactional in such complex flows if an exception is thrown

@michael-simons
Copy link
Collaborator

Another workaround is using explicit concatMap instead of flat map and avoiding parallel transactions this way. We are investigating this.

Thanks to awesome VMWare folks we have this down to be reproducible with this

package org.springframework.data.neo4j.integration.x;

import java.time.Duration;
import java.util.Arrays;
import java.util.List;
import java.util.UUID;
import java.util.concurrent.TimeUnit;

import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtendWith;
import org.neo4j.driver.AuthTokens;
import org.neo4j.driver.Config;
import org.neo4j.driver.Driver;
import org.neo4j.driver.GraphDatabase;
import org.neo4j.driver.Session;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.neo4j.config.AbstractNeo4jConfig;
import org.springframework.data.neo4j.config.AbstractReactiveNeo4jConfig;
import org.springframework.data.neo4j.core.DatabaseSelectionProvider;
import org.springframework.data.neo4j.core.ReactiveDatabaseSelectionProvider;
import org.springframework.data.neo4j.core.transaction.Neo4jBookmarkManager;
import org.springframework.data.neo4j.core.transaction.Neo4jTransactionManager;
import org.springframework.data.neo4j.repository.config.EnableNeo4jRepositories;
import org.springframework.data.neo4j.repository.config.EnableReactiveNeo4jRepositories;
import org.springframework.data.neo4j.test.BookmarkCapture;
import org.springframework.test.context.junit.jupiter.SpringExtension;
import org.springframework.test.context.junit.jupiter.SpringJUnitConfig;
import org.springframework.transaction.PlatformTransactionManager;
import org.springframework.transaction.ReactiveTransactionManager;
import org.springframework.transaction.annotation.EnableTransactionManagement;

import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;
import reactor.test.StepVerifier;

@ExtendWith(SpringExtension.class)
@SpringJUnitConfig(FooTest.Config.class)
public class FooTest {

	@Test
	void f(@Autowired MovieRepository movieRepository, @Autowired Driver driver) {
		UUID id = UUID.randomUUID();
		Flux
				.range(1, 5)
				// .doOnCancel(() -> System.out.println("doOnCancel (outer)"))
				.flatMap(
						i -> movieRepository.findById(id)
								.switchIfEmpty(Mono.error(new RuntimeException()))
								//.doOnCancel(() -> System.out.println("doOnCancel"))
				)
				//.collectList()
				//.as(TransactionalOperator.create(transactionManager)::transactional)
				.then()
				.as(StepVerifier::create)
				.verifyError();
		System.out.println("---- complete");
		try (Session session = driver.session()) {
			System.out.println(session.run("RETURN 1").single().get(0));
		}
	}


	@Configuration
	@EnableTransactionManagement
	@EnableReactiveNeo4jRepositories(considerNestedRepositories = true)
	static class Config extends AbstractReactiveNeo4jConfig {

		@Bean
		public Driver driver() {
			org.neo4j.driver.Config config = org.neo4j.driver.Config.builder()
					.withMaxConnectionPoolSize(2)
					.withConnectionAcquisitionTimeout(20, TimeUnit.SECONDS)
					.withLeakedSessionsLogging()
					.build();
			return
					GraphDatabase.driver("bolt://localhost:7687", AuthTokens.basic("neo4j", "secret"), config);
		}
	}

}

@michael-simons
Copy link
Collaborator

Reproducer now down to this

	record SessionAndTx(ReactiveSession session, ReactiveTransaction tx) {
	}

	@Test
	void f2(@Autowired Driver driver) {
		Flux
				.range(1, 5)
				.flatMap(
						i -> {
							Mono<SessionAndTx> f = Mono
									.just(driver.session(ReactiveSession.class))
									.flatMap(s -> Mono.fromDirect(s.beginTransaction()).map(tx -> new SessionAndTx(s, tx)));
							return Flux.usingWhen(f,
									h -> Flux.from(h.tx.run("MATCH (n) WHERE false = true RETURN n")).flatMap(ReactiveResult::records),
									h -> Mono.from(h.tx.commit()).then(Mono.from(h.session.close())),
									(h, e) -> Mono.from(h.tx.rollback()).then(Mono.from(h.session.close())),
									h -> Mono.from(h.tx.rollback()).then(Mono.from(h.session.close()))
							).switchIfEmpty(Mono.error(new RuntimeException()));
						}
				)
				.then()
				.as(StepVerifier::create)
				.verifyError();
		System.out.println("---- complete");
		try (Session session = driver.session()) {
			System.out.println(session.run("RETURN 1").single().get(0));
		}
	}

which is essentially SDN reactive tx flow, but without SDN.
Speaking now with the driver team.

@M3lkior
Copy link

M3lkior commented Nov 30, 2022

Hi @michael-simons ;

Thanks for the investigation,

Because we have the same issue on my product, using the reactive driver, what is the best things to do to mitigate this leak issue ? migrate to a concatMap approach ?

@michael-simons
Copy link
Collaborator

Either that, or wrap it in a transitional operator as shown.

@injectives can you please card this?

@AndyHeap-NeoTech please make room for this, this is getting a bit pressuring, @SeBBBe is from NOM btw.

@injectives
Copy link
Contributor

@injectives can you please card this?

Sure, I have actually carded this earlier today.

@AndyHeap-NeoTech
Copy link

We have a fix ready to be implemented. This will be available in the 5.4 Java driver which is planned for release in early January. It will also be back ported to the LTS 4.4 driver as a patch release, which will be available sooner if all things go smoothly.

@michael-simons
Copy link
Collaborator

Thank you, Andy!

@injectives
Copy link
Contributor

Fix for dangling transactions: neo4j/neo4j-java-driver#1341
Please give it a go if possible.

michael-simons added a commit that referenced this issue Dec 8, 2022
Closes #2632 for 6.3 with the fixed driver 4.4.10.
michael-simons added a commit that referenced this issue Dec 8, 2022
Closes #2632 for 6.3 with the fixed driver 4.4.10.
@michael-simons
Copy link
Collaborator

This has been now addressed with the upgrade to driver 4.4.10 in 6.3.x and 6.2.x.
Waiting for a 5.3 patch or the before mentioned 5.4. Thanks.

@michael-simons michael-simons added blocked: upstream dependency type: dependency-upgrade A dependency upgrade and removed status: needs-investigation An issue that has been triaged but needs further investigation type: dependency-upgrade A dependency upgrade labels Dec 14, 2022
michael-simons added a commit that referenced this issue Dec 21, 2022
Fixes #2632 on 7.0.x and main with the fixed driver 5.3.1.
@michael-simons
Copy link
Collaborator

This is fixed with driver 4.4.10 and 5.3.1 now. Thanks to everyone involved investigating and fixing it. Until there are new SDN releases, the following upgrade paths should work:

  • SDN 7.0.x / 7.1.x: Go to Java-Driver 5.3.1
  • SDN 6.3.x: Go to Java-Driver 4.4.10

🎁 🎄

@michael-simons michael-simons mentioned this issue Jan 4, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants