Skip to content

Use application name as default clientId #3001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
notizklotz opened this issue Jan 24, 2024 · 11 comments
Closed

Use application name as default clientId #3001

notizklotz opened this issue Jan 24, 2024 · 11 comments

Comments

@notizklotz
Copy link
Contributor

notizklotz commented Jan 24, 2024

Expected Behavior

If spring.application.name is defined, it should be used as default Kafka clientId unless overriden by more specific configs. I get the behaviour that I think would be a good default by defining this in my application:

spring.kafka.client-id=${spring.application.name}

Current Behavior

The default clientIds are "producer", "adminclient", etc.

Context

We operate fairly large shared Kafka clusters and a couple of important metrics and log output by Kafka only includes the clientId and not the username. Especially for producers this means we see many of our customers implicitly connecting using "producer-1" as clientId as there is no way on Kafka server side to enforce a specific pattern. By using a more specific default this would ease problem analysis on server side. spring.application.name would be an ideal default because this is also used for similar use cases with other technologies.

I could have a look at creating a pull request for this, if this isn't something that has already been rejected in the past.

@sobychacko
Copy link
Contributor

@notizklotz I think we should still preserve producer, consumer, admin client etc. in the client-id name to distinguish them, but we can attach the spring.application.name for shared cluster environments where there are multiple apps connecting to Kafka. A PR would be greatly welcomed!

@artembilan
Copy link
Member

Note: the spring.kafka.client-id is the part of Spring Boot.
But Soby is right: this common property is out of use and those API-specific are there for consideration.

The "producer", "consumer" and "adminclient" comes from bean names auto-configured for those specific Kafka clients.

We may indeed consider to include spring.application.name into those properties.
For example, DefaultKafkaProducerFactory.CloseSafeProducer:

this.clientId = factoryName + "." + id;

And somewhere there we can resolve it like this.applicationContext.getEnvironment().getProperty("spring.application.name");

@artembilan artembilan added this to the 3.2.0-M1 milestone Jan 24, 2024
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Jan 29, 2024
@notizklotz
Copy link
Contributor Author

I started implementing a proof of concept and it seems to work out well, especially for the cases which currently have a very generic clientId (Consumer without Consumer Group, Producer, Admin Client).

I chose to put the application name before the type (myapp-consumer-1 instead of consumer-myapp-1) to differentiate it better from the Consumer with Consumer Group defaults, which have the Consumer Group name after the type. And it is more similar to the Kafka Streams defaults.

The following table with examples assumes spring.application.name=myapp

Type Before After Notes
consumer (with CG) consumer-myconsumergroup-1 consumer-myconsumergroup-1 Unchanged. Consumer Group is part of default clientId generated by KafkaConsumer which is enough for identifying the client.
consumer (without CG) consumer-null-1 myapp-consumer-1
producer producer-1 myapp-producer-1
admin adminclient-1 myapp-admin-1
Streams myapp-398ba268-efe9-4a74-a6e0-dcdc79412538-StreamThread-1-producer

myapp-ef4f985f-8ed7-48c4-8abc-bb51a354ab45-StreamThread-1-restore-consumer
myapp-398ba268-efe9-4a74-a6e0-dcdc79412538-StreamThread-1-producer

myapp-ef4f985f-8ed7-48c4-8abc-bb51a354ab45-StreamThread-1-restore-consumer
Unchanged. A Streams application id is always required by Kafka Streams and used in the default client ids. spring.application.name is already set as default Streams application id by Spring Boot.

@artembilan
Copy link
Member

@notizklotz ,

Thank you for investigation!

So, as we have discussed before, we have a plan.
The DefaultKafkaProducerFactory, DefaultKafkaConsumerFactory and KafkaAdmin need to be adjusted to incorporate spring.application.name Environment property, if clientId is not set explicitly in the target application.

Does that makes sense?

@notizklotz
Copy link
Contributor Author

@artembilan Yes, that makes sense :-) The code of my PoC is here and I could work it into a proper PR within the next weeks: main...notizklotz:spring-kafka:GH-3001

@frosiere
Copy link
Contributor

frosiere commented Feb 7, 2024

Very nice proposal. I'm currently facing the same issue.

When deploying in a cloud environment like Kubernetes, the same client id may still refer to multiple instances of a producer, consumer and admin client.

So, to solve this issue, wouldn't it make sense to have a ClientIdResolver with a default implementation referring to what has been proposed above?

The resolver contract would be as follow

public interface ClientIdResolver {
   
     String resolve(Map<String, ?> config, String suffix); 
}
public class DefaultClientIdResolver implements ClientIdResolver {

    private final Environment environment;

    public DefaultClientIdResolver(Environment environment) {
        this.environment = environment;
    }

    @Override
    public String resolve(Map<String, ?> config, String suffix) {
        final var applicationName = environment.getProperty("spring.application.name");
        if (applicationName == null || config.containsKey(CommonClientConfigs.CLIENT_ID_CONFIG)) {
            return (String) config.get(CommonClientConfigs.CLIENT_ID_CONFIG);
        }
        return String.join(".", applicationName, suffix);
    }
}

The resolver would also avoid the resolution of the clientId in 3 different kind of clients...

Users deploying in a cloud would be able to provide another implementation adding the pod id or another relevant information to enable correlation between the clientId and a specific instance of the producer, consumer and admin client.

Hope that the proposal make sense and can help investigations based on the clientId when deploying in a cloud environment.

@artembilan
Copy link
Member

As far as I know you can use env vars placeholders in those Spring Boot configuration properties, therefore no need in extra logic in the code: https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#features.external-config.files.property-placeholders

@frosiere
Copy link
Contributor

frosiere commented Feb 7, 2024

Fair point. The idea was also to factorize the code to avoid duplicating the same logic in 3 different places, allowing to change the resolution in an easy way. Thanks for the quick reply.

@sobychacko
Copy link
Contributor

@notizklotz Please let us know when you can work on a PR. We are postponing this issue to the next milestone (3.2.0-M2).

@sobychacko sobychacko modified the milestones: 3.2.0-M1, 3.2.0-M2 Feb 15, 2024
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Feb 19, 2024
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Feb 19, 2024
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Feb 19, 2024
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Feb 19, 2024
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Feb 19, 2024
Fixes: #spring-projectsGH-3001

Use Spring Boot's `spring.application.name` property as part of the default clientIds for Consumers, Producers and AdminClients. Helps with identifying problematic clients at server side.

* Only use as a fallback if clientId wasn't specified explicitly
* Do not use for Consumers with a specified groupId because KafkaConsumer will use the groupId as clientId which already is an identifiable default
@notizklotz
Copy link
Contributor Author

@notizklotz Please let us know when you can work on a PR. We are postponing this issue to the next milestone (3.2.0-M2).

@sobychacko I have the PR ready: #3048

notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Mar 1, 2024
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Mar 1, 2024
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Mar 1, 2024
# Conflicts:
#	spring-kafka-docs/src/main/antora/modules/ROOT/pages/whats-new.adoc
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Mar 11, 2024
Fixes: #spring-projectsGH-3001

Use Spring Boot's `spring.application.name` property as part of the default clientIds for Consumers, Producers and AdminClients. Helps with identifying problematic clients at server side.

* Only use as a fallback if clientId wasn't specified explicitly
* Do not use for Consumers with a specified groupId because KafkaConsumer will use the groupId as clientId which already is an identifiable default
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Mar 11, 2024
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Mar 11, 2024
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Mar 11, 2024
notizklotz added a commit to notizklotz/spring-kafka that referenced this issue Mar 11, 2024
Fixes: #spring-projectsGH-3001

Use Spring Boot's `spring.application.name` property as part of the default clientIds for Consumers, Producers and AdminClients. Helps with identifying problematic clients at server side.

* Only use as a fallback if clientId wasn't specified explicitly
* Do not use for Consumers with a specified groupId because KafkaConsumer will use the groupId as clientId which already is an identifiable default
sobychacko pushed a commit that referenced this issue Mar 11, 2024
Fixes: #GH-3001

* Use Spring Boot's `spring.application.name` property as part of the default clientIds for 
  Consumers, Producers, and AdminClients. Helps with identifying problematic clients on the server side.
* Only use as a fallback if clientId wasn't specified explicitly
* Do not use for Consumers with a specified groupId because KafkaConsumer will use the groupId 
  as clientId, which already is an identifiable default
@sobychacko
Copy link
Contributor

Closed via ab5f0a1.
See the PR: #3048

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants