Skip to content

Duplicated logs for loggers with same service name #249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dnlopes opened this issue Dec 20, 2020 · 7 comments
Closed

Duplicated logs for loggers with same service name #249

dnlopes opened this issue Dec 20, 2020 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@dnlopes
Copy link

dnlopes commented Dec 20, 2020

In my code base, all modules have a logger with a well-known service name, which is used across my whole application. This name is passed via the POWERTOOLS_SERVICE_NAME environment variable. While doing so, I'm having duplicated log entries that scales with the number of modules invocation which declare a logger. Example: if my flow passes by 5 modules that declare logger = Logger() then I will have 5 duplicated entries in my logs.

Expected Behavior

When using logger = GetLogger() with the same service name, I was expecting to not have logs duplications.

Current Behavior

When using logger = GetLogger() with the same service name, logs duplication occurs. I believe I already identified the root cause: when a logger is instantiated, even though the underlying logger object is the same, a new handler is being created for the logger, which results in one handler being registered in each logger = GetLogger() invocation.

Steps to Reproduce (for bugs)

The following snippet will reproduce the issue.

from aws_lambda_powertools import Logger


class Class1:
    logger = Logger("class_1")

    @staticmethod
    def f_1():
        Class1.logger.info("log from class_1")


class Class2:
    logger = Logger("class_1")

    @staticmethod
    def f_2():
        Class2.logger.info("log from class_2")
        Class1.f_1()


Class2.f_2()
{"level": "INFO", "location": "f_2:17", "message": "log from class_2", "timestamp": "2020-12-20 21:58:37,473", "service": "class_1", "sampling_rate": 0.0}
{"level": "INFO", "location": "f_2:17", "message": "log from class_2", "timestamp": "2020-12-20 21:58:37,473", "service": "class_1", "sampling_rate": 0.0}
{"level": "INFO", "location": "f_1:9", "message": "log from class_1", "timestamp": "2020-12-20 21:58:37,476", "service": "class_1", "sampling_rate": 0.0} 
{"level": "INFO", "location": "f_1:9", "message": "log from class_1", "timestamp": "2020-12-20 21:58:37,476", "service": "class_1", "sampling_rate": 0.0}

However, if I change the loggers to have different names, no duplication occurs.

I understand I can achieve what I want with the child=True flag, but that leads to some other issues like: how to decide which loggers are child or not.

I feel this is rather a common use case, so I don't understand why there are no reports of this behavior.

What am I missing? Is this by design?

Environment

  • Powertools version used: 1.9.0
  • AWS Lambda function runtime: Python 3.8

Thanks,
David

@dnlopes dnlopes added bug Something isn't working triage Pending triage from maintainers labels Dec 20, 2020
@heitorlessa
Copy link
Contributor

Hi @dnlopes, thank you for raising this issue -- This is in fact a bug as using the same logger name, or simply not passing one and relying on POWERTOOLS_SERVICE_NAME shouldn't emit duplicated logs.

It's coming from here as we're configuring an existing Logger as opposed to returning it: https://github.com/awslabs/aws-lambda-powertools-python/blob/835789e959e6f563e0e6f9611b549a3d2a1b2e59/aws_lambda_powertools/logging/logger.py#L154

As for child and parent - Your main Lambda handler is your entry point thus the parent Logger, all other files and entrypoints are child.

Will ping once a fix is ready and thank you again

@dnlopes
Copy link
Author

dnlopes commented Dec 21, 2020

Hey @heitorlessa , thanks for the quick reply.

The problem I mentioned regarding when using child flag or not is because sometimes that decision is not so straighforward. Due to the nature of Python imports, if we have an import inside a lambda that is already invoking logger = GetLogger(), then, my Lambda is not the one actually creating the first instance of the underlying logger. Thus, it's not trivial to look at a given module and decide upfront if this should be a child or not, as it depends on the code path that led there.

@heitorlessa
Copy link
Contributor

Yes, I'm aware it's not trivial.

We're relying on hierarchical support, so if the child is created first (it will in the import example, and it is expected), the hierarchy will be fixed when the parent is created -- The only issue here is if there are log statements as part of the import, but then this is the same when using std logger.

What child does here is a) ensures it follows hierarchy protocol, and b) we don't configure handlers to it as part of the DX.

I've got a fix and test coverage for that now, I'm releasing a patch shortly

@dnlopes
Copy link
Author

dnlopes commented Dec 21, 2020

I didn't know about that hierarchy would eventually self-heal, that's nice to hear. Nevertheless, if you are fixing this in the product itself, I think I would rather wait for it.

When you say releasing a patch shortly, you mean within the next hours, or you mean in the next release (days/weeks)?

@heitorlessa
Copy link
Contributor

Patch is ready - could you double check the PR above @dnlopes in case I missed anything obvious?

Once you approve I'll kick off the release and should be ready within a few minutes.

@heitorlessa heitorlessa self-assigned this Dec 21, 2020
@dnlopes
Copy link
Author

dnlopes commented Dec 21, 2020

I already approved, looks OK, altough the PR is broken (seems not related with the fix itself)

Thanks for the quick action.

heitorlessa added a commit that referenced this issue Dec 21, 2020
fix: prevent touching preconfigured loggers #249
@heitorlessa heitorlessa added pending-release Fix or implementation already in dev waiting to be released and removed triage Pending triage from maintainers labels Dec 21, 2020
@heitorlessa
Copy link
Contributor

This is now available in PyPi with the version 1.9.1.

I'm closing this now, and appreciate the bug reporting and the PR validation once again :) Enjoy the public holidays

PS: On the PR, that was merely Codecov endpoints not being available as it happens from time to time.

@heitorlessa heitorlessa removed the pending-release Fix or implementation already in dev waiting to be released label Dec 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

No branches or pull requests

2 participants