Skip to content

child=True is not creating a logger? #143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lorengordon opened this issue Aug 27, 2020 · 24 comments
Closed

child=True is not creating a logger? #143

lorengordon opened this issue Aug 27, 2020 · 24 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@lorengordon
Copy link

What were you trying to accomplish?

Re-use a logger in a sub-module, using the child=True feature, per the documentation.

Expected Behavior

With the following setup, there should be two log entries, one with "service": "foo" and one with "service": "foo.bar".

cat foo.py:

$ cat foo.py
import aws_lambda_powertools
import bar

log = aws_lambda_powertools.Logger(service="foo")

log.info('hallelujah')
bar.bar()

cat bar.py:

import aws_lambda_powertools

log = aws_lambda_powertools.Logger(child=True)

def bar():
    log.info('doublestuff')

Current Behavior

$ python foo.py
{"timestamp": "2020-08-27 11:20:16,245", "level": "INFO", "location": "<module>:6", "service": "foo", "sampling_rate": 0.0, "message": "hallelujah"}

Even setting the service explicitly in the submodule to foo.bar, the service remains foo in the output, though at least now their are two log entries.

cat bar.py:

$ cat bar.py
import aws_lambda_powertools

log = aws_lambda_powertools.Logger(service="foo.bar", child=True)

def bar():
    log.info('doublestuff')
$ python foo.py
{"timestamp": "2020-08-27 11:26:05,330", "level": "INFO", "location": "<module>:6", "service": "foo", "sampling_rate": 0.0, "message": "hallelujah"}
{"timestamp": "2020-08-27 11:26:05,330", "level": "INFO", "location": "bar:6", "service": "foo", "sampling_rate": 0.0, "message": "doublestuff"}

Steps to Reproduce (for bugs)

See above.

Environment

  • Powertools version used: 1.4.0
  • Packaging format (Layers, PyPi): PyPi
  • AWS Lambda function runtime: running locally, not in lambda yet
@lorengordon lorengordon added bug Something isn't working triage Pending triage from maintainers labels Aug 27, 2020
@michaelbrewer
Copy link
Contributor

michaelbrewer commented Aug 27, 2020

I would suspect this is at fault:

handlers = self._logger.parent.handlers if self.child else self._logger.handlers

@heitorlessa
Copy link
Contributor

Hey Loren - Thanks for raising this issue.

If you set service="foo" in your child in the first example it should work.

We need to make the docs clearer here, so please allow me to explain what's going on (I'm on my phone so it'll be a longer response than usual)

Service is what defines what the function is responsible for, or part of (e.g payment service).

For Logger, the service is the logging key customers will use to search log operations for one or more functions, for example - "Search for all error, or messages like X, where service is payment".

That said, customers can set the service using explicit arguments, but typically use as an environment variable since it's used across all utilities and to avoid repetition - POWERTOOLS_SERVICE_NAME

With that hopefully explained, let's get to inheritance.

Python Logging hierarchy happens at the dot notation (service, service.child, service.child_2).
For that to work seamlessly, we name all parent Loggers, no child param, with the value defined in the service param.

For child Loggers we introspect the name of your module where Logger(child=True) is called, and name your Logger as "service.filename".

Once both Loggers are registered, and Python Logging recognises the dot notation, it fixes up the hierarchy and event propagation.

Why the first sample didn't work

Python Logger doesn't recognise as a parent + child loggers because their names are different, since service param value is different.

  • Parent logger is named "foo"
  • Child logger is named "service_undefined"

If you were to use the same service name in the Child logger that should work, as that will then be "foo.bar", where bar comes from bar.py

Why the service key isn't foo.bar

That's because of consistency. The service/application is still the same despite having multiple Loggers.

Sorry for the long explanation but hope that clarifies it - Lemme know otherwise :)

Thanks

@lorengordon
Copy link
Author

@heitorlessa Ok, perhaps I'm just not using this library correctly then. All the examples assume service, where python loggers are more typically using the logger name in the record. So I just assumed service === name for the purposes of this library.

I suppose I don't particularly care about the service name for our use case. I do care about specifically where in the module the message was generated. It looks like this module is using the location for that purpose, but that is just bar instead of the logger name (foo.bar).

$ cat bar.py
import aws_lambda_powertools

log = aws_lambda_powertools.Logger(service="foo", child=True)

def bar():
    log.info('doublestuff')
$ python foo.py
{"timestamp": "2020-08-27 12:02:10,669", "level": "INFO", "location": "<module>:6", "service": "foo", "sampling_rate": 0.0, "message": "hallelujah"}
{"timestamp": "2020-08-27 12:02:10,669", "level": "INFO", "location": "bar:10", "service": "foo", "sampling_rate": 0.0, "message": "doublestuff"}

@heitorlessa
Copy link
Contributor

That is correct - location is where funcName:lineNumber will be present, hence bar:lineNo

We didn't include the name parameter for brevity, and for preventing mistakes such as name==name which wouldn't work.

@michaelbrewer
Copy link
Contributor

michaelbrewer commented Aug 27, 2020

@heitorlessa personally I would prefer if "location" included the module it was benign logged from.

This would be similar to other logging libraries.

DEBUG - [method_name] package_name -  message

@heitorlessa heitorlessa added documentation Improvements or additions to documentation area/logger and removed bug Something isn't working triage Pending triage from maintainers labels Aug 27, 2020
@lorengordon
Copy link
Author

Gotcha. Yeah, the function name is definitely not enough info. Is there some way we can set location to something like package.module.function:line?

@heitorlessa
Copy link
Contributor

@michaelbrewer I think it might be easier to have an option to override log record attributes.

When we started, we heard from beta customers the opposite, as the module wasn't always valuable.

@michaelbrewer
Copy link
Contributor

Gotcha. Yeah, the function name is definitely not enough info. Is there some way we can set location to something like package.module.function:line?

Maybe we can allow for a custom pattern for this. As in some cases you don't want the location field to be too long.

@lorengordon
Copy link
Author

lorengordon commented Aug 27, 2020

Maybe we can allow for a custom pattern for this. As in some cases you don't want the location field to be too long.

That would be fine. Coming from python logging, we're used to setting the logformat with our custom preferences.

@michaelbrewer
Copy link
Contributor

@michaelbrewer I think it might be easier to have an option to override log record attributes.

When we started, we heard from beta customers the opposite, as the module wasn't always valuable.

Yeah. I would want to suppress sample_rate, service and some other unused fields. Can overload "location" to include more info like the package/module and method.

@michaelbrewer
Copy link
Contributor

Maybe a separate RFC on a more tailored logger.

@heitorlessa
Copy link
Contributor

That was mostly it @michaelbrewer as sometimes it'd be either too long or not easily parsed/searched.

Let me ask you this before I go to sleep :)

Apart from location, is there any other log record attribute you might to override? E.g timestamp etc

Native ones under LogRecord attributes

https://docs.python.org/3/library/logging.html

@heitorlessa
Copy link
Contributor

A RFC it is, then - if either of you could create I'd be happy to consider it.

I want to avoid a situation where we add too many flags or break existing customers searching mechanisms (saved queries).

Instead, we should provide an option for Power users to override Log records as they see fit without imposing this to less experienced customers, as Python Logging is a complex beast

@michaelbrewer
Copy link
Contributor

@heitorlessa - maybe some presets for location and timestamp. Support for pulling through correlation ids. And ability to suppress / change order of attributes.

But I think the default should please as many people as possible.

Small lambda with a single file should work as well as complex ones with sub modules. Maybe we have a simple flag to switch between the 2.

@lorengordon
Copy link
Author

Overriding the timestamp seems to work fine... My repro case here was simplified, but here is what we're really testing with, derived from our current logging configs...

log = aws_lambda_powertools.Logger(
    service=__name__,
    stream=sys.stderr,
    timestamp="%(asctime)s.%(msecs)03dZ",
    datefmt="%Y-%m-%dT%H:%M:%S",
)

Here's the boilerplate we currently have in all our lambdas, which your library is replacing, and also giving us the benefits of structured logging!

DEFAULT_LOG_LEVEL = logging.INFO
LOG_LEVELS = collections.defaultdict(
    lambda: DEFAULT_LOG_LEVEL,
    {
        "critical": logging.CRITICAL,
        "error": logging.ERROR,
        "warning": logging.WARNING,
        "info": logging.INFO,
        "debug": logging.DEBUG,
    },
)

# Lambda initializes a root logger that needs to be removed in order to set a
# different logging config
root = logging.getLogger()
root.handlers.clear()

logging.basicConfig(
    format="%(asctime)s.%(msecs)03dZ [%(name)s][%(levelname)-5s]: %(message)s",
    datefmt="%Y-%m-%dT%H:%M:%S",
    level=LOG_LEVELS[os.environ.get("LOG_LEVEL", "").lower()],
)
log = logging.getLogger(__name__)

@heitorlessa heitorlessa self-assigned this Aug 27, 2020
@lorengordon
Copy link
Author

Ok, nice, so overriding location already works:

log = aws_lambda_powertools.Logger(
    service="foo",
    stream=sys.stderr,
    location="%(name)s.%(funcName)s:%(lineno)d",
    timestamp="%(asctime)s.%(msecs)03dZ",
    datefmt="%Y-%m-%dT%H:%M:%S",
)
$ python foo.py
{"timestamp": "2020-08-27T12:54:32.516Z", "level": "INFO", "location": "foo.<module>:14", "service": "foo", "sampling_rate": 0.0, "message": "hallelujah"}
{"timestamp": "2020-08-27T12:54:32.517Z", "level": "INFO", "location": "foo.bar.bar:10", "service": "foo", "sampling_rate": 0.0, "message": "doublestuff"}

@heitorlessa
Copy link
Contributor

Yes, you can override "location", "timestamp", "level", and "datefmt" - I just haven't documented them.

If you're okay with it being this way, then we just need:

  1. RFC for suppressing keys, and ordering them (message first, etc)
  2. Document this

I'll check the responses tomorrow

@michaelbrewer
Copy link
Contributor

@lorengordon @heitorlessa - looks like documentation updates and #140 might address this.

# space.py
from aws_lambda_powertools import Logger

logger = Logger(location="%(module)s.%(funcName)s:%(lineno)d", timestamp=None, format_keys=["location", "message"])

def fly():
    logger.info("takeoff")

fly()
$ POWERTOOLS_SERVICE_NAME=falcon python space.py 
{"location": "space.fly:7", "message": "takeoff", "level": "INFO", "service": "falcon", "sampling_rate": 0.0}

@lorengordon
Copy link
Author

Agreed, given the ability to override location formatting, documentation updates would be sufficient. Ability to suppress keys would be a nice-to-have, but I don't have a strong opinion there.

@heitorlessa
Copy link
Contributor

We've just merged @michaelbrewer PR on ability to suppress and order a few keys - Working on a PR for the docs piece now

@heitorlessa
Copy link
Contributor

@lorengordon @michaelbrewer could you please take a look and tell me whether this clarifies these and other questions?

#148

@lorengordon
Copy link
Author

@heitorlessa looks good to me! thanks!

@michaelbrewer
Copy link
Contributor

@heitorlessa looks good

@heitorlessa heitorlessa added pending-release Fix or implementation already in dev waiting to be released and removed pending-release Fix or implementation already in dev waiting to be released labels Aug 31, 2020
@heitorlessa
Copy link
Contributor

This has now available in 1.5.0 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
Development

No branches or pull requests

3 participants