Skip to content

Feature request: Change ordering of default log formatter #1568

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
bilalq opened this issue Jun 29, 2023 · 12 comments · Fixed by #2736
Closed
2 tasks done

Feature request: Change ordering of default log formatter #1568

bilalq opened this issue Jun 29, 2023 · 12 comments · Fixed by #2736
Assignees
Labels
completed This item is complete and has been merged/shipped feature-request This item refers to a feature request for an existing or new utility logger This item relates to the Logger Utility

Comments

@bilalq
Copy link

bilalq commented Jun 29, 2023

Use case

I know that users can provide their own LogFormatter instance, but the I believe the default should still be reasonably well-suited for most people.

The problem today is that the output frontloads a bunch of noise that isn't useful and makes the CloudWatch console log stream view useless without expanding and hard to read/skim even when you do.

At a glance, all you see is whether or not an invocation was a cold start and the ARN. Actually interesting information like log level and message are buried.

Screenshot 2023-06-29 at 6 58 58 AM

Solution/User Experience

Technically, V8 and every other JS engine I'm aware of don't guarantee ordering of JS object literals when serializing/deserializing. In practice, when the object key counts are not high, the order as written does get preserved.

If we look here:

https://github.com/aws-powertools/powertools-lambda-typescript/blob/main/packages/logger/src/formatter/PowertoolLogFormatter.ts#L19-L33

Changing:

    return {
      cold_start: attributes.lambdaContext?.coldStart,
      function_arn: attributes.lambdaContext?.invokedFunctionArn,
      function_memory_size: attributes.lambdaContext?.memoryLimitInMB,
      function_name: attributes.lambdaContext?.functionName,
      function_request_id: attributes.lambdaContext?.awsRequestId,
      level: attributes.logLevel,
      message: attributes.message,
      sampling_rate: attributes.sampleRateValue,
      service: attributes.serviceName,
      timestamp: this.formatTimestamp(attributes.timestamp),
      xray_trace_id: attributes.xRayTraceId,
    };

To something like this:

    return {
      level: attributes.logLevel,
      message: attributes.message,
      function_request_id: attributes.lambdaContext?.awsRequestId,
      xray_trace_id: attributes.xRayTraceId,
      service: attributes.serviceName,
      function_arn: attributes.lambdaContext?.invokedFunctionArn,
      function_memory_size: attributes.lambdaContext?.memoryLimitInMB,
      function_name: attributes.lambdaContext?.functionName,
      sampling_rate: attributes.sampleRateValue,
      timestamp: this.formatTimestamp(attributes.timestamp),
      cold_start: attributes.lambdaContext?.coldStart,
    };

Would likely be a lot more usable. This brings the log level, message, and execution ids to the forefront.

With regards to backwards compatibility: No promised API or contractual behavior changes. Technically, if a user had regex based parsing of log lines, that may break. However, I'd argue that given the JS engine already doesn't guarantee order preservation, such a dependency never had guarantees to begin with and should not be considered breakage of backwards compatibility.

Alternative solutions

No response

Acknowledgment

Future readers

Please react with 👍 and your use case to help us understand customer demand.

@bilalq bilalq added triage This item has not been triaged by a maintainer, please wait feature-request This item refers to a feature request for an existing or new utility labels Jun 29, 2023
@bilalq
Copy link
Author

bilalq commented Jul 2, 2023

I'm happy to submit a PR for this if the team would be okay to merge it. Also open to hearing feedback on the ordering. My rationale for putting log level first is that it makes it easy to visually scan and see which log entries are for errors vs info vs debug. After that, the message is the most important attribute, IMO. The request and trace ids are then the most useful bits for pulling additional context around a given log line.

@dreamorosi dreamorosi added logger This item relates to the Logger Utility discussing The issue needs to be discussed, elaborated, or refined and removed triage This item has not been triaged by a maintainer, please wait labels Jul 3, 2023
@dreamorosi
Copy link
Contributor

Hi @bilalq thank you for taking the time to open an issue.

As you mentioned, we can't guarantee the order of the keys so if my memory serves me correctly, we simply ordered the keys alphabetically when creating the formatter.

I'm okay with changing the ordering as you suggested in your second code example and have level, message, and function_request_id at the front, so if the offer of opening a PR is still valid I'd be happy to review it and help merge it.

As a side note, the Python version of Powertools has level, location, message, and timestamp at the front but also allows customers to override the key position without having to write their own formatter via the log_record_order constructor parameter. I think in the future we should consider having a similar property as well, but this is out of the scope of this issue.

@dreamorosi dreamorosi added confirmed The scope is clear, ready for implementation and removed discussing The issue needs to be discussed, elaborated, or refined labels Jul 3, 2023
@dreamorosi dreamorosi moved this from Ideas to Backlog in Powertools for AWS Lambda (TypeScript) Jul 3, 2023
@bilalq
Copy link
Author

bilalq commented Jul 7, 2023

One correction from the issue description: JSON.stringify serialization does have a well-defined order post ES5.

bilalq added a commit to bilalq/powertools-lambda-typescript that referenced this issue Jul 7, 2023
This resolves aws-powertools#1568. The problem raised in that issue is that the log
formatter outputs log lines with keys written in alphabetical order. In
practice, that means they start off with the cold start flag, the
`function_arn`, `function_name`, `function_memory_size`, and a whole
bunch of other stuff that is not usually of interest. The actual log
level and message are buried and need log lines expanded to be seen.
Even when expanded, they're hard to visually spot at a glance. When
viewing a log group in Cloudwatch or some other UI that truncates log
lines, you effectively see no useful information at all (see screenshot
in aws-powertools#1568 for example).

This change addresses the problem by surfacing log properties in order
of usefulness. This is a subjective decision, and opinions may vary, but
the order submitted in this PR should be preferable for the vast
majority of users. The "at-a-glance" view of logs should now be
actionable and help with log-diving and debugging. The actual
information recorded is unchanged and log insight queries should still
work as before.

The serialization order is well-defined in modern ECMAScript engines:

Serialization From the MDN docs on `JSON.stringify`:

> Only enumerable own properties are visited. This means `Map`, `Set`,
> etc. will become `"{}"`. You can use the replacer parameter to
> serialize them to something more useful. Properties are visited using
> the same algorithm as `Object.keys()`, which has a well-defined order
> and is stable across implementations. For example, `JSON.stringify` on
> the same object will always produce the same string, and
> `JSON.parse(JSON.stringify(obj))` would produce an object with the
> same key ordering as the original (assuming the object is completely
> JSON-serializable).
>
> Source: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify#description

And the note about what that order is:

> The traversal order, as of modern ECMAScript specification, is
> well-defined and consistent across implementations. Within each
> component of the prototype chain, all non-negative integer keys (those
> that can be array indices) will be traversed first in ascending order
> by value, then other string keys in ascending chronological order of
> property creation.
>
> Source: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for...in
bilalq added a commit to bilalq/powertools-lambda-typescript that referenced this issue Jul 7, 2023
This resolves aws-powertools#1568. The problem raised in that issue is that the log
formatter outputs log lines with keys written in alphabetical order. In
practice, that means they start off with the cold start flag, the
`function_arn`, `function_name`, `function_memory_size`, and a whole
bunch of other stuff that is not usually of interest. The actual log
level and message are buried and need log lines expanded to be seen.
Even when expanded, they're hard to visually spot at a glance. When
viewing a log group in Cloudwatch or some other UI that truncates log
lines, you effectively see no useful information at all (see screenshot
in aws-powertools#1568 for example).

This change addresses the problem by surfacing log properties in order
of usefulness. This is a subjective decision, and opinions may vary, but
the order submitted in this PR should be preferable for the vast
majority of users. The "at-a-glance" view of logs should now be
actionable and help with log-diving and debugging. The actual
information recorded is unchanged and log insight queries should still
work as before.

The serialization order is well-defined in modern ECMAScript engines:

Serialization From the MDN docs on `JSON.stringify`:

> Only enumerable own properties are visited. This means `Map`, `Set`,
> etc. will become `"{}"`. You can use the replacer parameter to
> serialize them to something more useful. Properties are visited using
> the same algorithm as `Object.keys()`, which has a well-defined order
> and is stable across implementations. For example, `JSON.stringify` on
> the same object will always produce the same string, and
> `JSON.parse(JSON.stringify(obj))` would produce an object with the
> same key ordering as the original (assuming the object is completely
> JSON-serializable).
>
> Source: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify#description

And the note about what that order is:

> The traversal order, as of modern ECMAScript specification, is
> well-defined and consistent across implementations. Within each
> component of the prototype chain, all non-negative integer keys (those
> that can be array indices) will be traversed first in ascending order
> by value, then other string keys in ascending chronological order of
> property creation.
>
> Source: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for...in
@matmeylan
Copy link

Hi @bilalq thank you for taking the time to open an issue.

As you mentioned, we can't guarantee the order of the keys so if my memory serves me correctly, we simply ordered the keys alphabetically when creating the formatter.

I'm okay with changing the ordering as you suggested in your second code example and have level, message, and function_request_id at the front, so if the offer of opening a PR is still valid I'd be happy to review it and help merge it.

As a side note, the Python version of Powertools has level, location, message, and timestamp at the front but also allows customers to override the key position without having to write their own formatter via the log_record_order constructor parameter. I think in the future we should consider having a similar property as well, but this is out of the scope of this issue.

Having the same functionality in typescript would be awesome ! I'd be willing to contribute with some guidance.

@dreamorosi
Copy link
Contributor

dreamorosi commented Oct 30, 2023

Hi @mmeylan thank you for offering to contribute to the project.

As a first stop, and if you haven't already done so, I would recommend to take a look at the contributing document for the repo. In this doc you should be able to find the main info about how to contribute & setup your dev environment. Since you have already identified an opportunity for contributing and an issue, you're already more than halfway there.

After that, in case you want to get more details about the repo structure or other conventions, I would recommend reading also the pages under the "Contributing" section in our documentation: https://docs.powertools.aws.dev/lambda/typescript/latest/contributing/setup/

Once you feel ready to contribute, you can fork the repo and create your working branch starting from the feat/v2 branch. This feature will be included as part of the next major release, so you won't be merging back to main but to feat/v2.

In terms of implementing the feature, the main reference and North Star in terms of developer experience we are looking for is the one found in the Python version of Powertools.

To that extent, I imagine that we will have to add a new property to the ConstructorOptions type that is used to define the options that customers can use to instantiate a new Logger item. In doing so, we should try to type this in a way that allows customers to get auto complete at least for the keys of the UnformattedAttributes type.

Then, we'll likely have to handle this new option in the constructor method of the Logger and perhaps pass it down to the Logger.setLogFormatter() method called in the Logger.setOptions() method, which will in turn pass it down to the PowertoolsLogFormatter class.

Once that's done, we should modify the implementation of the PowertoolsLogFormatter.formatAttributes() method so that it arranges the keys of the LogItem based on the customer selection.

In doing so, we should keep in mind a couple of things:

  • we should aim at implementing this in a way that causes the least amount of runtime impact in terms of performance (ideally we should avoid iterating over the keys since the PowertoolsLogFormatter.formatAttributes() function is called every time a log is emitted)
  • we should allow the customer to pass a partial selection (i.e. only specify that the fields should be message and timestamp) and leave the order of the other fields unchanged.
  • we should make sure that whenever a field ordering is specified, it's passed to children of the logger instance (see Logger.createChild()) - this should already be the case but it might be worth it to make sure this is the case.

If any of the above is not clear or you'd like to discuss any of it further please don't hesitate to ask. Before staring the implementation it'd be great if you could lay out a rough plan of how you plan on implementing the feature so that we can try catching potential issues with it early on.

Feel free to continue the discussion under this issue, we'll make sure to reply as soon as we can. For a closer collaboration, or if you prefer to have a more real-time conversation (time zone allowing it), you can also join our Discord server, you'll find me & the other maintainers of this repo, as well as other members of the community in the #typescript channel.

@dreamorosi dreamorosi added the help-wanted We would really appreciate some support from community for this one label Mar 11, 2024
@willpeixoto
Copy link

I will be glad to help with this issue if still need some help :)

@dreamorosi
Copy link
Contributor

Hi @willpeixoto, yes, help is always welcome!

Thank you for offering, I'll assign the issue to you.

If you have any questions now or during development, please let me know.

We're happy to help either here or on Discord if less async communication is needed.

@leandrodamascena
Copy link
Contributor

leandrodamascena commented Apr 9, 2024

I will be glad to help with this issue if still need some help :)

Sensacional hein monstrão @willpeixoto! :D

We are so happy to have your contribution to the Powertools. If you have any questions, let us know and we'll help!

@dreamorosi
Copy link
Contributor

Hi @willpeixoto - just wanted to check in to see if you were still working on this or if you needed any help.

@dreamorosi dreamorosi moved this from Working on it to Backlog in Powertools for AWS Lambda (TypeScript) Jun 9, 2024
@dreamorosi
Copy link
Contributor

Putting back the issue on the backlog, if anyone is interested in picking this up please leave a comment.

@arnabrahman
Copy link
Contributor

@dreamorosi I have made a draft WIP PR with some questions. Please have a look whenever you are available.

@dreamorosi dreamorosi moved this from Backlog to Working on it in Powertools for AWS Lambda (TypeScript) Jul 8, 2024
@dreamorosi dreamorosi linked a pull request Jul 8, 2024 that will close this issue
@dreamorosi dreamorosi moved this from Working on it to Pending review in Powertools for AWS Lambda (TypeScript) Jul 8, 2024
@dreamorosi dreamorosi moved this from Pending review to Working on it in Powertools for AWS Lambda (TypeScript) Jul 8, 2024
@github-project-automation github-project-automation bot moved this from Working on it to Coming soon in Powertools for AWS Lambda (TypeScript) Sep 9, 2024
Copy link
Contributor

github-actions bot commented Sep 9, 2024

⚠️ COMMENT VISIBILITY WARNING ⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@dreamorosi dreamorosi added completed This item is complete and has been merged/shipped and removed confirmed The scope is clear, ready for implementation help-wanted We would really appreciate some support from community for this one labels Sep 16, 2024
@dreamorosi dreamorosi moved this from Coming soon to Shipped in Powertools for AWS Lambda (TypeScript) Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
completed This item is complete and has been merged/shipped feature-request This item refers to a feature request for an existing or new utility logger This item relates to the Logger Utility
Projects
6 participants