Skip to content

fix(logging): Prevent accidental overwriting of reserved keys via structured arguments #1822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Apr 24, 2025

Conversation

phipag
Copy link
Contributor

@phipag phipag commented Apr 23, 2025

Issue #, if available: #1759

Description of changes:

This PR fixes two problems for both the logback logging module as well as the log4j logging module:

  1. Prevents users from accidentally overwriting reserved context keys such as message, level, Lambda context information, etc.
  2. Fixes a bug in the MapArgument where JSON serialization was missing a comma if the map contained multiple root-level keys.

Note: In addition to unit tests, I ran E2E tests as well. Until E2E tests are restored in GitHub workflows, here is the output from running in my AWS account.

❯ mvn -Pe2e -B verify --file powertools-e2e-tests/pom.xml -Dtest="LoggingE2ET.java"
...
16:32:30.422 [main] INFO  software.amazon.lambda.powertools.testutils.Infrastructure - Uploading asset 3a7b20f10d24fa631c596730d62c85aff6d25420cc2ed78d6a7decf66e08a2e5.jar to cdk-hnb659fds-assets-*******-eu-west-1
16:32:34.738 [main] INFO  software.amazon.lambda.powertools.testutils.Infrastructure - Deploying 'LoggingE2ET-893080a1f7f7' on account *******
16:33:35.840 [main] INFO  software.amazon.lambda.powertools.testutils.Infrastructure - Stack LoggingE2ET-893080a1f7f7 successfully deployed
16:33:38.474 [main] INFO  software.amazon.lambda.powertools.testutils.Infrastructure - Deleting 'LoggingE2ET-893080a1f7f7' on account *******
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.50 s -- in software.amazon.lambda.powertools.LoggingE2ET
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

Checklist

Copy link
Contributor

github-actions bot commented Apr 23, 2025

💾 Artifacts Size Report

Module Version Size (KB)
powertools-common 2.0.0-SNAPSHOT 13.75
powertools-serialization 2.0.0-SNAPSHOT 22.89
powertools-logging 2.0.0-SNAPSHOT 40.02
powertools-logging-log4j 2.0.0-SNAPSHOT 28.06
powertools-logging-logback 2.0.0-SNAPSHOT 21.85
powertools-tracing 2.0.0-SNAPSHOT 19.09
powertools-metrics 2.0.0-SNAPSHOT 18.48
powertools-parameters 2.0.0-SNAPSHOT 17.42
powertools-validation 2.0.0-SNAPSHOT 19.79
powertools-cloudformation 2.0.0-SNAPSHOT 16.76
powertools-idempotency-core 2.0.0-SNAPSHOT 35.92
powertools-idempotency-dynamodb 2.0.0-SNAPSHOT 12.58
powertools-large-messages 2.0.0-SNAPSHOT 17.41
powertools-batch 2.0.0-SNAPSHOT 21.42
powertools-parameters-ssm 2.0.0-SNAPSHOT 10.65
powertools-parameters-secrets 2.0.0-SNAPSHOT 9.86
powertools-parameters-dynamodb 2.0.0-SNAPSHOT 11.90
powertools-parameters-appconfig 2.0.0-SNAPSHOT 11.41

Copy link
Contributor

@dreamorosi dreamorosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments on the documentation, one of which needs attention.

Regarding the implementation, two comments:

  • in TS and I think other runtimes, we are emitting a warning when we ignore a key to let the customer know there's been data loss - see here. Was this omission intentional?
  • I don't know the implementation good enough to answer this question without running the Logger, so I'll ask: if I add a reserved key via the appendKey() method, is it being correctly ignored?

Finally, could you look at the SonarQubeCloud findings and either address them or mark them as false positives?

@dreamorosi dreamorosi linked an issue Apr 23, 2025 that may be closed by this pull request
@phipag
Copy link
Contributor Author

phipag commented Apr 23, 2025

in TS and I think other runtimes, we are emitting a warning when we ignore a key to let the customer know there's been data loss - see here. Was this omission intentional?

Yes, it was intentional because I didn't want to spam the user's logs with warning messages on each log emission where this happens. If you have a strong opinion on adding a warning I can do so. I believe in .NET there is a warning in the documentation only – similar to what I propose here.

I don't know the implementation good enough to answer this question without running the Logger, so I'll ask: if I add a reserved key via the appendKey() method, is it being correctly ignored?

The appendKey() method in Logging v2 was removed and replaced in favor of Mapped Diagnostic Context (MDC) which is a standard in Java logging frameworks. Here is a link to an explanation for SLF4J which we use here. https://www.slf4j.org/manual.html#mdc

You can check the Preview Documentation to see the Logging v2 docs where appendKey() was removed and MDC was added in this section: https://docs.powertools.aws.dev/lambda/java/preview/core/logging/#custom-keys

To answer your question "is it being correctly ignored?": Unfortunately not and this is why I added the warning in the documentation that unlike with StructuredArguments MDC will not be ignored.

The reason for this is a bit technical (see this issue message). Within the Logger utility we are using MDC.put ourselves and since MDC.put comes from the SLF4J standard logging library we have no way to intercept / modify the behavior. We cannot differentiate in the log resolver whether it was set by us or the user of the library. I think we have to live with this behavior but can at least protect users when using StructuredArguments.

This comes down to the tradeoff between using Java standards vs. custom interfaces / implementations. The value proposition of MDC is that it will automatically work for new users onboarding with Powertools for AWS because there is a high chance they are already using MDC for their logs before adopting Powertools for AWS.

@hjgraca
Copy link
Contributor

hjgraca commented Apr 23, 2025

in TS and I think other runtimes, we are emitting a warning when we ignore a key to let the customer know there's been data loss - see here. Was this omission intentional?

Yes, it was intentional because I didn't want to spam the user's logs with warning messages on each log emission where this happens. If you have a strong opinion on adding a warning I can do so. I believe in .NET there is a warning in the documentation only – similar to what I propose here.

Correct, we do not log any warning to user.

@dreamorosi
Copy link
Contributor

Thanks for the explanations.

As a customer, if I am emitting a log with a key and that key is dropped, I would want to know. Putting myself in the shoes of an operator, having a warning that tells me this has happened helps me understand.

The opposite means: 1/ I see a missing key from my logs when I needed it, 2/ I think there's an issue with my code, 3/ I spend time troubleshooting, 4/ maybe, after a long while, I understand this is because of a Powertools specific behavior, 5/ I'm now frustrated because I wasted time.

Since we are saying we're excluding MDC from this logic (more on this in a sec), then we'd not be spamming the customer since the warning would be local to a specific log entry (aka it's emitted once). Besides, warnings are subject to log level, so I could silence them if I really don't want them.

Alternatively, we could also have a "warn once" logic that emits a warning at most once whenever a key filtering action happens the first time. This way we still inform the customer, but keep warnings at minimum. This is still a better option than saying nothing in my opinion.

Regarding MDC, this is again a limitation of knowledge in the language/ecosystem, but does this code (from the docs) add the keys to that one specific log entry, or does it add to all the subsequent ones?

import static software.amazon.lambda.powertools.logging.argument.StructuredArguments.entry;
import static software.amazon.lambda.powertools.logging.argument.StructuredArguments.entries;

public class PaymentFunction implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {

    private static final Logger LOGGER = LoggerFactory.getLogger(AppLogResponse.class);

    @Logging
    public APIGatewayProxyResponseEvent handleRequest(final APIGatewayProxyRequestEvent input, final Context context) {
        // ...
        LOGGER.info("Collecting payment", entry("orderId", order.getId()));

        // ...
        Map<String, String> customKeys = new HashMap<>();
        customKeys.put("paymentId", payment.getId());
        customKeys.put("amount", payment.getAmount);
        LOGGER.info("Payment successful", entries(customKeys));
    }
}

@phipag
Copy link
Contributor Author

phipag commented Apr 23, 2025

Regarding MDC, this is again a limitation of knowledge in the language/ecosystem, but does this code (from the docs) add the keys to that one specific log entry, or does it add to all the subsequent ones?

This code will add the keys only to those specific log entries (this is what I refer to as StructuredArguments). If you want to add keys to all log entries you can do this – and this is where we can't prevent users from adding reserved keys:

public class CreditCardFunction implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {

    private static final Logger LOGGER = LoggerFactory.getLogger(CreditCardFunction.class);

    @Logging(clearState = true)
    public APIGatewayProxyResponseEvent handleRequest(final APIGatewayProxyRequestEvent input, final Context context) {
        // This will be attached to each log
        MDC.put("cardNumber", card.getId());
        LOGGER.info("Updating card information");
        // ...
    }
}

Besides, warnings are subject to log level, so I could silence them if I really don't want them.

This is a strong argument, I can send an update emitting a warning log.

Copy link

sonarqubecloud bot commented Apr 24, 2025

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

@phipag
Copy link
Contributor Author

phipag commented Apr 24, 2025

Hey @hjgraca and @dreamorosi,

thank you again for your comments. I sent an update addressing your comments. While working on this I also improved the implementation by moving the reserved key logic out of the respective log4j/logback implementation to the StructuredArguments class which removes a lot of redundant code. This also allowed making each implementation of a structured argument (e.g. MapArgument) package private reducing the public API surface area.

I also deployed a preview of the documentation here to avoid any confusion: https://dealn7fl31ram.cloudfront.net/

Let me summarize the current behavior with an example:

StructuredArguments

    @Logging(clearState = true)
    public String handleRequest(final Object input, final Context context) {
        LOGGER.info("My message.", StructuredArguments.entry("message", "My duplicated message."),
                StructuredArguments.entry("message2", "My second message."),
                StructuredArguments.entry("service", "My duplicated service."),
                StructuredArguments.entry("function_name", "My duplicated function_name"));
        return "OK";
    }

As discussed, this will drop the reserved keys (message, service, function_name) and log exactly one warning for each offending key. This warning can be suppressed by the user when filtering by log-level.

{
  "level": "WARN",
  "message": "Attempted to use reserved key 'message' in structured argument. This key will be ignored.",
  "cold_start": true,
  "function_arn": "arn:aws:lambda:us-east-1:012345678912:function:HelloWorldFunction",
  "function_memory_size": 512,
  "function_name": "HelloWorldFunction",
  "function_request_id": "03f6ec1f-51dd-4a63-87de-8831ca1fd883",
  "function_version": "$LATEST",
  "service": "service_undefined",
  "timestamp": "2025-04-24T10:54:40.120Z"
}
{
  "level": "WARN",
  "message": "Attempted to use reserved key 'service' in structured argument. This key will be ignored.",
  "cold_start": true,
  "function_arn": "arn:aws:lambda:us-east-1:012345678912:function:HelloWorldFunction",
  "function_memory_size": 512,
  "function_name": "HelloWorldFunction",
  "function_request_id": "03f6ec1f-51dd-4a63-87de-8831ca1fd883",
  "function_version": "$LATEST",
  "service": "service_undefined",
  "timestamp": "2025-04-24T10:54:40.121Z"
}
{
  "level": "WARN",
  "message": "Attempted to use reserved key 'function_name' in structured argument. This key will be ignored.",
  "cold_start": true,
  "function_arn": "arn:aws:lambda:us-east-1:012345678912:function:HelloWorldFunction",
  "function_memory_size": 512,
  "function_name": "HelloWorldFunction",
  "function_request_id": "03f6ec1f-51dd-4a63-87de-8831ca1fd883",
  "function_version": "$LATEST",
  "service": "service_undefined",
  "timestamp": "2025-04-24T10:54:40.122Z"
}
{
  "level": "INFO",
  "message": "My message.",
  "cold_start": true,
  "function_arn": "arn:aws:lambda:us-east-1:012345678912:function:HelloWorldFunction",
  "function_memory_size": 512,
  "function_name": "HelloWorldFunction",
  "function_request_id": "03f6ec1f-51dd-4a63-87de-8831ca1fd883",
  "function_version": "$LATEST",
  "service": "service_undefined",
  "timestamp": "2025-04-24T10:54:40.123Z",
  "message2": "My second message."
}

Mapped Diagnostic Context MDC

Here, we have the technical limitation of not being able to prevent the user from doing this and therefore added a warning about this behavior in the documentation.

    @Logging(clearState = true)
    public String handleRequest(final Object input, final Context context) {
        MDC.put("message", "My duplicated message.");
        LOGGER.info("My message.");
        return "OK";
    }

This code will render the message key twice in the JSON output:

{"level":"INFO","message":"My message.","cold_start":true,"function_arn":"arn:aws:lambda:us-east-1:012345678912:function:HelloWorldFunction","function_memory_size":512,"function_name":"HelloWorldFunction","function_request_id":"e7a2a7f2-b9db-49d4-929f-72a89f002334","function_version":"$LATEST","service":"service_undefined","timestamp":"2025-04-24T11:06:32.413Z","message":"My duplicated message."}

which will yield the following parsed JSON (in CloudWatch):

{
  "level": "INFO",
  "message": "My duplicated message.",
  "cold_start": true,
  "function_arn": "arn:aws:lambda:us-east-1:012345678912:function:HelloWorldFunction",
  "function_memory_size": 512,
  "function_name": "HelloWorldFunction",
  "function_request_id": "e7a2a7f2-b9db-49d4-929f-72a89f002334",
  "function_version": "$LATEST",
  "service": "service_undefined",
  "timestamp": "2025-04-24T11:06:32.413Z"
}

@dreamorosi dreamorosi self-requested a review April 24, 2025 14:59
Copy link
Contributor

@dreamorosi dreamorosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my feedback and adding the warnings.

This MDC thing is unfortunate, and I wish there was a better solution, but for now we'll keep it as it is.

@phipag phipag merged commit 374b38d into v2 Apr 24, 2025
12 checks passed
@phipag phipag deleted the phipag/logging-reserved-keys-v2 branch April 24, 2025 15:46
@github-project-automation github-project-automation bot moved this from Pending review to Coming soon in Powertools for AWS Lambda (Java) Apr 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working logger size/L
Projects
Status: Coming soon
Development

Successfully merging this pull request may close these issues.

Bug: v2 StructuredArguments allow logging of reserved keys
3 participants