Skip to content

RFC: Handling Large messages for SQS and SNS #1259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jeromevdl opened this issue Jun 30, 2023 · 4 comments
Closed

RFC: Handling Large messages for SQS and SNS #1259

jeromevdl opened this issue Jun 30, 2023 · 4 comments
Assignees
Labels
feature-request New feature or request priority:3 Neutral - not a core feature or affects less than 40% of users

Comments

@jeromevdl
Copy link
Contributor

Key information

Summary

We should handle large messages (with s3) for SQS and SNS and provide a better user experience (error handling, batch...) than today with the @SqsLargeMessage annotation.

Motivation

  • Lambda Powertools Java does not handle large message for SNS (doc, code)
  • Current annotation is global to the handler and handles all messages in a batch (code). Error handling is thus global. It would be better to have the annotation at the item level. See Feature request: Better failures handling while using both BatchProcessing and LargeMessageHandling #596.
  • Large message handling is currently mixed up with batch processing. While they can be use jointly, large message handling is not needed in all batch cases. As it requires additional libraries (payloadoffloading-common, s3 sdk), it should be an independent module we add specifically for this purpose.

Proposal

SQS

This feature should be considered in conjunction with #797 but should be in an individual module (eg. powertools-large-messages ?). The aim of #797 is to provide a simple interface so that developers can focus on handling each message of the batch while the library takes care of failures. This module should permit to handle large messages individually.

if #797 is providing the following interface:

public void processMessage(SQSMessage message, Context context) {
}

this new module should provide the annotation @LargeMessage to handle the message from S3 (implementation would be quite similar as today, replacing the body with the content of the s3 object):

@LargeMessage
public void processMessage(SQSMessage message, Context context) {
    // message.getBody() will contain the content of the S3 object
}

We can also keep the option to not delete the object in S3 (@LargeMessage(deletePayload = false)).

SNS

On SNS, despite SNSEvent has a list of records, there will always be only one message (see doc). The developer would have to create a method to handle a single SNSRecord (as first parameter):

@LargeMessage
public void processRecord(SNSRecord record, Context context) {
    // record.getSNS().getMessage() will contain the content of the S3 object
}

Drawbacks

For SNS, it forces the developer to have the following code in their handler:

public class App implements RequestHandler<SNSEvent, Object> {
    @Override
    public Void handleRequest(SNSEvent input, Context context) {
        processRecord(input.getRecords().get(0), context);
        return null;
    }

    @LargeMessage
    public void processRecord(SNSRecord record, Context context) {
        // record.getSNS().getMessage() will contain the content of the S3 object
    }
}

Rationale and alternatives

  • should we handle the annotation at the handler level? Would be easier for SNS but degraded for SQS (handling failures in batch mode)

Unresolved questions

@scottgerring
Copy link
Contributor

Hey @jeromevdl this looks good! Some thoughts below.

Modules

I think the separate module - powertools-large-messages - is critical. It addresses the underlying issue here - that we bundled functionality up together and now find it hard to split back out. We should favor having more small modules loosely coupled. Furthermore I think we should introduce this in v1 of the powertools, and mark powertools-sqs as deprecated with a migration guide once this and #797 are complete.

Serialization

if #797 is providing the following interface:

public void processMessage(SQSMessage message, Context context) {
}

In the other RFC, we also want to allow the user to provide a handler that takes the deserialized payload - e.g. here with the user defined type Basket.

    @Override
    public void processItem(Basket basket, Context context) {
        // do some stuff with the item
        System.out.println(basket.toString());
    }

In this RFC if the user wants to use large message handling, they will have to deserialize the payload themselves. Is there any way we can structure the impl so that we can could annotate a handler like this, but have the weaving reach in and 1/ expand the large message, and then 2/ deserialize the payload to the user type? Intuitively I think this would be at least hard, and involve coupling the new large-message module to the batch module so that we can find the right join point, which seems like a bad idea.

Should we target the lambda or the user function consuming the record itself?

IMHO the latter. This 1/ gives the user more flexibility with use of the feature (you see this already with the way you can use the existing SQS large message annotation on any function) and 2/ works better with #797 in batch mode as you observe. I think in general we should avoid making new features couple against the lambda handler because of this loss of flexibility.

SNS Single Message in a batch Issue

On SNS, despite SNSEvent has a list of records, there will always be only one message (see doc). The developer would have to create a method to handle a single SNSRecord (as first parameter):

We could make the batch handler handle SNS "batches" too, and then you could process a large message batch simply even though it would only ever be a single record. This is however poor from a 1/ complexity and 2/ performance perspective. I don't see another option here yet.

@jeromevdl jeromevdl added the priority:3 Neutral - not a core feature or affects less than 40% of users label Jul 17, 2023
@jeromevdl jeromevdl removed their assignment Jul 18, 2023
@scottgerring
Copy link
Contributor

LGTM. It would be nice if there was some way to use this and still have the inner-message deserialized to a user-defined type, but I don't see an obvious way to do this.

@mriccia
Copy link
Contributor

mriccia commented Jul 19, 2023

Looks good to me as well. Let's ensure we check that the @LargeMessage annotation is placed on the correct Method (needs to check we have SNSMessage or SQSMessage on the param, and not force the Context).

@jeromevdl
Copy link
Contributor Author

Done, will be released in 1.17.0 soon.

@scottgerring scottgerring moved this from Backlog to Coming soon in Powertools for AWS Lambda (Java) Aug 17, 2023
@jeromevdl jeromevdl moved this from Coming soon to Shipped in Powertools for AWS Lambda (Java) Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request New feature or request priority:3 Neutral - not a core feature or affects less than 40% of users
Projects
Status: Shipped
Development

No branches or pull requests

3 participants