Skip to content

Feature request: On batch processing, fill in processor result even if BatchProcessingError is raised #3716

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 of 2 tasks
nico00 opened this issue Feb 6, 2024 · 7 comments
Labels
feature-request feature request

Comments

@nico00
Copy link

nico00 commented Feb 6, 2024

Use case

According to the documentation (https://docs.powertools.aws.dev/lambda/python/latest/utilities/batch/#partial-failure-mechanics), BatchProcessingError is raised when all records failed to be processed. In such case, processor response appears empty, as all records have been successfully processed. Having the processor response filled with the list of failed records, would help in reprocessing them.

Solution/User Experience

I suggest that the processor response be compiled before raising BatchProcessingError (class BasePartialBatchProcessor).
This would give the programmer the freedom to decide what to do according to various business cases.

Current approach:

    def _clean(self):
        """
        Report messages to be deleted in case of partial failure.
        """

        if not self._has_messages_to_report():
            return

        if self._entire_batch_failed():
            raise BatchProcessingError(
                msg=f"All records failed processing. {len(self.exceptions)} individual errors logged "
                f"separately below.",
                child_exceptions=self.exceptions,
            )

        messages = self._get_messages_to_report()
        self.batch_response = {"batchItemFailures": messages}

Proposed solution:

    def _clean(self):
        """
        Report messages to be deleted in case of partial failure.
        """

        if not self._has_messages_to_report():
            return

        messages = self._get_messages_to_report()
        self.batch_response = {"batchItemFailures": messages}

        if self._entire_batch_failed():
            raise BatchProcessingError(
                msg=f"All records failed processing. {len(self.exceptions)} individual errors logged "
                f"separately below.",
                child_exceptions=self.exceptions,
            )

Alternative solutions

No response

Acknowledgment

Copy link

boring-cyborg bot commented Feb 6, 2024

Thanks for opening your first issue here! We'll come back to you as soon as we can.
In the meantime, check out the #python channel on our Powertools for AWS Lambda Discord: Invite link

@nico00 nico00 changed the title Feature request: On batch processing, fill in processor result event if BatchProcessingError is raised Feature request: On batch processing, fill in processor result even if BatchProcessingError is raised Feb 6, 2024
@sthulb
Copy link
Contributor

sthulb commented Feb 7, 2024

Thanks for raising this @nico00.

I guess it depends on what you're trying to do with them – Lambda will auto redrive to reprocess these messages at a service level.

@nico00
Copy link
Author

nico00 commented Feb 9, 2024

That's correct but in such case Lambda is limited to two retries, while DynamoDb stream allows up to 10,000 retries. On the other side I see no cons in filling in processor result just before raising BatchProcessingError.

@heitorlessa
Copy link
Contributor

hey @nico00, please allow me to ask some clarifying questions

BatchProcessingError is raised when all records failed to be processed. In such case, processor response appears empty, as all records have been successfully processed.

It's technically a Lambda invocation failure, as recommended by the Lambda team. The Lambda Poller picks up the error and considers the entire batch a failure, there is no empty response in this case.

Are you experiencing an empty response instead of a BatchProcessingError? If so, it'd be a bug/regression on our side.

This would give the programmer the freedom to decide what to do according to various business cases.

Would you be able to expand with one or more examples to help us picture this better?

I'm trying to understand whether you want to intercept a BatchProcessingError - like you can with the context manager today - or something else entirely?

Thanks a lot!

@heitorlessa heitorlessa removed the triage Pending triage from maintainers label Feb 20, 2024
@heitorlessa heitorlessa moved this from Triage to Pending customer in Powertools for AWS Lambda (Python) Feb 20, 2024
@heitorlessa
Copy link
Contributor

Also, before I forget, thank you for creating a feature request :) We always appreciate hearing from customers and learning what additional use cases can be unblocked (or made easier!) for everyone

@leandrodamascena
Copy link
Contributor

This feature request was added in the v2.41.0 release.

Docs: https://docs.powertools.aws.dev/lambda/python/latest/utilities/batch/#working-with-full-batch-failures

Closing as completed.

@github-project-automation github-project-automation bot moved this from Pending customer to Coming soon in Powertools for AWS Lambda (Python) Aug 11, 2024
Copy link
Contributor

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@anafalcao anafalcao moved this from Coming soon to Shipped in Powertools for AWS Lambda (Python) Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request feature request
Projects
Status: Shipped
Development

No branches or pull requests

4 participants