Skip to content

Batch Processor Bug - Batch Item Failure Report does not include all message IDs that failed processing #966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kimberlyamandalu opened this issue Jan 20, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@kimberlyamandalu
Copy link

kimberlyamandalu commented Jan 20, 2022

When returning the batchItemFailures dictionary, I think it is supposed to return a list of dictionaries. In my understanding, this is how the batchItemFailures object must look like:

{
    "batchItemFailures": [
        {
            "itemIdentifier": "msgId1"
        },
        {
            "itemIdentifier": "msgId2"
        }
    ]
}

I checked out the documentation here: https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html#services-sqs-batchfailurereporting

While testing I noticed that the batch processor cleanup was always returning just one message ID from the batch, even when there are multiple messages that failed.

Looks like there might be a bug in the implementation where the failed message IDs are collected.
File:
https://github.com/awslabs/aws-lambda-powertools-python/blob/develop/aws_lambda_powertools/utilities/batch/base.py

Lines:
https://github.com/awslabs/aws-lambda-powertools-python/blob/27e502298795f5720f546e09011ae7290a50a443/aws_lambda_powertools/utilities/batch/base.py#L408-L422

I think the collection methods should be returning a list.

    def _collect_sqs_failures(self):
        if self.model:
            return [{"itemIdentifier": msg.messageId} for msg in self.fail_messages]
        return [{"itemIdentifier": msg.message_id} for msg in self.fail_messages]


    def _collect_kinesis_failures(self):
        if self.model:
            # Pydantic model uses int but Lambda poller expects str
            return [{"itemIdentifier": msg.kinesis.sequenceNumber} for msg in self.fail_messages]
        return [{"itemIdentifier": msg.kinesis.sequence_number} for msg in self.fail_messages]


    def _collect_dynamodb_failures(self):
        if self.model:
            return [{"itemIdentifier": msg.dynamodb.SequenceNumber} for msg in self.fail_messages]
        return [{"itemIdentifier": msg.dynamodb.sequence_number} for msg in self.fail_messages]

As part of the proposed changes above, the following line also needs to be updated to avoid returning a list of list of dictionaries.
https://github.com/awslabs/aws-lambda-powertools-python/blob/27e502298795f5720f546e09011ae7290a50a443/aws_lambda_powertools/utilities/batch/base.py#L388

        self.batch_response = {"batchItemFailures": messages}

I have mainly been working on SQS Batch Processing for my job so I was able to test the proposed changes locally only for the SQS collection method.

Please let me know if I am mistaken or this is a valid bug.
Thank you!

@boring-cyborg
Copy link

boring-cyborg bot commented Jan 20, 2022

Thanks for opening your first issue here! We'll come back to you as soon as we can.

@heitorlessa heitorlessa added area/batch bug Something isn't working labels Jan 20, 2022
@heitorlessa
Copy link
Contributor

🤦 thank you so so so much @kimberlyamandalu - This is a critical bug that it's hard to believe it passed multiple reviews and our tests were naive on that too.

Changing to a list within that exp comprehension would cause the same bug, because we have a static dict key itemIdentifier - last one would win, Python would discard the rest.

Fix: #967

@heitorlessa heitorlessa added the pending-release Fix or implementation already in dev waiting to be released label Jan 20, 2022
@github-actions
Copy link
Contributor

This is now released under 1.24.1 version!

@github-actions github-actions bot removed the pending-release Fix or implementation already in dev waiting to be released label Jan 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants