Skip to content

Feature request: sequential async processing #1829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 of 2 tasks
revmischa opened this issue Dec 21, 2023 · 11 comments · Fixed by #3109
Closed
1 of 2 tasks

Feature request: sequential async processing #1829

revmischa opened this issue Dec 21, 2023 · 11 comments · Fixed by #3109
Assignees
Labels
batch This item relates to the Batch Processing Utility completed This item is complete and has been merged/shipped feature-request This item refers to a feature request for an existing or new utility

Comments

@revmischa
Copy link

Use case

Sometimes I have records that I want processed one at a time, but my processor function happens to be async.

Solution/User Experience

It would be nice to request sequential processing of records with async handlers.

Alternative solutions

No response

Acknowledgment

Future readers

Please react with 👍 and your use case to help us understand customer demand.

@revmischa revmischa added triage This item has not been triaged by a maintainer, please wait feature-request This item refers to a feature request for an existing or new utility labels Dec 21, 2023
@dreamorosi dreamorosi added discussing The issue needs to be discussed, elaborated, or refined batch This item relates to the Batch Processing Utility and removed triage This item has not been triaged by a maintainer, please wait labels Dec 21, 2023
@dreamorosi
Copy link
Contributor

Hi @revmischa thank you for taking the time to open this issue.

As we discussed in Discord, I think the request is valid and I remember hearing it from other customers in the past few weeks.

I'm adding this to the backlog so that it can be picked up.

If anyone is interested in contributing, please leave a comment so we dan discuss an implementation.

@revmischa
Copy link
Author

I also have a use case where I have an async handler but need to process FIFO events sequentially.

I think it's a safe assumption that most every handler anyone will ever write will be async. How else will do you I/O otherwise? And what is the use of a handler that can't do I/O?

@dreamorosi
Copy link
Contributor

That's very fair, we are focused on releasing v2 this & next week.

After that we'll be able to reprise working on new features for the existing utilities. This is one of the issues I'd like to pick up relatively soon.

@dreamorosi dreamorosi added confirmed The scope is clear, ready for implementation and removed discussing The issue needs to be discussed, elaborated, or refined labels Feb 21, 2024
@dreamorosi dreamorosi added the help-wanted We would really appreciate some support from community for this one label Jun 3, 2024
@arnabrahman
Copy link
Contributor

arnabrahman commented Jul 2, 2024

I can work on this next. I can see there is a section in the doc about Async processing,

*If your function is async returning a Promise, use BatchProcessor and processPartialResponse * 
 If your function is not async, use BatchProcessorSync and processPartialResponseSync

So, based on the PR description, do we now want to have the option to use BatchProcessorSync and processPartialResponseSync in an async function? It would be helpful if I could have some more context. @dreamorosi, whenever you are free.

@revmischa
Copy link
Author

Also just curious, what is the use case for sync processing? You can't really do I/O without async right? So what use is a SQS processing function that can't do any I/O?

@dreamorosi
Copy link
Contributor

Hi @arnabrahman - thank you for reviving the conversation on this feature request.

When we initially ported the Batch Processing utility from the Python version of Powertools for AWS Lambda, we did so mirroring their preferred patterns: meaning we made the synchronous & sequential processor the default, and the asynchronous & parallel one the alternative one.

In hindsight, this was a mistake because - as @revmischa points out - in modern Node.js working with async/await and promises is the de facto standard when dealing with I/O.

In the next release, and before the utility was considered generally available we corrected this and made the BatchProcessor asynchronous by default, and made the sync one the secondary one (BatchProcessorSync).

Currently the async processing only supports processing the items in the batch in parallel (implementation is here).

For example, today you can do this, which will call the recordHandler on each item in the batch in parallel:

import {
  BatchProcessor,
  EventType,
  processPartialResponse,
} from '@aws-lambda-powertools/batch';
import type { SQSRecord, SQSHandler } from 'aws-lambda';

const processor = new BatchProcessor(EventType.SQS);

const recordHandler = async (record: SQSRecord): Promise<void> => {
  // ... do your async processing
};
 
export const handler: SQSHandler = async (event, context) =>
  processPartialResponse(event, recordHandler, processor, {
    context,
  });

As part of this feature request we should allow customers to also use sequential processing, with a flag similar to this:

import {
  BatchProcessor,
  EventType,
  processPartialResponse,
} from '@aws-lambda-powertools/batch';
import type { SQSRecord, SQSHandler } from 'aws-lambda';

const processor = new BatchProcessor(EventType.SQS);

const recordHandler = async (record: SQSRecord): Promise<void> => {
  // ... do your async processing
};
 
export const handler: SQSHandler = async (event, context) =>
  processPartialResponse(event, recordHandler, processor, {
    context,
    processInParallel: false // new flag, name to be confirmed
  });

I'm not 100% sold on the name of the option being processInParallel, but with it I think we should convey the following:

  • by default, when using processPartialResponse & BatchProcessor items are processed in parallel - this is to maintain backwards-compatibility (we might decide to change the default to sequential in the next major version, but that's a separate discussion)
  • by using this new option I'm opting out of the default behavior and instead choosing to have the utility call my async record handler sequentially following the order of the items as they appear in the batch.

Consequentially, regardless of the name we choose for the option, we will have to modify the process() method in the BatchProcessor class to check the value of the new option, and when opted-out, call & await each promise sequentially.

Regarding the BatchProcessorSync and processPartialResponseSync, I don't think there will have to be any changes for this new feature to be added.

@mpiltz
Copy link

mpiltz commented Aug 27, 2024

I ran in to this same issue and solved it by writing my own implementation that extends the BasePartialBatchProcessor.
This was quite easy thanks to the really simple and clean implementations from this project 👍🏻 .

@am29d
Copy link
Contributor

am29d commented Aug 28, 2024

Hey @mpiltz , that's good to hear, glad it worked well for you! Are you interested in contributing the changes in a PR? Happy to collaborate and iterate together.

@arnabrahman
Copy link
Contributor

I have encountered this issue at work and now understand what's needed. I will take this @dreamorosi

Copy link
Contributor

⚠️ COMMENT VISIBILITY WARNING ⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@github-actions github-actions bot added pending-release This item has been merged and will be released soon and removed help-wanted We would really appreciate some support from community for this one confirmed The scope is clear, ready for implementation labels Sep 28, 2024
Copy link
Contributor

github-actions bot commented Oct 8, 2024

This is now released under v2.9.0 version!

@github-actions github-actions bot added completed This item is complete and has been merged/shipped and removed pending-release This item has been merged and will be released soon labels Oct 8, 2024
@dreamorosi dreamorosi moved this from Coming soon to Shipped in Powertools for AWS Lambda (TypeScript) Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
batch This item relates to the Batch Processing Utility completed This item is complete and has been merged/shipped feature-request This item refers to a feature request for an existing or new utility
Projects
5 participants