Skip to content

feat(batch): sequential async processing of records for BatchProcessor #3109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

arnabrahman
Copy link
Contributor

@arnabrahman arnabrahman commented Sep 25, 2024

Summary

By default, the BatchProcessor processes records in parallel using Promise.all(). We want to give users an option to process records sequentially.

Changes

  • Introduce a new batch processing option, processInParallel, which indicates whether records should be processed in parallel or sequentially.
  • I believe this option should only be available for BatchProcessor class but I couldn't make the optional property work as the class signature is similar for both BatchProcessor & BatchProcessorSync. I have omitted this property for SqsFifoPartialProcessor though.
  • The records will be processed in parallel or sequentially based on the processInParallel option. If processInParallel is not provided, the records will default to being processed in parallel. Made the changes in theBasePartialProcessor class.
  • For the tests, I restructured the async processing test cases. I ran all the tests for both parallel and sequential processing by introducing describe.each(). My rationale is that every feature for BatchProcessor should behave the same, regardless of whether records are processed in parallel or sequentially. This approach also avoids duplicating test cases for both options.
  • Updated the documentation for processInParallel and made additional changes where necessary, using my judgment.

Issue number: #1829


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.

@arnabrahman arnabrahman requested a review from a team September 25, 2024 05:41
@arnabrahman arnabrahman requested a review from a team as a code owner September 25, 2024 05:41
@boring-cyborg boring-cyborg bot added batch This item relates to the Batch Processing Utility documentation Improvements or additions to documentation tests PRs that add or change tests labels Sep 25, 2024
@pull-request-size pull-request-size bot added the size/XXL PRs with 1K+ LOC, largely documentation related label Sep 25, 2024
@github-actions github-actions bot added the feature PRs that introduce new features or minor changes label Sep 25, 2024
@dreamorosi
Copy link
Contributor

Hi @arnabrahman, thank you for the PR!

I will start reviewing it tomorrow morning - I was on PTO for the first half of the week.

Regarding the SonarCloud findings, I think we can simplify the structure of the tests and remove the findings by just changing it to something like:

describe('Asynchronously processing', () => {
    const cases = [
      {
        description: 'in parallel',
        options: { processInParallel: true },
      },
      {
        description: 'sequentially',
        options: { processInParallel: false },
      },
    ];

    it.each(cases)('SQS Records $description', ({ options }) => {

    })
})

or similar

@dreamorosi dreamorosi linked an issue Sep 26, 2024 that may be closed by this pull request
2 tasks
@arnabrahman
Copy link
Contributor Author

@dreamorosi No worries, i will take a look at this tomorrow.

@pull-request-size pull-request-size bot added size/XL PRs between 500-999 LOC, often PRs that grown with feedback and removed size/XXL PRs with 1K+ LOC, largely documentation related labels Sep 27, 2024
Copy link
Contributor

@dreamorosi dreamorosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work on this PR @arnabrahman, appreciate your help!

Thank you also for updating the docs with the new feature 🎉

Copy link

@dreamorosi dreamorosi merged commit e31279a into aws-powertools:main Sep 27, 2024
19 checks passed
@arnabrahman arnabrahman deleted the 1829-sequential-async-processing branch September 27, 2024 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
batch This item relates to the Batch Processing Utility documentation Improvements or additions to documentation feature PRs that introduce new features or minor changes size/XL PRs between 500-999 LOC, often PRs that grown with feedback tests PRs that add or change tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: sequential async processing
2 participants