Skip to content

Feature request: DynamoDb idempotency causes hot partions #3781

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
elliottohara opened this issue Mar 26, 2025 · 14 comments · Fixed by #3783
Closed
2 tasks done

Feature request: DynamoDb idempotency causes hot partions #3781

elliottohara opened this issue Mar 26, 2025 · 14 comments · Fixed by #3783
Labels
completed This item is complete and has been merged/shipped feature-request This item refers to a feature request for an existing or new utility good-first-issue Something that is suitable for those who want to start contributing idempotency This item relates to the Idempotency Utility

Comments

@elliottohara
Copy link
Contributor

Use case

We have an existing dynamodb table that we're using for our system that has a composite key (pk, sk)... When you set the sortKeyAttribute, we end up with all idempotency checks for a given lambda sharing a partition - resulting in hot partitions.

Solution/User Experience

Ideally we'd like the ability to inject or override the getKey method so we can control how the row is created, or just change the basic implementation to not create hot partitions with busy lambdas.

Alternative solutions

Acknowledgment

Future readers

Please react with 👍 and your use case to help us understand customer demand.

@elliottohara elliottohara added feature-request This item refers to a feature request for an existing or new utility triage This item has not been triaged by a maintainer, please wait labels Mar 26, 2025
Copy link

boring-cyborg bot commented Mar 26, 2025

Thanks for opening your first issue here! We'll come back to you as soon as we can.
In the meantime, check out the #typescript channel on our Powertools for AWS Lambda Discord: Invite link

@dreamorosi
Copy link
Contributor

Hi @elliottohara thank you for opening this issue and proposing this idea.

I understand the problem we're trying to avoid, but I am unsure how it would look like or what's the best way to address it yet.

Assuming you were able to provide your own logic, how would you like to implement it/customize it?

Having this example would help us understand if it's something we can improve internally and provide it as an integrated experience, or if it's something specific to your implementation - in which case we'd just make the method easier to extend/override.

Thank you!

@dreamorosi dreamorosi added idempotency This item relates to the Idempotency Utility discussing The issue needs to be discussed, elaborated, or refined need-response This item requires a response from a customer and will considered stale after 2 weeks and removed triage This item has not been triaged by a maintainer, please wait labels Mar 26, 2025
@elliottohara
Copy link
Contributor Author

Hi @elliottohara thank you for opening this issue and proposing this idea.

I understand the problem we're trying to avoid, but I am unsure how it would look like or what's the best way to address it yet.

Assuming you were able to provide your own logic, how would you like to implement it/customize it?

Having this example would help us understand if it's something we can improve internally and provide it as an integrated experience, or if it's something specific to your implementation - in which case we'd just make the method easier to extend/override.

Thank you!

If I where doing it, I'd just set the partition key to the same thing that the Id is on a table without a composite key and set the sort key to (anything really, but probably what the sort key currently is).

I struggle to see the value of being able to query all idemptotency records for a given lambda (which is what the current solution provides) what I'm proposing simply means you can't get that data without a scan (which is fine by me).

@dreamorosi
Copy link
Contributor

dreamorosi commented Mar 26, 2025

I see, thanks for clarifying.

Does setting a prefix at the function level using this parameter help reduce the pressure on the partition? At the very least with this you can have one partition for each function.

Alternatively, just to understand better, is using a table without sort key an option?

@dreamorosi
Copy link
Contributor

On second thought - what I suggested with the prefix would work only if you're sharing the same table with multiple Lambda functions.

If you only have one function, then it won't change anything.

@dreamorosi
Copy link
Contributor

dreamorosi commented Mar 26, 2025

I struggle to see the value of being able to query all idemptotency records for a given lambda (which is what the current solution provides) what I'm proposing simply means you can't get that data without a scan (which is fine by me).

This key design is meant to allow querying a specific item with a single GetItem operation - the fact that you can also query all the records for a function is just a side effect.

Just to clarify, here we're specifically talking about composite keys. If you can use a single key in your table, none of this is a problem and the Idempotency utility will generate a unique PK based on your payload hash.

If instead you're using a composite key, which is optional, then this is the best key design we could come up with while having the requirement of being able to query an item in a single operation. Keep in mind that once your table has a composite key DDB only allows you to do a GetItem/PutItem using both.

Now with this in mind, since we derive the sk from your payload and Lambda is stateless (aka there's no guarantee that your next request will be served by the same environment) then the pk must be static, or we wouldn't know how to query that item back.

If we were to introduce scan operations just to check if an item exists, we would need to scan the table for every request, which is bad for both costs and performance.

With that said, if you have any suggestions on how to improve this we'd be happy to do so.

@leandrodamascena
Copy link
Contributor

Hi @elliottohara, thanks for opening this issue! I would like to bring some thoughts here and probably enrich the discussion.

I struggle to see the value of being able to query all idemptotency records for a given lambda (which is what the current solution provides) what I'm proposing simply means you can't get that data without a scan (which is fine by me).

When performing a scan operation in DynamoDB, we need to handle pagination since the data may span multiple pages. A scan can retrieve anywhere from a single line of data to thousands of lines, depending on the size of the table. This variability makes the process unpredictable and resource-intensive. Also, as a result, customers may end up paying significantly more due to the increased read capacity units consumed by large scans. Additionally, Lambda functions executing these scans can take several seconds to complete (e.g., from millisecond to seconds), which impacts performance and costs further.

Current State

The current implementation of Idempotency with composite keys (PK + SK) is causing potential hot partition issues due to static partition keys. This design forces queries to use a fixed PK format (e.g., "idempotency#function-name"), which can lead to hot partitions in very specific case. While adding dynamic data to the PK could alleviate this problem, retrieving this dynamic part for queries becomes challenging, especially when Lambda container instances may change.

Some brainstorm

On top of my mind, I think in a solution that involves a two-step record creation process when customers opt for composite keys. First, store a record with the PK containing the calculated payload_idempotency_key and a new dynamic_data_to_sk field (not final name). Then, create a second record using this dynamic dynamic_data_to_sk as the PK, combined with the SK. This approach would distribute data more evenly across partitions, reducing hot partition risks. However, it comes with trade-offs: increased complexity, can also have additional queries and inserts, and potential impacts on existing implementations.

To be honest, I can't have a final opinion on this issue yet and @dreamorosi idea of ​​a PK table only seems to be a viable solution for now. But of course, I'd like to hear more from you and if you can have another idea to solve this.

@elliottohara
Copy link
Contributor Author

Why not just concatenate partition key and sort key and set that for both of them? That would 100% solve my problem.

My point about "without a scan" was just that I'd never actually issue a query to get all the idempotency records for a given lambda (specify the partition key without the sort key) -- not that scans are ok. No they aren't ok! :)

@dreamorosi
Copy link
Contributor

dreamorosi commented Mar 26, 2025

Why not just concatenate partition key and sort key and set that for both of them?

Honestly I never thought of that - it might just work, but I'd like to make some tests first.

That would 100% solve my problem.

Indeed, but even though it's technically an improvement, it would void all the idempotency records of any customer who's using the current implementation.

This is to say that we need to think about how to do this without introducing a breaking change.

For the time being I'd rather add this to the list of changes we want to consider for v3, and change this method here to protected, so that you can extend it on your side.

@elliottohara
Copy link
Contributor Author

Yeah, just making getKey protected is PERFECT for now. In fact, that was almost the feature request I made, but I figured you'd want to know why.

Would REALLY appreciate that!

@dreamorosi
Copy link
Contributor

Ok, deal!

You're welcome to send a PR if you'd like, otherwise I'll pick the issue up in the next days - but definitely before the next release.

@dreamorosi dreamorosi added good-first-issue Something that is suitable for those who want to start contributing help-wanted We would really appreciate some support from community for this one confirmed The scope is clear, ready for implementation and removed discussing The issue needs to be discussed, elaborated, or refined labels Mar 26, 2025
@dreamorosi dreamorosi moved this from Ideas to Backlog in Powertools for AWS Lambda (TypeScript) Mar 26, 2025
@github-project-automation github-project-automation bot moved this from Backlog to Coming soon in Powertools for AWS Lambda (TypeScript) Mar 27, 2025
Copy link
Contributor

⚠️ COMMENT VISIBILITY WARNING ⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@dreamorosi
Copy link
Contributor

Thank you for the PR and for the interesting discussion.

I have captured the details about modifying the composite key in the v4 discussion here.

@dreamorosi dreamorosi removed help-wanted We would really appreciate some support from community for this one need-response This item requires a response from a customer and will considered stale after 2 weeks labels Mar 27, 2025
@github-actions github-actions bot added pending-release This item has been merged and will be released soon and removed confirmed The scope is clear, ready for implementation labels Mar 27, 2025
Copy link
Contributor

github-actions bot commented Apr 8, 2025

This is now released under v2.18.0 version!

@github-actions github-actions bot added completed This item is complete and has been merged/shipped and removed pending-release This item has been merged and will be released soon labels Apr 8, 2025
@dreamorosi dreamorosi moved this from Coming soon to Shipped in Powertools for AWS Lambda (TypeScript) Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
completed This item is complete and has been merged/shipped feature-request This item refers to a feature request for an existing or new utility good-first-issue Something that is suitable for those who want to start contributing idempotency This item relates to the Idempotency Utility
Projects
Development

Successfully merging a pull request may close this issue.

3 participants