-
Notifications
You must be signed in to change notification settings - Fork 154
Feature request: DynamoDb idempotency causes hot partions #3781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for opening your first issue here! We'll come back to you as soon as we can. |
Hi @elliottohara thank you for opening this issue and proposing this idea. I understand the problem we're trying to avoid, but I am unsure how it would look like or what's the best way to address it yet. Assuming you were able to provide your own logic, how would you like to implement it/customize it? Having this example would help us understand if it's something we can improve internally and provide it as an integrated experience, or if it's something specific to your implementation - in which case we'd just make the method easier to extend/override. Thank you! |
If I where doing it, I'd just set the partition key to the same thing that the Id is on a table without a composite key and set the sort key to (anything really, but probably what the sort key currently is). I struggle to see the value of being able to query all idemptotency records for a given lambda (which is what the current solution provides) what I'm proposing simply means you can't get that data without a scan (which is fine by me). |
I see, thanks for clarifying. Does setting a prefix at the function level using this parameter help reduce the pressure on the partition? At the very least with this you can have one partition for each function. Alternatively, just to understand better, is using a table without sort key an option? |
On second thought - what I suggested with the prefix would work only if you're sharing the same table with multiple Lambda functions. If you only have one function, then it won't change anything. |
This key design is meant to allow querying a specific item with a single Just to clarify, here we're specifically talking about composite keys. If you can use a single key in your table, none of this is a problem and the Idempotency utility will generate a unique PK based on your payload hash. If instead you're using a composite key, which is optional, then this is the best key design we could come up with while having the requirement of being able to query an item in a single operation. Keep in mind that once your table has a composite key DDB only allows you to do a Now with this in mind, since we derive the If we were to introduce scan operations just to check if an item exists, we would need to scan the table for every request, which is bad for both costs and performance. With that said, if you have any suggestions on how to improve this we'd be happy to do so. |
Hi @elliottohara, thanks for opening this issue! I would like to bring some thoughts here and probably enrich the discussion.
When performing a scan operation in DynamoDB, we need to handle pagination since the data may span multiple pages. A scan can retrieve anywhere from a single line of data to thousands of lines, depending on the size of the table. This variability makes the process unpredictable and resource-intensive. Also, as a result, customers may end up paying significantly more due to the increased read capacity units consumed by large scans. Additionally, Lambda functions executing these scans can take several seconds to complete (e.g., from millisecond to seconds), which impacts performance and costs further. Current StateThe current implementation of Idempotency with composite keys (PK + SK) is causing potential hot partition issues due to static partition keys. This design forces queries to use a fixed PK format (e.g., "idempotency#function-name"), which can lead to hot partitions in very specific case. While adding dynamic data to the PK could alleviate this problem, retrieving this dynamic part for queries becomes challenging, especially when Lambda container instances may change. Some brainstormOn top of my mind, I think in a solution that involves a two-step record creation process when customers opt for composite keys. First, store a record with the PK containing the calculated To be honest, I can't have a final opinion on this issue yet and @dreamorosi idea of a PK table only seems to be a viable solution for now. But of course, I'd like to hear more from you and if you can have another idea to solve this. |
Why not just concatenate partition key and sort key and set that for both of them? That would 100% solve my problem. My point about "without a scan" was just that I'd never actually issue a query to get all the idempotency records for a given lambda (specify the partition key without the sort key) -- not that scans are ok. No they aren't ok! :) |
Honestly I never thought of that - it might just work, but I'd like to make some tests first.
Indeed, but even though it's technically an improvement, it would void all the idempotency records of any customer who's using the current implementation. This is to say that we need to think about how to do this without introducing a breaking change. For the time being I'd rather add this to the list of changes we want to consider for v3, and change this method here to |
Yeah, just making getKey protected is PERFECT for now. In fact, that was almost the feature request I made, but I figured you'd want to know why. Would REALLY appreciate that! |
Ok, deal! You're welcome to send a PR if you'd like, otherwise I'll pick the issue up in the next days - but definitely before the next release. |
This issue is now closed. Please be mindful that future comments are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so. |
Thank you for the PR and for the interesting discussion. I have captured the details about modifying the composite key in the v4 discussion here. |
This is now released under v2.18.0 version! |
Use case
We have an existing dynamodb table that we're using for our system that has a composite key (pk, sk)... When you set the sortKeyAttribute, we end up with all idempotency checks for a given lambda sharing a partition - resulting in hot partitions.
Solution/User Experience
Ideally we'd like the ability to inject or override the
getKey
method so we can control how the row is created, or just change the basic implementation to not create hot partitions with busy lambdas.Alternative solutions
Acknowledgment
Future readers
Please react with 👍 and your use case to help us understand customer demand.
The text was updated successfully, but these errors were encountered: