Add support for MemoryDB/ElasticCache/Redis as Idempotency backend #1181

whardier · 2021-08-21T02:37:55Z

Updated topic to mention ElasticCache and general Redis as well.

https://aws.amazon.com/about-aws/whats-new/2021/08/amazon-memorydb-redis/

gmcrocetti · 2021-09-02T01:14:47Z

I'm putting myself available to tackle this issue.

heitorlessa · 2021-09-04T10:43:24Z

That would be awesome! I think vanilla Redis makes more sense than MemoryDB as it’s still early to know how many customers would want it. FYI - I’m back on Sep 27th and will review the PR by then. Muito obrigado Guilherme ;-)

…

On Wed, 1 Sep 2021 at 22:15, Guilherme Martins Crocetti < ***@***.***> wrote: I'm putting myself available to tackle this issue. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/awslabs/aws-lambda-powertools-python/issues/629#issuecomment-910989625>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZPQBE4ZZNMF4I3EIW7ZYDT73FZHANCNFSM5CRLKANQ> .

danielloader · 2021-09-04T10:45:50Z

Isn't it meant to be the same client implementation for both?

heitorlessa · 2021-09-04T10:52:08Z

Yep, though I haven’t had a chance to look into it yet for any caveats.

…

On Sat, 4 Sep 2021 at 07:46, Daniel Loader ***@***.***> wrote: Isn't it meant to be the same client implementation for both? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <https://github.com/awslabs/aws-lambda-powertools-python/issues/629#issuecomment-912952211>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZPQBDWNMV65RQC5BQXWDDUAH2GRANCNFSM5CRLKANQ> .

heitorlessa · 2021-09-04T13:30:09Z

Just read the docs, you’re right Daniel, it’s fully compatible — we could use redis-py-cluster as an optional dependency. Depending on the implementation, it’d be easy to also provide Redis support for Parameters and Feature Flag. https://github.com/Grokzen/redis-py-cluster

…

On Sat, 4 Sep 2021 at 07:51, Heitor Lessa ***@***.***> wrote: Yep, though I haven’t had a chance to look into it yet for any caveats. On Sat, 4 Sep 2021 at 07:46, Daniel Loader ***@***.***> wrote: > Isn't it meant to be the same client implementation for both? > > — > You are receiving this because you commented. > > > Reply to this email directly, view it on GitHub > <https://github.com/awslabs/aws-lambda-powertools-python/issues/629#issuecomment-912952211>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAZPQBDWNMV65RQC5BQXWDDUAH2GRANCNFSM5CRLKANQ> > . >

whardier · 2021-09-04T19:32:15Z

Redis in general would be fairly rad to have available via an extra requirement. I wasn't entirely sure if a broad implementation in aws-lambda-powertools for redis support would ultimately support both ElasticCache and MemoryDB.

gmcrocetti · 2021-09-09T01:58:58Z

@heitorlessa . Things have changed, won't have bandwidth to tackle this issue :/. I hope someone else will volunteer. Enjoy your time ! Obrigado 👍

leandrodamascena · 2022-06-30T14:33:02Z

Hi all!

I was looking at some issues and opportunities to start contributing to Lambda PowerTools and came across this issue. This implementation looks very interesting and adding a new backend to persist Idempotency data seems like a great idea. Even DynamoDB is a great lightweight option, not everyone wants to create a new table and/or manage a new resource on AWS.

I'll try to summarize some information before I start writing code and tests.

1 - As everyone mentioned, MemoryDB is fully compatible with the Redis protocol and we can confirm that here https://docs.aws.amazon.com/memorydb/latest/devguide/memorydb-guide.pdf.pdf (page 1)

2 - @heitorlessa I'm not sure if the https://github.com/Grokzen/redis-py-cluster library is a good choice at this point. It might be a good choice when this thread starts, but now they suggest migrating to the official RedisLabs library (https://github.com/redis/redis-py)

3 - The official library supports connection and operations on a single node or cluster. https://github.com/redis/redis-py#cluster-mode

4 - Even the Idempotency utility will write and read few keys in Redis, a performance test is indicated to see if it can impact somewhere. I'll take care of that and share the results as I write the code.

5 - I create a new class to test how we can implement this feature and it is already saving and getting keys from Redis. Of course it's missing a lot of things like tests, parameters, code optimization, comments and other things, but I think it can be a start and I can start working on this.

6 - I will try to cover most scenarios in the first version.

I know this project is frozen until next month I think. So there will be plenty of time to write code and test.

Thanks for reading and suggestions are welcome.

heitorlessa · 2022-12-13T10:50:45Z

Quick update: @Vandita2020 is working on this

leandrodamascena · 2023-01-29T19:06:56Z

leandrodamascena · 2023-01-30T13:13:06Z

I've been writing code to connect to Redis Sentinel and it changes parameters and options a lot, so I thought of a new UX to make everything simpler. The way it is programmed now is:

from aws_lambda_powertools.utilities.idempotency import (
       idempotent,
       RedisCachePersistenceLayer,
       IdempotencyConfig
)

persistence_layer = RedisCachePersistenceLayer(host="192.168.68.112", port="6379", user="xxx", password="xxxx", db_index=0, static_pk_value="test",...)

The way this would be more readable to create a connection would be:

Standalone

from aws_lambda_powertools.utilities.idempotency.redis.connection import RedisStandalone
from aws_lambda_powertools.utilities.idempotency import RedisCachePersistenceLayer

conn_config = RedisStandalone(host.. pass.. user)
persistence_layer = RedisCachePersistenceLayer(connection=conn_config, ...)

Cluster

from aws_lambda_powertools.utilities.idempotency.redis.connection import RedisCluster
from aws_lambda_powertools.utilities.idempotency import RedisCachePersistenceLayer

conn_config = RedisCluster(host... startup_nodes)
persistence_layer = RedisCachePersistenceLayer(connection=conn_config, ...)

Sentinel

from aws_lambda_powertools.utilities.idempotency.redis.connection import RedisSentinel
from aws_lambda_powertools.utilities.idempotency import RedisCachePersistenceLayer

conn_config = RedisSentinel(sentinels..socket…)
persistence_layer = RedisCachePersistenceLayer(connection=conn_config, ....)

Makes sense? I would like to hear feedback on this.

heitorlessa · 2023-12-06T09:10:59Z

We're reviewing the last edge case on concurrency and docs tomorrow - hoping to get this released next week

github-actions · 2024-01-10T18:37:28Z

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

github-actions · 2024-01-19T13:52:50Z

This is now released under 2.32.0 version!

leandrodamascena · 2024-01-30T22:58:23Z

To enable other runtimes and customers to benefit from the research and implementation carried out in this pull request, I have provided the Redis implementation text below.

Many thanks to @roger-zhangg for the partnership and deep dive into this extensive and fantastic work, which undoubtedly raised the bar for this project!

API Design

We kept the same user experience when switching from DynamoDB to Redis. This is very important because one of the core principles of Powertools is to ensure the developer experience is as seamless as possible.

By keeping the persistence abstraction layer interchangeable between DynamoDB and Redis, we enabled a smooth transition that required minimal code changes for developers. The interfaces remained consistent, reducing friction when migrating the data storage technology.

Connection

In the initial version, we provided a dedicated Connection Class to assist our customers in creating Redis connections. This class wrapped the Redis client, both standalone and cluster, enabling customers to provide their Redis connection details (host, port, passwords) for establishing connections. Once the connection was established, this class could be passed to the Idempotency Layer for use. However, this design had a few significant drawbacks. Firstly, it was challenging for this design to support Redis sentinel connections, as sentinel connections are set up differently from standalone and cluster connections. Secondly, the default Redis client had more than 40 parameters. In our connection design, we chose to support only the most commonly used ones, passing all other parameters using **kwargs directly to the wrapped Redis Client. This could result in a less-than-optimal experience for our customers, as they would need to figure out the parameter names without IDE typing hints when passing parameters using **kwargs.

Keeping these considerations in mind, we drafted the second version of the Redis Connection. We made a few changes to the logic in the Idempotency Layer class so that it can now accept an established Redis client. With this design, customers can pass in any Redis client they prefer, as long as it adheres to the schema defined in the protocol class. Additionally, using the original Redis clients allows customers to leverage their prior experience with Redis and easily transfer and adapt their existing code. However, after some discussions, we concluded that this design may not be user-friendly for individuals without prior Redis experience. Therefore, we believe it's still beneficial to provide a helper class for creating connections to assist such users.

In the third and final design, we have opted to implement a helper connection class that assists customers in creating Redis connections with only the most commonly used Redis parameters (host, port, username, password, db_index, url, ssl). Simultaneously, we enable our customers to bring their own Redis connection if they prefer. One common use case, for example, is when customers want to use Redis with their certificates. This added flexibility allows them to establish secure Redis connections using their custom certificates while benefiting from the simplified connection setup provided by our helper class.

Example

from redis import Redis

from aws_lambda_powertools.utilities.idempotency.persistence.redis import (
    RedisCachePersistenceLayer,
)

client = Redis(
    host="host",
    port=6379,
    ssl=True,
    ssl_certfile=ssl_certfile,
    ssl_keyfile=ssl_keyfile,
    ssl_cert_reqs="required",
    ssl_ca_certs=ssl_ca_certs,
)

persistence_layer = RedisCachePersistenceLayer(client=client)

Orphan Records

Each idempotency record includes attributes such as expire_time, and inprogress_expire_time. These records should be automatically deleted when the current time reaches expire_time in the "completed" status or inprogress_expire_time in the "in_progress" status.

However, due to factors like Lambda handler timeouts, exceptions, or potential Redis expiration issues, there may be instances where idempotency records persist in Redis even after the current time has exceeded inprogress_expire_time while the record status is still "in_progress," or the current time has surpassed expire_time while the record status is "completed." We refer to these invalid records as "Orphan Records."

It's important to note that the method we implement to address these orphan records must be executed with caution to avoid potential race conditions. Further details on this issue will be elaborated upon in the following two paragraphs.

Redis HSET vs SET

In the idempotency workflow, we need to store idempotency records with multiple attributes in Redis. This can be achieved by using HSET with the idempotency key followed by attributes. Alternatively, we can encapsulate the idempotency key and all its attributes into a JSON format and employ SET to store the entire JSON structure.

In the initial design, we employed HSET to set idempotency records. HSET offers several advantages over SET: hash lookups are typically faster, and HSET allows us to set multiple attributes using the same hash key. This aligns perfectly with the requirements of idempotency records, as we need to store the idempotency key, expire_time, in_progress_expire_time, status, and payload under a single key. Thus, the utilization of HSET for storing idempotency records appears to be an optimal choice for this project.

However, there are two major drawbacks using this method.

Firstly, SET allows us to set an expiration time when creating the record, but HSET doesn't offer this capability. To set an expiration time, we must make an additional Redis call using EXPIRE for the respective key after using HSET. This could lead to reduced performance since it requires sending two Redis calls, significantly increasing latency due to the double Round Trip Time (RTT) to Redis. This also introduces the potential for orphan records. For example, if the Lambda handler times out immediately after creating the Idempotency record but before setting the expiration time, the record will become an orphan record as it won't automatically expire in Redis. One way to optimize this issue is to use Redis commands like a pipeline to combine HSET and EXPIRE into a single Redis request. However, this workaround also increases the complexity of the code.
Secondly, SET allows us to use the "nx" (non-exist) parameter to write only on keys that do not exist. However, HSET does not support this "nx" parameter. "nx" plays a crucial role in preventing race conditions, which will be explained in the following section on Redis Race Conditions.

During our experiments, we used pipelines with HSET to reduce the Round Trip Time (RTT). However, pipelines introduce complexity that can make code writing and testing more challenging:

Batch Logic - Pipelines do not execute commands immediately, requiring the inclusion of batch logic to group related operations and ensure they are executed together. This complexity adds to the application code.
Error Handling - Errors occur asynchronously after pipelined commands are sent, making error handling more challenging compared to synchronous execution.
Testing Complexity - Writing isolated unit tests that assert on command outcomes becomes more challenging, as test code must handle asynchronous results. Additionally, race conditions can occur more easily.
Debugging Challenges - In the event of a pipeline failure, it becomes more challenging to identify which specific command in the batch caused the issues. Extra logging must be implemented.
Transactional Integrity - Pipelines do not offer transactional integrity. If commands require transactions, additional logic is necessary to manually implement this on top of pipelining.
Compability - Some versions of Redis Cluster doesn't supports pipeline in transactional mode like pipe = r.pipeline(transaction=True)

Due to HSET lacking support for "EX" (expiration time) and "NX" (non-exist), and the challenges associated with pipelines, we have made the decision to transition to using SET in our final design.

Redis Race condition

There are two potential race conditions in the Redis Idempotency workflow. Although the probability of these race conditions occurring is low, if not addressed effectively, certain payloads may bypass the idempotency layer. This could lead to the underlying Lambda handler executing more than once for identical payloads, which is an undesirable scenario for our customers when utilizing our idempotency layer.

The first potential race condition occurs when two Lambda handlers simultaneously perform SET/HSET operations without using 'nx' on a non-existent record. Consider a scenario where, at the same time, two lambda handlers with the same payload both use GET or HGET to verify if a particular record is empty. While they both assume the record is empty, they both decide to write to the record using SET/HSET, resulting in the record being updated twice. Eventually, both Lambda handlers with the same payload pass the idempotency check and proceed to the execution stage. In this case, the idempotency layer would fails to prevent the handler from running twice on the same payload. See graph below.

This race condition can be resolved by adding nx=True when using SET. Once we apply nx=True to SET, even if two handlers execute SET at the exact same time, only one of them will succeed, and the other handler will recognize that the record exists and return accordingly.

The second potential race condition is similar to the first one but more complicated. This scenario occurs when two lambda handers are both trying to fix the same orphan record they found at the same time. In our idempotency workflow, if a Lambda handler encounters an existing idempotency record and identifies it as a corrupted or orphaned record, it proceeds to overwrite it with a valid record. However, in this case, we are attempting to overwrite a record in Redis, so we cannot use the nx=True flag in SET as we are modifying an existing record. Consequently, this scenario introduces the possibility of a race condition where two SET operations are executed simultaneously by two different Lambda handlers, potentially resulting in the record being updated twice, and both lambda handlers advancing to the execution stage. This is a situation that the idempotency mechanism should prevent. One solution for this is to use SET on a key with nx to serve as a lock, ensuring that only the handler that successfully acquires the lock proceeds to update the orphan record. See graph below showing each scenario:

without lock

with lock

Tests

We had challenges while writing tests to test the Redis interface and functionality, primarily because testing the Protocol without establishing a real connection is a little bit hard. Our initial approach was create an integration testing by running Redis locally in Docker containers, but spinning up container environments locally has some downsides, such as:

Slow - Launching containers locally introduced overhead and slowed down test execution.
Fragile - Tests became vulnerable to failures if containers didn't start correctly or if the local environment wasn't properly configured to support containers.
Resource intensive - Containers consume substantial amounts of RAM and CPU on developers' local machines.
Docker Registry Issues - Occasionally, downloading a container from Docker registries encountered failures.

Our solution was to shift towards more functional testing by injecting a fake "Redis client" class instead of using a real Redis connection. This approach offers the following advantages:

Speed - There is no spin-up time, resulting in significantly faster test execution.
Isolation - Tests concentrate on the application code logic without relying on external dependencies.
Lightweight - There's no requirement to install and configure containers on local machines.

Thanks

whardier changed the title ~~Add support for MemoryDB as Idemopotency backend~~ Add support for MemoryDB/ElasticCache/Redis as Idemopotency backend Sep 4, 2021

heitorlessa transferred this issue from aws-powertools/powertools-lambda-python Nov 13, 2021

heitorlessa added Idempotency labels Nov 13, 2021

heitorlessa transferred this issue from aws-powertools/powertools-lambda Apr 28, 2022

heitorlessa added feature-request feature request area/idempotency and removed Idempotency labels Apr 28, 2022

heitorlessa changed the title ~~Add support for MemoryDB/ElasticCache/Redis as Idemopotency backend~~ Add support for MemoryDB/ElasticCache/Redis as Idempotency backend Apr 28, 2022

heitorlessa assigned mploski Aug 1, 2022

heitorlessa unassigned mploski Aug 25, 2022

heitorlessa added idempotency Idempotency utility and removed area/idempotency labels Nov 9, 2022

heitorlessa assigned Vandita2020 Dec 15, 2022

heitorlessa assigned leandrodamascena Feb 6, 2023

heitorlessa mentioned this issue Feb 6, 2023

Docs: Add IAM permissions in Idempotency #1901

Closed

1 task

Vandita2020 mentioned this issue Feb 8, 2023

feat(idempotency): adding redis as idempotency backend #1914

Closed

7 tasks

heitorlessa linked a pull request Feb 9, 2023 that will close this issue

feat(idempotency): adding redis as idempotency backend #1914

Closed

7 tasks

sthulb added this to Powertools for AWS Lambda (Python) Jun 19, 2023

github-project-automation bot moved this to Triage in Powertools for AWS Lambda (Python) Jun 19, 2023

sthulb moved this from Triage to On hold in Powertools for AWS Lambda (Python) Jun 19, 2023

roger-zhangg mentioned this issue Jun 23, 2023

feat(idempotency): adding redis as idempotency backend #2567

Merged

7 tasks

leandrodamascena linked a pull request Aug 15, 2023 that will close this issue

feat(idempotency): adding redis as idempotency backend #2567

Merged

7 tasks

leandrodamascena removed a link to a pull request Aug 15, 2023

feat(idempotency): adding redis as idempotency backend #1914

Closed

7 tasks

leandrodamascena moved this from On hold to Working on it in Powertools for AWS Lambda (Python) Aug 15, 2023

leandrodamascena unassigned Vandita2020 Aug 15, 2023

heitorlessa added this to the Redis support in Idempotency milestone Nov 20, 2023

leandrodamascena closed this as completed in #2567 Jan 10, 2024

github-project-automation bot moved this from Working on it to Coming soon in Powertools for AWS Lambda (Python) Jan 10, 2024

github-actions bot added the pending-release Fix or implementation already in dev waiting to be released label Jan 10, 2024

github-actions bot removed the pending-release Fix or implementation already in dev waiting to be released label Jan 19, 2024

leandrodamascena moved this from Coming soon to Shipped in Powertools for AWS Lambda (Python) Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for MemoryDB/ElasticCache/Redis as Idempotency backend #1181

Add support for MemoryDB/ElasticCache/Redis as Idempotency backend #1181

whardier commented Aug 21, 2021

gmcrocetti commented Sep 2, 2021

heitorlessa commented Sep 4, 2021 via email

danielloader commented Sep 4, 2021

heitorlessa commented Sep 4, 2021 via email

heitorlessa commented Sep 4, 2021 via email

whardier commented Sep 4, 2021

gmcrocetti commented Sep 9, 2021

leandrodamascena commented Jun 30, 2022 •

edited

Loading

heitorlessa commented Dec 13, 2022

leandrodamascena commented Jan 29, 2023 •

edited by Vandita2020

Loading

leandrodamascena commented Jan 30, 2023

heitorlessa commented Dec 6, 2023

github-actions bot commented Jan 10, 2024

github-actions bot commented Jan 19, 2024

leandrodamascena commented Jan 30, 2024

Add support for MemoryDB/ElasticCache/Redis as Idempotency backend #1181

Add support for MemoryDB/ElasticCache/Redis as Idempotency backend #1181

Comments

whardier commented Aug 21, 2021

gmcrocetti commented Sep 2, 2021

heitorlessa commented Sep 4, 2021 via email

danielloader commented Sep 4, 2021

heitorlessa commented Sep 4, 2021 via email

heitorlessa commented Sep 4, 2021 via email

whardier commented Sep 4, 2021

gmcrocetti commented Sep 9, 2021

leandrodamascena commented Jun 30, 2022 • edited Loading

heitorlessa commented Dec 13, 2022

leandrodamascena commented Jan 29, 2023 • edited by Vandita2020 Loading

leandrodamascena commented Jan 30, 2023

heitorlessa commented Dec 6, 2023

github-actions bot commented Jan 10, 2024

⚠️COMMENT VISIBILITY WARNING⚠️

github-actions bot commented Jan 19, 2024

leandrodamascena commented Jan 30, 2024

API Design

Connection

Example

Orphan Records

Redis HSET vs SET

Redis Race condition

without lock

with lock

Tests

leandrodamascena commented Jun 30, 2022 •

edited

Loading

leandrodamascena commented Jan 29, 2023 •

edited by Vandita2020

Loading