-
Notifications
You must be signed in to change notification settings - Fork 153
Feature request: support for Redis in Idempotency #3183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Sounds interesting. Since powertools-python has it, typescript should have this feature too. I'm interested in contributing @dreamorosi |
Hi @arnabrahman, nice to see you again here - thank you for offering to help! I have to admit that I am not very familiar with Redis myself, so please bear with me. Based on my understanding, this is where the Python implementation is. There are a few items I'd like to discuss/mention before we move on the implementation:
I have assigned the issue to you, please take a look at the reference implementation if you haven't and let us know if you have any questions. Once we have addressed the points above I think we can start the implementation. Also, if you think it's useful before or during the implementation - we're happy to jump on a call and discuss this issue at any point, especially @leandrodamascena who has worked on the Python implementation. Thanks again, this is exciting! |
Hi @arnabrahman and @dreamorosi! This is super nice we will add support for Redis in TS. Let me share some ideas/challenges I had while implementing Python. I may be repetitive at some points as Andrea has already shared. 1/ We allow customers to create a new instance of 2/ All the idempotency logic must be handled by Idempotency classes as it currently happens with 3/ This implementation should support both standalone Redis connections and Redis clusters. In theory, you only need to change a few things during the connection; the underlying commands remain the same. We are planning to add support for Sentinel clients, but we haven't heard any customer demand yet. 4/ Serverless Cache as a Service is a new trend in the Serverless market, so the library/client to be used must implement the 5/ In Python, we are not forcing the Redis version, but it would be interesting to see if we can enforce Redis 7+ for performance reasons. This is not mandatory, just a tip. 6/ In the first implementation (during PoC), we considered using pipelines to handle multiple commands and reduce round-trip time (RTT) to optimize Redis data/network/connection exchange, but we opted out and are using 7/ To implement atomic operations or optimistic locking in Redis, Lua scripts are required. Redis does not natively support optimistic locking without Lua scripts. While Lua scripts provide atomic execution by running all commands within the script as a single operation, they are restricted in some managed services or may be disallowed due to security policies. This limitation can be a blocker for adoption by clients who are not authorized to use Lua scripts. To address concurrency challenges, such as those arising from simultaneous transactions in environments like AWS Lambda, we wrote a lock acquisition mechanism to ensure execution uniqueness and prevent race conditions. This approach avoids the need for Lua scripts and relies on native Redis commands like As @dreamorosi said: I'm happy to connect if you need any help. |
Thanks a lot @leandrodamascena and @dreamorosi, really appreciate the thoughtful responses and solid starting points. I’ll dig into these and share an update once I’ve made some headway. |
Hi @leandrodamascena, @dreamorosi , @arnabrahman , I'm a valkey-glide maintainer and have been part of the ElastiCache team for the past 5 years. I have extensive knowledge of Redis/Valkey best practices and would be happy to help, design/coding whatever is needed. Valkey-glide was designed to be a robust client for Valkey and Redis while minimizing downtime. The idea was to create a robust core written in Rust with thin wrappers for various programming languages. Currently, we support Python, Java, Node.js, and Go, with .NET, Ruby, and C++ support in development. The API and behavior are consistent across languages, so if you have a working version in Python, it will work in Node.js as well. I would be happy to help with this migration. Perhaps we could schedule a quick call next week to meet (which would be nice) and share knowledge to determine the fastest and most appropriate way to move forward. While I'm not very familiar with this package or Lambda functions, I can share my expertise regarding Valkey/Redis clients and Redis/Valkey databases. |
Ok, I had an initial look at the Python implementation and thanks to the clean nature of the code and well-described comments, I think i understand the high-level flow. I’ll go over some of the points that @dreamorosi mentioned:
Let me know what you guys think of this. |
Aws elastiache supports valkey 8.0. Aws elasticache does not support sentinel. We will be able to add missing featurs to valkey-glide, There is also cooperation with gcp and we work together to make valkey-glide better and better. See the dev pace at the repo. I recommend to use cluster mode but to be honest I don't fully understand your requirements. |
Thank you both, especially @arnabrahman for the comparison. I would not worry about Sentinel at this stage since ElastiCache doesn't support it. Regarding the client library selection, based on the above I would automatically exclude I went ahead and made some very basic tests and I have a couple additional considerations that are important for this project regarding the other two libraries. CommonJS / ES Modules support The This is not a huge deal since we do the same with Tracer & X-Ray SDK, and starting from Node.js 24 either of the two should be able to import the other. Overall usage
While it's true that the GLIDE library is 7mo only vs the other having a 2+ yrs head start, it's clear that the Redis one has appears to be used a few orders of magnitude more than the newer one. Low usage/downloads is not a disqualifying factor by itself, but if we are thinking in terms of DX and we want to allow customers to pass their own client to the persistence layer, then maybe using the Provenance & Supply chain When choosing a 3rd party dependency we look at two things when it comes to OSS supply chain security & governance:
Neither of the two libraries publishes provenance statements with their release. In terms of dependencies:
While having provenance statement would be a big differentiator for us, if we look at dependency tree alone the Redis client seems to have a smaller surface area when it comes to modules brought into the Architecture
As a customer, when it comes to TypeScript/JavaScript functions, having native libraries in the dependency tree means I now have to choose between two options:
As a library author, since we publish and offer public Lambda layers that include all Powertools for AWS utilities and their dependencies, it means we will need to start publishing two set of the layer, one for each architecture in every region - functionally doubling our deployment targets. Given that the change above will also result in new ARNs for the Lambda layers it means we'll need to do this in a major release (no ETA as of today) and introduce additional management overhead for our customers over a feature that at this point - also considering the low interest on the post above - is marginal at best in the context of Powertools for AWS. All the above is not necessarily a disqualifying factor for using Performance I deployed a Valkey Serverless ElastiCache in my account and created two Lambda functions, one using
Click here to see CDK stackimport {
Stack,
type StackProps,
CfnOutput,
RemovalPolicy,
Duration,
} from 'aws-cdk-lib';
import type { Construct } from 'constructs';
import {
Architecture,
Code,
LayerVersion,
Runtime,
Tracing,
} from 'aws-cdk-lib/aws-lambda';
import { NodejsFunction, OutputFormat } from 'aws-cdk-lib/aws-lambda-nodejs';
import { LogGroup, RetentionDays } from 'aws-cdk-lib/aws-logs';
import { aws_elasticache } from '@open-constructs/aws-cdk';
import { Port, SecurityGroup, Vpc } from 'aws-cdk-lib/aws-ec2';
import { HttpApi, HttpMethod } from 'aws-cdk-lib/aws-apigatewayv2';
import { HttpLambdaIntegration } from 'aws-cdk-lib/aws-apigatewayv2-integrations';
export class ValkeyStack extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
// #region Shared
const vpc = new Vpc(this, 'MyVpc', {
maxAzs: 2, // Default is all AZs in the region
});
const fnSecurityGroup = new SecurityGroup(this, 'ValkeyFnSecurityGroup', {
vpc,
allowAllOutbound: true,
description: 'Security group for Valkey function',
});
// #region Valkey Cluster
const serverlessCacheSecurityGroup = new SecurityGroup(
this,
'ServerlessCacheSecurityGroup',
{
vpc,
allowAllOutbound: true,
description: 'Security group for serverless cache',
}
);
serverlessCacheSecurityGroup.addIngressRule(
fnSecurityGroup,
Port.tcp(6379),
'Allow Lambda to connect to serverless cache'
);
const serverlessCache = new aws_elasticache.ServerlessCache(
this,
'ServerlessCache',
{
engine: aws_elasticache.Engine.VALKEY,
majorEngineVersion: aws_elasticache.MajorVersion.VER_8,
serverlessCacheName: 'my-serverless-cache',
vpc,
securityGroups: [serverlessCacheSecurityGroup],
}
);
// #region Glide Valkey version
const valkeyLayer = new LayerVersion(this, 'ValkeyLayer', {
removalPolicy: RemovalPolicy.DESTROY,
compatibleArchitectures: [Architecture.ARM_64],
compatibleRuntimes: [Runtime.NODEJS_22_X],
code: Code.fromAsset('./lib/layers/valkey-glide'),
});
const fnName = 'ValkeyFn';
const logGroup = new LogGroup(this, 'MyLogGroup', {
logGroupName: `/aws/lambda/${fnName}`,
removalPolicy: RemovalPolicy.DESTROY,
retention: RetentionDays.ONE_DAY,
});
const fn = new NodejsFunction(this, 'MyFunction', {
functionName: fnName,
logGroup,
runtime: Runtime.NODEJS_22_X,
architecture: Architecture.ARM_64,
memorySize: 512,
timeout: Duration.seconds(30),
entry: './src/index.ts',
handler: 'handler',
layers: [valkeyLayer],
bundling: {
minify: true,
mainFields: ['module', 'main'],
sourceMap: true,
format: OutputFormat.ESM,
externalModules: ['@valkey/valkey-glide'],
metafile: true,
},
vpc,
securityGroups: [fnSecurityGroup],
});
fn.addEnvironment('CACHE_ENDPOINT', serverlessCache.endpointAddress);
fn.addEnvironment('CACHE_PORT', serverlessCache.endpointPort.toString());
// #region Redis Client version
const fnName2 = 'RedisFn';
const logGroup2 = new LogGroup(this, 'MyLogGroup2', {
logGroupName: `/aws/lambda/${fnName2}`,
removalPolicy: RemovalPolicy.DESTROY,
retention: RetentionDays.ONE_DAY,
});
const fn2 = new NodejsFunction(this, 'MyFunction2', {
functionName: fnName2,
logGroup: logGroup2,
runtime: Runtime.NODEJS_22_X,
architecture: Architecture.ARM_64,
memorySize: 512,
timeout: Duration.seconds(30),
entry: './src/redis-client.ts',
handler: 'handler',
bundling: {
minify: true,
mainFields: ['module', 'main'],
sourceMap: true,
format: OutputFormat.ESM,
banner:
"import { createRequire } from 'module';const require = createRequire(import.meta.url);",
metafile: true,
},
vpc,
securityGroups: [fnSecurityGroup],
});
fn2.addEnvironment('CACHE_ENDPOINT', serverlessCache.endpointAddress);
fn2.addEnvironment('CACHE_PORT', serverlessCache.endpointPort.toString());
// #region API Gateway
const api = new HttpApi(this, 'HttpApi');
api.addRoutes({
path: '/valkey',
methods: [HttpMethod.GET],
integration: new HttpLambdaIntegration('ValkeyIntegration', fn),
});
api.addRoutes({
path: '/redis',
methods: [HttpMethod.GET],
integration: new HttpLambdaIntegration('RedisIntegration', fn2),
});
new CfnOutput(this, 'APIEndpoint', {
value: api.apiEndpoint,
});
}
} Click here to see `@valkey/valkey-glide` functionimport { GlideClient } from '@valkey/valkey-glide';
const endpoint = process.env.CACHE_ENDPOINT || '';
const port = process.env.CACHE_PORT || '6379';
const redis = await GlideClient.createClient({
addresses: [
{
host: endpoint,
port: Number(port),
},
],
useTLS: true,
});
export const handler = async () => {
// write
await redis.set('valkey-key', 'value');
console.log('Set key to value');
// read
const value = await redis.get('valkey-key');
console.log('Got value:', value);
return {
statusCode: 200,
body: JSON.stringify('Hello, World!'),
};
}; Click here to see `@redis/client` functionimport { createClient } from '@redis/client';
const endpoint = process.env.CACHE_ENDPOINT || '';
const port = process.env.CACHE_PORT || '6379';
const redis = createClient({
username: 'default',
socket: {
tls: true,
host: endpoint,
port: Number(port),
},
});
await redis.connect();
export const handler = async () => {
// write
await redis.set('redis-key', 'value');
console.log('Set key to value');
// read
const value = await redis.get('redis-key');
console.log('Got value:', value);
return {
statusCode: 200,
body: JSON.stringify('Hello, World!'),
};
}; I ran the test by making 2K requests with 5 concurrent connections made using 5 parallel requests - aka 25 workers. The load test was carried out using oha -n 2000 -c 5 -p 5 --latency-correction --disable-keepalive $API_ENDPOINT/valkey -o valkey.txt --no-tui oha -n 2000 -c 5 -p 5 --latency-correction --disable-keepalive $API_ENDPOINT/redis -o redis.txt --no-tui I repeated the tests 3 times and here's a sample of results for both: `@valkey/valkey-glide Summary:
Success rate: 100.00%
Total: 61.9826 secs
Slowest: 0.2301 secs
Fastest: 0.1313 secs
Average: 0.1548 secs
Requests/sec: 32.2671
Total data: 29.30 KiB
Size/request: 15 B
Size/sec: 484 B
Response time histogram:
0.131 [1] |
0.141 [285] |■■■■■■■■■■■■■■
0.151 [620] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.161 [456] |■■■■■■■■■■■■■■■■■■■■■■■
0.171 [391] |■■■■■■■■■■■■■■■■■■■■
0.181 [203] |■■■■■■■■■■
0.191 [37] |■
0.200 [4] |
0.210 [1] |
0.220 [1] |
0.230 [1] |
Response time distribution:
10.00% in 0.1398 secs
25.00% in 0.1441 secs
50.00% in 0.1530 secs
75.00% in 0.1641 secs
90.00% in 0.1727 secs
95.00% in 0.1766 secs
99.00% in 0.1843 secs
99.90% in 0.2198 secs
99.99% in 0.2301 secs
Details (average, fastest, slowest):
DNS+dialup: 0.0925 secs, 0.0785 secs, 0.1533 secs
DNS-lookup: 0.0001 secs, 0.0000 secs, 0.0462 secs
Status code distribution:
[200] 2000 responses `@redis/client Summary:
Success rate: 100.00%
Total: 62.0723 secs
Slowest: 0.2635 secs
Fastest: 0.1305 secs
Average: 0.1550 secs
Requests/sec: 32.2205
Total data: 29.30 KiB
Size/request: 15 B
Size/sec: 483 B
Response time histogram:
0.130 [1] |
0.144 [458] |■■■■■■■■■■■■■■■■■■■■
0.157 [726] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.170 [557] |■■■■■■■■■■■■■■■■■■■■■■■■
0.184 [231] |■■■■■■■■■■
0.197 [16] |
0.210 [4] |
0.224 [2] |
0.237 [0] |
0.250 [0] |
0.264 [5] |
Response time distribution:
10.00% in 0.1395 secs
25.00% in 0.1445 secs
50.00% in 0.1533 secs
75.00% in 0.1635 secs
90.00% in 0.1726 secs
95.00% in 0.1773 secs
99.00% in 0.1848 secs
99.90% in 0.2560 secs
99.99% in 0.2635 secs
Details (average, fastest, slowest):
DNS+dialup: 0.0923 secs, 0.0778 secs, 0.1525 secs
DNS-lookup: 0.0001 secs, 0.0000 secs, 0.0621 secs
Status code distribution:
[200] 2000 responses Both performed quite similarly across all metrics with less than 1% of variance in all key metrics:
With latency profiles are nearly identical:
Redis has slightly higher maximum latency (0.2635s vs 0.2301s for Valkey), but this affects only the very top percentiles (99.9%+). Conclusion Based on what I see above, I am inclined to choose
With that said, even if by default in our dev environment and Lambda layer we'll go with I expect our use case to really just use a handful of methods: Finally, if you see any mistake or inaccuracy in the arguments above or in the benchmarks, please do point them out - I will be more than happy to amend the recommendation. |
@dreamorosi Great analysis. After reading your comment, I now think it makes more sense to use |
I apologize for the long comment, it's out of love for our craft, not argumentative. Performance wise, did you consider the ability to use What about the security in a client that is and will follow Valkey, and open source? For the next version (in a month, give or take), we're introducing batching for cluster as well, meaning you can pipeline multi shards commands together while handled by the library. But, on top of performance, what glide was designed for in the first place, and excel in doing so, is the reliability, fault tolerance, and solving real pains, which we learn through years of working with users. Glide also has an awesome community, and strong backup, while the giants pushing strong forward, there's also a lot of community initiative that is amazing to see. For the different types of connection, we will release lazy connection feature, then the behavior will be the same if chosen. For comparison of fault tolerance, see the number of errors per client, for valkey 8.1 vs. Redis 7 performance see iovalkey vs. ioredis, which has the same code, but connected to the same cluster: The bench is for the implementation of glide, iovalkey, and ioredis to rate-limiter-flexible, and doesn't compare direct performances of the clients. But for whatever you decide, you are great, and it's really just because I love our project. |
Thank you for your reply, no need to apologize for being passionate - it's nice to see it and I get it. Let me address some of your points:
No, I didn't consider it because our use case reads and writes exactly and at most one value at the time for each request coming to an AWS Lambda function. If I am understanding correctly the docs for these two methods, they're used to set/get a list of values. If so, then our use case won't benefit from them because Lambda's programming model always processes one request at the time, thus our Idempotency utility also only needs to set/get one item at the time. See the request flow diagrams in our docs to understand what I mean.
I hear you, and I am aware of the history in this space, but I am not going to make a decision based on FUD. As of today, both Conversely, knowing that a client is following Valkey, while a nice to have, for us is not necessarily a goal. I'm very excited about Valkey personally, and I am glad that AWS is investing in it but when it comes to Powertools for AWS, we want to make sure our customers can use our Idempotency utility with as many engines as possible.
This is interesting, and so far the only tangible benefit in favor of the Based on what I see here, the developer needs to provide the client AZ to the This means that in practice, customers need to hardcode the value and also configure their functions to run in a single Subnet, which is the only way to guarantee that they'll run in a given AZ. Not sure that these are good ideas - but if there's a way to get the current AZ from within a Lambda function, then my entire argument is wrong and this is actually a plus.
I agree that when wanting to run an in-memory, high performance, key-value datastore on AWS, Valkey is probably the top option today, however this is not what this argument is about. In this discussion we're trying to choose which Node.js client we'll support in Powertools for AWS.
As mentioned above in relation to
Same as
This is nice and it'd be useful for us. Do you have an ETA for when will this released?
That is great to hear, obviously my tests above are evaluating what's available now. Once @arnabrahman's implementation is nearly done I think it will be easy enough to swap the client and run the benchmark again - this way we can compare the actual implementation and not a toy example like I did above. Overall it's great to see that there's a lot of movement on the Valkey client and also a clear roadmap. The main concerns about tradeoffs of deployment complexity still stand though. Is by chance your team planning on publishing public Lambda layers for the client? This would definitely help the argument, and our team is happy to do a knowledge transfer to help you set things up if needed. Like I said in my previous comment, I still want us to explore the option of supporting both clients and making it very clear in the docs that we support both. However when it comes to our Powertools for AWS Lambda layer, we're not ready to take on the complexity of supporting architecture-specific dependencies. There are a couple of ideas around it that I'd like to test, but I won't be able to do so before a couple of weeks from now. To be clear, for now @arnabrahman can continue the implementation with whichever of the two clients. Once we have a PR up I'll spend some time seeing if we can make it generic enough to support both. Then we'll take it from there. |
@dreamorosi Thanks for the comprehensive answer! I'll react to some points.
That is a wrong assumption, one trigger != one set of commands.
It is actually the opposite, we support any open-source versions of Redis from 6.2 forward, it's part of our test matrices Your Lambda users mainly need support for Elasticache versions or available stores in the Linux distros they use for their KV store. Therefore, the chances that Lambda users will want integration with Redis 7.4 are lower than the versions we support, for example Redis 7 or Valkey 8. On the other hand, Jedis’s client-side caching for Redis 7.4 forward only is a clear vendor lock-in move by the client. Client-side caching has nothing to do with newer versions; it’s a feature on the client side, not a missing feature in the engine. Following Valkey means benefiting from new features without locking users from using OSS versions of Redis.
Two points - B. You can, not directly through the lambda API, but this is possible if your lambada is attached to a VPC, I know users who are doing it, so I'm sure it is possible, that a huge cost reduction, ec2 doesn't charge you for the transaction.
I think we answered about it and our view on it in other places in the discussion. My main point is people will follow valkey in the AWS environment, and you better have a client that supports the features it gives, rather than one give features exist just in close source versions of redis.
Same answer, but with more emphasis, you can accumulate together all the actions you need for something and perform it at once, both setting the user sessions and getting his recommended audio tracks page.
A little more than a month. But a workaround already exists, until then.
Let's meet and talk about that, if we can work together to create such a thing, there will be an enormous amount of happy users. I think that exactly the kind of thing that we can create together, each bringing his specialty and knowledge, and creating healthy collaboration. |
Use case
The Idempotency utility currently supports only DynamoDB as persistence layer.
With AWS announcing Amazon ElastiCache for Valkey, we would like to understand if there's demand for the Idempotency utility in Powertools for AWS Lambda (TypeScript) supporting Redis-compatible persistence layers.
Important
We are opening this issue to gauge demand for this feature. If you're interested please leave a 👍 under this issue. If you'd like, consider also leaving a comment with your use case. If you are not comfortable sharing details in public, you can also do so by emailing us at [email protected] with your work email.
Solution/User Experience
From a customer perspective, using ElastiCache as persistence layer should be as transparent as possible and the DX should look the same as today except that instead of instantiating a
DynamoDBPersistenceLayer
, you'd be instantiating anElastiCachePersistenceLayer
(Name TBD).Below a high level example of how it'd look like:
Note
The API shown above is just for illustration purposes and might be different in the final implementation. We however welcome comments and feedback if you have any.
Alternative solutions
The feature is already available in Powertools for AWS Lambda (Python), so we should use that as reference.
Acknowledgment
Future readers
Please react with 👍 and your use case to help us understand customer demand.
The text was updated successfully, but these errors were encountered: