Skip to content

Commit 43b72e7

Browse files
author
Tom McCarthy
committed
docs: first draft of docs for idempotency util
1 parent b4490b9 commit 43b72e7

7 files changed

+372
-0
lines changed
+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
@startuml
2+
'https://plantuml.com/sequence-diagram
3+
4+
participant Client
5+
participant Lambda
6+
participant "Persistence layer"
7+
8+
9+
group initial request
10+
Client->Lambda:Invoke (event)
11+
Lambda->"Persistence layer":Get or set (id=event.search(payload))
12+
activate "Persistence layer"
13+
note right of "Persistence layer":Locked during this time. Prevents \nmultiple Lambda invocations with the \nsame payload running concurrently.
14+
Lambda-->Lambda:Run Lambda handler (event)
15+
Lambda->"Persistence layer":Update record with Lambda handler result¹
16+
deactivate "Persistence layer"
17+
"Persistence layer"-->"Persistence layer": Update record with result¹
18+
Client x<--Lambda:Response not received by client
19+
end
20+
21+
group retried request
22+
23+
Client->Lambda: Invoke (event)
24+
Lambda->"Persistence layer":Get or set (id=event.search(payload))
25+
Lambda<--"Persistence layer":Already exists in persistence layer. Return result¹
26+
Client<--Lambda:Response sent to client
27+
end
28+
29+
@enduml
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
@startuml
2+
'https://plantuml.com/sequence-diagram
3+
4+
participant Client
5+
participant Lambda
6+
participant "Persistence layer"
7+
8+
9+
Client->Lambda:Invoke (event)
10+
Lambda->"Persistence layer":Get or set (id=event.search(payload))
11+
activate "Persistence layer"
12+
note right of "Persistence layer":Locked during this time. Prevents \nmultiple Lambda invocations with the \nsame payload running concurrently.
13+
Lambda-->x Lambda:Run Lambda handler (event). Raises Exception.
14+
Lambda->"Persistence layer":Delete record (id=event.search(payload))
15+
deactivate "Persistence layer"
16+
Client<--Lambda:Return error response
17+
18+
@enduml

docs/index.md

+1
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,7 @@ aws serverlessrepo list-application-versions \
152152
| [Validation](./utilities/validation) | JSON Schema validator for inbound events and responses
153153
| [Event source data classes](./utilities/data_classes) | Data classes describing the schema of common Lambda event triggers
154154
| [Parser](./utilities/parser) | Data parsing and deep validation using Pydantic
155+
| [Idempotency](./utilities/idempotency) | Idempotent Lambda handler
155156

156157
## Environment variables
157158

docs/media/idempotent_sequence.png

72.9 KB
Loading
45.6 KB
Loading

docs/utilities/idempotency.md

+323
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,323 @@
1+
---
2+
title: Idempotency
3+
description: Utility
4+
---
5+
6+
This utility provides a simple solution to convert your Lambda functions into idempotent operations which are safe to
7+
retry.
8+
9+
## Terminology
10+
11+
The property of idempotency means that an operation does not cause additional side effects if it is called more than
12+
once with the same input parameters. Idempotent operations will return the same result when they are called multiple
13+
times with the same parameters. This makes idempotent operations safe to retry.
14+
15+
16+
## Key features
17+
18+
* Prevent Lambda handler code executing more than once on the same event payload during a time window
19+
* Ensure Lambda handler returns the same result when called with the same payload
20+
* Select a subset of the event as the idempotency key using JMESpath expressions
21+
* Set a time window in which records with the same payload should be considered duplicates
22+
23+
## Getting started
24+
25+
### Required resources
26+
27+
Before getting started, you need to create a DynamoDB table to store state used by the idempotency utility. Your lambda
28+
functions will need read and write access to this table.
29+
30+
> Example using AWS Serverless Application Model (SAM)
31+
32+
=== "template.yml"
33+
```yaml
34+
Resources:
35+
HelloWorldFunction:
36+
Type: AWS::Serverless::Function
37+
Properties:
38+
Runtime: python3.8
39+
...
40+
Policies:
41+
- DynamoDBCrudPolicy:
42+
TableName: !Ref IdempotencyTable
43+
IdempotencyTable:
44+
Type: AWS::DynamoDB::Table
45+
Properties:
46+
AttributeDefinitions:
47+
- AttributeName: id
48+
AttributeType: S
49+
BillingMode: PAY_PER_REQUEST
50+
KeySchema:
51+
- AttributeName: id
52+
KeyType: HASH
53+
TableName: "IdempotencyTable"
54+
TimeToLiveSpecification:
55+
AttributeName: expiration
56+
Enabled: true
57+
```
58+
59+
!!! note
60+
When using this utility, each function invocation will generally make 2 requests to DynamoDB. If the result
61+
returned by your Lambda is less than 1kb, you can expect 2 WCUs per invocation. For retried invocations, you will
62+
see 1WCU and 1RCU. Review the [DynamoDB pricing documentation](https://aws.amazon.com/dynamodb/pricing/) to
63+
estimate the cost.
64+
65+
66+
### Lambda handler
67+
68+
You can quickly start by initializing the `DynamoDBPersistenceLayer` class outside the Lambda handler, and using it
69+
with the `idempotent` decorator on your lambda handler. There are 2 required parameters to initialize the persistence
70+
layer:
71+
72+
`table_name`: The name of the DynamoDB table to use.
73+
`event_key_jmespath`: A JMESpath expression which will be used to extract the payload from the event your Lambda hander
74+
is called with. This payload will be used as the key to decide if future invocations are duplicates.
75+
76+
=== "app.py"
77+
78+
```python hl_lines="2 6-9 11"
79+
import json
80+
from aws_lambda_powertools.utilities.idempotency import DynamoDBPersistenceLayer, idempotent
81+
82+
# Treat everything under the "body" key in
83+
# the event json object as our payload
84+
persistence_layer = DynamoDBPersistenceLayer(
85+
event_key_jmespath="body",
86+
table_name="IdempotencyTable"
87+
)
88+
89+
@idempotent(persistence_store=persistence_layer)
90+
def handler(event, context):
91+
body = json.loads(event['body'])
92+
payment = create_subscription_payment(
93+
user=body['user'],
94+
product=body['product_id']
95+
)
96+
...
97+
return {"message": "success", "statusCode": 200, "payment_id": payment.id}
98+
```
99+
=== "Example event"
100+
101+
```json
102+
{
103+
"version":"2.0",
104+
"routeKey":"ANY /createpayment",
105+
"rawPath":"/createpayment",
106+
"rawQueryString":"",
107+
"headers": {
108+
"Header1": "value1",
109+
"Header2": "value2"
110+
},
111+
"requestContext":{
112+
"accountId":"123456789012",
113+
"apiId":"api-id",
114+
"domainName":"id.execute-api.us-east-1.amazonaws.com",
115+
"domainPrefix":"id",
116+
"http":{
117+
"method":"POST",
118+
"path":"/createpayment",
119+
"protocol":"HTTP/1.1",
120+
"sourceIp":"ip",
121+
"userAgent":"agent"
122+
},
123+
"requestId":"id",
124+
"routeKey":"ANY /createpayment",
125+
"stage":"$default",
126+
"time":"10/Feb/2021:13:40:43 +0000",
127+
"timeEpoch":1612964443723
128+
},
129+
"body":"{\"username\":\"xyz\",\"product_id\":\"123456789\"}",
130+
"isBase64Encoded":false
131+
}
132+
```
133+
134+
In this example, we have a Lambda handler that creates a payment for a user subscribing to a product. We want to ensure
135+
that we don't accidentally charge our customer by subscribing them more than once. Imagine the function executes
136+
successfully, but the client never receives the response. When we're using the idempotent decorator, we can safely
137+
retry. This sequence diagram shows an example flow of what happens in this case:
138+
139+
![Idempotent sequence](../media/idempotent_sequence.png)
140+
141+
142+
The client was successful in receiving the result after the retry. Since the Lambda handler was only executed once, our
143+
customer hasn't been charged twice.
144+
145+
!!! note
146+
Bear in mind that the entire Lambda handler is treated as a single idempotent operation. If your Lambda handler can
147+
cause multiple side effects, consider splitting it into separate functions.
148+
149+
### Handling exceptions
150+
151+
If your Lambda handler raises an unhandled exception, the record in the persistence layer will be deleted. This means
152+
that if the client retries, your Lambda handler will be free to execute again. If you don't want the record to be
153+
deleted, you need to catch Exceptions within the handler and return a successful response.
154+
155+
156+
![Idempotent sequence exception](../media/idempotent_sequence_exception.png)
157+
158+
!!! warning
159+
If any of the calls to the persistence layer unexpectedly fail, `IdempotencyPersistenceLayerError` will be raised.
160+
As this happens outside the scope of your Lambda handler, you are not able to catch it.
161+
162+
### Setting a time window
163+
In most cases, it is not desirable to store the idempotency records forever. Rather, you want to guarantee that the
164+
same payload won't be executed within a period of time. By default, the period is set to 1 hour (3600 seconds). You can
165+
change this window with the `expires_after_seconds` parameter:
166+
167+
```python hl_lines="4"
168+
DynamoDBPersistenceLayer(
169+
event_key_jmespath="body",
170+
table_name="IdempotencyTable",
171+
expires_after_seconds=5*60 # 5 minutes
172+
)
173+
174+
```
175+
This will mark any records older than 5 minutes expired, and the lambda handler will be executed as normal if it is
176+
invoked with a matching payload. If you have set the TTL field in DynamoDB like in the SAM example above, the record
177+
will be automatically deleted from the table after a period of itme.
178+
179+
180+
### Using local cache
181+
To reduce the number of lookups to the persistence storage layer, you can enable in memory caching with the
182+
`use_local_cache` parameter, which is disabled by default. This cache is local to each Lambda execution environment.
183+
This means it will be effective in cases where your function's concurrency is low in comparison to the number of
184+
"retry" invocations with the same payload. When enabled, the default is to cache a maxmum of 256 records in each Lambda
185+
execution environment. You can change this with the `local_cache_max_items` parameter.
186+
187+
```python hl_lines="4 5"
188+
DynamoDBPersistenceLayer(
189+
event_key_jmespath="body",
190+
table_name="IdempotencyTable",
191+
use_local_cache=True,
192+
local_cache_max_items=1000
193+
)
194+
```
195+
196+
197+
## Advanced
198+
199+
### Payload validation
200+
What happens if lambda is invoked with a payload that it has seen before, but some parameters which are not part of the
201+
payload have changed? By default, lambda will return the same result as it returned before, which may be misleading.
202+
Payload validation provides a solution to that. You can provide another JMESpath expression to the persistence store
203+
with the `payload_validation_jmespath` to specify which part of the event body should be validated against previous
204+
idempotent invocations.
205+
206+
=== "app.py"
207+
```python hl_lines="6"
208+
from aws_lambda_powertools.utilities.idempotency import DynamoDBPersistenceLayer, idempotent
209+
210+
persistence_layer = DynamoDBPersistenceLayer(
211+
event_key_jmespath="[userDetail, productId]",
212+
table_name="IdempotencyTable",)
213+
payload_validation_jmespath="amount"
214+
)
215+
216+
@idempotent(persistence_store=persistence_layer)
217+
def handler(event, context):
218+
# Creating a subscription payment is a side
219+
# effect of calling this function!
220+
payment = create_subscription_payment(
221+
user=event['userDetail']['username'],
222+
product=event['product_id'],
223+
amount=event['amount']
224+
)
225+
...
226+
return {"message": "success", "statusCode": 200,
227+
"payment_id": payment.id, "amount": payment.amount}
228+
```
229+
=== "Event"
230+
```json
231+
{
232+
"userDetail": {
233+
"username": "User1",
234+
"user_email": "[email protected]"
235+
},
236+
"productId": 1500,
237+
"charge_type": "subscription",
238+
"amount": 500
239+
}
240+
```
241+
242+
In this example, the "userDetail" and "productId" keys are used as the payload to generate the idempotency key. If
243+
we try to send the same request but with a different amount, Lambda will raise `IdempotencyValidationError`. Without
244+
payload validation, we would have returned the same result as we did for the initial request. Since we're also
245+
returning an amount in the response, this could be quite confusing for the client. By using payload validation on the
246+
amount field, we prevent this potentially confusing behaviour and instead raise an Exception.
247+
248+
### Changing dynamoDB attribute names
249+
If you want to use an existing DynamoDB table, or wish to change the name of the attributes used to store items in the
250+
table, you can do so when you construct the `DynamoDBPersistenceLayer` instance.
251+
252+
253+
Parameter | Default value | Description
254+
------------------- |--------------- | ------------
255+
key_attr | "id" | Primary key of the table. Hashed representation of the payload
256+
expiry_attr | "expiration" | Unix timestamp of when record expires
257+
status_attr | "status" | Stores status of the lambda execution during and after invocation
258+
data_attr | "data" | Stores results of successfully executed Lambda handlers
259+
validation_key_attr | "validation" | Hashed representation of the parts of the event used for validation
260+
261+
This example demonstrates changing the attribute names to custom values:
262+
263+
=== "app.py"
264+
```python hl_lines="5-10"
265+
persistence_layer = DynamoDBPersistenceLayer(
266+
event_key_jmespath="[userDetail, productId]",
267+
table_name="IdempotencyTable",)
268+
key_attr="idempotency_key",
269+
expiry_attr="expires_at",
270+
status_attr="current_status",
271+
data_attr="result_data",
272+
validation_key_attr="validation_key"
273+
)
274+
```
275+
276+
### Customizing boto configuration
277+
You can provide custom boto configuration or event bring your own boto3 session if required by using the `boto_config`
278+
or `boto3_session` parameters when constructing the persistence store.
279+
280+
=== "Custom session"
281+
```python hl_lines="1 4 8"
282+
import boto3
283+
from aws_lambda_powertools.utilities.idempotency import DynamoDBPersistenceLayer, idempotent
284+
285+
boto3_session = boto3.session.Session()
286+
persistence_layer = DynamoDBPersistenceLayer(
287+
event_key_jmespath="body",
288+
table_name="IdempotencyTable",
289+
boto3_session=boto3_session
290+
)
291+
292+
@idempotent(persistence_store=persistence_layer)
293+
def handler(event, context):
294+
...
295+
```
296+
=== "Custom config"
297+
```python hl_lines="1 4 8"
298+
from botocore.config import Config
299+
from aws_lambda_powertools.utilities.idempotency import DynamoDBPersistenceLayer, idempotent
300+
301+
boto_config = Config()
302+
persistence_layer = DynamoDBPersistenceLayer(
303+
event_key_jmespath="body",
304+
table_name="IdempotencyTable",
305+
boto_config=boto_config
306+
)
307+
308+
@idempotent(persistence_store=persistence_layer)
309+
def handler(event, context):
310+
...
311+
```
312+
313+
### Bring your own persistent store
314+
315+
The utility provides an abstract base class which can be used to implement your choice of persistent storage layers.
316+
You can inherit from the `BasePersistenceLayer` class and implement the abstract methods `_get_record`, `_put_record`,
317+
`_update_record` and `_delete_record`. Pay attention to the documentation for each - you may need to perform additional
318+
checks inside these methods to ensure the idempotency guarantees remain intact. For example, the `_put_record` method
319+
needs to raise an exception if a non-expired record already exists in the data store with a matching key.
320+
321+
## Extra resources
322+
If you're interested in a deep dive on how Amazon uses idempotency when building our APIs, check out
323+
[this article](https://aws.amazon.com/builders-library/making-retries-safe-with-idempotent-APIs/).

mkdocs.yml

+1
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ nav:
1515
- utilities/validation.md
1616
- utilities/data_classes.md
1717
- utilities/parser.md
18+
- utilities/idempotency.md
1819

1920
theme:
2021
name: material

0 commit comments

Comments
 (0)