Can powertools ensure the idempotence of all kinds of functions #801

Nsupyq · 2021-11-04T17:45:40Z

When reading the document about idempotency, I am wondering whether the powertool can convert all functions to be idempotent.

For example, if a function tries to increase the value of a variable in DynamoDB or other databases, I think it cannot be idempotent unless writing the functional return value and increasing the variable are completed atomically.

I suggest that the document should describe which kinds of functions can achieve the idempotence via powertools in detail.
Please let me know if my understanding is correct.

boring-cyborg · 2021-11-04T17:45:41Z

Thanks for opening your first issue here! We'll come back to you as soon as we can.

to-mc · 2021-11-05T09:34:28Z

If you wrap any function with the @idempotent_function decorator, the entire function will behave in an idempotent manner. Meaning, if the function is called twice with the same arguments, it will only be executed once. This is true regardless of the contents of the function. The utility doesn't alter the contents of the decorated function in any way. Given that the function is called multiple times within the idempotency expiry period, with the same arguments, the code in the body of the function will only be executed the first time the function is called. Subsequent calls will receive the same response, which the idempotency utility retreieves from its data store instead of executing the function again.

The idempotency of the specific operations contained within the function (like updating a counter) should not be relevant here. Take the below example:

@idempotent_function(data_keyword_argument="data", config=config, persistence_store=dynamodb)
def dummy(arg_one, arg_two, data: dict, **kwargs):

    # make an "unsafe" update to dynamodb counter
    get_counter_value_from_dynamodb(Key=data)
    increment_counter_value()
    set_counter_to_new_value_in_dynamodb(Key=data)
    #############################################


    return {"data": counter}

In this case, the idempotent utility will not allow any of the code in this function to be called more than once within the idempotent expiry period. If you call the function a second (and third, fourth, and so on...) time after the first execution has completed, the idempotency util will deliver the same response as the first execution, without running the function code again. Assuming there's a separate counter for each data in the example, there should be no possibility that the same counter is being updated more than once at the same time.

Disclaimer: the example is very much a contrived one, there are better ways to do this with DynamoDB alone that don't require the idempotent utility.

Does that answer your question?

Nsupyq · 2021-11-05T13:23:24Z

Thank you for your answer @cakepietoast!
But I still have a question.

If you call the function a second (and third, fourth, and so on...) time after the first execution has completed, the idempotency util will deliver the same response as the first execution, without running the function code again.

It seems that powertool only considers the retry happening after a function has completed. I am wondering what will happen if the function fails after set_counter_to_new_value_in_dynamodb(Key=data) and before return {"data": counter}. According to the document, in this case, the powertool will not write the return value into DynamoDB and the function can be executed for the second time. But the counter has been increased in the first failed execution. Therefore, the counter will be increased twice on retry. I am wondering if powertool can properly address such retry.

to-mc · 2021-11-10T15:23:17Z

I am wondering what will happen if the function fails after set_counter_to_new_value_in_dynamodb(Key=data) and before return {"data": counter}. According to the document, in this case, the powertool will not write the return value into DynamoDB and the function can be executed for the second time.

This is correct, though you do have control over this as a user of the library. If you don't want the function to be retried in its entirety, you can catch any exceptions and return a valid response from your Lambda function instead of allowing the Exception to bubble up. Example:

@idempotent_function(data_keyword_argument="data", config=config, persistence_store=dynamodb)
def dummy(arg_one, arg_two, data: dict, **kwargs):

    # make an "unsafe" update to dynamodb counter
    get_counter_value_from_dynamodb(Key=data)
    increment_counter_value()
    set_counter_to_new_value_in_dynamodb(Key=data)
    
    try:
        some_other_call_that_raises_an_exception()
    except Exception as err:
        logger.error(err)
        return {"data": None, "error": str(err)}


    return {"data": counter}

This is mentioned in the handling exceptions section of the docs.

Having said that, it is a good idea to make your idempotent functions as small as you possibly can, with any code that doesn't need to be executed as idempotent outside the function. To continue with my (increasingly contrived) example from above:

def lambda_handler(event, context):
    do_some_stuff()
    result = dummy("one", "two", {"foo": "bar", "baz": "qux"})
    some_other_call_that_raises_an_exception()


@idempotent_function(data_keyword_argument="data", config=config, persistence_store=dynamodb)
def dummy(arg_one, arg_two, data: dict, **kwargs):

    # make an "unsafe" update to dynamodb counter
    get_counter_value_from_dynamodb(Key=data["foo"])
    increment_counter_value()
    set_counter_to_new_value_in_dynamodb(Key=data["foo"])
    return {"data": counter}

In this case, the code that can cause an exception - but is unrelated to the code that needs to be idempotent - is outside of the idempotent function. Now, when an exception is raised, it will be outside of the context of the function and not cause the record to be deleted. I can see that the exception handling part of the document needs updating to reflect this. It was written before we implemented the idempotent_function decorator, and doesn't account for it. I'll make these changes in PR #808 to clarify.

Nsupyq · 2021-11-11T11:50:03Z

Thank you for your detailed explanation @cakepietoast !
I still have a question. I think that the runtime exception is not the only factor that can trigger failure and retry. Some other things, such as system crash and hardware fault, can also cause function failure. Then the function cannot catch these factors.

def dummy(arg_one, arg_two, data: dict, **kwargs):
    get_counter_value_from_dynamodb(Key=data["foo"])
    increment_counter_value()
    set_counter_to_new_value_in_dynamodb(Key=data["foo"])
    return {"data": counter}

For example, when the machine running the function crashes after executing set_counter_to_new_value_in_dynamodb and before writing the function result into DynamoDB, the function will be retried and increases the counter again.

I am wondering how the powertool addresses this kind of failure.

to-mc · 2021-11-11T15:14:49Z

Thank you for your detailed explanation @cakepietoast ! I still have a question. I think that the runtime exception is not the only factor that can trigger failure and retry. Some other things, such as system crash and hardware fault, can also cause function failure. Then the function cannot catch these factors.
def dummy(arg_one, arg_two, data: dict, **kwargs):
    get_counter_value_from_dynamodb(Key=data["foo"])
    increment_counter_value()
    set_counter_to_new_value_in_dynamodb(Key=data["foo"])
    return {"data": counter}
For example, when the machine running the function crashes after executing set_counter_to_new_value_in_dynamodb and before writing the function result into DynamoDB, the function will be retried and increases the counter again.

I am wondering how the powertool addresses this kind of failure.

It is important to remember that Powertools is "just" a library that executes within the scope of your Lambda Function. It "wraps" your decorated python function, injecting its idempotency logic before and after your decorated python function is executed. In the case of underlying hardware failure during execution of your decorated python function, no more code execution can happen - including any Powertools/idempotency logic.

Specifically in the scenario you describe, when your python function is executed, the following will happen:

An INPROGRESS idempotency record would be written to the persistent store to acquire a lock before any of your function code is allowed to begin executing.
Your function begins executing, and successfully executes set_counter_to_new_value_in_dynamodb.
The underlying hardware crashes before the function successfully returns.
No more code is executed, so the idempotency logic cannot delete/update the INPROGRESS record to release the lock.
On subsequent executions, it would look like your python function is still in progress. The idempotency utility would fail to acquire the lock, and would raise an IdempotencyAlreadyInProgressError rather than executing your function code.
The idempotent record would expire after the period you configured, and the function would be retryable again.

As a side note: you can replace step 3. above with "the Lambda Function times out" as the behaviour there is the same.

Nsupyq · 2021-11-11T17:15:16Z

Ok, I see. Thank you very much!

Nsupyq added the documentation Improvements or additions to documentation label Nov 4, 2021

to-mc self-assigned this Nov 4, 2021

to-mc added area/idempotency need-more-information Pending information to continue labels Nov 5, 2021

aws-powertools locked and limited conversation to collaborators Nov 12, 2021

to-mc closed this as completed Nov 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Can powertools ensure the idempotence of all kinds of functions #801

Can powertools ensure the idempotence of all kinds of functions #801

Nsupyq commented Nov 4, 2021

boring-cyborg bot commented Nov 4, 2021

to-mc commented Nov 5, 2021

Nsupyq commented Nov 5, 2021

to-mc commented Nov 10, 2021

Nsupyq commented Nov 11, 2021

to-mc commented Nov 11, 2021 •

edited

Loading

Nsupyq commented Nov 11, 2021

This issue was moved to a discussion.

This issue was moved to a discussion.

Can powertools ensure the idempotence of all kinds of functions #801

Can powertools ensure the idempotence of all kinds of functions #801

Comments

Nsupyq commented Nov 4, 2021

boring-cyborg bot commented Nov 4, 2021

to-mc commented Nov 5, 2021

Nsupyq commented Nov 5, 2021

to-mc commented Nov 10, 2021

Nsupyq commented Nov 11, 2021

to-mc commented Nov 11, 2021 • edited Loading

Nsupyq commented Nov 11, 2021

This issue was moved to a discussion.

to-mc commented Nov 11, 2021 •

edited

Loading