Skip to content

feat(idempotency): Fix KeyError when local_cache is True and an error is raised in the lambda handler #300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 28, 2021

Conversation

michaelbrewer
Copy link
Contributor

@michaelbrewer michaelbrewer commented Feb 27, 2021

Description of changes:

Fix a KeyError when we an error happens and local_cache is True and the cache is empty

Current fix is either check idempotency_key exists in cache before calling del:

 if idempotency_key in self._cache:
    del self._cache[idempotency_key]

NOTE: pop causes a KeyError when there is a match based on the current LRUDict implementation.

self._cache.pop(idempotency_key, None)
# Raise KeyError when idempotency_key does exist in the _cache

Alternative fix is to change the LRUDict to only expose the get, __getitem__ and __setitem__. And when
setting an idempotency_key move it up for eviction

self._cache[idempotency_key] = None

Checklist

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Fix a KeyError when we an error happens and local_cache is True and the cache is empty
@michaelbrewer
Copy link
Contributor Author

@cakepietoast @heitorlessa You might want to apply this fix before the release.

@michaelbrewer
Copy link
Contributor Author

michaelbrewer commented Feb 27, 2021

@cakepietoast i think there is some kind of bug in the LRU cache

i initially thought this would fix it:

self._cache.pop(idempotency_key, None)

But this also raises a KeyError. So i have opted for this instead.

 if idempotency_key in self._cache:
    del self._cache[idempotency_key]

@heitorlessa
Copy link
Contributor

heitorlessa commented Feb 27, 2021

What's the stack trace you get in tests if you use a safe delete instead?

self._cache.pop(idempotency_key, None) # delete key from cache, or simply return if key doesn't exist in cache

@michaelbrewer
Copy link
Contributor Author

What's the stack trace you get in tests if you use a safe delete instead?

self._cache.pop(idempotency_key, None) # delete key from cache, or simply return if key doesn't exist in cache

@heitorlessa Seems like it is crashing at self.move_to_end(key)

    @pytest.mark.parametrize("persistence_store", [{"use_local_cache": False}, {"use_local_cache": True}], indirect=True)
    def test_idempotent_lambda_expired_during_request(
        persistence_store,
        lambda_apigw_event,
        timestamp_expired,
        lambda_response,
        expected_params_update_item,
        hashed_idempotency_key,
    ):
        """
        Test idempotent decorator when lambda is called with an event it succesfully handled already. Persistence store
        returns inconsistent/rapidly changing result between put_item and get_item calls.
        """
    
        stubber = stub.Stubber(persistence_store.table.meta.client)
    
        ddb_response_get_item = {
            "Item": {
                "id": {"S": hashed_idempotency_key},
                "expiration": {"N": timestamp_expired},
                "data": {"S": '{"message": "test", "statusCode": 200}'},
                "status": {"S": "INPROGRESS"},
            }
        }
        ddb_response_get_item_missing = {}
        expected_params_get_item = {
            "TableName": TABLE_NAME,
            "Key": {"id": hashed_idempotency_key},
            "ConsistentRead": True,
        }
    
        # Simulate record repeatedly changing state between put_item and get_item
        stubber.add_client_error("put_item", "ConditionalCheckFailedException")
        stubber.add_response("get_item", ddb_response_get_item, expected_params_get_item)
    
        stubber.add_client_error("put_item", "ConditionalCheckFailedException")
        stubber.add_response("get_item", ddb_response_get_item_missing)
    
        stubber.add_client_error("put_item", "ConditionalCheckFailedException")
        stubber.add_response("get_item", copy.deepcopy(ddb_response_get_item), copy.deepcopy(expected_params_get_item))
    
        stubber.activate()
    
        @idempotent(persistence_store=persistence_store)
        def lambda_handler(event, context):
            return lambda_response
    
        # max retries exceeded before get_item and put_item agree on item state, so exception gets raised
        with pytest.raises(IdempotencyInconsistentStateError):
>           lambda_handler(lambda_apigw_event, {})

tests/functional/idempotency/test_idempotency.py:407: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
aws_lambda_powertools/middleware_factory/factory.py:133: in wrapper
    response = middleware()
aws_lambda_powertools/utilities/idempotency/idempotency.py:66: in idempotent
    return idempotency_handler.handle()
aws_lambda_powertools/utilities/idempotency/idempotency.py:120: in handle
    self.persistence_store.save_inprogress(event=self.event)
aws_lambda_powertools/utilities/idempotency/persistence/base.py:323: in save_inprogress
    if self._retrieve_from_cache(idempotency_key=data_record.idempotency_key):
aws_lambda_powertools/utilities/idempotency/persistence/base.py:270: in _retrieve_from_cache
    self._delete_from_cache(idempotency_key)
aws_lambda_powertools/utilities/idempotency/persistence/base.py:275: in _delete_from_cache
    self._cache.pop(idempotency_key, None)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = LRUDict([]), key = 'be4755f20c4e6f94f43bc3f5849a26fb'

    def __getitem__(self, key):
        value = super().__getitem__(key)
>       self.move_to_end(key)
E       KeyError: 'be4755f20c4e6f94f43bc3f5849a26fb'

aws_lambda_powertools/shared/cache_dict.py:16: KeyError

@michaelbrewer
Copy link
Contributor Author

@heitorlessa @cakepietoast got to be something strange with the LRUCache ?

@michaelbrewer
Copy link
Contributor Author

michaelbrewer commented Feb 28, 2021

@heitorlessa @cakepietoast I added a "fail" test for the LRUDict

ie this raises a KeyError:

cache = LRUDict()
cache["key"] = "value"
cache.pop("key", None)

@codecov-io
Copy link

codecov-io commented Feb 28, 2021

Codecov Report

Merging #300 (4eaffd9) into develop (fe53a2e) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop     #300   +/-   ##
========================================
  Coverage    99.71%   99.71%           
========================================
  Files           87       87           
  Lines         3150     3151    +1     
  Branches       149      150    +1     
========================================
+ Hits          3141     3142    +1     
  Misses           5        5           
  Partials         4        4           
Impacted Files Coverage Δ
...wertools/utilities/idempotency/persistence/base.py 99.19% <100.00%> (+<0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe53a2e...4eaffd9. Read the comment docs.

@heitorlessa
Copy link
Contributor

heitorlessa commented Feb 28, 2021 via email

@michaelbrewer
Copy link
Contributor Author

@heitorlessa weird that even the python docs does not do this:

@heitorlessa
Copy link
Contributor

@heitorlessa weird that even the python docs does not do this:

* https://docs.python.org/3/library/collections.html#ordereddict-examples-and-recipes

I was wrong - had a look at the C implementation of dict which OrderedDict inherits and the order when pop is called is: 1. contains, 2. getitem, 3. __delitem

I need to take some time to understand the implementation of OrderedDict before making a blind suggestion like checking if key is present before calling move_to_end method -- As it fails with KeyError within __getitem__, __delitem__ is never called.

@michaelbrewer
Copy link
Contributor Author

@heitorlessa would it be fine to put in this fix for now? Is your concern performance?

If you want pure performance:

https://github.com/amitdev/lru-dict

@heitorlessa
Copy link
Contributor

heitorlessa commented Feb 28, 2021 via email

@michaelbrewer
Copy link
Contributor Author

I think that even the fix might not work — Lemme know otherwise, I could be
wrong again.

I’m not concerned about Perf but correctness. I suspect subclassing
OrderedDict is causing the pop method to call the subclassed data container
and causing these odd issues — in the same way why you shouldn’t subclass
from dict but MutableMapping instead.

It’s weekend now so I’d revisit this early this week

On Sun, 28 Feb 2021 at 10:42, Michael Brewer [email protected]
wrote:

@heitorlessa https://github.com/heitorlessa would it be fine to put in
this fix for now? Is your concern performance?

If you want pure performance:

https://github.com/amitdev/lru-dict


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#300 (comment),
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAZPQBG46UUFN25PG7YMBBLTBIFY7ANCNFSM4YJP7Z3A
.

Maybe setting the value of the key to None would have the same behavior as removing; If removing (del or pop) is always going to be an issue.

Alternative fix by just exposing get, __getitem__ and __setitem__ and when setting to None moves the item to the beginning
@michaelbrewer
Copy link
Contributor Author

@heitorlessa I pushed an alternative fix which just makes LRUDict a regular class proxying calls to OrderDict, and only exposing, get, __getitem__ and __setitem__. Items that are set to None are evicted first.

@michaelbrewer
Copy link
Contributor Author

But if I had to vote, for now just go with the fix that checks key exists before calling del: e0cc7c4

As this is pretty much up you evict items anyway, even in the python docs.

@michaelbrewer michaelbrewer changed the title feat(idempotency): Fix KeyError when local_cache is True feat(idempotency): Fix KeyError when local_cache is True and an error is raised in the lambda handler Feb 28, 2021
@heitorlessa
Copy link
Contributor

But if I had to vote, for now just go with the fix that checks key exists before calling del: e0cc7c4

As this is pretty much up you evict items anyway, even in the python docs.

If that's the only place where it fails then yeah sure - it's internal only. If we end up using elsewhere in the codebase we can refactor it for correctness

Happy to merge with the fix only

@michaelbrewer
Copy link
Contributor Author

@heitorlessa done.

@michaelbrewer
Copy link
Contributor Author

@heitorlessa I would probably do a squash commit on this one :-)

@heitorlessa heitorlessa merged commit d7b4afe into aws-powertools:develop Feb 28, 2021
@heitorlessa
Copy link
Contributor

Thanks a lot @michaelbrewer !!

@michaelbrewer michaelbrewer deleted the fix-idempotency-local-cache branch February 28, 2021 22:39
@heitorlessa heitorlessa added area/utilities bug Something isn't working labels Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants