Skip to content

Feature request: Support of custom serizialization/deserisialization (object -> dict) on idempotency decorator 'idempotent_function' #2886

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
aradyaron opened this issue Jul 31, 2023 · 15 comments · Fixed by #2951
Assignees
Labels
feature-request feature request idempotency Idempotency utility

Comments

@aradyaron
Copy link
Contributor

aradyaron commented Jul 31, 2023

Use case

I use Pydantic as a representation of my DTO. As such, I receive and return pydantic objects as part of my internal function calls.
Currently, the idempotency decorator supports returning dictionary only (this was also discussed here #2164 ).

I would like to be able bring my own serialization logic in order to support custom DTO implementations.

Solution/User Experience

User will be able to add a serailizer/deserializer in order to control how it is done using simple function parameters:

def idempotent_function(
    function: Optional[AnyCallableT] = None,
    *,
    data_keyword_argument: str,
    persistence_store: BasePersistenceLayer,
    config: Optional[IdempotencyConfig] = None,
    input_serializer: Optional[Callable[[Any], Dict]] = None,
    output_serializer: Optional[Callable[[Any], Dict]] = None,
    output_deserializer: Optional[Callable[[Dict], Any]] = None,
) -> Any:

For example

  class PaymentOutput(BaseModel):
      customer_id: str
      transaction_id: str

  @idempotent_function(
      data_keyword_argument="payment",
      persistence_store=persistence_layer,
      config=config,
      input_serializer=lambda x: x.dict(),
      output_serializer=lambda x: x.dict(),
      output_deserializer=PaymentOutput.parse_obj,
  )
) -> Any:

As there is currently an existing implementation of input serialization (object -> dict), not supplying any input_serializer will use the existing "best effort" implementation.

Alternative solutions

User will be able to add a serailizer/deserializer in order to control how it is done using the IdempotencyConfig:

class IdempotencyConfig:
    def __init__(
        self,
        event_key_jmespath: str = "",
        payload_validation_jmespath: str = "",
        jmespath_options: Optional[Dict] = None,
        raise_on_no_idempotency_key: bool = False,
        expires_after_seconds: int = 60 * 60,  # 1 hour default
        use_local_cache: bool = False,
        local_cache_max_items: int = 256,
        hash_function: str = "md5",
        lambda_context: Optional[LambdaContext] = None,
        input_serializer: Optional[Callable[[Any], Dict]] = None,
        output_serializer: Optional[Callable[[Any], Dict]] = None,
        output_deserializer: Optional[Callable[[Dict], Any]] = None,
    ):

Acknowledgment

@aradyaron aradyaron added feature-request feature request triage Pending triage from maintainers labels Jul 31, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 31, 2023

Thanks for opening your first issue here! We'll come back to you as soon as we can.
In the meantime, check out the #python channel on our Powertools for AWS Lambda Discord: Invite link

@ran-isenberg
Copy link
Contributor

ran-isenberg commented Jul 31, 2023

@heitorlessa FYI, @aradyaron is a co-worker and a serverless ninja :)
+1 for this feature, it would really improve the utility.

@heitorlessa heitorlessa added idempotency Idempotency utility and removed triage Pending triage from maintainers labels Jul 31, 2023
@heitorlessa
Copy link
Contributor

heitorlessa commented Jul 31, 2023

heey @aradyaron firstly, big welcome and thank you for a nicely detailed feature request too!

If you have bandwidth, we'd love a PR on this so we can prioritize it (Observability Provider and Event Handler OpenAPI are taking most of our bandwidth now).

Two quick UX notes for now (might have more during implementation):

  • It'd be best to have it under IdempotencyConfig first - some customers have several functions annotated and would benefit it. We can add on a per function as an override so we can review the UX on the fly.
  • Start with output_serializer and output_deserializer
    • input_serializer could become error prone depending on the storage being used (we're adding Redis this year) or being switched to. It needs more feedback from customers: whether json_serializer makes more sense (common in other Powertools for AWS features), types of non-JSON data that isn't covered today (use cases), etc.

PS: You're lucky to work with @ran-isenberg ;)

@aradyaron
Copy link
Contributor Author

Hey @heitorlessa, thanks!
I'll open a pr on this.

  1. I'll add this configuration as part of IdempotencyConfig
  2. Regarding input_serializer, as serialization in this case means from object to dictionary, is there any other option to add to it? For now the assumption of both input and output is json/dictionary compatible.

@heitorlessa
Copy link
Contributor

Awesome! I had a piece of paper to think through this design now for the last 10m and I was wrong in recommending the global config -- it's best in the decorator as you initially suggested.

I can't think of an use case where you'd want to use the same PaymentOutput.parse_obj globally, since every annotated function will likely have a different DTO or portions of it -- easy to make a mistake here.


For input_serializer, as of now we automatically serialize Pydantic models, Dataclasses, and Event Source Data Classes to Dictionary - are there any other use cases you're in need now? My worry now is mostly having more dials without hearing concrete use cases we're enabling.

@heitorlessa
Copy link
Contributor

Ah btw, I'm on Discord if you ever get stuck -- Idempotency is the most complex part of our codebase.

Idempotency has two major areas I wish we could refactor to make contributions easier: 1/ Decouple data transformation within IdempotencyHandler, 2/ Create a private testing utilities to make it easier to author functional tests (we went too far with Pytest fixtures here).

Until we manage to make these changes, any of us are more than happy to jump on a call and co-author PRs to lower the entry bar to improve this feature :)

@aradyaron
Copy link
Contributor Author

Ok thanks for the tips @heitorlessa,
I'll take this into considerations in my Pr.

@ran-isenberg
Copy link
Contributor

@heitorlessa @aradyaron FYI, the current input serialiozation calls .json on the input class and for Pydantic V1 it works. But in Pydantic v2 its deperacred (though still works) but in v3 it will need to change to .model_dump.

@heitorlessa
Copy link
Contributor

yup, we'll take care of that on our V3 since we will drop support for Pydantic v1 :) I'm updating the roadmap page this week

@leandrodamascena
Copy link
Contributor

@heitorlessa @aradyaron FYI, the current input serialiozation calls .json on the input class and for Pydantic V1 it works. But in Pydantic v2 its deperacred (though still works) but in v3 it will need to change to .model_dump.

Thanks for bringing up this point, Ran!

This is very important to use functions/validators that are supported in both versions: v1 and v2. Some additional info to help here:

1 - If you want to test make pr with Pydantic v2: 1/ poetry remove cfn-lint 2/ poetry add "pydantic>=2.0.3"

2 - If you want to write specific tests for Pydantic v2 (not sure if that's the case), we have a fixture to skip Pydantic v1.

3 - Compability table:

Pydantic v1 Pydantic v2 V2 deprecation In use? Code change required
@validator @field_validator ✔️ ✔️ ✔️
@root_validator @model_validator ✔️ ✔️ ✔️
.parse_obj() model_validate() ✔️ ✔️
.json() model_dump_json() ✔️ ✔️
.dict() model_dump() ✔️ ✔️
.parse_raw() model_validate_json() ✔️ ✔️

Thanks everyone

@rubenfonseca
Copy link
Contributor

This makes a lot of sense to me! I agree with all the observations, including that it should be "output" only for now. We should also make sure we don't break any existing behaviour when adding the new fields.

Note that the implementation should be 100% agnostic to the backend used for serde: in other words, we shouldn't need to import Pydantic to implement this.

@aradyaron if you want to submit a PR I would love to help you merge it! This would be a great addition.

@ran-isenberg
Copy link
Contributor

This makes a lot of sense to me! I agree with all the observations, including that it should be "output" only for now. We should also make sure we don't break any existing behaviour when adding the new fields.

Note that the implementation should be 100% agnostic to the backend used for serde: in other words, we shouldn't need to import Pydantic to implement this.

@aradyaron if you want to submit a PR I would love to help you merge it! This would be a great addition.

see #2951

@rubenfonseca
Copy link
Contributor

rubenfonseca commented Aug 16, 2023

@ran-isenberg i need more coffee :)

@github-project-automation github-project-automation bot moved this from Working on it to Coming soon in Powertools for AWS Lambda (Python) Sep 5, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Sep 5, 2023

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@github-actions github-actions bot added the pending-release Fix or implementation already in dev waiting to be released label Sep 5, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Sep 8, 2023

This is now released under 2.24.0 version!

@github-actions github-actions bot removed the pending-release Fix or implementation already in dev waiting to be released label Sep 8, 2023
@leandrodamascena leandrodamascena moved this from Coming soon to Shipped in Powertools for AWS Lambda (Python) Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request feature request idempotency Idempotency utility
Projects
Status: Shipped
5 participants