Skip to content

Bug: The pydantic parser breaks when parsing JSON with a model containing a field with the type RawDictOrModel #5303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
UlfurOrn opened this issue Oct 4, 2024 · 6 comments · Fixed by #5313
Assignees
Labels
bug Something isn't working bug-upstream Something isn't working in an upstream dependency parser Parser (Pydantic) utility

Comments

@UlfurOrn
Copy link

UlfurOrn commented Oct 4, 2024

Expected Behaviour

Parsing an event/field from JSON containing a field with the type RawDictOrModel (like EventBridgeModel) should work.

Current Behaviour

Parsing an event/field from JSON containing a field with the type RawDictOrModel (like EventBridgeModel) results in an error:

self = <pydantic.type_adapter.TypeAdapter object at 0x106895dd0>
data = '{"version":"version","id":"id","source":"source","account":"account","time":"2024-10-04T11:55:04.861457","region":"region","resources":[],"detail_type":null,"detail":{"key":"value"},"replay_name":null}'

    @_frame_depth(1)
    def validate_json(
        self, data: str | bytes, /, *, strict: bool | None = None, context: dict[str, Any] | None = None
    ) -> T:
        """Usage docs: https://docs.pydantic.dev/2.9/concepts/json/#json-parsing
    
        Validate a JSON string or bytes against the model.
    
        Args:
            data: The JSON data to validate against the model.
            strict: Whether to strictly check types.
            context: Additional context to use during validation.
    
        Returns:
            The validated object.
        """
>       return self.validator.validate_json(data, strict=strict, context=context)
E       NotImplementedError: Cannot check issubclass when validating from json, use a JsonOrPython validator instead.

Code snippet

# Minimal reproducible example:
EventBridgeModel(
    version="version",
    id="id",
    source="source",
    account="account",
    time=datetime.now(),
    region="region",
    resources=[],
    detail={"key": "value"},
).model_dump_json()

parse(event, model=EventBridgeModel)


# Actual use-case where this was discovered - parsing EventBridgeModel's using the SqsEnvelope:

event = {
    "Records": [
        {
            "messageId": "19dd0b57-b21e-4ac1-bd88-01bbb068cb78",
            "receiptHandle": "MessageReceiptHandle",
            "body": EventBridgeModel(
                version="version",
                id="id",
                source="source",
                account="account",
                time=datetime.now(),
                region="region",
                resources=[],
                detail={"key": "value"},
            ).model_dump_json(),
            "attributes": {
                "ApproximateReceiveCount": "1",
                "SentTimestamp": "1523232000000",
                "SenderId": "123456789012",
                "ApproximateFirstReceiveTimestamp": "1523232000001",
            },
            "messageAttributes": {},
            "md5OfBody": "{{{md5_of_body}}}",
            "eventSource": "aws:sqs",
            "eventSourceARN": "arn:aws:sqs:us-east-1:123456789012:MyQueue",
            "awsRegion": "us-east-1",
        }
    ]
}

parse(event, model=EventBridgeModel, envelope=SqsEnvelope)

Possible Solution

This is being caused due to the RawDictOrModel type being a union of multiple types including type[BaseModel] which is causing the issue.

AnyInheritedModel = Union[Type[BaseModel], BaseModel]
RawDictOrModel = Union[Dict[str, Any], AnyInheritedModel]

e.g. when powertools attempts to pydantic to load the JSON (using model_validate_json). It tries to validate the field against type[BaseModel] which does not seem to be allowed in a JSON parsing context.

Either this part of the type hint needs to be removed or something needs to happen in the validation logic itself.

Steps to Reproduce

See the code snippets. This only happens when the detail field (or any field of the type RawDictOrModel) of the EventBridgeModel actually contains some data.

Powertools for AWS Lambda (Python) version

latest

AWS Lambda function runtime

3.11

Packaging format used

Lambda Layers

Debugging logs

No response

@UlfurOrn UlfurOrn added bug Something isn't working triage Pending triage from maintainers labels Oct 4, 2024
Copy link

boring-cyborg bot commented Oct 4, 2024

Thanks for opening your first issue here! We'll come back to you as soon as we can.
In the meantime, check out the #python channel on our Powertools for AWS Lambda Discord: Invite link

@UlfurOrn
Copy link
Author

UlfurOrn commented Oct 4, 2024

If you need any more details please let me know 🙏

@leandrodamascena
Copy link
Contributor

leandrodamascena commented Oct 4, 2024

Hi @UlfurOrn! Thanks for reporting this issue! I see other people complaining about the same thing in Pydantic and with no effective solution. It really seems like validating JSON against a model that can has a type of another model is a limitation in Pydantic.

But talking about the Powertools side, I was able to reproduce the problem here and I can confirm this is creating some problems. Due to this seems to be a limitation in Pydantic I think we can put the validation code in a try/exception block and if this raises a NotImplementedError we try to validate with validate_python instead of validate_json. Something like this:

logger.debug("parsing event against model")
try:
    if isinstance(data, str):
        logger.debug("parsing event as string")
        return adapter.validate_json(data)

    return adapter.validate_python(data)
except NotImplementedError:
    logger.debug("fallback to validate using python")
    data = json.loads(data)
    return adapter.validate_python(data)

I still need to think in a more optimized code for this, but this seems to fix the Pydantic limitation.

Either this part of the type hint needs to be removed or something needs to happen in the validation logic itself.

Removing the type hint part is not a good solution because we can break customers that are relying in this possible type and using classes instead of instances of those classes. We never know how our customers are using this project and it is better to avoid breaking things.

I'd like to hear your thoughts.

@leandrodamascena leandrodamascena added bug-upstream Something isn't working in an upstream dependency parser Parser (Pydantic) utility and removed triage Pending triage from maintainers labels Oct 4, 2024
@leandrodamascena leandrodamascena self-assigned this Oct 4, 2024
@leandrodamascena leandrodamascena moved this from Triage to Pending review in Powertools for AWS Lambda (Python) Oct 4, 2024
@github-project-automation github-project-automation bot moved this from Pending review to Coming soon in Powertools for AWS Lambda (Python) Oct 7, 2024
Copy link
Contributor

github-actions bot commented Oct 7, 2024

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@github-actions github-actions bot added the pending-release Fix or implementation already in dev waiting to be released label Oct 7, 2024
Copy link
Contributor

github-actions bot commented Oct 8, 2024

This is now released under 3.1.0 version!

@github-actions github-actions bot removed the pending-release Fix or implementation already in dev waiting to be released label Oct 8, 2024
@UlfurOrn
Copy link
Author

UlfurOrn commented Oct 11, 2024

Hey 👋 sorry for the late reply... it seems I had disabled notifications 😓

Your solution sounds like the more correct one and works for me!

I also commented on the related issue in Pydantic and since then a fix/change has been merged.

Otherwise thanks for the quick response on this 🙏

@leandrodamascena leandrodamascena moved this from Coming soon to Shipped in Powertools for AWS Lambda (Python) Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working bug-upstream Something isn't working in an upstream dependency parser Parser (Pydantic) utility
Projects
Status: Shipped
2 participants