Skip to content

Bug: S3Model not compatible with ObjectRemoved events #1637

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
barreeeiroo opened this issue Oct 23, 2022 · 20 comments
Closed

Bug: S3Model not compatible with ObjectRemoved events #1637

barreeeiroo opened this issue Oct 23, 2022 · 20 comments
Labels
bug Something isn't working

Comments

@barreeeiroo
Copy link
Contributor

barreeeiroo commented Oct 23, 2022

Expected Behaviour

After deleting an object from S3, if a Lambda catches this event, they should be able to parse it. This event name from S3 is ObjectDeleted:*.

Current Behaviour

A ValidationError is thrown because of the missing size and eTag attributes.

Records -> 0 -> s3 -> object -> size
  field required (type=value_error.missing)
Records -> 0 -> s3 -> object -> eTag
  field required (type=value_error.missing)

The ObjectDeleted event does not include those attributes.

Code snippet

parse(model=S3Model, event=json.loads(s3_message))

Payload Example: #1637 (comment)

Possible Solution

Update S3Object so that both size and eTag are optional:

class S3Object(BaseModel):
    key: str
    size: Optional[NonNegativeFloat]
    eTag: Optional[str]
    sequencer: str
    versionId: Optional[str]

Steps to Reproduce

  1. Create a Lambda function and an S3 bucket
  2. Trigger the Lambda function through S3 events, and enable the ObjectDeleted type
  3. Delete an object from S3
  4. S3Model will raise a ValidationError

AWS Lambda Powertools for Python version

latest

AWS Lambda function runtime

3.8

Packaging format used

PyPi

Debugging logs

No response

@barreeeiroo barreeeiroo added bug Something isn't working triage Pending triage from maintainers labels Oct 23, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Oct 23, 2022

Thanks for opening your first issue here! We'll come back to you as soon as we can.
In the meantime, check out the #python channel on our AWS Lambda Powertools Discord: Invite link

@barreeeiroo barreeeiroo changed the title Bug: S3Model not compatible with ObjectRemoved eventName Bug: S3Model not compatible with ObjectRemoved events Oct 23, 2022
@heitorlessa
Copy link
Contributor

@barreeeiroo thank you for flagging this - would you be able to send a quick fix PR marking them as Optional, please?

We're launching V2 tomorrow morning and we could include the fix in.

Otherwise we can make a V1 patch release afterwards.

Thank you!!

@heitorlessa heitorlessa added area/parser and removed triage Pending triage from maintainers labels Oct 23, 2022
barreeeiroo added a commit to barreeeiroo/aws-lambda-powertools-python that referenced this issue Oct 23, 2022
Fixes ObjectRemoved API event, as they don't include these attributes.
aws-powertools#1637

Signed-off-by: Diego Barreiro <[email protected]>
@barreeeiroo
Copy link
Contributor Author

@heitorlessa done, the PR should be available now: PR#1638. I couldn't really test the change right now, but if strictly needed I can fully test it tomorrow.

@heitorlessa
Copy link
Contributor

@heitorlessa done, the PR should be available now: PR#1638. I couldn't really test the change right now, but if strictly needed I can fully test it tomorrow.

Thank you!! Do you happen to have a copy of the delete event?

If you do, please add to the issue. We can complete the tests before releasing it tomorrow (~10am Amsterdam time)

@barreeeiroo
Copy link
Contributor Author

Sure, here it is:

{
  "eventVersion": "2.1",
  "eventSource": "aws:s3",
  "awsRegion": "us-west-2",
  "eventTime": "2022-10-23T15:37:51.707Z",
  "eventName": "ObjectRemoved:Delete",
  "userIdentity": {
    "principalId": "[REDACTED]"
  },
  "requestParameters": {
    "sourceIPAddress": "[REDACTED]"
  },
  "responseElements": {
    "x-amz-request-id": "[REDACTED]",
    "x-amz-id-2": "[REDACTED]"
  },
  "s3": {
    "s3SchemaVersion": "1.0",
    "configurationId": "[REDACTED]",
    "bucket": {
      "name": "[REDACTED]",
      "ownerIdentity": {
        "principalId": "[REDACTED]"
      },
      "arn": "[REDACTED]"
    },
    "object": {
      "key": "[REDACTED]",
      "sequencer": "[REDACTED]"
    }
  }
}

Seems like the rest of the payload is correct.

@ran-isenberg
Copy link
Contributor

Thanks @barreeeiroo for finding and solving this!

@heitorlessa maybe we can add a validator that checks that they are indeed optional only if the event name is ObjectRemoved:Delete? For other S3object events, these fields are not optional, right?

@ran-isenberg
Copy link
Contributor

@heitorlessa i think it's also a bug for the S3Object dataclass.

@barreeeiroo
Copy link
Contributor Author

No problem @ran-isenberg! I also thought about adding a conditional validator like this:

class S3Object(BaseModel):
    key: str
    size: Optional[NonNegativeFloat] # This needs to still be defined as Optional
    eTag: Optional[str] # This needs to still be defined as Optional
    sequencer: str
    versionId: Optional[str]

class S3RecordModel(BaseModel):
    eventVersion: str
    eventSource: Literal["aws:s3"]
    awsRegion: str
    eventTime: datetime
    eventName: str
    userIdentity: S3Identity
    requestParameters: S3RequestParameters
    responseElements: S3ResponseElements
    s3: S3Message
    glacierEventData: Optional[S3EventRecordGlacierEventData]

    @root_validator
    def validate_s3_object(cls, values):
        event_name = values.get('eventName')
        s3_object = values.get('s3').get('object')
        if 'ObjectRemoved' not in event_name:
            if not s3_object.get('size') or not s3_object.get('eTag'):
                raise ValueError('S3Object.size and S3Object.eTag are required for non-ObjectRemoved events')
        return values

But wasn't really sure about it as I couldn't find any AWS official documentation stating which attributes are required for each event type. Is there any document which lists all the attributes provided for each event type from S3?

@heitorlessa
Copy link
Contributor

heitorlessa commented Oct 23, 2022 via email

@ran-isenberg
Copy link
Contributor

@heitorlessa it's not just the parser in this case, also regular data classes are broken too.

@ran-isenberg
Copy link
Contributor

No problem @ran-isenberg! I also thought about adding a conditional validator like this:

class S3Object(BaseModel):
    key: str
    size: Optional[NonNegativeFloat] # This needs to still be defined as Optional
    eTag: Optional[str] # This needs to still be defined as Optional
    sequencer: str
    versionId: Optional[str]

class S3RecordModel(BaseModel):
    eventVersion: str
    eventSource: Literal["aws:s3"]
    awsRegion: str
    eventTime: datetime
    eventName: str
    userIdentity: S3Identity
    requestParameters: S3RequestParameters
    responseElements: S3ResponseElements
    s3: S3Message
    glacierEventData: Optional[S3EventRecordGlacierEventData]

    @root_validator
    def validate_s3_object(cls, values):
        event_name = values.get('eventName')
        s3_object = values.get('s3').get('object')
        if 'ObjectRemoved' not in event_name:
            if not s3_object.get('size') or not s3_object.get('eTag'):
                raise ValueError('S3Object.size and S3Object.eTag are required for non-ObjectRemoved events')
        return values

But wasn't really sure about it as I couldn't find any AWS official documentation stating which attributes are required for each event type. Is there any document which lists all the attributes provided for each event type from S3?

I didnt find any official docs, otherwise i'd have the even name as a literal of closed event types list :(

@heitorlessa
Copy link
Contributor

heitorlessa commented Oct 23, 2022 via email

@ran-isenberg
Copy link
Contributor

@barreeeiroo yeah that makes sense! nice one.
I'll open a PR with tests in a few min.

@heitorlessa
Copy link
Contributor

No problem @ran-isenberg! I also thought about adding a conditional validator like this:

class S3Object(BaseModel):

key: str
size: Optional[NonNegativeFloat] # This needs to still be defined as Optional
eTag: Optional[str] # This needs to still be defined as Optional
sequencer: str
versionId: Optional[str]

class S3RecordModel(BaseModel):

eventVersion: str
eventSource: Literal["aws:s3"]
awsRegion: str
eventTime: datetime
eventName: str
userIdentity: S3Identity
requestParameters: S3RequestParameters
responseElements: S3ResponseElements
s3: S3Message
glacierEventData: Optional[S3EventRecordGlacierEventData]
@root_validator
def validate_s3_object(cls, values):
    event_name = values.get('eventName')
    s3_object = values.get('s3').get('object')
    if 'ObjectRemoved' not in event_name:
        if not s3_object.get('size') or not s3_object.get('eTag'):
            raise ValueError('S3Object.size and S3Object.eTag are required for non-ObjectRemoved events')
    return values

But wasn't really sure about it as I couldn't find any AWS official documentation stating which attributes are required for each event type. Is there any document which lists all the attributes provided for each event type from S3?

I didnt find any official docs, otherwise i'd have the even name as a literal of closed event types list :(

+1. It's what I shared earlier - the root cause is not having an event schema. We derived based on events we trigger - our models can only be as good as our tests and "contracts" being respected by their original producers.

So far we've noticed a few variations that end up making one or more properties to be either omitted or have different casing:

  • Event Source default settings

  • Event Source custom settings (FIFO, different action like Deleted here)

  • Event Source connected with another event source (SNS->SQS->Lambda is slightly different than SNS->Lambda)

@heitorlessa
Copy link
Contributor

heitorlessa commented Oct 23, 2022 via email

@ran-isenberg
Copy link
Contributor

ran-isenberg commented Oct 23, 2022

@heitorlessa of course. It's 11PM here, didnt expect you to merge it :)
I didnt know how to add Diego's PR to mine, but you can merge his code and then mine so his commit is added.

The validator does not break older tests and it's very specific to the delete event.

@barreeeiroo
Copy link
Contributor Author

For reference, providing this link to this AWS page: https://docs.aws.amazon.com/filegateway/latest/files3/refresh-cache.html
It shows an example payload in Step 15 for ObjectRemoved where they show the object having only eTag and key attributes.

barreeeiroo added a commit to barreeeiroo/aws-lambda-powertools-python that referenced this issue Oct 23, 2022
Fixes ObjectRemoved API event, as they don't include these attributes.
aws-powertools#1637

Signed-off-by: Diego Barreiro <[email protected]>
@heitorlessa heitorlessa added the pending-release Fix or implementation already in dev waiting to be released label Oct 24, 2022
@heitorlessa
Copy link
Contributor

Releasing v2.0.0a4 pre-release: https://github.com/awslabs/aws-lambda-powertools-python/actions/runs/3310473745

We should make the official launch at 10am CET (Amsterdam time)

@heitorlessa heitorlessa removed the pending-release Fix or implementation already in dev waiting to be released label Oct 24, 2022
@heitorlessa
Copy link
Contributor

@github-actions
Copy link
Contributor

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants