Skip to content

S3 object Content Encoding Metadata not being set on upload #3750

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
michaeljohnalbers opened this issue Feb 7, 2023 · 11 comments
Closed

S3 object Content Encoding Metadata not being set on upload #3750

michaeljohnalbers opened this issue Feb 7, 2023 · 11 comments
Assignees
Labels
bug This issue is a bug. crt-client p2 This is a standard priority issue

Comments

@michaeljohnalbers
Copy link

michaeljohnalbers commented Feb 7, 2023

Describe the bug

I'm uploading a GZIP encoded JSON file to S3 using the S3 CRT client and the TransferManager. When I upload the file, the Content-Encoding metadata is not getting set on the object in S3. When the file is downloaded in a browser with a pre-signed URL it isn't being uncompressed because the response to the HTTP request to the pre-signed URL has an empty Content-Encoding header.

Here's the code to upload the file

        PutObjectRequest putObjectRequest = PutObjectRequest.builder()
                .bucket(cacheS3Bucket)
                .key(objectKey)
                .contentType(contentType)
                .contentEncoding(CONTENT_ENCODING)
                .serverSideEncryption(ServerSideEncryption.AES256)
                .build();
        UploadFileRequest uploadRequest = UploadFileRequest.builder()
                .putObjectRequest(putObjectRequest)
                .source(tmpFile)
                .build();
        FileUpload upload = s3TransferManager.uploadFile(uploadRequest);

Where CONTENT_ENCODING is a final String with a value "gzip". contentType is "application/json". And the source is a temporary file which contains the GZIP encoded data.

Building the client is pretty vanilla

        S3CrtAsyncClientBuilder builder = S3AsyncClient.crtBuilder()
                .credentialsProvider(getCredentialsProvider())
                .region(region);

        if (s3Endpoint.isPresent()) {
            try {
                builder.endpointOverride(new URI(s3Endpoint.get()));
            } catch (URISyntaxException e) {
                throw new IllegalStateException(e);
            }
        }

        return builder.build();

This had been working correctly until we switched to using the CRT client from the non-CRT client just yesterday.

Expected Behavior

The content encoding is set correctly based off the value provided via the API.

Current Behavior

Content encoding is set to an empty value.
image

Reproduction Steps

Upload a file with the code provided above.

Possible Solution

No response

Additional Information/Context

CRT version: 0.21.5

AWS Java SDK version used

2.19.31

JDK version used

11

Operating System and version

Ubuntu 22.10 (Docker image)

@michaeljohnalbers michaeljohnalbers added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Feb 7, 2023
@debora-ito
Copy link
Member

Hi @michaeljohnalbers

this was previously reported here #3569 but we already released a fix in 2.19.25, which means this should be working in 2.19.31.

Can you double check if you are really using 2.19.31?

@debora-ito debora-ito added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. and removed needs-triage This issue or PR still needs to be triaged. labels Feb 8, 2023
@debora-ito debora-ito self-assigned this Feb 8, 2023
@michaeljohnalbers
Copy link
Author

@debora-ito I just verified this. Our latest build is using 2.19.31. I reproduced the problem just now.

@debora-ito
Copy link
Member

I can't reproduce using your sample code, I see the content-encoding after the upload -

Screen Shot 2023-02-07 at 5 58 55 PM

Can you enable the CRT Debug logs?
Instructions can be found here - https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/logging-slf4j.html

I'm just looking for the canonical request, to see if it includes the headers.
As an example, here's the canonical request of my local tests:

POST
/file.zip
uploads=
amz-sdk-invocation-id:b1995abc-257e-59b4-eed2-8b7baf7eb79a
amz-sdk-request:attempt=1; max=1
content-encoding:gzip
content-type:application/json
host:<my-bucket>.s3.ap-south-1.amazonaws.com
x-amz-checksum-algorithm:CRC32
x-amz-content-sha256:UNSIGNED-PAYLOAD
x-amz-date:20230208T015744Z
x-amz-security-token:XXX
x-amz-server-side-encryption:AES256

amz-sdk-invocation-id;amz-sdk-request;content-encoding;content-type;host;x-amz-checksum-algorithm;x-amz-content-sha256;x-amz-date;x-amz-security-token;x-amz-server-side-encryption
UNSIGNED-PAYLOAD

@debora-ito debora-ito added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. and removed response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. labels Feb 8, 2023
@michaeljohnalbers
Copy link
Author

michaeljohnalbers commented Feb 8, 2023

@debora-ito I attached a TransferListener to the upload request and here's the output from that just to verify that the content encoding is present there.

UploadFileRequest(putObjectRequest=PutObjectRequest(Bucket=<bucket>, ContentEncoding=gzip, ContentType=application/json, ChecksumAlgorithm=SHA256, Key=<object-key-path>/result.json.gz, ServerSideEncryption=AES256), source=/app/tmp/results10886660184527967118.dat, configuration=[com.modeanalytics.flamingo.planexecution.ResultsCache$1@1fc1bc21])

And here's the request data from the CRT trace data

PUT
<object-key-path>/result.json.gz
 
amz-sdk-invocation-id:d174f99d-20ff-4d17-8a39-cce3ec72db32
amz-sdk-request:attempt=1; max=1
content-encoding:aws-chunked
content-length:2565989
content-type:application/json
host:<bucket>.s3.us-west-2.amazonaws.com
x-amz-content-sha256:STREAMING-UNSIGNED-PAYLOAD-TRAILER
x-amz-date:20230208T023047Z
x-amz-decoded-content-length:2565906
x-amz-sdk-checksum-algorithm:SHA256
x-amz-security-token:XXX
x-amz-server-side-encryption:AES256
x-amz-trailer:x-amz-checksum-sha256
 
amz-sdk-invocation-id;amz-sdk-request;content-encoding;content-length;content-type;host;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length;x-amz-sdk-checksum-algorithm;x-amz-security-token;x-amz-server-side-encryption;x-amz-trailer
STREAMING-UNSIGNED-PAYLOAD-TRAILER

Small note, I added code to set the checksum just incase that might cause the correct encoding to get set, based on the issue you linked to above. That's why you see the SHA-256 checksum header.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Feb 8, 2023
@michaeljohnalbers
Copy link
Author

Swapping back to the non-CRT S3 client is a workaround to this problem.

@debora-ito
Copy link
Member

Thank you for the logs, I can repro now. The issue occurs when the upload is not a multipart upload, it's a single part upload.
We'll work on a fix.

@debora-ito debora-ito added the p2 This is a standard priority issue label Mar 7, 2023
@RobertoUa
Copy link

any news? it causing us issues on production

@michaeljohnalbers
Copy link
Author

@debora-ito Is there any news on fixing this?

@debora-ito
Copy link
Member

debora-ito commented May 17, 2023

@michaeljohnalbers @RobertoUa

a fix was released in aws-crt version 0.21.16. I've created a PR to upgrade the version in the Java SDK but in the meantime you can override the version in your own project if necessary. Thank you for your patience.

@debora-ito
Copy link
Member

debora-ito commented May 19, 2023

Fix released in Java SDK version 2.20.68. Please try it out and let us know of any issues.

@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

aws-sdk-java-automation added a commit that referenced this issue Mar 18, 2025
…b96fd2a34

Pull request: release <- staging/9bf4b3c8-4200-45cf-9f0a-4c4b96fd2a34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. crt-client p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

3 participants