Skip to content

S3 getObject combined with AsyncResponseTransformer.toBytes() copies too much data #3193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
chibenwa opened this issue May 16, 2022 · 3 comments
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue

Comments

@chibenwa
Copy link

chibenwa commented May 16, 2022

Describe the bug

Given the following piece of code:

client.getObject(
                    builder -> builder.bucket(bucket),
                    AsyncResponseTransformer.toBytes())

Several data copies do take place:

  • Netty ByteBuf get's copied into a heap ByteBuffer. I understand this copy is done to present a common abstraction level for all transports. Yet maybe ByteBuf::nioBuffers could be relied on to minimize copies. I did not expore this option myself.
  • AsyncResponseTransformer.toBytes() relies internally on an unsized ByteArrayOutputStream (that thus will expend many times) and the result will be copied when calling toByteArray() and another defensive copy is carried other when transforming the future.

Expected Behavior

Minimize memory copies in such a scenario to minimize heap pressure.

Current Behavior

Screenshot from 2022-05-16 10-20-07

Reproduction Steps

client.getObject(
                    builder -> builder.bucket(bucket),
                    AsyncResponseTransformer.toBytes())

I used https://github.com/jvm-profiling-tools/async-profiler for the flame graphs.

Possible Solution

  • Remove needless defensive copies inside ByteArrayAsyncResponseTransformer. The byte array is passed to the caller, who then becomes responsible of it, and nobody else references the old byte array once the publisher completes. This can be an instant win coming at a very low price.
  • Rely on GetResponse::contentLength to size a byte array and copy incoming buffers to it in place. This requires knowledge about response type... Thus this might be hardly doable in a generic fashion.

I tested this successfully:

package org.apache.james.blob.objectstorage.aws;

import java.nio.ByteBuffer;
import java.util.concurrent.CompletableFuture;

import org.reactivestreams.Subscriber;
import org.reactivestreams.Subscription;

import software.amazon.awssdk.core.ResponseBytes;
import software.amazon.awssdk.core.async.AsyncResponseTransformer;
import software.amazon.awssdk.core.async.SdkPublisher;
import software.amazon.awssdk.services.s3.model.GetObjectResponse;

/**
* Class copied for {@link software.amazon.awssdk.core.internal.async.ByteArrayAsyncResponseTransformer}
*
* Modified to take advantage of the content length of the get response in order to use a sized array
* upon content copy. This avoids the usage of a ByteArrayOutputStream that yields additional copies
* (resizing upon copy, copy of the resulting byte array).
*
* A defensive copy upon returning the result is also removed (responsibility transfered to the caller, no other usages)
*/
public class MinimalCopyBytesResponseTransformer implements AsyncResponseTransformer<GetObjectResponse, ResponseBytes<GetObjectResponse>> {
   private volatile CompletableFuture<byte[]> cf;
   private volatile GetObjectResponse response;

   public MinimalCopyBytesResponseTransformer() {

   }

   public CompletableFuture<ResponseBytes<GetObjectResponse>> prepare() {
       this.cf = new CompletableFuture();
       // Modifcation: Remove a defensive copy of the buffer upon completion: the caller is now the sole user of the array
       return this.cf.thenApply(arr -> ResponseBytes.fromByteArrayUnsafe(response, arr));
   }

   public void onResponse(GetObjectResponse response) {
       this.response = response;
   }

   public void onStream(SdkPublisher<ByteBuffer> publisher) {
       publisher.subscribe(new BaosSubscriber(this.cf, response.contentLength().intValue()));
   }

   public void exceptionOccurred(Throwable throwable) {
       this.cf.completeExceptionally(throwable);
   }

   static class BaosSubscriber implements Subscriber<ByteBuffer> {
       private final CompletableFuture<byte[]> resultFuture;
       // Modification: use a byte array instead of the ByteArrayInputStream and track position
       private final byte[] buffer;
       private int pos = 0;
       private Subscription subscription;

       BaosSubscriber(CompletableFuture<byte[]> resultFuture, int size) {
           this.resultFuture = resultFuture;
           this.buffer = new byte[size];
       }

       public void onSubscribe(Subscription s) {
           if (this.subscription != null) {
               s.cancel();
           } else {
               this.subscription = s;
               this.subscription.request(9223372036854775807L);
           }
       }

       public void onNext(ByteBuffer byteBuffer) {
           // Modification: copy the response part in place into the result buffer and track position
           int written = byteBuffer.remaining();
           byteBuffer.get(buffer, pos, written);
           pos += written;
           this.subscription.request(1L);
       }

       public void onError(Throwable throwable) {
           this.resultFuture.completeExceptionally(throwable);
       }

       public void onComplete() {
           this.resultFuture.complete(this.buffer);
       }
   }
}

Additional Information/Context

Apache James relies on the S3 driver for email content, thus speed and heap impact is important for us as S3 getObject will be called on each email content read.

AWS Java SDK version used

2.17.189

JDK version used

openjdk version "11.0.15" 2022-04-19

Operating System and version

Ubuntu 20.04.4 LTS

@chibenwa chibenwa added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels May 16, 2022
chibenwa added a commit to chibenwa/james-project that referenced this issue May 16, 2022
CF aws/aws-sdk-java-v2#3193

`AsyncResponseTransformer.toBytes()` relies internally on
an unsized ByteArrayOutputStream (that thus will expend
many times) and the result will be copied when calling
`toByteArray()` and another defensive copy is carried
other when transforming the future.

We thus implement a response transformer variation:

 - Remove needless defensive copies inside
`ByteArrayAsyncResponseTransformer`. The byte array is
passed to the caller, who then becomes responsible of it,
and nobody else references the old byte array once the
publisher completes. This can be an instant win coming at
a very low price.
 - Rely on `GetResponse::contentLength` to size a byte
array and copy incoming buffers to it in place. This
requires knowledge about response type... Thus this might
be hardly doable in a generic fashion.

On a typical IMAP benchmark S3 getObject toBytes
transformation takes 1.66% of overall memory allocation.
chibenwa added a commit to chibenwa/james-project that referenced this issue May 16, 2022
CF aws/aws-sdk-java-v2#3193

`AsyncResponseTransformer.toBytes()` relies internally on
an unsized ByteArrayOutputStream (that thus will expend
many times) and the result will be copied when calling
`toByteArray()` and another defensive copy is carried
other when transforming the future.

We thus implement a response transformer variation:

 - Remove needless defensive copies inside
`ByteArrayAsyncResponseTransformer`. The byte array is
passed to the caller, who then becomes responsible of it,
and nobody else references the old byte array once the
publisher completes. This can be an instant win coming at
a very low price.
 - Rely on `GetResponse::contentLength` to size a byte
array and copy incoming buffers to it in place. This
requires knowledge about response type... Thus this might
be hardly doable in a generic fashion.

On a typical IMAP benchmark S3 getObject toBytes
transformation takes 1.66% of overall memory allocation.
@debora-ito
Copy link
Member

@chibenwa thank you for reaching out, added to our backlog. Since it's an optimization I'm changing to feature request.

@debora-ito debora-ito added feature-request A feature should be added or improved. and removed bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels May 16, 2022
chibenwa added a commit to chibenwa/james-project that referenced this issue May 18, 2022
CF aws/aws-sdk-java-v2#3193

`AsyncResponseTransformer.toBytes()` relies internally on
an unsized ByteArrayOutputStream (that thus will expend
many times) and the result will be copied when calling
`toByteArray()` and another defensive copy is carried
other when transforming the future.

We thus implement a response transformer variation:

 - Remove needless defensive copies inside
`ByteArrayAsyncResponseTransformer`. The byte array is
passed to the caller, who then becomes responsible of it,
and nobody else references the old byte array once the
publisher completes. This can be an instant win coming at
a very low price.
 - Rely on `GetResponse::contentLength` to size a byte
array and copy incoming buffers to it in place. This
requires knowledge about response type... Thus this might
be hardly doable in a generic fashion.

On a typical IMAP benchmark S3 getObject toBytes
transformation takes 1.66% of overall memory allocation.
chibenwa added a commit to chibenwa/james-project that referenced this issue May 18, 2022
CF aws/aws-sdk-java-v2#3193

`AsyncResponseTransformer.toBytes()` relies internally on
an unsized ByteArrayOutputStream (that thus will expend
many times) and the result will be copied when calling
`toByteArray()` and another defensive copy is carried
other when transforming the future.

We thus implement a response transformer variation:

 - Remove needless defensive copies inside
`ByteArrayAsyncResponseTransformer`. The byte array is
passed to the caller, who then becomes responsible of it,
and nobody else references the old byte array once the
publisher completes. This can be an instant win coming at
a very low price.
 - Rely on `GetResponse::contentLength` to size a byte
array and copy incoming buffers to it in place. This
requires knowledge about response type... Thus this might
be hardly doable in a generic fashion.

On a typical IMAP benchmark S3 getObject toBytes
transformation takes 1.66% of overall memory allocation.
chibenwa added a commit to apache/james-project that referenced this issue May 19, 2022
CF aws/aws-sdk-java-v2#3193

`AsyncResponseTransformer.toBytes()` relies internally on
an unsized ByteArrayOutputStream (that thus will expend
many times) and the result will be copied when calling
`toByteArray()` and another defensive copy is carried
other when transforming the future.

We thus implement a response transformer variation:

 - Remove needless defensive copies inside
`ByteArrayAsyncResponseTransformer`. The byte array is
passed to the caller, who then becomes responsible of it,
and nobody else references the old byte array once the
publisher completes. This can be an instant win coming at
a very low price.
 - Rely on `GetResponse::contentLength` to size a byte
array and copy incoming buffers to it in place. This
requires knowledge about response type... Thus this might
be hardly doable in a generic fashion.

On a typical IMAP benchmark S3 getObject toBytes
transformation takes 1.66% of overall memory allocation.
@yasminetalby yasminetalby added the p2 This is a standard priority issue label Nov 14, 2022
@Dunemaster
Copy link

The same problem is present for putObject

@rtyley
Copy link
Contributor

rtyley commented Sep 5, 2023

Just spotted this issue- I raised a near duplicate as #4392 (apologies for not searching existing issues, doh), and a PR to fix it in #4355.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

5 participants