-
Notifications
You must be signed in to change notification settings - Fork 910
Allow to get array from BytesWrapper without copying #1959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
feature-request
A feature should be added or improved.
Comments
This is a reasonable feature request @mar-kolya, thank you for reporting it. If you submit a PR we will take a look. |
aws-sdk-java-automation
added a commit
that referenced
this issue
Mar 11, 2022
…688b76afb Pull request: release <- staging/c5a9416c-9296-4379-ae26-248688b76afb
rtyley
added a commit
to rtyley/aws-sdk-async-response-bytes
that referenced
this issue
Aug 26, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response bytes, we still need to wrap those bytes and the response object in a `ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a whole new byte array will be allocated, which is bad for two reasons: * While copying, the JVM heap must briefly hold both the old & new byte arrays - roughly speaking, doubling the memory requirements. * Copying the bytes from one array to another takes a little bit of CPU time (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms on my M1 machine). A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()` is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020 with aws/aws-sdk-java-v2#1977 in response to aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name is a warning to users of this method that the underlying byte array is _not_ copied, and so could be susceptible to badly-behaving code manipulating the contents of the byte array after the `ResponseBytes` is handed to the calling code. If only the trusted SDK code has access to the original byte array before `ResponseBytes` is handed over to the caller, and once the `ResponseBytes` instance is handed over to the caller, the SDK code has no further use for the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()` in the trusted SDK code, and return the resulting `ResponseBytes` to the user, saving a double-allocation of memory, and the CPU time for the copying of bytes. See also: * aws/aws-sdk-java-v2#1959 * aws/aws-sdk-java-v2#1977
rtyley
added a commit
to rtyley/aws-sdk-async-response-bytes
that referenced
this issue
Aug 26, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response bytes, we still need to wrap those bytes and the response object in a `ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a whole new byte array will be allocated, which is bad for two reasons: * While copying, the JVM heap must briefly hold both the old & new byte arrays - roughly speaking, doubling the memory requirements. * Copying the bytes from one array to another takes a little bit of CPU time (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms on my M1 machine). A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()` is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020 with aws/aws-sdk-java-v2#1977 in response to aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name is a warning to users of this method that the underlying byte array is _not_ copied, and so could be susceptible to badly-behaving code manipulating the contents of the byte array after the `ResponseBytes` is handed to the calling code. If only the trusted SDK code has access to the original byte array before `ResponseBytes` is handed over to the caller, and once the `ResponseBytes` instance is handed over to the caller, the SDK code has no further use for the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()` in the trusted SDK code, and return the resulting `ResponseBytes` to the user, saving a double-allocation of memory, and the CPU time for the copying of bytes. See also: * aws/aws-sdk-java-v2#1959 * aws/aws-sdk-java-v2#1977
rtyley
added a commit
to rtyley/aws-sdk-async-response-bytes
that referenced
this issue
Aug 26, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response bytes, we still need to wrap those bytes and the response object in a `ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a whole new byte array will be allocated, which is bad for two reasons: * While copying, the JVM heap must briefly hold both the old & new byte arrays - roughly speaking, doubling the memory requirements. * Copying the bytes from one array to another takes a little bit of CPU time (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms on my M1 machine). A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()` is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020 with aws/aws-sdk-java-v2#1977 in response to aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name is a warning to users of this method that the underlying byte array is _not_ copied, and so could be susceptible to badly-behaving code manipulating the contents of the byte array after the `ResponseBytes` is handed to the calling code. If only the trusted SDK code has access to the original byte array before `ResponseBytes` is handed over to the caller, and once the `ResponseBytes` instance is handed over to the caller, the SDK code has no further use for the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()` in the trusted SDK code, and return the resulting `ResponseBytes` to the user, saving a double-allocation of memory, and the CPU time for the copying of bytes. See also: * aws/aws-sdk-java-v2#1959 * aws/aws-sdk-java-v2#1977
rtyley
added a commit
to rtyley/aws-sdk-async-response-bytes
that referenced
this issue
Aug 27, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response bytes, we still need to wrap those bytes and the response object in a `ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a whole new byte array will be allocated, which is bad for two reasons: * While copying, the JVM heap must briefly hold both the old & new byte arrays - roughly speaking, doubling the memory requirements. * Copying the bytes from one array to another takes a little bit of CPU time (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms on my M1 machine). A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()` is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020 with aws/aws-sdk-java-v2#1977 in response to aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name is a warning to users of this method that the underlying byte array is _not_ copied, and so could be susceptible to badly-behaving code manipulating the contents of the byte array after the `ResponseBytes` is handed to the calling code. If only the trusted SDK code has access to the original byte array before `ResponseBytes` is handed over to the caller, and once the `ResponseBytes` instance is handed over to the caller, the SDK code has no further use for the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()` in the trusted SDK code, and return the resulting `ResponseBytes` to the user, saving a double-allocation of memory, and the CPU time for the copying of bytes. See also: * aws/aws-sdk-java-v2#1959 * aws/aws-sdk-java-v2#1977
rtyley
added a commit
to rtyley/aws-sdk-async-response-bytes
that referenced
this issue
Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response bytes, we still need to wrap those bytes and the response object in a `ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a whole new byte array will be allocated, which is bad for two reasons: * While copying, the JVM heap must briefly hold both the old & new byte arrays - roughly speaking, doubling the memory requirements. * Copying the bytes from one array to another takes a little bit of CPU time (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms on my M1 machine). A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()` is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020 with aws/aws-sdk-java-v2#1977 in response to aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name is a warning to users of this method that the underlying byte array is _not_ copied, and so could be susceptible to badly-behaving code manipulating the contents of the byte array after the `ResponseBytes` is handed to the calling code. If only the trusted SDK code has access to the original byte array before `ResponseBytes` is handed over to the caller, and once the `ResponseBytes` instance is handed over to the caller, the SDK code has no further use for the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()` in the trusted SDK code, and return the resulting `ResponseBytes` to the user, saving a double-allocation of memory, and the CPU time for the copying of bytes. See also: * aws/aws-sdk-java-v2#1959 * aws/aws-sdk-java-v2#1977
rtyley
added a commit
to rtyley/aws-sdk-async-response-bytes
that referenced
this issue
Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response bytes, we still need to wrap those bytes and the response object in a `ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a whole new byte array will be allocated, which is bad for two reasons: * While copying, the JVM heap must briefly hold both the old & new byte arrays - roughly speaking, doubling the memory requirements. * Copying the bytes from one array to another takes a little bit of CPU time (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms on my M1 machine). A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()` is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020 with aws/aws-sdk-java-v2#1977 in response to aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name is a warning to users of this method that the underlying byte array is _not_ copied, and so could be susceptible to badly-behaving code manipulating the contents of the byte array after the `ResponseBytes` is handed to the calling code. If only the trusted SDK code has access to the original byte array before `ResponseBytes` is handed over to the caller, and once the `ResponseBytes` instance is handed over to the caller, the SDK code has no further use for the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()` in the trusted SDK code, and return the resulting `ResponseBytes` to the user, saving a double-allocation of memory, and the CPU time for the copying of bytes. See also: * aws/aws-sdk-java-v2#1959 * aws/aws-sdk-java-v2#1977
rtyley
added a commit
to rtyley/aws-sdk-async-response-bytes
that referenced
this issue
Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response bytes, we still need to wrap those bytes and the response object in a `ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a whole new byte array will be allocated, which is bad for two reasons: * While copying, the JVM heap must briefly hold both the old & new byte arrays - roughly speaking, doubling the memory requirements. * Copying the bytes from one array to another takes a little bit of CPU time (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms on my M1 machine). A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()` is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020 with aws/aws-sdk-java-v2#1977 in response to aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name is a warning to users of this method that the underlying byte array is _not_ copied, and so could be susceptible to badly-behaving code manipulating the contents of the byte array after the `ResponseBytes` is handed to the calling code. If only the trusted SDK code has access to the original byte array before `ResponseBytes` is handed over to the caller, and once the `ResponseBytes` instance is handed over to the caller, the SDK code has no further use for the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()` in the trusted SDK code, and return the resulting `ResponseBytes` to the user, saving a double-allocation of memory, and the CPU time for the copying of bytes. See also: * aws/aws-sdk-java-v2#1959 * aws/aws-sdk-java-v2#1977
rtyley
added a commit
to rtyley/aws-sdk-async-response-bytes
that referenced
this issue
Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response bytes, we still need to wrap those bytes and the response object in a `ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a whole new byte array will be allocated, which is bad for two reasons: * While copying, the JVM heap must briefly hold both the old & new byte arrays - roughly speaking, doubling the memory requirements. * Copying the bytes from one array to another takes a little bit of CPU time (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms on my M1 machine). A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()` is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020 with aws/aws-sdk-java-v2#1977 in response to aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name is a warning to users of this method that the underlying byte array is _not_ copied, and so could be susceptible to badly-behaving code manipulating the contents of the byte array after the `ResponseBytes` is handed to the calling code. If only the trusted SDK code has access to the original byte array before `ResponseBytes` is handed over to the caller, and once the `ResponseBytes` instance is handed over to the caller, the SDK code has no further use for the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()` in the trusted SDK code, and return the resulting `ResponseBytes` to the user, saving a double-allocation of memory, and the CPU time for the copying of bytes. See also: * aws/aws-sdk-java-v2#1959 * aws/aws-sdk-java-v2#1977
rtyley
added a commit
to rtyley/aws-sdk-async-response-bytes
that referenced
this issue
Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response bytes, we still need to wrap those bytes and the response object in a `ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a whole new byte array will be allocated, which is bad for two reasons: * While copying, the JVM heap must briefly hold both the old & new byte arrays - roughly speaking, doubling the memory requirements. * Copying the bytes from one array to another takes a little bit of CPU time (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms on my M1 machine). A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()` is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020 with aws/aws-sdk-java-v2#1977 in response to aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name is a warning to users of this method that the underlying byte array is _not_ copied, and so could be susceptible to badly-behaving code manipulating the contents of the byte array after the `ResponseBytes` is handed to the calling code. If only the trusted SDK code has access to the original byte array before `ResponseBytes` is handed over to the caller, and once the `ResponseBytes` instance is handed over to the caller, the SDK code has no further use for the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()` in the trusted SDK code, and return the resulting `ResponseBytes` to the user, saving a double-allocation of memory, and the CPU time for the copying of bytes. See also: * aws/aws-sdk-java-v2#1959 * aws/aws-sdk-java-v2#1977
rtyley
added a commit
to rtyley/aws-sdk-async-response-bytes
that referenced
this issue
Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response bytes, we still need to wrap those bytes and the response object in a `ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a whole new byte array will be allocated, which is bad for two reasons: * While copying, the JVM heap must briefly hold both the old & new byte arrays - roughly speaking, doubling the memory requirements. * Copying the bytes from one array to another takes a little bit of CPU time (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms on my M1 machine). A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()` is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020 with aws/aws-sdk-java-v2#1977 in response to aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name is a warning to users of this method that the underlying byte array is _not_ copied, and so could be susceptible to badly-behaving code manipulating the contents of the byte array after the `ResponseBytes` is handed to the calling code. If only the trusted SDK code has access to the original byte array before `ResponseBytes` is handed over to the caller, and once the `ResponseBytes` instance is handed over to the caller, the SDK code has no further use for the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()` in the trusted SDK code, and return the resulting `ResponseBytes` to the user, saving a double-allocation of memory, and the CPU time for the copying of bytes. See also: * aws/aws-sdk-java-v2#1959 * aws/aws-sdk-java-v2#1977
This was referenced Sep 2, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
In performance sensitive applications in is important to reduce number of data-copies that happen during processing.
Unfortunately currently when we get data from S3 and then try to use response byte array we have to:
Describe the Feature
Provide a method in
BytesWrapper
that allows users to get either underlying array directly or a non-read-onlyByteBuffer
Is your Feature Request related to a problem?
Current S3 client implementation does multiple data copies - this is just one more of them, and this reduces performace.
Proposed Solution
Describe alternatives you've considered
Unfortunately there are not many alternatives: in some cases one needs access to arrays, for example if one tries top use JNI compression implementations.
Additional Context
Your Environment
The text was updated successfully, but these errors were encountered: