Skip to content

Added an "unsafe" way to retrieve a byte array from SdkBytes and ResponseBytes without copying the data. #1977

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 12, 2020

Conversation

millems
Copy link
Contributor

@millems millems commented Aug 7, 2020

Reduced the number of payload data copies on some code paths.

Fixes #1959.

@codecov-commenter
Copy link

codecov-commenter commented Aug 7, 2020

Codecov Report

Merging #1977 into master will decrease coverage by 0.00%.
The diff coverage is 58.06%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1977      +/-   ##
============================================
- Coverage     76.63%   76.63%   -0.01%     
  Complexity      225      225              
============================================
  Files          1112     1112              
  Lines         33653    33660       +7     
  Branches       2621     2606      -15     
============================================
+ Hits          25791    25795       +4     
- Misses         6584     6587       +3     
  Partials       1278     1278              
Flag Coverage Δ Complexity Δ
#unittests 76.63% <58.06%> (-0.01%) 225.00 <0.00> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ Complexity Δ
...awssdk/protocols/ion/internal/SdkIonGenerator.java 58.09% <0.00%> (-2.91%) 0.00 <0.00> (ø)
...awssdk/protocols/json/StructuredJsonGenerator.java 18.18% <0.00%> (-0.87%) 0.00 <0.00> (ø)
.../internal/unmarshall/JsonProtocolUnmarshaller.java 95.65% <0.00%> (ø) 0.00 <0.00> (ø)
...rnal/converter/string/SdkBytesStringConverter.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...java/software/amazon/awssdk/core/BytesWrapper.java 78.94% <50.00%> (-1.06%) 0.00 <0.00> (ø)
...amazon/awssdk/protocols/json/SdkJsonGenerator.java 51.75% <60.00%> (+0.37%) 0.00 <0.00> (ø)
...ava/software/amazon/awssdk/core/ResponseBytes.java 31.81% <66.66%> (+6.81%) 0.00 <0.00> (ø)
...on/internal/marshall/SimpleTypeJsonMarshaller.java 89.33% <100.00%> (ø) 0.00 <0.00> (ø)
...l/unmarshall/AwsJsonProtocolErrorUnmarshaller.java 98.27% <100.00%> (ø) 0.00 <0.00> (ø)
.../awssdk/protocols/core/StringToValueConverter.java 100.00% <100.00%> (ø) 0.00 <0.00> (ø)
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3bd1819...55eff4f. Read the comment docs.

@millems millems force-pushed the millem/less-copying branch from c621f10 to bf22a05 Compare August 7, 2020 20:36
@millems
Copy link
Contributor Author

millems commented Aug 7, 2020

Integration tests have passed. Running benchmarks...

@millems
Copy link
Contributor Author

millems commented Aug 7, 2020

Unit tests have passed. Still waiting on benchmarks...

@millems
Copy link
Contributor Author

millems commented Aug 11, 2020

No appreciable difference in benchmarks.

@millems millems force-pushed the millem/less-copying branch from 55eff4f to c6bebf9 Compare August 12, 2020 17:33
@millems millems merged commit 09c7d03 into master Aug 12, 2020
@millems millems deleted the millem/less-copying branch August 12, 2020 17:50
@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities (and Security Hotspot 0 Security Hotspots to review)
Code Smell A 4 Code Smells

62.5% 62.5% Coverage
0.0% 0.0% Duplication

warning The version of Java (1.8.0_252) you have used to run this analysis is deprecated and we will stop accepting it from October 2020. Please update to at least Java 11.
Read more here

aws-sdk-java-automation added a commit that referenced this pull request Mar 31, 2022
…3a81bcb60

Pull request: release <- staging/11492aef-de88-4dde-99d1-db93a81bcb60
rtyley added a commit to rtyley/aws-sdk-async-response-bytes that referenced this pull request Aug 26, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response
bytes, we still need to wrap those bytes and the response object in a
`ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a
whole new byte array will be allocated, which is bad for two reasons:

* While copying, the JVM heap must briefly hold both the old & new byte arrays -
  roughly speaking, doubling the memory requirements.
* Copying the bytes from one array to another takes a little bit of CPU time
  (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms
  on my M1 machine).

A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()`
is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020
with aws/aws-sdk-java-v2#1977 in response to
aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name
is a warning to users of this method that the underlying byte array is _not_
copied, and so could be susceptible to badly-behaving code manipulating the
contents of the byte array after the `ResponseBytes` is handed to the calling
code.

If only the trusted SDK code has access to the original byte array before
`ResponseBytes` is handed over to the caller, and once the `ResponseBytes`
instance is handed over to the caller, the SDK code has no further use for
the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()`
in the trusted SDK code, and return the resulting `ResponseBytes` to the user,
saving a double-allocation of memory, and the CPU time for the copying of
bytes.

See also:

* aws/aws-sdk-java-v2#1959
* aws/aws-sdk-java-v2#1977
rtyley added a commit to rtyley/aws-sdk-async-response-bytes that referenced this pull request Aug 26, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response
bytes, we still need to wrap those bytes and the response object in a
`ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a
whole new byte array will be allocated, which is bad for two reasons:

* While copying, the JVM heap must briefly hold both the old & new byte arrays -
  roughly speaking, doubling the memory requirements.
* Copying the bytes from one array to another takes a little bit of CPU time
  (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms
  on my M1 machine).

A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()`
is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020
with aws/aws-sdk-java-v2#1977 in response to
aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name
is a warning to users of this method that the underlying byte array is _not_
copied, and so could be susceptible to badly-behaving code manipulating the
contents of the byte array after the `ResponseBytes` is handed to the calling
code.

If only the trusted SDK code has access to the original byte array before
`ResponseBytes` is handed over to the caller, and once the `ResponseBytes`
instance is handed over to the caller, the SDK code has no further use for
the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()`
in the trusted SDK code, and return the resulting `ResponseBytes` to the user,
saving a double-allocation of memory, and the CPU time for the copying of
bytes.

See also:

* aws/aws-sdk-java-v2#1959
* aws/aws-sdk-java-v2#1977
rtyley added a commit to rtyley/aws-sdk-async-response-bytes that referenced this pull request Aug 26, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response
bytes, we still need to wrap those bytes and the response object in a
`ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a
whole new byte array will be allocated, which is bad for two reasons:

* While copying, the JVM heap must briefly hold both the old & new byte arrays -
  roughly speaking, doubling the memory requirements.
* Copying the bytes from one array to another takes a little bit of CPU time
  (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms
  on my M1 machine).

A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()`
is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020
with aws/aws-sdk-java-v2#1977 in response to
aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name
is a warning to users of this method that the underlying byte array is _not_
copied, and so could be susceptible to badly-behaving code manipulating the
contents of the byte array after the `ResponseBytes` is handed to the calling
code.

If only the trusted SDK code has access to the original byte array before
`ResponseBytes` is handed over to the caller, and once the `ResponseBytes`
instance is handed over to the caller, the SDK code has no further use for
the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()`
in the trusted SDK code, and return the resulting `ResponseBytes` to the user,
saving a double-allocation of memory, and the CPU time for the copying of
bytes.

See also:

* aws/aws-sdk-java-v2#1959
* aws/aws-sdk-java-v2#1977
rtyley added a commit to rtyley/aws-sdk-async-response-bytes that referenced this pull request Aug 27, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response
bytes, we still need to wrap those bytes and the response object in a
`ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a
whole new byte array will be allocated, which is bad for two reasons:

* While copying, the JVM heap must briefly hold both the old & new byte arrays -
  roughly speaking, doubling the memory requirements.
* Copying the bytes from one array to another takes a little bit of CPU time
  (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms
  on my M1 machine).

A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()`
is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020
with aws/aws-sdk-java-v2#1977 in response to
aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name
is a warning to users of this method that the underlying byte array is _not_
copied, and so could be susceptible to badly-behaving code manipulating the
contents of the byte array after the `ResponseBytes` is handed to the calling
code.

If only the trusted SDK code has access to the original byte array before
`ResponseBytes` is handed over to the caller, and once the `ResponseBytes`
instance is handed over to the caller, the SDK code has no further use for
the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()`
in the trusted SDK code, and return the resulting `ResponseBytes` to the user,
saving a double-allocation of memory, and the CPU time for the copying of
bytes.

See also:

* aws/aws-sdk-java-v2#1959
* aws/aws-sdk-java-v2#1977
rtyley added a commit to rtyley/aws-sdk-async-response-bytes that referenced this pull request Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response
bytes, we still need to wrap those bytes and the response object in a
`ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a
whole new byte array will be allocated, which is bad for two reasons:

* While copying, the JVM heap must briefly hold both the old & new byte arrays -
  roughly speaking, doubling the memory requirements.
* Copying the bytes from one array to another takes a little bit of CPU time
  (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms
  on my M1 machine).

A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()`
is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020
with aws/aws-sdk-java-v2#1977 in response to
aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name
is a warning to users of this method that the underlying byte array is _not_
copied, and so could be susceptible to badly-behaving code manipulating the
contents of the byte array after the `ResponseBytes` is handed to the calling
code.

If only the trusted SDK code has access to the original byte array before
`ResponseBytes` is handed over to the caller, and once the `ResponseBytes`
instance is handed over to the caller, the SDK code has no further use for
the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()`
in the trusted SDK code, and return the resulting `ResponseBytes` to the user,
saving a double-allocation of memory, and the CPU time for the copying of
bytes.

See also:

* aws/aws-sdk-java-v2#1959
* aws/aws-sdk-java-v2#1977
rtyley added a commit to rtyley/aws-sdk-async-response-bytes that referenced this pull request Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response
bytes, we still need to wrap those bytes and the response object in a
`ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a
whole new byte array will be allocated, which is bad for two reasons:

* While copying, the JVM heap must briefly hold both the old & new byte arrays -
  roughly speaking, doubling the memory requirements.
* Copying the bytes from one array to another takes a little bit of CPU time
  (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms
  on my M1 machine).

A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()`
is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020
with aws/aws-sdk-java-v2#1977 in response to
aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name
is a warning to users of this method that the underlying byte array is _not_
copied, and so could be susceptible to badly-behaving code manipulating the
contents of the byte array after the `ResponseBytes` is handed to the calling
code.

If only the trusted SDK code has access to the original byte array before
`ResponseBytes` is handed over to the caller, and once the `ResponseBytes`
instance is handed over to the caller, the SDK code has no further use for
the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()`
in the trusted SDK code, and return the resulting `ResponseBytes` to the user,
saving a double-allocation of memory, and the CPU time for the copying of
bytes.

See also:

* aws/aws-sdk-java-v2#1959
* aws/aws-sdk-java-v2#1977
rtyley added a commit to rtyley/aws-sdk-async-response-bytes that referenced this pull request Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response
bytes, we still need to wrap those bytes and the response object in a
`ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a
whole new byte array will be allocated, which is bad for two reasons:

* While copying, the JVM heap must briefly hold both the old & new byte arrays -
  roughly speaking, doubling the memory requirements.
* Copying the bytes from one array to another takes a little bit of CPU time
  (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms
  on my M1 machine).

A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()`
is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020
with aws/aws-sdk-java-v2#1977 in response to
aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name
is a warning to users of this method that the underlying byte array is _not_
copied, and so could be susceptible to badly-behaving code manipulating the
contents of the byte array after the `ResponseBytes` is handed to the calling
code.

If only the trusted SDK code has access to the original byte array before
`ResponseBytes` is handed over to the caller, and once the `ResponseBytes`
instance is handed over to the caller, the SDK code has no further use for
the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()`
in the trusted SDK code, and return the resulting `ResponseBytes` to the user,
saving a double-allocation of memory, and the CPU time for the copying of
bytes.

See also:

* aws/aws-sdk-java-v2#1959
* aws/aws-sdk-java-v2#1977
rtyley added a commit to rtyley/aws-sdk-async-response-bytes that referenced this pull request Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response
bytes, we still need to wrap those bytes and the response object in a
`ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a
whole new byte array will be allocated, which is bad for two reasons:

* While copying, the JVM heap must briefly hold both the old & new byte arrays -
  roughly speaking, doubling the memory requirements.
* Copying the bytes from one array to another takes a little bit of CPU time
  (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms
  on my M1 machine).

A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()`
is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020
with aws/aws-sdk-java-v2#1977 in response to
aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name
is a warning to users of this method that the underlying byte array is _not_
copied, and so could be susceptible to badly-behaving code manipulating the
contents of the byte array after the `ResponseBytes` is handed to the calling
code.

If only the trusted SDK code has access to the original byte array before
`ResponseBytes` is handed over to the caller, and once the `ResponseBytes`
instance is handed over to the caller, the SDK code has no further use for
the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()`
in the trusted SDK code, and return the resulting `ResponseBytes` to the user,
saving a double-allocation of memory, and the CPU time for the copying of
bytes.

See also:

* aws/aws-sdk-java-v2#1959
* aws/aws-sdk-java-v2#1977
rtyley added a commit to rtyley/aws-sdk-async-response-bytes that referenced this pull request Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response
bytes, we still need to wrap those bytes and the response object in a
`ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a
whole new byte array will be allocated, which is bad for two reasons:

* While copying, the JVM heap must briefly hold both the old & new byte arrays -
  roughly speaking, doubling the memory requirements.
* Copying the bytes from one array to another takes a little bit of CPU time
  (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms
  on my M1 machine).

A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()`
is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020
with aws/aws-sdk-java-v2#1977 in response to
aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name
is a warning to users of this method that the underlying byte array is _not_
copied, and so could be susceptible to badly-behaving code manipulating the
contents of the byte array after the `ResponseBytes` is handed to the calling
code.

If only the trusted SDK code has access to the original byte array before
`ResponseBytes` is handed over to the caller, and once the `ResponseBytes`
instance is handed over to the caller, the SDK code has no further use for
the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()`
in the trusted SDK code, and return the resulting `ResponseBytes` to the user,
saving a double-allocation of memory, and the CPU time for the copying of
bytes.

See also:

* aws/aws-sdk-java-v2#1959
* aws/aws-sdk-java-v2#1977
rtyley added a commit to rtyley/aws-sdk-async-response-bytes that referenced this pull request Sep 2, 2023
Once the `ByteArrayAsyncResponseTransformer` has gathered all the response
bytes, we still need to wrap those bytes and the response object in a
`ResponseBytes` instance - but if we use `ResponseBytes.fromByteArray()`, a
whole new byte array will be allocated, which is bad for two reasons:

* While copying, the JVM heap must briefly hold both the old & new byte arrays -
  roughly speaking, doubling the memory requirements.
* Copying the bytes from one array to another takes a little bit of CPU time
  (obviously this varies: `System.arraycopy()` for 40MB of bytes takes ~2ms
  on my M1 machine).

A faster, more memory efficient alternative to `ResponseBytes.fromByteArray()`
is `ResponseBytes.fromByteArrayUnsafe()`, added to the AWS SDK in August 2020
with aws/aws-sdk-java-v2#1977 in response to
aws/aws-sdk-java-v2#1959. The 'Unsafe' in the name
is a warning to users of this method that the underlying byte array is _not_
copied, and so could be susceptible to badly-behaving code manipulating the
contents of the byte array after the `ResponseBytes` is handed to the calling
code.

If only the trusted SDK code has access to the original byte array before
`ResponseBytes` is handed over to the caller, and once the `ResponseBytes`
instance is handed over to the caller, the SDK code has no further use for
the original byte array, then it is safe to use `ResponseBytes.fromByteArrayUnsafe()`
in the trusted SDK code, and return the resulting `ResponseBytes` to the user,
saving a double-allocation of memory, and the CPU time for the copying of
bytes.

See also:

* aws/aws-sdk-java-v2#1959
* aws/aws-sdk-java-v2#1977
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow to get array from BytesWrapper without copying
3 participants