-
Notifications
You must be signed in to change notification settings - Fork 56
Introduction of core functionality and tests #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level review of API structure around the JCE emulation only.
|
||
@attr.s(hash=False) | ||
class JceNameLocalDelegatedKey(DelegatedKey): | ||
"""Delegated key that JCE StandardName algorithm values to determine behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't parse this sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be "Delegated key that uses JCE StandardName algorithm values to determine behavior."
fixed
:returns: Encrypted ciphertext | ||
:rtype: bytes | ||
""" | ||
encryptor = encryption.JavaCipher.from_transformation(algorithm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be cached?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transformation won't necessarily be consistent for all calls to this method across the life of the delegated key instance, so it should not be retained permanently. As for caching JavaCipher
instances based on the transformation, I haven't run performance tests, but from past experience I don't think that the JavaCipher
class constructor will affect performance enough to justify the additional complexity that caching would add.
|
||
@property | ||
def allowed_for_raw_materials(self): | ||
"""Only ``JceNameLocalDelegatedKey`` backed by ``JavaSymmetricEncryptionAlgorithm`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the first class you're looking at, this is a lot of gibberish. Why not just say "Returns true if this key can be used with raw cryptographic materials (i.e. random bytes), rather than a structured key (e.g. an EC or RSA key)." or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add reference links to the mentioned classes, but it's more complex than that . RawCryptographicMaterials
is a specific class in this client[1] that encrypts attributes directly with a delegated key. This is a pattern that we want to heavily discourage, but need to support for legacy compatibility, so the default behavior of all delegated keys is to not allow this.
# TODO: should we support these? | ||
# HmacMD5 | ||
# HmacSHA1 | ||
HMAC_SHA224 = ('HmacSHA224', hmac.HMAC, hashes.SHA224) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... does this somehow instantiate the containing class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline: https://docs.python.org/3/library/enum.html
self.java_name = algorithm_name | ||
self.algorithm_type = algorithm_type | ||
self.hash_type = hash_type | ||
self.register() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having globally visible side-effects in a constructor seems a bit surprising and dangerous.
Why not just have a list or dict of supported algorithms in jce_bridge/__init__.py
? This is analogous to how JCE does algorithm registration, and means all supported algorithms are in one place for easy inspection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enum constructors are run on first module import, so this will only run once, and will run before any of these object can be accessed. The registration adds them to the lookup table of the first class up in the class hierarchy that defines the class attribute __rlookup__
. In this case (JavaMac
), this is JavaAuthenticator
.
This is the same as saying self.__rlookup__[self.java_name] = self
, just less to screw up each time it is written and clearer about what it is doing. Having a static list of values would (IMO) just introduce another set of values to maintain and another opportunity for drift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm more concerned about the possibility that something, somewhere else will construct an instance of e.g. JavaMac and have it silently be registered systemwide. Why can't we just put it all in one place as:
macAlgos = {
'HmacSHA256': JavaMac(hmac.HMAC, hashes.SHA256),
# ...
}
) | ||
|
||
|
||
class JavaMode(JavaBridge): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So all of these components seem to be registered in the same namespace - what happens if I start asking for strange things like "AES/PKCS5Padding/GCM" or "AES/NoPadding/PKCS5Padding"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No; they are registered to the lowest class in the hierarchy that defines __rlookup__
. So in the case of JavaMode
, all of its members will register with JavaMode
and not affect any other JavaBridge
classes.
JavaMode
only knows about JCE Cipher Algorithm Mode namesJavaPadding
only knows about JCE Cipher Algorithm Padding namesJavaEncryptionAlgorithm
only knows about JCE Cipher Algorithm names
If you want to load an encryption transformation, you want JavaCipher
.
# First try for encryption ciphers | ||
# https://docs.oracle.com/javase/8/docs/api/javax/crypto/Cipher.html | ||
try: | ||
key_transformer = primitives.JavaEncryptionAlgorithm.from_name(self.algorithm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this handle compound algorithms (e.g. "AES/GCM/NoPadding") or just raw algorithms ("AES")?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JceNameLocalDelegatedKey
does not know anything about mode or padding, only algorithm. It only accepts algorithm names. This is intentional, because the mode and padding can potentially change for operations over the life of a delegated key instance.
""" | ||
return self.__key_handler_class is primitives.JavaSymmetricEncryptionAlgorithm | ||
|
||
def _encrypt(self, algorithm, plaintext, additional_associated_data=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of supporting all possible operations on the same class makes me a bit nervous about attacks like this: https://auth0.com/blog/critical-vulnerabilities-in-json-web-token-libraries/
Where is the algorithm string and key coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree, and this is one of the things I would like to change with v2.0 of the spec. FWIW, this specific delegated key class explicitly only enables those APIs that the loaded key is likely to support[1][2] (encrypt/decrypt/wrap/unwrap for encryption keys, sign/verify for mac/signature keys).
The algorithm identifier comes from the material description[3].
The key could be anything or come from anywhere: only the cryptographic materials provider knows.
[1] https://github.com/awslabs/aws-dynamodb-encryption-python/pull/1/files#diff-b899d421fd3daa12d0fb4551745d310dR117
[2] https://github.com/awslabs/aws-dynamodb-encryption-python/pull/1/files#diff-b899d421fd3daa12d0fb4551745d310dR130
[3] https://github.com/awslabs/aws-dynamodb-encryption-python/pull/1/files#diff-04101d472c1b1ffa6f7f73dbb6e16bb7R53
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed authentication code and serialization so far.
data='TABLE>{}<TABLE'.format(table_name).encode('utf-8') | ||
)) | ||
for key in sorted(item.keys()): | ||
action = attribute_actions.action(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know that attribute_actions.action will always have the same full list of keys as item, or do we have to handle the case of a KeyError here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is handled by the AttributeActions
class[1]. AttributeActions
know two things: the default action and any override actions that have been requested for a specific attribute name. The AttributeActions().action(name)
method returns the override action if defined for the requested attribute and otherwise returns the default action.
) | ||
|
||
|
||
def _string_to_sign(item, table_name, attribute_actions): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears to me that the string to sign is in accordance with AwsCryptoToolsSpecs section on item signing
@@ -0,0 +1,251 @@ | |||
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable.
@@ -0,0 +1,220 @@ | |||
[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, recommend using exact same full set of test vectors for both Java and Python implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments on material_providers/aws_kms.py only
.. note:: | ||
|
||
This cryptographic materials provider makes one AWS KMS API call each time encryption | ||
or decryption materials are requested. This means that one request will be made for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: remove "for"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way that you fixed this is fine too.
validator=attr.validators.instance_of(botocore.session.Session), | ||
default=attr.Factory(botocore.session.Session) | ||
) | ||
_grant_tokens = attr.ib( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For attributes that are tuples or dicts, does it make sense to do a deeper validation that the elements of the tuple or dict have the right format? It looks like from the attrs documentation it is easy enough to write your own validator functions. ( https://attrs.readthedocs.io/en/stable/examples.html#decorator )
Not blocking on this comment. I consider it going above and beyond. But the more we can do to help users catch errors at class creation time, the more user-friendly we make the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. I forget when they added that functionality, but we're already requiring 17.4.0 (...which, checking, it looks like I forgot to update the requirements.txt, so I'll do that), which I'm fairly confident is newer than the decorators.
) | ||
return hkdf.derive(initial_material) | ||
|
||
def _encryption_key(self, initial_material, key_info): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am reading correctly, _encryption_key and _mac_key methods of this class are identical except for the final argument given to the self._hkdf() method call. I would consider making a single underlying method to avoid the repetition in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. Consolidating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up review on material_providers/aws_kms.py
This file passes my review now. Optional extra changes are mentioned in the comments.
.. note:: | ||
|
||
This cryptographic materials provider makes one AWS KMS API call each time encryption | ||
or decryption materials are requested. This means that one request will be made for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way that you fixed this is fine too.
) | ||
_grant_tokens = attr.ib(default=attr.Factory(tuple)) | ||
|
||
@_grant_tokens.validator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the only place we need to do such validations, this is fine, but it strikes me that checking whether something is a tuple of strings or a string->string dictionary is a common enough kind of validation that I might consider moving the validator functions that do that to their own module. Then you can simply list them as callables with the validator keyword on the attr.ib command, per https://attrs.readthedocs.io/en/stable/examples.html#callables . (This is closer to how the code was written before, but you don't need to use the attr.validators pre-made callables; you can just use your own.) The regional clients validator is very specific (a string->botocore client dict) so I would probably leave that inline with the decorator as you've done so here.
Again, not blocking on this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See, this is why it's good to get new eyes on things. That has significantly changed since the last time I looked at it, and is much simpler to use now. I had actually just finished chopping this logic off into a stand-alone module (in a commit currently stacked on top of the high-level client work I've been doing), but tweaking it to work as a straightforward callable would definitely be worthwhile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked at the materials, material providers, and delegated keys sections of the code. Code is very good and I have no suggestions for improvement but I tried to put comments at the top of each file to indicate which ones I read. My only questions are for my own understanding only, not for any needed code changes.
@@ -0,0 +1,114 @@ | |||
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
@@ -0,0 +1,150 @@ | |||
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
@@ -0,0 +1,202 @@ | |||
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
@@ -0,0 +1,147 @@ | |||
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
@@ -0,0 +1,281 @@ | |||
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
@@ -0,0 +1,147 @@ | |||
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
@@ -0,0 +1,281 @@ | |||
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
@@ -0,0 +1,114 @@ | |||
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
@@ -0,0 +1,150 @@ | |||
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
@@ -0,0 +1,202 @@ | |||
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
""" | ||
self._raise_not_implemented('encrypt') | ||
|
||
def decrypt(self, algorithm, ciphertext, additional_associated_data=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For both decrypt and encrypt, please also take in the fieldName as a parameter. (It can be ignored by existing implementations.) This is something I deeply regret not having in Java and will need to add to support some current feature requests.
""" | ||
try: | ||
signature_attribute = item.pop(ReservedAttributes.SIGNATURE.value) | ||
except KeyError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no signature is found, then there is no valid signature and it must be rejected just as if a signature were present but incorrect. Otherwise, someone can silently inject plaintext data (without a signature) into a datastore and the application will never notice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention here was to support partially migrated data in a table and/or use with an unencrypted/unsigned table. I thought I remembered seeing the Java client have similar behavior somewhere. Not providing some way to support this behavior seems to me like an extremely sharp edge...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior's existence is also extremely sharp. It is purposefully not exposed in Java due to the risk it poses to users. I think that a migration story may need to be in a different (higher-level) object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah; I see what I was remembering. The Java client skips the item if no attribute flags are set[1]. I must have crossed wires at some point and read that as "if the item wasn't encrypted/signed" rather than "if we are configured to ignore this item".
:rtype: dict | ||
""" | ||
serialized_attribute = serialize_attribute(attribute) | ||
encrypted_attribute = encryption_key.encrypt( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember that here we need to also pass in the fieldName.
:rtype: dict | ||
""" | ||
encrypted_attribute = attribute[Tag.BINARY.dynamodb_tag] | ||
decrypted_attribute = decryption_key.decrypt( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember that here we need to also pass in the fieldName.
""" | ||
# TODO: I'm pretty sure these are sane defaults, but verify with someone more familiar with JCE. | ||
if cipher_transformation == 'AESWrap': | ||
return cls.from_transformation('AES/GCM/NoPadding') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AESWrap is the actual name and it doesn't correspond with GCM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a weird sharp edge that I'm not really comfortable with. Java defines AESWrap
as a Cipher
transformation, but it doesn't actually make a Cipher
, and I was trying to map AESWrap
to something that would provide similar characteristics without having to define a custom mode that doesn't actually exist in JCE names to try and emulate the behavior of AESWrap
. Maybe AES/ECB/NoPadding
would be a better mapping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's talk in person to straighten this out. I don't understand the issue here.
JAVA_PADDING = { | ||
'NoPadding': SimplePadding('NoPadding', _NoPadding), | ||
'PKCS1Padding': SimplePadding('PKCS1Padding', asymmetric_padding.PKCS1v15), | ||
'PKCS5Padding': BlockSizePadding('PKCS5Padding', symmetric_padding.PKCS7), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment explaining why "PKCS5" when it really is PKCS7
'NoPadding': SimplePadding('NoPadding', _NoPadding), | ||
'PKCS1Padding': SimplePadding('PKCS1Padding', asymmetric_padding.PKCS1v15), | ||
'PKCS5Padding': BlockSizePadding('PKCS5Padding', symmetric_padding.PKCS7), | ||
'OAEPWithSHA-1AndMGF1Padding': OaepPadding( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add another comment here warning about Java's idiosyncrasy with the MGF hash function.
:returns: Decoded tag | ||
:rtype: bytes | ||
""" | ||
(_, tag) = unpack_value('>cc', stream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember, we've defined the first byte to be RESERVED and not part of the tag. So decode this as two 1 byte values and ensure the first is zero.
attribute_type, attribute_value = list(attribute.items())[0] | ||
if attribute_type == 'B': | ||
return base64.b64encode(attribute_value.value) | ||
return attribute_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if attribute_type isn't 'S'? I think you can reject it.
…rimitives are requested from JavaCipher
…on/decryption path
…ibute and fail if actions should be taken on decrypt but no signature is found
This is the core functionality required for the client to work, including all necessary configuration structures, serialization/deserialization, and cryptographic components.
The highest level interface currently provided in this codebase is found in
dynamodb_encryption_sdk.encrypted.item
, which contains found functions that will encrypt DynamoDB items and Python dictionaries.All operations are performed on DynamoDB JSON items. The Python dictionary functions simply use the transformers provided in
boto3.dynamodb.types
to convert back and forth.Notes
dynamodb_encryption_sdk
, but it seemed as good a starting point as any.Places wanting specific attention
material_providers
), cryptographic materials (materials
), and delegated keys (delegated_keys
). I'm pretty sure these are aligned with the intended concepts now, but I'd like your eyes on those.internal.crypto.jce_bridge
anddelegated_keys.jce
) should provide close analogues to the JCE interfaces because we have to work with JCE StandardNames and transformations as the core primitives of configuration in the material description. Having someone who is much more familiar with the JCE structure than I would be very helpful to validate that the way the structures are built makes sense.internal.crypto
) and formatting (internal.formatting
) components.