-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Simplify Serializer [En|De]code*() methods to no longer [de]serialize manually. #1816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than pick this apart in detail, I think It's possible to give high-level feedback:
I think there's too much error handling in here for not enough actual value. Yes, it's important to discard invalid inputs but we're not talking to arbitrary backends--we're talking to a single trusted server.
In the prior version where we were deserializing by hand there were many more failure cases to worry about and so making all the decoded types optional made sense.
Now it seems as if with just a few adjustments none of these need to be optional, and that would simplify this greatly.
For example, DecodeFieldValue can really only fail itself if there's a bad null value. It seems like we can make that unconditionally just return a FieldValue.
@@ -341,70 +275,56 @@ void Serializer::EncodeFieldValue(Writer* writer, | |||
} | |||
} | |||
|
|||
absl::optional<FieldValue> Serializer::DecodeFieldValue(Reader* reader) { | |||
absl::optional<FieldValue> Serializer::DecodeFieldValue( | |||
Reader* reader, const google_firestore_v1beta1_Value& msg) { | |||
if (!reader->status().ok()) return absl::nullopt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need these checks?
Presumably Nanopb was able to deserialize so something reasonable must have come over. I don't expect that this is ever going to fail in practice so I don't really care if we decode more than we need to.
switch (msg.which_value_type) { | ||
case google_firestore_v1beta1_Value_null_value_tag: | ||
if (msg.null_value != google_protobuf_NullValue_NULL_VALUE) { | ||
reader->Fail("Input proto bytes cannot be parsed (invalid null value)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this matter? None of the other clients make this check.
(I'm not opposed to this just wondering if it's really worth it.)
default: | ||
// Unspecified type. | ||
reader->Fail("Invalid type while decoding FieldValue"); | ||
return absl::nullopt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you returned FieldValue::NullValue here the return type wouldn't need to be absl::optional.
std::string key = Serializer::DecodeString(map_value.fields[i].key); | ||
absl::optional<FieldValue> value = | ||
Serializer::DecodeFieldValue(reader, map_value.fields[i].value); | ||
if (!reader->status().ok()) return absl::nullopt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need an early return here?
I expect actual failures to be vanishingly rare. It would be nice to eliminate this boilerplate if possible.
} | ||
|
||
absl::optional<ObjectValue::Map> DecodeMapValue(Reader* reader) { | ||
absl::optional<ObjectValue::Map> DecodeMapValue( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could just return an empty map on error.
if (!reader->status().ok()) return absl::nullopt; | ||
|
||
std::string key = Serializer::DecodeString(fields.key); | ||
absl::optional<FieldValue> value = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If DecodeFieldValue returned a FieldValue rather than an optional this could be simpler.
@wilhuff Early feedback is always appreciated, but I think it may be a little premature here. This PR isn't ready for review yet. In particular, note that the commit calls out that error handling still needs to be reworked (amongst other things). I only created it to get travis to build it on the various xcode configurations. But in any case, taking a quick peek at the feedback, I think we're on the same page. In particular:
|
Now, we use the re-worked nanopb deserializer (that mallocs). This also had the advantage of moving the C++ serialization code closer to the other platforms. Still TODO: - Reorder the methods within the files to resemble the other platforms. - Rework error handling. - Encode*() methods.
Now, we use the re-worked nanopb deserializer (that mallocs). This also had the advantage of moving the C++ serialization code closer to the other platforms. Still TODO: - Reorder the methods within the files to resemble the other platforms. - Rework error handling.
d461324
to
f2f4c17
Compare
Ok, I think this is ready for review now. There's a few caveats:
I definitely recommend reviewing one commit at a time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM
google_protobuf_Timestamp_init_zero; | ||
reader->ReadNanopbMessage(google_protobuf_Timestamp_fields, ×tamp_proto); | ||
|
||
Timestamp Serializer::DecodeTimestamp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the absl::optional to Timestamp change required for the change to use malloc-ed nanopb version? If not, could you elaborate the reason for the return type change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's not necessary; we could keep it as optional<>. This would additionally have the benefit of forcing the caller to check the status (or die otherwise) whereas as currently written, if the caller forgets to check the status, they'll end up using an invalid return value.
The disadvantage of keeping it as optional<>, is that it introduces a lot more boilerplate code, since now all the methods have to continually check all of their values to ensure a value is present before continuing.
So why the change now? Previously, we did most of the deserialization ourselves, so we were a little more paranoid. Now, nanopb does more of it for us, so we feel a little safer.
Other alternatives that we could consider:
- use optional for the public methods, and just T for the private ones. The callers get safety, while minimizing (though not eliminating) the internal boilerplate.
- Similar to (1), but switch to statusor<> rather than optional<>. We've effectively eliminated nanopb/reader and nanopb/writer anyways, so if we merge the last little bits into the serializers, then the public methods can just return statusor while the private methods can continue to return T.
* We need to figure out what, if anything, we want to do about this. For now, | ||
* we'll just ignore this test. | ||
*/ | ||
#if 0 | ||
TEST_F(SerializerTest, BadBoolValue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just do TEST_F(SerializerTest, DISABLED_BadBoolValue) instead of the condition. But I could be wrong due to our customized setup of the C++ test in Xcode project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, interesting. I'll leave it as is for now; that's guaranteed to work everywhere. (I suspect the eventual answer might be to just delete the test.)
×tamp_proto); | ||
pb_bytes_array_t* Serializer::EncodeString(const std::string& str) { | ||
auto size = static_cast<pb_size_t>(str.size()); | ||
auto result = reinterpret_cast<pb_bytes_array_t*>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: static_cast
would probably be sufficient here (applies everywhere malloc
is used).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch; done.
|
||
int i = 0; | ||
for (const auto& kv : object_value_map) { | ||
result.fields[i] = google_firestore_v1beta1_MapValue_FieldsEntry_init_zero; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit: is this zeroing out necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not currently.
This message only has two fields, both of which are set immediately below. However, if new fields are added, and we don't init_zero them, then they'll be uninitialized. nanopb uses init_zero or similar in their examples, even when exhaustively setting all fields immediately afterwards, presumably for this reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're intending to zero-initialize these, we should just calloc
the array in the first place so we can't forget. See comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer relevant.
Where relevant, i.e. for the return value of malloc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big improvement! I think there's more we can do.
Feel free to push back or defer work into another PR.
writer->WriteTag({PB_WT_STRING, google_firestore_v1beta1_Document_name_tag}); | ||
writer->WriteString(rpc_serializer_.EncodeKey(doc.key())); | ||
result.name = | ||
rpc_serializer_.EncodeString(rpc_serializer_.EncodeKey(doc.key())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If EncodeKey returned the nanopb byte array directly, you could eliminate this two-step at every invocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's occasionally used without the EncodeString() surrounding it (though admittedly, mostly in the tests, and even then, only for use with libprotobuf, since it has no knowledge of nanopb byte arrays.) I suppose we could wrap those usages in DecodeString(), or more likely, yet another wrapper to deal with the memory leak.
But it seems simpler to just leave as is for now. I suspect we'll revisit this again when re-evaluating if we should just use nanopb strings for our internal strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I have a mild preference for mirroring the method structure in the other SDKs and pushing testing concerns on the tests, but I'm happy to defer until later as well.
size_t count = doc.data().object_value().internal_value.size(); | ||
result.fields_count = count; | ||
result.fields = | ||
reinterpret_cast<google_firestore_v1beta1_Document_FieldsEntry*>(malloc( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A parameterized MakeArray function that uses calloc
could make this less repetitive:
template <typename T>
T* MakeArray(size_t count) {
return reinterpret_cast<T*>(calloc(count, sizeof(T)));
}
It cuts down on the actual call site, and eliminates the init fields to zero step in the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea; done.
const Document& doc) const { | ||
google_firestore_v1beta1_Document LocalSerializer::EncodeDocument( | ||
const Document& doc) const { | ||
google_firestore_v1beta1_Document result = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can write this as
google_firestore_v1beta1_Document result{};
C compilers won't generate aggregate initializers for you, so nanopb ships the _init_zero
things. In our case in C++ they can mostly be avoided.
Here and throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
int i = 0; | ||
for (const auto& kv : object_value_map) { | ||
result.fields[i] = google_firestore_v1beta1_MapValue_FieldsEntry_init_zero; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're intending to zero-initialize these, we should just calloc
the array in the first place so we can't forget. See comment above.
} | ||
} | ||
Reader* reader, const firestore_client_MaybeDocument& proto) const { | ||
if (!reader->status().ok()) return nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the error cases in here are now in the realm of vanishingly rare--to the degree that they should basically not happen in production.
So long as the caller handles the failure in some useful way, does it really matter if we do more work than strictly necessary?
I think you should eliminate all these early returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They protect against hard_assert
s; without these early returns, we could end up in a state where our assertions don't hold; this would lead to crashes, rather than returned errors. (That said, we can reduce their number; we really only need them before the hard asserts. Or before functions that call hard_assert)
Alternatively, we could eliminate the hard_asserts and replace them with if (!assertion) return nullptr_or_whatever;
But I'd rather crash if our assumptions about the system are wrong.
I looked into moving these checks to be right before the hard_assert statements. It mostly works. But because there are functions that hard_assert, that don't have access to the nanopb reader/writer objects, it doesn't quite. We could add extra plumbing to make that work though. But even then, in this situation, our asserts would be a two step process (check for error; assert;). So still not great. We could refactor that by adding a SOFT_ASSERT that checks the current status code first, and only then calls HARD_ASSERT?
Options:
- Remove checks; if a corruption occurs, it'll cause us to crash. (I don't like this; I've been burned by too many flaky drives. I'd prefer to give the developer a chance of noticing and clearing the local cache in an attempt to recover. But this would be a rare occurrence.)
if (!reader->status().ok()) return T;
all over the place.- Move checks into SOFT_ASSERT, which would only HARD_ASSERT when there's no error. Add plumbing to make the error accessible everywhere. (Mostly already exists.) Upon corruption (or other error), the HARD_ASSERTS would effectively be disabled, and we'd continue on until eventually returning a nonsense value (which is fine b/c the caller should be checking the status first.)
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I consider the goal you're after laudable, I think that in practice this is going to be too hard to get right to make it worth attempting. However, as discussed offline, based on my survey of the HARD_ASSERTS we have we can make simplifications that allow us to have our cake and eat it too.
The hard asserts you're talking about defending against fall into a few broad categories:
- Assertions about true logic errors that are independent of input, e.g. the one in
DecodeMissingDocument
. These can remain and don't need any special treatment. - Assertions that indicate corruption directly, e.g. the one in
DecodeResourceName
. If our goal is to avoid crashing on bad input this needs to be changed into a Fail, in which case this no longer needs to be guarded by an early return elsewhere. - assertions that indicate corruption indirectly, e.g. the one in
DecodeFoundDocument
that asserts the the timestamp is non-empty, but one the major way that could have happened is to have had a failure during or prior to reading the timestamp. This should also probably be converted to a failure, though since it can be a secondary failure we should double-check that we don't clobber the root cause.
In any case, after we address this there just won't be many assertions in here so we won't have to keep this extra layer of defense either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Most asserts have been converted to reader->Fail() (which doesn't overwrite the root cause, so we should be good there.)
|
||
default: | ||
reader->Fail( | ||
"Invalid MaybeDocument message: Neither 'no_document' nor 'document' " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't quite true: we've encountered an unknown child of MaybeDocument, which is slightly different than this message. (Also, we could log the value of which_document_type
that got us here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
return absl::make_unique<NoDocument>(rpc_serializer_.DecodeKey(name), | ||
*std::move(version)); | ||
return absl::make_unique<NoDocument>( | ||
rpc_serializer_.DecodeKey(rpc_serializer_.DecodeString(proto.name)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These decodes can conceivably fail and you're doing them after the status check above which will result in an invalid NoDocument.
Somewhat counter-intuitively, I think the fix is to remove the status check. Even if we try to prevent the creation of invalid objects, we should assume we're going to get it wrong: we're going to end up creating bogus instances in the error case anyway because it's hard to verify we're doing this right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invalid docs are fine; I'm not trying to avoid that. The check is to prevent hard_assert'ing if there's a corruption. I think ideally, we'd push the check into DecodeKey(), etc though. If you assume DecodeString() fails, and could cause a hard assert in DecodeKey(), then the existing check above isn't going to be sufficient. (DecodeString() can't actually fail right now.)
result.target_id = query_data.target_id(); | ||
result.snapshot_version = rpc_serializer_.EncodeTimestamp( | ||
query_data.snapshot_version().timestamp()); | ||
result.resume_token = rpc_serializer_.EncodeBytes(query_data.resume_token()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's still a place for nanopb::Writer
in our system. If we call pb_release
on messages once we're done encoding to bytes, nanopb is going to free every pointer in there. This means that we must copy everything we put in the message just so pb_release doesn't destroy objects we're otherwise holding onto.
However, if we tracked every malloc separately, we could freely mingle model values and malloc'ed objects without fear of nanopb freeing a value we want to retain elsewhere. That tracking has a natural home in the Writer
.
Note that tracking shared objects (with the intent of nulling those out before calling pb_release) is harder to do because we encode into objects that allow moves. We could traverse the message to find these (using the field metadata nanopb provides), but this is fragile both because it depends on nanopb implementation and because it relies on us remembering to track a value we share. Tracking allocations is easier because it's fully under our control and is automatic.
On the reading side we can't take the same approach, since nanopb is doing the mallocing. There we're going to need to pb_release. We can still share values there by taking them from the message and nulling out the source. That way when pb_release comes along there will be a null-value there and free will do nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yeah we could track that in the writer. At this point, it isn't doing much else. :)
Clang complains about this, though gcc does not.
} | ||
} | ||
Reader* reader, const firestore_client_MaybeDocument& proto) const { | ||
if (!reader->status().ok()) return nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I consider the goal you're after laudable, I think that in practice this is going to be too hard to get right to make it worth attempting. However, as discussed offline, based on my survey of the HARD_ASSERTS we have we can make simplifications that allow us to have our cake and eat it too.
The hard asserts you're talking about defending against fall into a few broad categories:
- Assertions about true logic errors that are independent of input, e.g. the one in
DecodeMissingDocument
. These can remain and don't need any special treatment. - Assertions that indicate corruption directly, e.g. the one in
DecodeResourceName
. If our goal is to avoid crashing on bad input this needs to be changed into a Fail, in which case this no longer needs to be guarded by an early return elsewhere. - assertions that indicate corruption indirectly, e.g. the one in
DecodeFoundDocument
that asserts the the timestamp is non-empty, but one the major way that could have happened is to have had a failure during or prior to reading the timestamp. This should also probably be converted to a failure, though since it can be a secondary failure we should double-check that we don't clobber the root cause.
In any case, after we address this there just won't be many assertions in here so we won't have to keep this extra layer of defense either.
writer->WriteTag({PB_WT_STRING, google_firestore_v1beta1_Document_name_tag}); | ||
writer->WriteString(rpc_serializer_.EncodeKey(doc.key())); | ||
result.name = | ||
rpc_serializer_.EncodeString(rpc_serializer_.EncodeKey(doc.key())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I have a mild preference for mirroring the method structure in the other SDKs and pushing testing concerns on the tests, but I'm happy to defer until later as well.
for (const auto& kv : doc.data().object_value().internal_value) { | ||
result.fields[i].key = rpc_serializer_.EncodeString(kv.first); | ||
result.fields[i].value = rpc_serializer_.EncodeFieldValue(kv.second); | ||
i++; | ||
} | ||
|
||
// Encode Document.update_time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments made considerably more sense when it took 3-5 lines of non-obvious code for each field, but now that this is a simple assignment, maybe we can skip them? Here and throughout.
In this specific case, if you thought it worthwhile you could combine it with the comments below to draw a contrast. Something like, "Encode update_time but not create_time because we don't use the latter in our on-disk protos." I'm also happy to leave the comment below alone (especially if that's what's on the other platforms).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left the below comment, but have removed this one, and all the other now-obvious ones throughout local/remote serializers.
0.3.9.1 -> 0.3.901. This is a result of cocoapods requiring http://semver.org compliant versions. Nanopb's version scheme doesn't fit, so as a result, we've used the following conversion formula: nanopb-a.b.c.d => a.b.(c*100+d)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if (resource_name.size() <= 4 || resource_name[4] != "documents") { | ||
reader->Fail(StringFormat("Tried to deserialize invalid key %s", | ||
resource_name.CanonicalString())); | ||
return ResourcePath(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: prefer c++11 brace initialization {}
Until the updated nanopb pod is available this can't go into master. I suggest resolving the merge conflicts by merging master then targeting this at a nanopb-master branch. |
Now, we use the re-worked nanopb deserializer (that mallocs). This also
had the advantage of moving the C++ serialization code closer to the
other platforms.
Still TODO: