Skip to content

Simplify Serializer [En|De]code*() methods to no longer [de]serialize manually. #1816

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Oct 19, 2018

Conversation

rsgowman
Copy link
Member

Now, we use the re-worked nanopb deserializer (that mallocs). This also
had the advantage of moving the C++ serialization code closer to the
other platforms.

Still TODO:

  • Reorder the methods within the files to resemble the other platforms.
  • Rework error handling.
  • Encode*() methods.

@rsgowman rsgowman self-assigned this Sep 13, 2018
Copy link
Contributor

@wilhuff wilhuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than pick this apart in detail, I think It's possible to give high-level feedback:

I think there's too much error handling in here for not enough actual value. Yes, it's important to discard invalid inputs but we're not talking to arbitrary backends--we're talking to a single trusted server.

In the prior version where we were deserializing by hand there were many more failure cases to worry about and so making all the decoded types optional made sense.

Now it seems as if with just a few adjustments none of these need to be optional, and that would simplify this greatly.

For example, DecodeFieldValue can really only fail itself if there's a bad null value. It seems like we can make that unconditionally just return a FieldValue.

@@ -341,70 +275,56 @@ void Serializer::EncodeFieldValue(Writer* writer,
}
}

absl::optional<FieldValue> Serializer::DecodeFieldValue(Reader* reader) {
absl::optional<FieldValue> Serializer::DecodeFieldValue(
Reader* reader, const google_firestore_v1beta1_Value& msg) {
if (!reader->status().ok()) return absl::nullopt;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need these checks?

Presumably Nanopb was able to deserialize so something reasonable must have come over. I don't expect that this is ever going to fail in practice so I don't really care if we decode more than we need to.

switch (msg.which_value_type) {
case google_firestore_v1beta1_Value_null_value_tag:
if (msg.null_value != google_protobuf_NullValue_NULL_VALUE) {
reader->Fail("Input proto bytes cannot be parsed (invalid null value)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this matter? None of the other clients make this check.

(I'm not opposed to this just wondering if it's really worth it.)

default:
// Unspecified type.
reader->Fail("Invalid type while decoding FieldValue");
return absl::nullopt;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you returned FieldValue::NullValue here the return type wouldn't need to be absl::optional.

std::string key = Serializer::DecodeString(map_value.fields[i].key);
absl::optional<FieldValue> value =
Serializer::DecodeFieldValue(reader, map_value.fields[i].value);
if (!reader->status().ok()) return absl::nullopt;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need an early return here?

I expect actual failures to be vanishingly rare. It would be nice to eliminate this boilerplate if possible.

}

absl::optional<ObjectValue::Map> DecodeMapValue(Reader* reader) {
absl::optional<ObjectValue::Map> DecodeMapValue(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just return an empty map on error.

if (!reader->status().ok()) return absl::nullopt;

std::string key = Serializer::DecodeString(fields.key);
absl::optional<FieldValue> value =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If DecodeFieldValue returned a FieldValue rather than an optional this could be simpler.

@wilhuff wilhuff mentioned this pull request Sep 14, 2018
@rsgowman
Copy link
Member Author

@wilhuff Early feedback is always appreciated, but I think it may be a little premature here. This PR isn't ready for review yet. In particular, note that the commit calls out that error handling still needs to be reworked (amongst other things). I only created it to get travis to build it on the various xcode configurations.

But in any case, taking a quick peek at the feedback, I think we're on the same page. In particular:

  • I agree that most of the optional<> things will likely disappear.
  • Many of the early returns are likely unnecessary. I kept them to be paranoid while developing, but fully expect them to disappear due to optional<> going away. If for some reason I end up keeping optional, then I intend to go back and audit when these early returns are necessary. But I don't think I'm going to have to do that. :)

Now, we use the re-worked nanopb deserializer (that mallocs). This also
had the advantage of moving the C++ serialization code closer to the
other platforms.

Still TODO:
- Reorder the methods within the files to resemble the other platforms.
- Rework error handling.
- Encode*() methods.
Now, we use the re-worked nanopb deserializer (that mallocs). This also
had the advantage of moving the C++ serialization code closer to the
other platforms.

Still TODO:
- Reorder the methods within the files to resemble the other platforms.
- Rework error handling.
@rsgowman rsgowman force-pushed the rsgowman/simplify_serializers branch from d461324 to f2f4c17 Compare October 3, 2018 20:02
@rsgowman rsgowman changed the title Simplify Serializer Decode*() methods to no longer deserialize manually. Simplify Serializer [En|De]code*() methods to no longer [de]serialize manually. Oct 3, 2018
@rsgowman rsgowman requested review from var-const and zxu123 October 4, 2018 13:51
@rsgowman rsgowman assigned var-const and zxu123 and unassigned rsgowman Oct 4, 2018
@rsgowman
Copy link
Member Author

rsgowman commented Oct 4, 2018

Ok, I think this is ready for review now. There's a few caveats:

  1. I still need to reorder these methods to match the other ports. (Deferred to a followup.)
  2. f/f/nanopb/* has become rather barren. We could probably merge this back into the serializer itself.
  3. related to above, we provide an identical "free" method in both nanopb/ and remote/serializer. Merging them (via 2 above) would immediately solve this, but if we want to keep them separate, we'll have to do something a little different.
  4. ios not done. We'll need to wait until we resolve the nanopb + malloc thing. In the meantime, we'll need to retarget this to not_master.

I definitely recommend reviewing one commit at a time.

Copy link
Contributor

@zxu123 zxu123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM

google_protobuf_Timestamp_init_zero;
reader->ReadNanopbMessage(google_protobuf_Timestamp_fields, &timestamp_proto);

Timestamp Serializer::DecodeTimestamp(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the absl::optional to Timestamp change required for the change to use malloc-ed nanopb version? If not, could you elaborate the reason for the return type change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it's not necessary; we could keep it as optional<>. This would additionally have the benefit of forcing the caller to check the status (or die otherwise) whereas as currently written, if the caller forgets to check the status, they'll end up using an invalid return value.

The disadvantage of keeping it as optional<>, is that it introduces a lot more boilerplate code, since now all the methods have to continually check all of their values to ensure a value is present before continuing.

So why the change now? Previously, we did most of the deserialization ourselves, so we were a little more paranoid. Now, nanopb does more of it for us, so we feel a little safer.

Other alternatives that we could consider:

  1. use optional for the public methods, and just T for the private ones. The callers get safety, while minimizing (though not eliminating) the internal boilerplate.
  2. Similar to (1), but switch to statusor<> rather than optional<>. We've effectively eliminated nanopb/reader and nanopb/writer anyways, so if we merge the last little bits into the serializers, then the public methods can just return statusor while the private methods can continue to return T.

* We need to figure out what, if anything, we want to do about this. For now,
* we'll just ignore this test.
*/
#if 0
TEST_F(SerializerTest, BadBoolValue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just do TEST_F(SerializerTest, DISABLED_BadBoolValue) instead of the condition. But I could be wrong due to our customized setup of the C++ test in Xcode project.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, interesting. I'll leave it as is for now; that's guaranteed to work everywhere. (I suspect the eventual answer might be to just delete the test.)

&timestamp_proto);
pb_bytes_array_t* Serializer::EncodeString(const std::string& str) {
auto size = static_cast<pb_size_t>(str.size());
auto result = reinterpret_cast<pb_bytes_array_t*>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: static_cast would probably be sufficient here (applies everywhere malloc is used).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch; done.


int i = 0;
for (const auto& kv : object_value_map) {
result.fields[i] = google_firestore_v1beta1_MapValue_FieldsEntry_init_zero;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit: is this zeroing out necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not currently.

This message only has two fields, both of which are set immediately below. However, if new fields are added, and we don't init_zero them, then they'll be uninitialized. nanopb uses init_zero or similar in their examples, even when exhaustively setting all fields immediately afterwards, presumably for this reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're intending to zero-initialize these, we should just calloc the array in the first place so we can't forget. See comment above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer relevant.

Where relevant, i.e. for the return value of malloc
Copy link
Contributor

@wilhuff wilhuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big improvement! I think there's more we can do.

Feel free to push back or defer work into another PR.

writer->WriteTag({PB_WT_STRING, google_firestore_v1beta1_Document_name_tag});
writer->WriteString(rpc_serializer_.EncodeKey(doc.key()));
result.name =
rpc_serializer_.EncodeString(rpc_serializer_.EncodeKey(doc.key()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If EncodeKey returned the nanopb byte array directly, you could eliminate this two-step at every invocation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's occasionally used without the EncodeString() surrounding it (though admittedly, mostly in the tests, and even then, only for use with libprotobuf, since it has no knowledge of nanopb byte arrays.) I suppose we could wrap those usages in DecodeString(), or more likely, yet another wrapper to deal with the memory leak.

But it seems simpler to just leave as is for now. I suspect we'll revisit this again when re-evaluating if we should just use nanopb strings for our internal strings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I have a mild preference for mirroring the method structure in the other SDKs and pushing testing concerns on the tests, but I'm happy to defer until later as well.

size_t count = doc.data().object_value().internal_value.size();
result.fields_count = count;
result.fields =
reinterpret_cast<google_firestore_v1beta1_Document_FieldsEntry*>(malloc(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A parameterized MakeArray function that uses calloc could make this less repetitive:

template <typename T>
T* MakeArray(size_t count) {
  return reinterpret_cast<T*>(calloc(count, sizeof(T)));
}

It cuts down on the actual call site, and eliminates the init fields to zero step in the loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea; done.

const Document& doc) const {
google_firestore_v1beta1_Document LocalSerializer::EncodeDocument(
const Document& doc) const {
google_firestore_v1beta1_Document result =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can write this as

google_firestore_v1beta1_Document result{};

C compilers won't generate aggregate initializers for you, so nanopb ships the _init_zero things. In our case in C++ they can mostly be avoided.

Here and throughout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


int i = 0;
for (const auto& kv : object_value_map) {
result.fields[i] = google_firestore_v1beta1_MapValue_FieldsEntry_init_zero;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're intending to zero-initialize these, we should just calloc the array in the first place so we can't forget. See comment above.

}
}
Reader* reader, const firestore_client_MaybeDocument& proto) const {
if (!reader->status().ok()) return nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the error cases in here are now in the realm of vanishingly rare--to the degree that they should basically not happen in production.

So long as the caller handles the failure in some useful way, does it really matter if we do more work than strictly necessary?

I think you should eliminate all these early returns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They protect against hard_asserts; without these early returns, we could end up in a state where our assertions don't hold; this would lead to crashes, rather than returned errors. (That said, we can reduce their number; we really only need them before the hard asserts. Or before functions that call hard_assert)

Alternatively, we could eliminate the hard_asserts and replace them with if (!assertion) return nullptr_or_whatever; But I'd rather crash if our assumptions about the system are wrong.

I looked into moving these checks to be right before the hard_assert statements. It mostly works. But because there are functions that hard_assert, that don't have access to the nanopb reader/writer objects, it doesn't quite. We could add extra plumbing to make that work though. But even then, in this situation, our asserts would be a two step process (check for error; assert;). So still not great. We could refactor that by adding a SOFT_ASSERT that checks the current status code first, and only then calls HARD_ASSERT?

Options:

  1. Remove checks; if a corruption occurs, it'll cause us to crash. (I don't like this; I've been burned by too many flaky drives. I'd prefer to give the developer a chance of noticing and clearing the local cache in an attempt to recover. But this would be a rare occurrence.)
  2. if (!reader->status().ok()) return T; all over the place.
  3. Move checks into SOFT_ASSERT, which would only HARD_ASSERT when there's no error. Add plumbing to make the error accessible everywhere. (Mostly already exists.) Upon corruption (or other error), the HARD_ASSERTS would effectively be disabled, and we'd continue on until eventually returning a nonsense value (which is fine b/c the caller should be checking the status first.)

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I consider the goal you're after laudable, I think that in practice this is going to be too hard to get right to make it worth attempting. However, as discussed offline, based on my survey of the HARD_ASSERTS we have we can make simplifications that allow us to have our cake and eat it too.

The hard asserts you're talking about defending against fall into a few broad categories:

  • Assertions about true logic errors that are independent of input, e.g. the one in DecodeMissingDocument. These can remain and don't need any special treatment.
  • Assertions that indicate corruption directly, e.g. the one in DecodeResourceName. If our goal is to avoid crashing on bad input this needs to be changed into a Fail, in which case this no longer needs to be guarded by an early return elsewhere.
  • assertions that indicate corruption indirectly, e.g. the one in DecodeFoundDocument that asserts the the timestamp is non-empty, but one the major way that could have happened is to have had a failure during or prior to reading the timestamp. This should also probably be converted to a failure, though since it can be a secondary failure we should double-check that we don't clobber the root cause.

In any case, after we address this there just won't be many assertions in here so we won't have to keep this extra layer of defense either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Most asserts have been converted to reader->Fail() (which doesn't overwrite the root cause, so we should be good there.)


default:
reader->Fail(
"Invalid MaybeDocument message: Neither 'no_document' nor 'document' "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite true: we've encountered an unknown child of MaybeDocument, which is slightly different than this message. (Also, we could log the value of which_document_type that got us here.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return absl::make_unique<NoDocument>(rpc_serializer_.DecodeKey(name),
*std::move(version));
return absl::make_unique<NoDocument>(
rpc_serializer_.DecodeKey(rpc_serializer_.DecodeString(proto.name)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These decodes can conceivably fail and you're doing them after the status check above which will result in an invalid NoDocument.

Somewhat counter-intuitively, I think the fix is to remove the status check. Even if we try to prevent the creation of invalid objects, we should assume we're going to get it wrong: we're going to end up creating bogus instances in the error case anyway because it's hard to verify we're doing this right.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid docs are fine; I'm not trying to avoid that. The check is to prevent hard_assert'ing if there's a corruption. I think ideally, we'd push the check into DecodeKey(), etc though. If you assume DecodeString() fails, and could cause a hard assert in DecodeKey(), then the existing check above isn't going to be sufficient. (DecodeString() can't actually fail right now.)

result.target_id = query_data.target_id();
result.snapshot_version = rpc_serializer_.EncodeTimestamp(
query_data.snapshot_version().timestamp());
result.resume_token = rpc_serializer_.EncodeBytes(query_data.resume_token());
Copy link
Contributor

@wilhuff wilhuff Oct 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's still a place for nanopb::Writer in our system. If we call pb_release on messages once we're done encoding to bytes, nanopb is going to free every pointer in there. This means that we must copy everything we put in the message just so pb_release doesn't destroy objects we're otherwise holding onto.

However, if we tracked every malloc separately, we could freely mingle model values and malloc'ed objects without fear of nanopb freeing a value we want to retain elsewhere. That tracking has a natural home in the Writer.

Note that tracking shared objects (with the intent of nulling those out before calling pb_release) is harder to do because we encode into objects that allow moves. We could traverse the message to find these (using the field metadata nanopb provides), but this is fragile both because it depends on nanopb implementation and because it relies on us remembering to track a value we share. Tracking allocations is easier because it's fully under our control and is automatic.

On the reading side we can't take the same approach, since nanopb is doing the mallocing. There we're going to need to pb_release. We can still share values there by taking them from the message and nulling out the source. That way when pb_release comes along there will be a null-value there and free will do nothing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yeah we could track that in the writer. At this point, it isn't doing much else. :)

@rsgowman rsgowman assigned wilhuff and unassigned rsgowman Oct 17, 2018
}
}
Reader* reader, const firestore_client_MaybeDocument& proto) const {
if (!reader->status().ok()) return nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I consider the goal you're after laudable, I think that in practice this is going to be too hard to get right to make it worth attempting. However, as discussed offline, based on my survey of the HARD_ASSERTS we have we can make simplifications that allow us to have our cake and eat it too.

The hard asserts you're talking about defending against fall into a few broad categories:

  • Assertions about true logic errors that are independent of input, e.g. the one in DecodeMissingDocument. These can remain and don't need any special treatment.
  • Assertions that indicate corruption directly, e.g. the one in DecodeResourceName. If our goal is to avoid crashing on bad input this needs to be changed into a Fail, in which case this no longer needs to be guarded by an early return elsewhere.
  • assertions that indicate corruption indirectly, e.g. the one in DecodeFoundDocument that asserts the the timestamp is non-empty, but one the major way that could have happened is to have had a failure during or prior to reading the timestamp. This should also probably be converted to a failure, though since it can be a secondary failure we should double-check that we don't clobber the root cause.

In any case, after we address this there just won't be many assertions in here so we won't have to keep this extra layer of defense either.

writer->WriteTag({PB_WT_STRING, google_firestore_v1beta1_Document_name_tag});
writer->WriteString(rpc_serializer_.EncodeKey(doc.key()));
result.name =
rpc_serializer_.EncodeString(rpc_serializer_.EncodeKey(doc.key()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I have a mild preference for mirroring the method structure in the other SDKs and pushing testing concerns on the tests, but I'm happy to defer until later as well.

for (const auto& kv : doc.data().object_value().internal_value) {
result.fields[i].key = rpc_serializer_.EncodeString(kv.first);
result.fields[i].value = rpc_serializer_.EncodeFieldValue(kv.second);
i++;
}

// Encode Document.update_time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments made considerably more sense when it took 3-5 lines of non-obvious code for each field, but now that this is a simple assignment, maybe we can skip them? Here and throughout.

In this specific case, if you thought it worthwhile you could combine it with the comments below to draw a contrast. Something like, "Encode update_time but not create_time because we don't use the latter in our on-disk protos." I'm also happy to leave the comment below alone (especially if that's what's on the other platforms).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left the below comment, but have removed this one, and all the other now-obvious ones throughout local/remote serializers.

@wilhuff wilhuff assigned rsgowman and unassigned wilhuff Oct 17, 2018
0.3.9.1 -> 0.3.901.

This is a result of cocoapods requiring http://semver.org compliant
versions. Nanopb's version scheme doesn't fit, so as a result, we've
used the following conversion formula:

nanopb-a.b.c.d  => a.b.(c*100+d)
@rsgowman rsgowman assigned wilhuff and unassigned rsgowman Oct 19, 2018
Copy link
Contributor

@wilhuff wilhuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if (resource_name.size() <= 4 || resource_name[4] != "documents") {
reader->Fail(StringFormat("Tried to deserialize invalid key %s",
resource_name.CanonicalString()));
return ResourcePath();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: prefer c++11 brace initialization {}

@wilhuff
Copy link
Contributor

wilhuff commented Oct 19, 2018

Until the updated nanopb pod is available this can't go into master. I suggest resolving the merge conflicts by merging master then targeting this at a nanopb-master branch.

@rsgowman rsgowman changed the base branch from master to nanopb-master October 19, 2018 20:22
@rsgowman rsgowman merged commit fe7e4a1 into nanopb-master Oct 19, 2018
@paulb777 paulb777 deleted the rsgowman/simplify_serializers branch April 13, 2019 16:01
@firebase firebase locked and limited conversation to collaborators Oct 27, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants