-
Notifications
You must be signed in to change notification settings - Fork 615
Add explicit FieldValue canonicalization #1178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add explicit FieldValue canonicalization #1178
Conversation
Codecov Report
Continue to review full report at Codecov.
|
} | ||
} | ||
|
||
private static void buildTimestampCanonicalId(StringBuilder builder, Timestamp timestamp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buildTimestampCanonicalId
reads weirdly and is the third form for names in this file:
- numberEquals
- compareNumber
- buildNumberCanonicalId
It seems like these could all have the same shape.
Ideas:
-
make these verbless, i.e. have
private static void canonicalId(StringBuilder builder, Value value) ...
and these helpers can then just drop the "build" prefix.
-
use a verb like "canonify" and then these become e.g. "canonifyTimestamp".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't bring myself to use canonify
or canonicalize
. I used stringify,
but am open to feedback, even if that means canonify
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"stringify" doesn't really tie into canonical ID though and I think that's probably more important. I don't have any better suggestions though, so feel free to put it back to makeCanonicalId or whatever it was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will rename these methods to use "canonify" instead. Apparently other people are using this terminology too: https://docs.cfengine.com/docs/3.12/reference-functions-canonify.html
} | ||
|
||
private static void buildGeoPointCanonicalId(StringBuilder builder, LatLng latLng) { | ||
builder.append(String.format("{lat:%s,lng:%s}", latLng.getLatitude(), latLng.getLongitude())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is indistinguishable from a user-supplied object that happens to have these parameters. While JSON-like seems useful we could avoid collisions altogether if we made this look like an object constructor instead.
Something like geo(lat:%s,lng:%s)
or even just geo(%s, %s)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to keep the IDs short. geo(%s, %s)
certainly is short enough and solves the ambiguity.
private static void buildObjectCanonicalId(StringBuilder builder, MapValue mapValue) { | ||
builder.append("{"); | ||
boolean first = true; | ||
for (Map.Entry<String, Value> entry : mapValue.getFieldsMap().entrySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you intending to make it an invariant that MapValues are always sorted by key? I don't recall seeing this in your other PR adding protovalue.ObjectValue.Builder
.
If not, we probably need to sort the strings here :-(. I think maintaining the invariant is probably better though.
Note that server-supplied MapValues already conform to this invariant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MapValues are backed by a LinkedHashMap, and keeping this sorted in ObjectValue.Builder would be pretty expensive. Even though it feels like the wrong thing to do here, I don't see a good solution that doesn't involve resorting here. I experimented with sorted views, but in the end we probably don't need to go too crazy as large maps in Query constraints are hopefully uncommon.
// Even though MapValue are likely sorted correctly based on their insertion order (e.g. when | ||
// received from the backend), local modifications can bring elements out of order. We need to | ||
// re-sort the elements to ensure that canonical IDs are independent of insertion order. | ||
SortedMap<String, Value> sortedMap = new TreeMap<>(mapValue.getFieldsMap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Building a binary tree to sort the entries is pretty wasteful in terms of memory allocation. It's also pessimistic because building the tree is always going to cost O(n lg(n)). An alternative would be to collect the keys into an ArrayList and sort that. That has the benefit of allocating less memory and also runs in O(n) if the list is already sorted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through different implementations for optimize for both runtime complexity and readability. While I still think that using large Maps here is kind of an anti-pattern (especially given our lack of truncation support), adding an extra line here to pre-sort the keys (and performing a key-based lookup on the HashMap) might be a worthy tradeoff. Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// received from the backend), local modifications can bring elements out of order. We need to | ||
// re-sort the elements to ensure that canonical IDs are independent of insertion order. | ||
List<String> keys = new ArrayList<>(mapValue.getFieldsMap().keySet()); | ||
Collections.sort(keys); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can make this keys.sort()
if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL: List.sort() is API Level 24.
Adds a new format for canonical IDs of field values that is meant to be verifiable and stable.