-
Notifications
You must be signed in to change notification settings - Fork 930
Implement limbo resolution throttling. #2790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Binary Size ReportAffected SDKs
Test Logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite the volume of comments this actually looks great. A lot of the comments are more nitpicky than substantive.
One high-level thought though: limbo resolution is now somewhat complex (or at least more than before). There's a flow to the way documents go through limbo and that's not really documented anywhere. While it's possible to document this inline in the SyncEngine
, pulling this process out into a separate LimboResolver
class would give you a class comment as a place to tie this all together. If that refactoring doesn't result in any crazy indirection, it seems like it would also be helpful in reducing the number of things for which SyncEngine
is directly responsible.
} | ||
} | ||
|
||
private startEnqueuedLimboResolutions(): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is off:
- we tend to use "start" and "stop" for beginning and ending continuous processes, like
startWatchStream
, after which the watch stream has started until we stop it and - this method doesn't necessarily start any resolutions if the number of pending resolutions is at the maximum.
In the RemoteStore
there's a fillWritePipeline
that has a similar high level behavior. The analogous name would be fillLimboResolutionPipeline
, but while it's an improvement I can't say I think this is an obvious winner, probably because we don't explicitly have a pipeline the way the remote store does. The symmetry is appealing though.
Other ideas:
drainLimboResolutionQueue
is problematic because it suggests waiting for all the queued resolutions.- I've seen "pump" used for this kind of operation.
fillActiveLimboResolutions
? (building on pending vs active, suggested elsewhere)
All things considered I'd probably just go with fillLimboResolutionPipeline
for symmetry with RemoteStore
more than anything but I'm definitely open to alternatives that avoid "start".
Also, please add comments. The comments above fillWritePipeline
are a good example: what the thing does and when it should be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add the requested documentation to the method.
Regarding the naming issue... what about using the verb "trigger" in the method name? The "pipeline" idea makes sense too, but the method in question is not technically "filling" the pipeline and I find that verb to be misleading. The "pump" verb is sounding good to me though. According to https://en.wikipedia.org/wiki/Event_loop the verb "pump" has a meaning in an event loop/message queue context, which has many analogues to the limbo resolution queue. That article also suggests the verb "dispatch".
So here are some other names to consider:
triggerLimboResolutions()
pumpLimboResolutionQueue()
(my preference)dispatchLimboResolutionListens()
Regarding the limbo resolution state names, I'm leaning towards the terms "enqueued" and "active". I prefer "enqueued" over "pending" because, IMO, it is a more concrete term than "pending" and therefore requires less explanation in documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you implemented was pumpLimboResolutionListenQueue
. I offer that you should construct this as "pump" plus the name of the thing. If you go with enqueuedLimboResolutions
, above, then match with pumpEnqueuedLimboResolutions
.
I'm also fine with "trigger" if that reads better to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I've renamed the method to pumpEnqueuedLimboResolutions()
.
@@ -154,6 +155,9 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |||
private limboResolutionsByTarget: { | |||
[targetId: number]: LimboResolution; | |||
} = {}; | |||
private readonly maxConcurrentLimboResolutions: number = DEFAULT_MAX_CONCURRENT_LIMBO_RESOLUTIONS; | |||
/** The keys of documents whose limbo resolutions are enqueued. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are now a number of limbo-related structures in here and it would be worthwhile to document them in relation to one another and give an overview of how they're supposed to fit together.
Also, since things start in the limboListenQueue
, I suggest making it structurally first among these. That way, when reading through this list, the structure suggests the flow.
Also, this specific comment doesn't do much to suggest what the queue is actually for. It's worth calling out that these are waiting to start a the query. Maybe something like this?
The keys of documents that are in limbo for which we haven't yet started a limbo resolution query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added/improved the documentation for limboListenQueue
, limboTargetsByKey
, and limboResolutionsByTarget
. I don't have enough context to provide meaningful documentation for limboDocumentRefs
. Since this PR does not touch anything related to limboDocumentRefs
I've left it alone. I also moved limboListenQueue
in the code to be structurally first.
@@ -154,6 +155,9 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |||
private limboResolutionsByTarget: { | |||
[targetId: number]: LimboResolution; | |||
} = {}; | |||
private readonly maxConcurrentLimboResolutions: number = DEFAULT_MAX_CONCURRENT_LIMBO_RESOLUTIONS; | |||
/** The keys of documents whose limbo resolutions are enqueued. */ | |||
private limboListenQueue: DocumentKey[] = []; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between a "limbo listen" and a "limbo resolution"? Is there any? If not I suggest sticking with one name to reduce the conceptual load here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A limbo "resolution" is the entire process that begins when a document enters the limbo state and ends when the limbo state is resolved (i.e. after getting a response from the server). A "listen" is the thing that the remoteStore does when we call remoteStore.listen(...). I agree, however, that the terminology is confusing and I'll make any changes needed to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still concerned about the "listen" in limboListenQueue
as a name here. Even your comment doesn't mention "listens" so it's not clear if there's an actual conceptual difference between limbo resolution and a limbo listen or if this is just an accidental inconsistency that wasn't caught in review.
Since you've resolved to distinguish active and enqueued limbo resolutions, why can't this be enqueuedLimboResolutions
? That is: instead of focusing on the fact that they're enqueued to start a listener, just know that these are limbo resolutions in the enqueued state without specifying exactly what they're waiting for. This resolves the inconsistency without any effective loss of clarity, because exactly what's next after enqueueing isn't a critical detail in this name.
If you prefer "Queue" in the name, "deferred" might better name for the enqueued resolutions because then you can make this deferredLimboResolutionQueue
. enqueuedLimboResolutionQueue
looks strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I've renamed the variable to enqueuedLimboResolutions
.
@@ -154,6 +155,9 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |||
private limboResolutionsByTarget: { | |||
[targetId: number]: LimboResolution; | |||
} = {}; | |||
private readonly maxConcurrentLimboResolutions: number = DEFAULT_MAX_CONCURRENT_LIMBO_RESOLUTIONS; | |||
/** The keys of documents whose limbo resolutions are enqueued. */ | |||
private limboListenQueue: DocumentKey[] = []; | |||
private limboDocumentRefs = new ReferenceSet(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of the existing variable names are now ambiguous because it's not clear if they apply to active limbo resolutions, inactive limbo resolutions, or all. I suggest prefixing all these names to clearly indicate which kind each is.
Other ideas for ways to clarify this:
- active vs pending (though "pending" could also possibly be read to mean the resolution for which we're waiting on a result)
- active vs deferred?
If we document this distinction though I think any pair could be good and an improvement. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the names "enqueued" and "active" for the states of documents that are in limbo. Since a limbo resolution is "active" if and only if it is associated with a target, the limboTargetsByKey
and limboResolutionsByTarget
imply that they only apply to "active" limbo resolutions. Similarly, a limbo resolution is "enqueued" if and only if its document key is in the limboListenQueue
. My sense is that adding "active" and "inactive" to these variable names will be confusing because they have no counterpart for the opposite state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
variable names will be confusing because they have no counterpart for the opposite state
That's an interesting perspective but I think the reasoning is kind of subtle: that because only active limbo resolutions can have a target the active-ness is self-explanatory. Would this have been self-evident to you a week ago?
My reasoning was that the prefix helps identify which variables are used in which state, and suggests which elements have to be modified as you transition between states. That is, this would have the effect of grouping these things together more than necessarily differentiating similar structures of the same type that might be used in the different states.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I've added "active" and "enqueued" to all relevant variable names throughout this PR.
private currentUser: User | ||
) {} | ||
private currentUser: User, | ||
maxConcurrentLimboResolutions?: number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be made simpler by just defaulting the argument.
private maxConcurrentLimboResolutions: number = DEFAULT_MAX_CONCURRENT_LIMBO_RESOLUTIONS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I also removed the DEFAULT_MAX_CONCURRENT_LIMBO_RESOLUTIONS
constant and just inlined the value of 100. IMO this improves readability at no cost to performance or generality.
added: [docB1, docB2, docB3], | ||
fromCache: true | ||
}) | ||
// After the existence filter mismatch the client re-listens without |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using "after" here makes this seem like the existence filter mismatch is happening elsewhere, but it's really happening here right? I suggest rephrasing this to something that more actively explains why this is going on:
The view now contains the docAs and the docBs (6 documents), but the existence filter indicated only three should match. This causes the client to re-listen without a resume token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I took your suggestion verbatim.
.watchAcks(docA3Query) | ||
.watchCurrents(docA3Query, 'resume-token-1004') | ||
.watchSnapshots(1004) | ||
// Watch has now confirmed that docA3 has been deleted. So the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are good comments, but they're a little wordy. If you could make them more concise it would help the next reader focus on the interesting bits. For example:
- "the listener sends an event about this" is mostly superfluous because we're always sending events.
- Similarly, "since we are in sync with the server and all docs that were in limbo have been resolved" is at least self-duplicating because "in sync" implies no documents in limbo.
you could shorten it like so:
Watch has confirmed the final limbo document has been deleted so expect an event with fromCache=false, indicating we're in sync again.
I realize this is something of a tall order because it's not obvious which behaviors are implicit, but give it a shot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Comments have been overhauled, as per our offline discussions.
@@ -353,7 +358,8 @@ export class SpecBuilder { | |||
enableNetwork: false, | |||
expectedState: { | |||
activeTargets: {}, | |||
limboDocs: [] | |||
limboDocs: [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, limboDocs
is now ambiguous. It's worth carrying the active/pending or similar distinction through to here so that it's consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Note that this causes a backwards incompatibility with the JSON test specs that are generated for Android and iOS. Once this PR is merged, I will submit PRs to those two repositories to use "activeLimboDocs" instead of "limboDocs".
* target. For the limbo resolutions that should be active, a targetId will | ||
* be assigned if it hasn't already been assigned. | ||
*/ | ||
expectLimboDocsEx(limboDocs: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a super fan of the "Ex" convention. You see this a lot in long-lived stable APIs that can't be changed, especially in languages where overloads aren't possible (the example that comes to mind are Win32 C APIs). This is neither of those cases, so I suggest either:
- just make this an overload of
expectLimboDocs
or - make
expectLimboDocs
only specify active limbo documents and add a separateexpectInactiveLimboDocs
.
I'd favor the latter, since it's simpler (in typescript, overloads exist for callers, but the implementation still has to manually disambiguate which variant is being called).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your suggestion of adding a new expectInactiveLimboDocs()
method. I've made this change.
@@ -405,6 +406,7 @@ abstract class TestRunner { | |||
); | |||
|
|||
private expectedLimboDocs: DocumentKey[]; | |||
private expectedInactiveLimboDocs: DocumentKey[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please resolve the ambiguity of the expectedLimboDocs
name too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Also, in the description, please indicate which issue this addresses and give an overview of how it address it. |
@wilhuff I have updated the description of this PR, as requested. Thank you for the thorough review and excellent feedback! Regarding your suggestion of moving the limbo resolution logic out of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left two comments outside of this review (because I just learned about "reviews" in GitHub today!).
@@ -154,6 +155,9 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |||
private limboResolutionsByTarget: { | |||
[targetId: number]: LimboResolution; | |||
} = {}; | |||
private readonly maxConcurrentLimboResolutions: number = DEFAULT_MAX_CONCURRENT_LIMBO_RESOLUTIONS; | |||
/** The keys of documents whose limbo resolutions are enqueued. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added/improved the documentation for limboListenQueue
, limboTargetsByKey
, and limboResolutionsByTarget
. I don't have enough context to provide meaningful documentation for limboDocumentRefs
. Since this PR does not touch anything related to limboDocumentRefs
I've left it alone. I also moved limboListenQueue
in the code to be structurally first.
@@ -154,6 +155,9 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |||
private limboResolutionsByTarget: { | |||
[targetId: number]: LimboResolution; | |||
} = {}; | |||
private readonly maxConcurrentLimboResolutions: number = DEFAULT_MAX_CONCURRENT_LIMBO_RESOLUTIONS; | |||
/** The keys of documents whose limbo resolutions are enqueued. */ | |||
private limboListenQueue: DocumentKey[] = []; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A limbo "resolution" is the entire process that begins when a document enters the limbo state and ends when the limbo state is resolved (i.e. after getting a response from the server). A "listen" is the thing that the remoteStore does when we call remoteStore.listen(...). I agree, however, that the terminology is confusing and I'll make any changes needed to be consistent.
@@ -154,6 +155,9 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |||
private limboResolutionsByTarget: { | |||
[targetId: number]: LimboResolution; | |||
} = {}; | |||
private readonly maxConcurrentLimboResolutions: number = DEFAULT_MAX_CONCURRENT_LIMBO_RESOLUTIONS; | |||
/** The keys of documents whose limbo resolutions are enqueued. */ | |||
private limboListenQueue: DocumentKey[] = []; | |||
private limboDocumentRefs = new ReferenceSet(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the names "enqueued" and "active" for the states of documents that are in limbo. Since a limbo resolution is "active" if and only if it is associated with a target, the limboTargetsByKey
and limboResolutionsByTarget
imply that they only apply to "active" limbo resolutions. Similarly, a limbo resolution is "enqueued" if and only if its document key is in the limboListenQueue
. My sense is that adding "active" and "inactive" to these variable names will be confusing because they have no counterpart for the opposite state.
private currentUser: User | ||
) {} | ||
private currentUser: User, | ||
maxConcurrentLimboResolutions?: number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I also removed the DEFAULT_MAX_CONCURRENT_LIMBO_RESOLUTIONS
constant and just inlined the value of 100. IMO this improves readability at no cost to performance or generality.
// resume token. | ||
.restoreListen(query, 'resume-token-1000') | ||
.watchAcks(query) | ||
// Watch now tells us that the query results on the server are docB1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I took your suggestion verbatim.
added: [docB1, docB2, docB3], | ||
fromCache: true | ||
}) | ||
// After the existence filter mismatch the client re-listens without |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I took your suggestion verbatim.
.watchAcks(docA3Query) | ||
.watchCurrents(docA3Query, 'resume-token-1004') | ||
.watchSnapshots(1004) | ||
// Watch has now confirmed that docA3 has been deleted. So the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Comments have been overhauled, as per our offline discussions.
@@ -353,7 +358,8 @@ export class SpecBuilder { | |||
enableNetwork: false, | |||
expectedState: { | |||
activeTargets: {}, | |||
limboDocs: [] | |||
limboDocs: [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Note that this causes a backwards incompatibility with the JSON test specs that are generated for Android and iOS. Once this PR is merged, I will submit PRs to those two repositories to use "activeLimboDocs" instead of "limboDocs".
* target. For the limbo resolutions that should be active, a targetId will | ||
* be assigned if it hasn't already been assigned. | ||
*/ | ||
expectLimboDocsEx(limboDocs: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your suggestion of adding a new expectInactiveLimboDocs()
method. I've made this change.
@@ -405,6 +406,7 @@ abstract class TestRunner { | |||
); | |||
|
|||
private expectedLimboDocs: DocumentKey[]; | |||
private expectedInactiveLimboDocs: DocumentKey[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -148,9 +148,13 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |||
q.canonicalId() | |||
); | |||
private queriesByTarget: { [targetId: number]: Query[] } = {}; | |||
/** The keys of documents that are in limbo for which we haven't yet started a limbo resolution query. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
80 columns. Here and below. Unfortunately, the formatter doesn't enforce this for comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops. I had assumed that yarn prettier
would take care of this.
@@ -154,6 +155,9 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |||
private limboResolutionsByTarget: { | |||
[targetId: number]: LimboResolution; | |||
} = {}; | |||
private readonly maxConcurrentLimboResolutions: number = DEFAULT_MAX_CONCURRENT_LIMBO_RESOLUTIONS; | |||
/** The keys of documents whose limbo resolutions are enqueued. */ | |||
private limboListenQueue: DocumentKey[] = []; | |||
private limboDocumentRefs = new ReferenceSet(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
variable names will be confusing because they have no counterpart for the opposite state
That's an interesting perspective but I think the reasoning is kind of subtle: that because only active limbo resolutions can have a target the active-ness is self-explanatory. Would this have been self-evident to you a week ago?
My reasoning was that the prefix helps identify which variables are used in which state, and suggests which elements have to be modified as you transition between states. That is, this would have the effect of grouping these things together more than necessarily differentiating similar structures of the same type that might be used in the different states.
@@ -154,6 +155,9 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |||
private limboResolutionsByTarget: { | |||
[targetId: number]: LimboResolution; | |||
} = {}; | |||
private readonly maxConcurrentLimboResolutions: number = DEFAULT_MAX_CONCURRENT_LIMBO_RESOLUTIONS; | |||
/** The keys of documents whose limbo resolutions are enqueued. */ | |||
private limboListenQueue: DocumentKey[] = []; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still concerned about the "listen" in limboListenQueue
as a name here. Even your comment doesn't mention "listens" so it's not clear if there's an actual conceptual difference between limbo resolution and a limbo listen or if this is just an accidental inconsistency that wasn't caught in review.
Since you've resolved to distinguish active and enqueued limbo resolutions, why can't this be enqueuedLimboResolutions
? That is: instead of focusing on the fact that they're enqueued to start a listener, just know that these are limbo resolutions in the enqueued state without specifying exactly what they're waiting for. This resolves the inconsistency without any effective loss of clarity, because exactly what's next after enqueueing isn't a critical detail in this name.
If you prefer "Queue" in the name, "deferred" might better name for the enqueued resolutions because then you can make this deferredLimboResolutionQueue
. enqueuedLimboResolutionQueue
looks strange.
} | ||
} | ||
|
||
private startEnqueuedLimboResolutions(): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you implemented was pumpLimboResolutionListenQueue
. I offer that you should construct this as "pump" plus the name of the thing. If you go with enqueuedLimboResolutions
, above, then match with pumpEnqueuedLimboResolutions
.
I'm also fine with "trigger" if that reads better to you.
* Starts listens for documents in limbo that are enqueued for resolution. | ||
* | ||
* When a document goes into limbo it is enqueued for resolution. This method | ||
* repeatedly removes entries from the limbo resolution queue and starts a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your description here uses "limbo resolution queue". I think this supports renaming "limbo listen" to "limbo resolution".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I will update this PR to use consistent terminology.
* (either successfully or unsuccessfully). This ensures that all documents in | ||
* limbo are eventually resolved. | ||
* | ||
* A maximum number of concurrent limbo resolution listens was implemented to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment buries the lede: limiting concurrent limbo resolution is a top-line feature of this method. It also focuses on the change we made instead of focusing on documenting the intended behavior and what motivated that. In particular, it's not clear why "resource exhausted" errors are necessarily a problem. I suggest moving this into the first paragraph and emphasizing what this fixes. Something like this:
Starts listens for documents in limbo that are enqueued for resolution, subject to a maximum number of concurrent resolutions. Without bounding the number of concurrent resolutions, the server can fail with resource exhausted errors which can lead to pathological client behavior as seen in #2683.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I like your comment a lot better. I've removed some of the extra explanation from the comments since they are bound to go out-of-date anyways with future changes to the sync engine.
/** | ||
* Expects a document to be in limbo, but *without* a targetId. | ||
*/ | ||
expectInactiveLimboDocs(...keys: DocumentKey[]): this { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Elsewhere you're calling these "enqueued" limbo docs. Let's be consistent throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Do you think it is okay to leave the expectedLimboDocs()
method with its current name? It does not explicitly call out the fact that it is referring to "active" limbo documents, but this is the typical case used in many places in the spec tests. Moreover, the difference between "enqueued" and "inactive" only matters here in the spec tests for limbo resolution throttling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see this going two ways:
- Just leave it as-is, which is pretty much fine
- Make
expectedLimboDocs
check both active and inactive docs and make a thirdexpectedActiveLimboDocs
that just checks actives.
Personally, I think (2) is veering into overengineering territory so I'm OK with just leaving it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to agree. I'll leave it as-is, with expectedLimboDocs()
being the name of the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very close to LGTM
this.sharedClientState, | ||
this.user | ||
); | ||
if (this.maxConcurrentLimboResolutions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's more configuration knobs like this one, this kind of separate invocation for each configuration will explode in complexity. It's probably better to have a named constant, export it, and just use that default value here.
Alternatively, maybe set the default here to something obscenely huge to prevent limbo throttling altogether if it's not intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Thanks for pointing this out because I didn't like it either. I've changed maxConcurrentLimboResolutions
to a required parameter, delegating the responsibility to the caller to determine the value. I then updated firestore_client.ts
to specify the value of 100
and used Number.MAX_SAFE_INTEGER
in the spec tests, unless explicitly overridden by a specific test.
/** | ||
* Expects a document to be in limbo, but *without* a targetId. | ||
*/ | ||
expectInactiveLimboDocs(...keys: DocumentKey[]): this { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see this going two ways:
- Just leave it as-is, which is pretty much fine
- Make
expectedLimboDocs
check both active and inactive docs and make a thirdexpectedActiveLimboDocs
that just checks actives.
Personally, I think (2) is veering into overengineering territory so I'm OK with just leaving it.
…gine constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This change to the change log was forgotten in the PR that added throttling: #2790
This change was motivated by the following bug report: #2683. In summary, the issue was that when a large number of documents went into limbo (in this case 15,000 documents) then this would cause 15,000 document listens, which would result in "resource-exhausted" errors. The client would then back off for a bit but would try the listens again, which would again result in "resource-exhausted" errors. And the cycle would continue. Worst of all, the customer was being billed for all of these reads that had no beneficial effects for the clients.
In order to avoid this situation, this PR modifies the limbo resolution logic to "throttle" the limbo resolutions. It does this by allowing at most 100 concurrent limbo resolutions. Limbo resolutions in excess of this limit are queued up and not started until another limbo resolution completes. This fix should avoid the "resource-exhausted" errors and the infinite read loop.