-
Notifications
You must be signed in to change notification settings - Fork 940
[Multi-Tab] Recover from unexpected primary lease loss #984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
c40ff09
e51265b
a7c8497
f9c7872
34d52ae
692b789
3de1af4
c243b26
ee94d57
3da6855
cbd58e4
72aa4a9
7f685f9
b1c7e8d
6d3628d
794e9f7
e203796
5c56479
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -276,26 +276,26 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |
*/ | ||
// PORTING NOTE: Multi-tab only. | ||
private synchronizeViewAndComputeSnapshot( | ||
query: Query, | ||
targetId: TargetId | ||
queryView: QueryView | ||
): Promise<ViewChange> { | ||
return this.localStore.executeQuery(query).then(async docs => { | ||
return this.localStore.executeQuery(queryView.query).then(async docs => { | ||
return this.localStore | ||
.remoteDocumentKeys(targetId) | ||
.remoteDocumentKeys(queryView.targetId) | ||
.then(async remoteKeys => { | ||
const queryView = this.queryViewsByTarget[targetId]; | ||
assert(!!queryView, 'Cannot reconcile missing view'); | ||
const viewChange = queryView.view.synchronizeWithPersistedState( | ||
// We do not update our tracking of limbo documents since | ||
//`resetCurrent` will set `current` to false for the querie's view | ||
// (and won't return any limbo documents). We also explicitly don't | ||
// track limbo documents as secondary. | ||
const viewSnapshot = queryView.view.synchronizeWithPersistedState( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This comment still doesn't explain why we are resetting 'current' and why we only do it as primary. I (non-confidently) think maybe the reasons are:
In general I'm fuzzy on what the desired behavior is and how we're implementing it. Perhaps the desired behavior is that at the point of a primary change, all clients (secondary and master) mark all views as non-current? But if that's the case, why can't we just unconditionally pass true here for resetCurrent? Sorry if I'm just confused. Feel free to ping me on chat to explain what's going on and we can figure out what comments/whatever might help this to make sense to me. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As per our discussion yesterday, I changed this logic to no longer set current to false. That means that we now transition primary state without any user feedback and also that we could theoretically get limbo documents at this point. Note that I am not quite sure how I can trigger limbo documents at this point (the new statement that I added). Any test case that I can think of has the query already marked Non-CURRENT by the previous primary tab (based on the limbo docs from the previous computation). |
||
docs, | ||
remoteKeys, | ||
/* resetLimboDocuments= */ this.isPrimary, | ||
/* resetCurrent= */ this.isPrimary | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm having trouble understanding what these flags do. I wonder if "reset" should be "recompute" or "update" (reset makes it sound like they'll just be cleared, which I don't think is the case)... And if we pass false does that mean current and limboDocuments will be left 100% as-is? I guess ultimately I'm wondering why we can't just unconditionally recompute these. I also wonder if we could sensibly combine these into a single argument. Since current depends on limboDocuments, it probably never makes sense to recompute current without computing limbo documents? Anyway, at minimum it will probably take some comments to appease me. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I simplified this a little to unconditionally re-compute the list of limbo documents. In clients that are in-sync this is a no-op, and in clients that are out of sync, this will bring them back into sync. The only bit that I still need to provide is whether to set Notice that I also added a TODO to mark the queries as non-current in SharedClientState. That's probably a three line change but I wanted to keep this PR from exploding even further. |
||
); | ||
await this.updateTrackedLimbos( | ||
queryView.targetId, | ||
viewChange.limboChanges | ||
assert( | ||
!this.isPrimary || viewSnapshot.limboChanges.length === 0, | ||
'Received limboChanges when query is not CURRENT.' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I stared at the comment and this assert for a while before I fully understood what's going on. One thing that would help is if you changed the message to "Primary received ..." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I removed this assert and am now tracking these documents in the if I am the primary client. I could also pass in a |
||
); | ||
return viewChange; | ||
return viewSnapshot; | ||
}); | ||
}); | ||
} | ||
|
@@ -787,10 +787,7 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |
err: FirestoreError | ||
): Promise<void> { | ||
if (isPrimaryLeaseLostError(err)) { | ||
log.debug( | ||
LOG_TAG, | ||
'Unexpectedly lost primary lease, attempting to recover' | ||
); | ||
log.debug(LOG_TAG, 'Unexpectedly lost primary lease'); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You still need to update the method name and comment... Please mention that when this error is thrown, there will automatically be a lease refresh scheduled so we should very quickly get an applyPrimaryState() call from the persistence layer. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Update the method and comment. Method is now |
||
} else { | ||
throw err; | ||
} | ||
|
@@ -865,47 +862,60 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer { | |
} | ||
|
||
/** | ||
* Reconcile the local query views with those the state from persistence. | ||
* Raises snapshots for any changes that affect the current client. | ||
* Reconcile the query views of the provided query targets with the state from | ||
* persistence. Raises snapshots for any changes that affect the local | ||
* client and returns the updated state of all target's query data. | ||
*/ | ||
// PORTING NOTE: Multi-tab only. | ||
private synchronizeQueryViewsAndRaiseSnapshots( | ||
targets: TargetId[] | ||
): Promise<QueryData[]> { | ||
let p = Promise.resolve(); | ||
const activeQueries: QueryData[] = []; | ||
const newViewSnapshots: ViewSnapshot[] = []; | ||
for (const targetId of targets) { | ||
// TODO(multitab): We should mark queries NON-CURRENT when the primary tab | ||
// re-computes its views. | ||
p = p.then(async () => { | ||
let queryData; | ||
const query = await this.localStore.getQueryForTarget(targetId); | ||
if (this.queryViewsByTarget[targetId] === undefined) { | ||
assert( | ||
this.isPrimary, | ||
'A secondary tab should never have an active query without an active view.' | ||
); | ||
// For queries that never executed on this client, we need to | ||
// allocate the query in LocalStore and initialize a new View. | ||
queryData = await this.localStore.allocateQuery(query); | ||
await this.initializeViewAndComputeSnapshot(queryData, false); | ||
} else { | ||
let queryData: QueryData; | ||
const queryView = this.queryViewsByTarget[targetId]; | ||
if (queryView) { | ||
// For queries that have a local View, we need to update their state | ||
// in LocalStore (as the resume token and the snapshot version | ||
// might have changed) and reconcile their views with the persisted | ||
// state (the list of syncedDocuments may have gotten out of sync). | ||
await this.localStore.releaseQuery(query, true); | ||
queryData = await this.localStore.allocateQuery(query); | ||
await this.localStore.releaseQuery( | ||
queryView.query, | ||
/*keepPersistedQueryData=*/ true | ||
); | ||
queryData = await this.localStore.allocateQuery(queryView.query); | ||
const viewChange = await this.synchronizeViewAndComputeSnapshot( | ||
query, | ||
queryData.targetId | ||
queryView | ||
); | ||
if (viewChange.snapshot) { | ||
this.syncEngineListener!.onWatchChange([viewChange.snapshot]); | ||
newViewSnapshots.push(viewChange.snapshot); | ||
} | ||
} else { | ||
assert( | ||
this.isPrimary, | ||
'A secondary tab should never have an active query without an active view.' | ||
); | ||
// For queries that never executed on this client, we need to | ||
// allocate the query in LocalStore and initialize a new View. | ||
const query = await this.localStore.getQueryForTarget(targetId); | ||
queryData = await this.localStore.allocateQuery(query); | ||
await this.initializeViewAndComputeSnapshot( | ||
queryData, | ||
/*current=*/ false | ||
); | ||
} | ||
activeQueries.push(queryData); | ||
}); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hrm... When we become primary we're going to add queryViews for targets we don't have local listeners for. So when you transition non-primary=>primary=>non-primary, you'll end up with a bunch of extra queryViews. While perhaps not harmful, I'd prefer if this were more deterministic. So should we remove no-longer-needed queryViews here too? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's unfortunately no API yet that allows me to determine whether a view is "local". SharedClientState does have this information. But since this will require some plumbing, I turned this part of the cleanup into a TODO. |
||
} | ||
return p.then(() => activeQueries); | ||
return p.then(() => { | ||
this.syncEngineListener!.onWatchChange(newViewSnapshots); | ||
return activeQueries; | ||
}); | ||
} | ||
|
||
// PORTING NOTE: Multi-tab only | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -585,10 +585,7 @@ export class RemoteStore implements TargetMetadataProvider { | |
|
||
private tryRecoverFromPrimaryLeaseLoss(err: FirestoreError): void { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please rename this if you rename the other one. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
if (isPrimaryLeaseLostError(err)) { | ||
log.debug( | ||
LOG_TAG, | ||
'Unexpectedly lost primary lease, attempting to recover' | ||
); | ||
log.debug(LOG_TAG, 'Unexpectedly lost primary lease'); | ||
} else { | ||
throw err; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing space after
//
andquerie's
typoThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done