Skip to content

[Multi-Tab] Recover from unexpected primary lease loss #984

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jul 25, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 46 additions & 36 deletions packages/firestore/src/core/sync_engine.ts
Original file line number Diff line number Diff line change
Expand Up @@ -276,26 +276,26 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer {
*/
// PORTING NOTE: Multi-tab only.
private synchronizeViewAndComputeSnapshot(
query: Query,
targetId: TargetId
queryView: QueryView
): Promise<ViewChange> {
return this.localStore.executeQuery(query).then(async docs => {
return this.localStore.executeQuery(queryView.query).then(async docs => {
return this.localStore
.remoteDocumentKeys(targetId)
.remoteDocumentKeys(queryView.targetId)
.then(async remoteKeys => {
const queryView = this.queryViewsByTarget[targetId];
assert(!!queryView, 'Cannot reconcile missing view');
const viewChange = queryView.view.synchronizeWithPersistedState(
// We do not update our tracking of limbo documents since
//`resetCurrent` will set `current` to false for the querie's view
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing space after // and querie's typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// (and won't return any limbo documents). We also explicitly don't
// track limbo documents as secondary.
const viewSnapshot = queryView.view.synchronizeWithPersistedState(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment still doesn't explain why we are resetting 'current' and why we only do it as primary.

I (non-confidently) think maybe the reasons are:

  1. We reset 'current' as primary because we expect synchronizeViewAndComputeSnapshot() to only be called immediately after we become primary and since we haven't had an active network connection, we should treat the queries as non-current (and raise fromCache=true events) until we get the initial response from Watch.
  2. Secondaries don't need to set current to false because it'll happen... somewhere else? (I guess where you added the TODO?)

In general I'm fuzzy on what the desired behavior is and how we're implementing it. Perhaps the desired behavior is that at the point of a primary change, all clients (secondary and master) mark all views as non-current? But if that's the case, why can't we just unconditionally pass true here for resetCurrent?

Sorry if I'm just confused. Feel free to ping me on chat to explain what's going on and we can figure out what comments/whatever might help this to make sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per our discussion yesterday, I changed this logic to no longer set current to false. That means that we now transition primary state without any user feedback and also that we could theoretically get limbo documents at this point.

Note that I am not quite sure how I can trigger limbo documents at this point (the new statement that I added). Any test case that I can think of has the query already marked Non-CURRENT by the previous primary tab (based on the limbo docs from the previous computation).

docs,
remoteKeys,
/* resetLimboDocuments= */ this.isPrimary,
/* resetCurrent= */ this.isPrimary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding what these flags do. I wonder if "reset" should be "recompute" or "update" (reset makes it sound like they'll just be cleared, which I don't think is the case)... And if we pass false does that mean current and limboDocuments will be left 100% as-is? I guess ultimately I'm wondering why we can't just unconditionally recompute these. I also wonder if we could sensibly combine these into a single argument. Since current depends on limboDocuments, it probably never makes sense to recompute current without computing limbo documents?

Anyway, at minimum it will probably take some comments to appease me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified this a little to unconditionally re-compute the list of limbo documents. In clients that are in-sync this is a no-op, and in clients that are out of sync, this will bring them back into sync. The only bit that I still need to provide is whether to set current to false. I added some documentation to clear this up a little.

Notice that I also added a TODO to mark the queries as non-current in SharedClientState. That's probably a three line change but I wanted to keep this PR from exploding even further.

);
await this.updateTrackedLimbos(
queryView.targetId,
viewChange.limboChanges
assert(
!this.isPrimary || viewSnapshot.limboChanges.length === 0,
'Received limboChanges when query is not CURRENT.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stared at the comment and this assert for a while before I fully understood what's going on. One thing that would help is if you changed the message to "Primary received ..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this assert and am now tracking these documents in the if I am the primary client. I could also pass in a updateLimboDocuments flag and not update the limbo documents for secondaries.

);
return viewChange;
return viewSnapshot;
});
});
}
Expand Down Expand Up @@ -787,10 +787,7 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer {
err: FirestoreError
): Promise<void> {
if (isPrimaryLeaseLostError(err)) {
log.debug(
LOG_TAG,
'Unexpectedly lost primary lease, attempting to recover'
);
log.debug(LOG_TAG, 'Unexpectedly lost primary lease');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still need to update the method name and comment... Please mention that when this error is thrown, there will automatically be a lease refresh scheduled so we should very quickly get an applyPrimaryState() call from the persistence layer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the method and comment. Method is now ignoreIfPrimaryLeaseLoss.

} else {
throw err;
}
Expand Down Expand Up @@ -865,47 +862,60 @@ export class SyncEngine implements RemoteSyncer, SharedClientStateSyncer {
}

/**
* Reconcile the local query views with those the state from persistence.
* Raises snapshots for any changes that affect the current client.
* Reconcile the query views of the provided query targets with the state from
* persistence. Raises snapshots for any changes that affect the local
* client and returns the updated state of all target's query data.
*/
// PORTING NOTE: Multi-tab only.
private synchronizeQueryViewsAndRaiseSnapshots(
targets: TargetId[]
): Promise<QueryData[]> {
let p = Promise.resolve();
const activeQueries: QueryData[] = [];
const newViewSnapshots: ViewSnapshot[] = [];
for (const targetId of targets) {
// TODO(multitab): We should mark queries NON-CURRENT when the primary tab
// re-computes its views.
p = p.then(async () => {
let queryData;
const query = await this.localStore.getQueryForTarget(targetId);
if (this.queryViewsByTarget[targetId] === undefined) {
assert(
this.isPrimary,
'A secondary tab should never have an active query without an active view.'
);
// For queries that never executed on this client, we need to
// allocate the query in LocalStore and initialize a new View.
queryData = await this.localStore.allocateQuery(query);
await this.initializeViewAndComputeSnapshot(queryData, false);
} else {
let queryData: QueryData;
const queryView = this.queryViewsByTarget[targetId];
if (queryView) {
// For queries that have a local View, we need to update their state
// in LocalStore (as the resume token and the snapshot version
// might have changed) and reconcile their views with the persisted
// state (the list of syncedDocuments may have gotten out of sync).
await this.localStore.releaseQuery(query, true);
queryData = await this.localStore.allocateQuery(query);
await this.localStore.releaseQuery(
queryView.query,
/*keepPersistedQueryData=*/ true
);
queryData = await this.localStore.allocateQuery(queryView.query);
const viewChange = await this.synchronizeViewAndComputeSnapshot(
query,
queryData.targetId
queryView
);
if (viewChange.snapshot) {
this.syncEngineListener!.onWatchChange([viewChange.snapshot]);
newViewSnapshots.push(viewChange.snapshot);
}
} else {
assert(
this.isPrimary,
'A secondary tab should never have an active query without an active view.'
);
// For queries that never executed on this client, we need to
// allocate the query in LocalStore and initialize a new View.
const query = await this.localStore.getQueryForTarget(targetId);
queryData = await this.localStore.allocateQuery(query);
await this.initializeViewAndComputeSnapshot(
queryData,
/*current=*/ false
);
}
activeQueries.push(queryData);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm... When we become primary we're going to add queryViews for targets we don't have local listeners for.
Here, when we lose primary, we'll stop listening to those queries, but we'll keep the queryViews.

So when you transition non-primary=>primary=>non-primary, you'll end up with a bunch of extra queryViews. While perhaps not harmful, I'd prefer if this were more deterministic. So should we remove no-longer-needed queryViews here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's unfortunately no API yet that allows me to determine whether a view is "local". SharedClientState does have this information. But since this will require some plumbing, I turned this part of the cleanup into a TODO.

}
return p.then(() => activeQueries);
return p.then(() => {
this.syncEngineListener!.onWatchChange(newViewSnapshots);
return activeQueries;
});
}

// PORTING NOTE: Multi-tab only
Expand Down
29 changes: 23 additions & 6 deletions packages/firestore/src/core/view.ts
Original file line number Diff line number Diff line change
Expand Up @@ -383,24 +383,41 @@ export class View {
return changes;
}

/**
* Update the in-memory state of the current view with the state read from
* persistence.
*
* We update the query view whenever a client's primary status changes:
* - When a client transitions from primary to secondary, it can miss
* LocalStorage updates and its query views may temporarily not be
* synchronized with the state on disk.
* - For secondary to primary transitions, the client needs to update the list
* of `syncedDocuments` since secondary clients update their query views
* based purely on synthesized RemoteEvents.
*
* @param localDocs - The documents that match the query according to the
* LocalStore.
* @param {DocumentKeySet} remoteKeys - The keys of the documents that match
* the query according to the backend.
* @param {boolean} resetCurrent - Whether we should flip `CURRENT` back to
* false, since the query will be re-listened to.
*
* @return The ViewChange that resulted from this synchronization.
*/
// PORTING NOTE: Multi-tab only.
synchronizeWithPersistedState(
localDocs: MaybeDocumentMap,
remoteKeys: DocumentKeySet,
resetLimboDocuments: boolean,
resetCurrent: boolean
): ViewChange {
if (resetLimboDocuments) {
this.limboDocuments = documentKeySet();
}

if (resetCurrent) {
this.current = false;
}

this._syncedDocuments = remoteKeys;
this.limboDocuments = documentKeySet();
const docChanges = this.computeDocChanges(localDocs);
return this.applyChanges(docChanges, resetLimboDocuments);
return this.applyChanges(docChanges, /*updateLimboDocuments=*/ true);
}

/**
Expand Down
5 changes: 1 addition & 4 deletions packages/firestore/src/remote/remote_store.ts
Original file line number Diff line number Diff line change
Expand Up @@ -585,10 +585,7 @@ export class RemoteStore implements TargetMetadataProvider {

private tryRecoverFromPrimaryLeaseLoss(err: FirestoreError): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this if you rename the other one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (isPrimaryLeaseLostError(err)) {
log.debug(
LOG_TAG,
'Unexpectedly lost primary lease, attempting to recover'
);
log.debug(LOG_TAG, 'Unexpectedly lost primary lease');
} else {
throw err;
}
Expand Down
6 changes: 2 additions & 4 deletions packages/firestore/src/util/obj.ts
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,9 @@ export function size<V>(obj: Dict<V>): number {
}

/** Extracts the numeric indices from a dictionary. */
export function indices<V>(obj: Dict<V>): number[] {
export function indices<V>(obj: { [numberKey: number]: V }): number[] {
return Object.keys(obj).map(key => {
const numberKey = Number(key);
assert(!isNaN(numberKey), `Cannot convert ${key} to an index.`);
return numberKey;
return Number(key);
});
}

Expand Down