Skip to content

Take WatchStream offline when IndexedDB is unavailable #3010

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 7, 2020

Conversation

schmidt-sebastian
Copy link
Contributor

@schmidt-sebastian schmidt-sebastian commented May 4, 2020

This PR takes the network offline if we fail to talk to IndexedDB when applying a query snapshot or rejection.

Also cleans up some Promise handling in RemoteStore.

Addresses #2755

@google-oss-bot
Copy link
Contributor

google-oss-bot commented May 4, 2020

Binary Size Report

Affected SDKs

  • @firebase/firestore

    Type Base (df44d12) Head (8c38bfb) Diff
    browser 247 kB 249 kB +1.65 kB (+0.7%)
    esm2017 193 kB 194 kB +480 B (+0.2%)
    main 488 kB 490 kB +2.15 kB (+0.4%)
    module 245 kB 247 kB +1.59 kB (+0.6%)
  • @firebase/firestore/memory

    Type Base (df44d12) Head (8c38bfb) Diff
    browser 188 kB 190 kB +1.21 kB (+0.6%)
    esm2017 148 kB 148 kB +338 B (+0.2%)
    main 365 kB 366 kB +1.66 kB (+0.5%)
    module 187 kB 188 kB +1.17 kB (+0.6%)
  • firebase

    Type Base (df44d12) Head (8c38bfb) Diff
    firebase-firestore.js 286 kB 288 kB +1.59 kB (+0.6%)
    firebase-firestore.memory.js 229 kB 230 kB +1.17 kB (+0.5%)
    firebase.js 820 kB 821 kB +1.59 kB (+0.2%)

Test Logs

@schmidt-sebastian schmidt-sebastian requested a review from wilhuff May 4, 2020 20:12
@schmidt-sebastian schmidt-sebastian changed the title IndexedDB recovery for the WatchStream Take WatchStream offline when IndexedDB is unavailable May 5, 2020
*/
private async disableNetworkUntilRecovery(e: FirestoreError): Promise<void> {
if (e.name === 'IndexedDbTransactionError') {
++this.indexedDbFailureBarrier;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mechanism seems more complicated/fiddly than it needs to be.

If we get multiple watch change events in a row, this is going to bump this indexedDbFailureBarrier multiple times and enqueue multiple retry tasks. When IndexedDB becomes available, each of them is going to succeed, decrement the count, and try to enable the network but only the last call to enableNetworkInternal will succeed.

It seems like instead of keeping a count, we could instead keep a boolean like indexedDbFailed. Then in this method:

  • If not indexedDbFailed, set it and enqueue the retry task
  • If indexedDbFailed, just log the error but do nothing.
  • In the retry task, if getLastRemoteSnapshotVersion succeeds, flip indexedDbFailed back to false and enable the network.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You say "complicated/fiddly", I say "advanced" :)

We should only ever get one of these events, since we go offline right away and discard all further messages from the stream. I realize that this is a good argument for making this a boolean and updated the code accordingly.

private async disableNetworkUntilRecovery(e: FirestoreError): Promise<void> {
if (e.name === 'IndexedDbTransactionError') {
++this.indexedDbFailureBarrier;
await this.disableNetworkInternal();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disableNetworkInternal does not change the onlineStateTracker, but it seems like while we've intentionally disabled the network we should mark ourselves offline, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will create some flicker, but it is the right thing to do. Updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to add the state transition here since we don't raise offline events when disableNetworkInternal is called from shutdown or for primary state transitions.

await this.raiseWatchSnapshot(snapshotVersion);
}
} catch (e) {
logDebug(LOG_TAG, 'Failed to raise snapshot: ' + e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any identifying information about the snapshot that we can log here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if it will help, but I added the snapshot version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The snapshot version probably isn't that useful, but the target ID would be. Anything to help associate this message with its context.

If the targetId isn't available going back to what you had before is probably better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this back to the original message. The target IDs are not easily available:

  • DocumentWatchChange has updatedTargetIds and removedTargetIds
  • ExistenceFilterChange has targetId
  • WatchTargetChange has targetIds

If I understand the flow correctly, we should always be in a WatchTargetChange to raise a snapshot, but I don't want to bake this assumption into the code for better logging. I can invest some time here, but the targets are already in the Watch response that we log right before.

try {
await this.handleTargetError(watchChange);
} catch (e) {
logDebug(LOG_TAG, 'Failed to remove target: ' + e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we log the query associated with the target here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we go out of our way, we don't have access to the query here. I logged the target ID instead. We should however already have this information from the Watch stream.

FWIW, I didn't add this logging to begin with since this is a debug log. The other logs were error logs in response to user behavior. They are always visible, and often without context.

@wilhuff wilhuff assigned schmidt-sebastian and unassigned wilhuff May 5, 2020
Copy link
Contributor

@wilhuff wilhuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

await this.raiseWatchSnapshot(snapshotVersion);
}
} catch (e) {
logDebug(LOG_TAG, 'Failed to raise snapshot: ' + e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The snapshot version probably isn't that useful, but the target ID would be. Anything to help associate this message with its context.

If the targetId isn't available going back to what you had before is probably better.

@wilhuff wilhuff assigned schmidt-sebastian and unassigned wilhuff May 7, 2020
@schmidt-sebastian schmidt-sebastian merged commit f9c6dde into master May 7, 2020
@firebase firebase locked and limited conversation to collaborators Jun 7, 2020
@schmidt-sebastian schmidt-sebastian deleted the mrschmidt/watchretry branch November 9, 2020 22:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants