Skip to content

Implement global resume token #1052

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jul 31, 2018
Merged

Conversation

wilhuff
Copy link
Contributor

@wilhuff wilhuff commented Jul 26, 2018

Watch sends us a periodic heartbeat with an updated resume token that applies to all active targets by sending us an empty list of target ids.

Our current implementation in the WatchChangeAggregator loops over the target ids in the change so in this case we fail to take any action. If a client has been listening to a query that sees no changes for more than thirty minutes we'll fail to persist any resume token updates during that period.

This change:

  • Implements special handling in the WatchChangeAggregator for the empty target IDs list
  • Adds a filter in the local store to prevent heartbeat resume token updates from being written out more than once every five minutes
  • Adds logic on unlisten to ensure that we don't lose a token.

Note that while a query is active, the in-memory state is updated, so any incidental re-listens that happen as a result of stream loss or re-auth will always use an up-to-date token.

@wilhuff wilhuff force-pushed the wilhuff/global-resume-token-take-2 branch from 62e18b8 to a4a7983 Compare July 27, 2018 01:09
@wilhuff wilhuff changed the title WIP: minimal global resume token handling Implement global resume token Jul 27, 2018
@wilhuff wilhuff requested a review from jdimond July 27, 2018 01:19
Copy link
Contributor

@schmidt-sebastian schmidt-sebastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. It is much simpler that what I had feared.

One idea for a follow-up PR would be to stagger these resume token updates, so that we don't write all resume tokens at once when the five interval is over.

change
)
) {
promises.push(this.queryCache.updateQueryData(txn, queryData));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment to describe the handling of snapshot version? It's seems like it would be safe to never update the snapshot version, but a comment about this would clear my doubts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added comments to the method declaration. If it's safe to omit the resume token it's safe to omit the snapshot version. Note that it's the global snapshot version that acts as the high water mark to defend against repeated changes from Watch.

newQueryData: QueryData,
change: TargetChange
): boolean {
// Avoid clearing any existing value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functions seems more like a shouldPersistQueryData to me. To me, that would describe the way it is used and the internal logic better than its current name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (newQueryData.resumeToken.length === 0) return false;

// Any resume token is interesting if there isn't one already.
if (oldQueryData.resumeToken.length === 0) return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just learned something about our style guide:

"Control flow statements spanning multiple lines always use blocks for the containing code.
The exception is that if statements fitting on one line may elide the block."

Good to know :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking this up so I wouldn't have to. :-)

delete this.targetIds[queryData!.targetId];

const targetId = queryData.targetId;
const memoryQueryData = this.targetIds[targetId];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/memoryQueryData/cachedQueryData/ ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* targetIds explicitly listed in the change or the targetIds of all currently
* active targets.
*/
forEachTargetId(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Slight preference to name this just forEachTarget, but please feel free to ignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (targetChange.targetIds.length > 0) {
targetChange.targetIds.forEach(fn);
} else {
objUtils.forEachNumber(this.targetStates, (targetId, _) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified to:

objUtils.forEachNumber(this.targetStates, targetId => fn(targetId));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified to:

      objUtils.forEachNumber(this.targetStates, fn);

😛

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That loses the this context and always creates really hard to debug issues in the long run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I think I know where you're coming from (doing objUtils.forEachNumber(..., this.someFunction) would be dangerous), I don't think it applies here since fn is just a free-standing function, not a member function... plus we're already doing the exact same construct 2 lines up.

That said, I don't care strongly in the least which option we do. I just wanted to nit your nit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* The maximum time to leave a resume token buffered without writing it
* out.
*/
private static readonly MAX_RESUME_TOKEN_BUFFERING_MICROS = 5 * 60 * 1e6;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave the name as is, but it doesn't quite match other names we use do deal with similar logic:

CLIENT_METADATA_REFRESH_INTERVAL_MS is used in multi-tab and replaces OWNER_LEASE_REFRESH_INTERVAL_MS. There is also OWNER_LEASE_MAX_AGE_MS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RESUME_TOKEN_MAX_AGE_MICROS? (don't care strongly though)

Also, is there any meaningful context on how 5 minutes was chosen (based on discussion with Jonny, etc.) that we should capture here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: RESUME_TOKEN_MAX_AGE_MICROS. Also added a bit of verbiage describing the hat from which this number was pulled.

@schmidt-sebastian schmidt-sebastian removed their assignment Jul 27, 2018
Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with optional nits and a gentle nudge in case you wanted to add more tests. 😛

* The maximum time to leave a resume token buffered without writing it
* out.
*/
private static readonly MAX_RESUME_TOKEN_BUFFERING_MICROS = 5 * 60 * 1e6;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RESUME_TOKEN_MAX_AGE_MICROS? (don't care strongly though)

Also, is there any meaningful context on how 5 minutes was chosen (based on discussion with Jonny, etc.) that we should capture here?

if (newQueryData.resumeToken.length === 0) return false;

// Any resume token is interesting if there isn't one already.
if (oldQueryData.resumeToken.length === 0) return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking this up so I wouldn't have to. :-)

this.localViewReferences.removeReferencesForId(queryData!.targetId);
delete this.targetIds[queryData!.targetId];

const targetId = queryData.targetId;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, these should probably still be queryData! but we currently do not compile with strictNullChecks. :-( [see b/73018483]

But trying to stay compliant without actually enforcing it is a losing game, so... ¯_(ツ)_/¯

if (targetChange.targetIds.length > 0) {
targetChange.targetIds.forEach(fn);
} else {
objUtils.forEachNumber(this.targetStates, (targetId, _) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified to:

      objUtils.forEachNumber(this.targetStates, fn);

😛

.watchSnapshots(3000)
.expectEvents(query, { fromCache: false })
);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems pretty straightforward to test the MAX_RESUME_TOKEN_BUFFERING_MICROS logic too if we wanted to (might need to use .restart() to test re-listening without unlistening).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay fine, you caught my lazy bones out. Done.

@mikelehen mikelehen assigned jdimond and wilhuff and unassigned mikelehen and jdimond Jul 27, 2018
@wilhuff wilhuff assigned mikelehen and unassigned jdimond and wilhuff Jul 30, 2018
@wilhuff
Copy link
Contributor Author

wilhuff commented Jul 30, 2018

PTAL for spec tests.

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

@mikelehen mikelehen assigned wilhuff and unassigned mikelehen Jul 30, 2018
@wilhuff wilhuff merged commit 1f25d0d into master Jul 31, 2018
@wilhuff wilhuff deleted the wilhuff/global-resume-token-take-2 branch July 31, 2018 15:45
wilhuff added a commit that referenced this pull request Jul 31, 2018
wilhuff added a commit that referenced this pull request Jul 31, 2018
* Add a CHANGELOG entry for #1052

* Add notes for #1055
@schmidt-sebastian schmidt-sebastian mentioned this pull request Aug 1, 2018
schmidt-sebastian added a commit that referenced this pull request Aug 1, 2018
* Implement global resume token (#1052)

* Add a spec test that shows correct global resume token handling

* Minimum implementation to handle global resume tokens

* Remove unused QueryView.resumeToken

* Avoid persisting the resume token unless required

* Persist the resume token on unlisten

* Add a type parameter to Persistence (#1047)

* Cherry pick sequence number starting point

* Working on typed transactions

* Start plumbing in sequence number

* Back out sequence number changes

* [AUTOMATED]: Prettier Code Styling

* Fix tests

* [AUTOMATED]: Prettier Code Styling

* Fix lint

* [AUTOMATED]: Prettier Code Styling

* Uncomment line

* MemoryPersistenceTransaction -> MemoryTransaction

* [AUTOMATED]: Prettier Code Styling

* Review updates

* Style

* Lint and style

* Review feedback

* [AUTOMATED]: Prettier Code Styling

* Revert some unintentional import churn

* Line 44 should definitely be empty

* Checkpoint before adding helper function for stores

* Use a helper for casting PersistenceTransaction to IndexedDbTransaction

* [AUTOMATED]: Prettier Code Styling

* Remove errant generic type

* Lint

* Fix typo

* Port optimizations to LocalDocumentsView from iOS (#1055)

* add a method to find batches affecting a set of keys (port of [1479](firebase/firebase-ios-sdk#1479));
* use the newly-added method to avoid rereading batches when getting documents in `LocalDocumentsView` (port of [1505](firebase/firebase-ios-sdk#1505));
* avoid rereading batches when searching for documents in a collection (port of [1533](firebase/firebase-ios-sdk#1533)).

Speedup was measured by running tests in browser and checking time spent writing 10 batches of 500 mutations each, and then querying the resulting 5K docs collection from cache in offline mode. For this case, the writing speedup is about 3x, and querying speedup is about 6x (see PR for more details).

* Add a CHANGELOG entry for #1052 (#1071)

* Add a CHANGELOG entry for #1052

* Add notes for #1055

* Rename idleTimer and fix comments. (#1068)

* Merge (#1073)
schmidt-sebastian added a commit that referenced this pull request Aug 1, 2018
* Merging PersistentStream refactor

* [AUTOMATED]: Prettier Code Styling

* Typo

* Remove canUseNetwork state. (#1076)

* Merging the latest merge into the previous merge (#1077)

* Implement global resume token (#1052)

* Add a spec test that shows correct global resume token handling

* Minimum implementation to handle global resume tokens

* Remove unused QueryView.resumeToken

* Avoid persisting the resume token unless required

* Persist the resume token on unlisten

* Add a type parameter to Persistence (#1047)

* Cherry pick sequence number starting point

* Working on typed transactions

* Start plumbing in sequence number

* Back out sequence number changes

* [AUTOMATED]: Prettier Code Styling

* Fix tests

* [AUTOMATED]: Prettier Code Styling

* Fix lint

* [AUTOMATED]: Prettier Code Styling

* Uncomment line

* MemoryPersistenceTransaction -> MemoryTransaction

* [AUTOMATED]: Prettier Code Styling

* Review updates

* Style

* Lint and style

* Review feedback

* [AUTOMATED]: Prettier Code Styling

* Revert some unintentional import churn

* Line 44 should definitely be empty

* Checkpoint before adding helper function for stores

* Use a helper for casting PersistenceTransaction to IndexedDbTransaction

* [AUTOMATED]: Prettier Code Styling

* Remove errant generic type

* Lint

* Fix typo

* Port optimizations to LocalDocumentsView from iOS (#1055)

* add a method to find batches affecting a set of keys (port of [1479](firebase/firebase-ios-sdk#1479));
* use the newly-added method to avoid rereading batches when getting documents in `LocalDocumentsView` (port of [1505](firebase/firebase-ios-sdk#1505));
* avoid rereading batches when searching for documents in a collection (port of [1533](firebase/firebase-ios-sdk#1533)).

Speedup was measured by running tests in browser and checking time spent writing 10 batches of 500 mutations each, and then querying the resulting 5K docs collection from cache in offline mode. For this case, the writing speedup is about 3x, and querying speedup is about 6x (see PR for more details).

* Add a CHANGELOG entry for #1052 (#1071)

* Add a CHANGELOG entry for #1052

* Add notes for #1055

* Rename idleTimer and fix comments. (#1068)

* Merge (#1073)
schmidt-sebastian added a commit that referenced this pull request Aug 1, 2018
* Catch invalid provider id error (#1064)

* RxFire: Api Change and documentation (#1066)

* api changes and doc updates

* fixes

* Refactor PersistentStream (no behavior changes). (#1041)

This breaks out a number of changes I made as prep for b/80402781 (Continue
retrying streams for 1 minute (idle delay)).

PersistentStream changes:
* Rather than providing a stream event listener to every call of start(),
  the stream listener is now provided once to the constructor and cannot
  be changed.
* Streams can now be restarted indefinitely, even after a call to stop().
  * PersistentStreamState.Stopped was removed and we just return to
    'Initial' after a stop() call.
  * Added `closeCount` member to PersistentStream in order to avoid
    bleedthrough issues with auth and stream events once stop() has
    been called.
  * Calling stop() now triggers the onClose() event listener, which
    simplifies stream cleanup.
* PersistentStreamState.Auth renamed to 'Starting' to better reflect that
  it encompasses both authentication and opening the stream.

RemoteStore changes:
* Creates streams once and just stop() / start()s them as necessary,
  never recreating them completely.
  * Added networkEnabled flag to track whether the network is
    enabled or not, since we no longer null out the streams.
  * Refactored disableNetwork() / enableNetwork() to remove stream
    re-creation.

Misc:
* Comment improvements including a state diagram on PersistentStream.
* Fixed spec test shutdown to schedule via the AsyncQueue to fix
  sequencing order I ran into.

* Merging Persistent Stream refactor (#1069)

* Merging PersistentStream refactor

* [AUTOMATED]: Prettier Code Styling

* Typo

* Remove canUseNetwork state. (#1076)

* Merging the latest merge into the previous merge (#1077)

* Implement global resume token (#1052)

* Add a spec test that shows correct global resume token handling

* Minimum implementation to handle global resume tokens

* Remove unused QueryView.resumeToken

* Avoid persisting the resume token unless required

* Persist the resume token on unlisten

* Add a type parameter to Persistence (#1047)

* Cherry pick sequence number starting point

* Working on typed transactions

* Start plumbing in sequence number

* Back out sequence number changes

* [AUTOMATED]: Prettier Code Styling

* Fix tests

* [AUTOMATED]: Prettier Code Styling

* Fix lint

* [AUTOMATED]: Prettier Code Styling

* Uncomment line

* MemoryPersistenceTransaction -> MemoryTransaction

* [AUTOMATED]: Prettier Code Styling

* Review updates

* Style

* Lint and style

* Review feedback

* [AUTOMATED]: Prettier Code Styling

* Revert some unintentional import churn

* Line 44 should definitely be empty

* Checkpoint before adding helper function for stores

* Use a helper for casting PersistenceTransaction to IndexedDbTransaction

* [AUTOMATED]: Prettier Code Styling

* Remove errant generic type

* Lint

* Fix typo

* Port optimizations to LocalDocumentsView from iOS (#1055)

* add a method to find batches affecting a set of keys (port of [1479](firebase/firebase-ios-sdk#1479));
* use the newly-added method to avoid rereading batches when getting documents in `LocalDocumentsView` (port of [1505](firebase/firebase-ios-sdk#1505));
* avoid rereading batches when searching for documents in a collection (port of [1533](firebase/firebase-ios-sdk#1533)).

Speedup was measured by running tests in browser and checking time spent writing 10 batches of 500 mutations each, and then querying the resulting 5K docs collection from cache in offline mode. For this case, the writing speedup is about 3x, and querying speedup is about 6x (see PR for more details).

* Add a CHANGELOG entry for #1052 (#1071)

* Add a CHANGELOG entry for #1052

* Add notes for #1055

* Rename idleTimer and fix comments. (#1068)

* Merge (#1073)
@firebase firebase locked and limited conversation to collaborators Oct 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants