Skip to content

Allow remote updates from watch to heal a cache with synthesized deletes in it #1015

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jul 18, 2018

Conversation

wilhuff
Copy link
Contributor

@wilhuff wilhuff commented Jul 17, 2018

No description provided.

.expectEvents(allQuery, { added: [docAv1], fromCache: false })
.watchSends({ removed: [allQuery] }, docDeleted)
.watchCurrents(allQuery, 'resume-token-2000')
.watchSnapshots(2000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be .watchSnapshots(2000, [allQuery], 'resume-token-2000') and then you can get rid of the second .watchCurrents().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

.watchAcks(visibleQuery)
.watchSends({ affects: [visibleQuery] }, docAv2)
.watchSnapshots(5000)
.watchSends({ affects: [visibleQuery] }, docAv3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not convinced watch will send us this. It would have to keep track of any document that it previously told us was in the query and somehow continue to send us changes for it even though it's no longer in the query, wouldn't it? Or maybe once it sends us a CURRENT it's allowed to stop? I don't really know what guarantee we're expecting and how watch would implement it.

In any case, I think this test as modified no longer reflects the comment a few lines above. We should probably get @jdimond's blessing and then update the comment as well if we're changing this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this scenario is possible as written then I can't make this change. I could adapt the change to only allow a bypass of the version check in the case of a delete.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. 😦

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Take 4 ready for review. This passes all the tests we've thrown at it but does so by introducing a notion of authoritative updates--updates sent by watch in a configuration that we can blindly trust.

.expectEvents(allQuery, { fromCache: false })
);
});

specTest('Deleted documents in cache are fixed', ['exclusive'], () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove exclusive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@mikelehen mikelehen assigned wilhuff and unassigned mikelehen Jul 17, 2018
@wilhuff wilhuff force-pushed the wilhuff/allow-healing branch from 717a856 to 34a0356 Compare July 17, 2018 23:02
@wilhuff wilhuff force-pushed the wilhuff/allow-healing branch from 34a0356 to 44850e5 Compare July 17, 2018 23:04
// Now when the client listens expect the cached NoDocument to be
// discarded because the global snapshot version exceeds what came before.
.userListens(allQuery, 'resume-token-2000')
.watchAcksFull(allQuery, 3000, docAv1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - wouldn't this send an existence filter with count of 1 and not send this document (since the resume token is newer than the document)?

It looks like a query with a single NoDocument associated has an document count of 1, and we will not reset the query (only verified by glancing through the code).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably true... It would either send us an existence filter, or it could send us a RESET and then the contents.

You could fix it by using a different query than allQuery above to generate the initial state we want (and then we wouldn't include an existence filter here). BUT in general this test is an invalid set of watch events since we're intentionally getting the cache into a bad state. So I'm not sure we care.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this to use a separate setup query to sidestep the existence filter issue.

@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

@wilhuff wilhuff added the bug label Jul 18, 2018
@wilhuff wilhuff assigned mikelehen and unassigned wilhuff Jul 18, 2018
Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with optional minor comment suggestion.

// Now when the client listens expect the cached NoDocument to be
// discarded because the global snapshot version exceeds what came before.
.userListens(allQuery, 'resume-token-2000')
.watchAcksFull(allQuery, 3000, docAv1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably true... It would either send us an existence filter, or it could send us a RESET and then the contents.

You could fix it by using a different query than allQuery above to generate the initial state we want (and then we wouldn't include an existence filter here). BUT in general this test is an invalid set of watch events since we're intentionally getting the cache into a bad state. So I'm not sure we care.

@@ -443,6 +443,9 @@ abstract class TestRunner {
}

private doStep(step: SpecStep): Promise<void> {
// tslint:disable-next-line:no-console
console.log(' step: ' + JSON.stringify(step));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is intentional? It'll be pretty spammy during successful runs, but it does seem useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add this every time there's a failure to help me understand where it failed. I didn't mean to let this through. Removed.

objUtils.forEachNumber(
remoteEvent.targetChanges,
(targetId: TargetId, change: TargetChange) => {
// Do not ref/unref unassigned targetIds - it may lead to leaks.
let queryData = this.targetIds[targetId];
if (!queryData) return;

// When a global snapshot contains updates (either add or modify) we
// can completely trust these updates as authoritative.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth adding, " and blindly apply them to our cache (as a defensive measure to promote self-healing in the unfortunate case that our cache is ever somehow corrupted / out-of-sync)." since in an ideal world I think this code would be unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@mikelehen mikelehen assigned wilhuff and unassigned mikelehen Jul 18, 2018
@wilhuff wilhuff merged commit 6d66392 into master Jul 18, 2018
@wilhuff wilhuff deleted the wilhuff/allow-healing branch July 18, 2018 19:38
wilhuff added a commit to firebase/firebase-ios-sdk that referenced this pull request Jul 18, 2018
wilhuff added a commit to firebase/firebase-ios-sdk that referenced this pull request Jul 19, 2018
wilhuff added a commit to firebase/firebase-ios-sdk that referenced this pull request Jul 19, 2018
wilhuff added a commit to firebase/firebase-ios-sdk that referenced this pull request Jul 19, 2018
…tes in it (#1557)

* Update spec tests from the js-sdk

* Allow remote updates from watch to heal a cache with synthesized deletes in it

Port of firebase/firebase-js-sdk#1015

Addresses #1548
wilhuff added a commit to firebase/firebase-ios-sdk that referenced this pull request Jul 19, 2018
…tes in it (#1557)

* Update spec tests from the js-sdk

* Allow remote updates from watch to heal a cache with synthesized deletes in it

Port of firebase/firebase-js-sdk#1015

Addresses #1548
schmidt-sebastian added a commit that referenced this pull request Jul 25, 2018
* Add @davideast as a CODEOWNER (#996)

* Embed metadata directly into the RPC call (#979)

* Embed metadata directly into the RPC call

* [AUTOMATED]: Prettier Code Styling

* Use ...args

* [AUTOMATED]: Prettier Code Styling

* Minimize diff

* Add the OAuth assertion back in

* Added missing type for optional database url. (#1001)

* RxFire Realtime Database (#997)

* initial database code

* test setup

* database tests

* auditTrail and database tests

* [AUTOMATED]: Prettier Code Styling

* [AUTOMATED]: License Headers

* Josh's comments. Database docs

* [AUTOMATED]: Prettier Code Styling

* Firestore docs

* auth docs

* declaration fixes

* switch to peerDeps

* [AUTOMATED]: Prettier Code Styling

* test config

* Expose array transforms and array contains queries. (#1004)

Also remove test code that was combining multiple array contains queries since those were disallowed in 04c9c3a.

* Move fieldFilter (free function) to Filter.create() (#988)

This is a refactoring to unify filter creation across platforms.

* Enable firestore sdk to talk to emulator (#1007)

* Enable firestore sdk to talk to emulator

* [AUTOMATED]: Prettier Code Styling

* Revert firestore sdk changes

* [AUTOMATED]: Prettier Code Styling

* Revert credentials.ts

* Cleanup

* [AUTOMATED]: Prettier Code Styling

* Set webSafe=false

* Combine initializeTestApp and initializeFirestoreTestApp

* [AUTOMATED]: Prettier Code Styling

* Cleanup

* [AUTOMATED]: Prettier Code Styling

* Update major version since this is a breaking change that will cause the testing sdk to no longer work with old versions of the RTDB emulator

* Completely remove admin sdk

* Change version back to 0.1.0

* Setting GarbageSource in SyncEngine's constructor (#1010)

* b/72533250: Fix issue with limbo resolutions triggering incorrect manufactured deletes. (#1014)

This fixes an issue occurring when a limbo target receives a documentUpdate,
then a global snapshot, and then a CURRENT. Because there was a global
snapshot before the CURRENT, WatchChangeAggregator has no pending document
updates and calls SyncEngine.targetContainsDocument() to see if we previously got any
document from the backend for the target. See:
https://github.com/firebase/firebase-js-sdk/blob/6905339235ad801291edc696dd75a08e80647f5b/packages/firestore/src/remote/watch_change.ts#L422

Prior to this change, targetContainsDocument() returned false because
it relies on our Views to track the contents of the target, and we don't
have Views for limbo targets. Thus WatchChangeAggregator incorrectly
manufactures a NoDocument document update which deletes data from our
cache.

The fix is to have SyncEngine track the fact that we did indeed get
a document for the limbo resolution and return true from
targetContainsDocument().

* Updating yarn.lock

* Add @firebase/util as a dep of @firebase/testing

* Allow remote updates from watch to heal a cache with synthesized deletes in it (#1015)

* Write a spec test for the busted cache

* Modify spec test to demonstrate deletedDoc issue. (#1017)

* Allow updates for targets where the document is modified

* Fix getRemoteKeysForTarget() method name in comment. (#1020)

While porting I noticed this was slightly wrong. targetContainsDocument() is the method in WatchChangeAggregator. The SyncEngine method I meant to reference is getRemoteKeysForTarget().

* Making sure we don't export 'experimental' (#1023)

* Add a schema migration that drops the query cache (#1019)

* Add a schema migration that drops the query cache

This is a force fix for potential existence filter mismatches caused by
firebase/firebase-ios-sdk#1548

The essential problem is that when resuming a query, the server is
allowed to forget deletes. If the number of incorrectly synthesized
deletes matches the number of server-forgotten deletes then the
existence filter can give a false positive, preventing the cache from
self healing.

Dropping the query cache clears any client-side resume token which
prevents a false positive existence filter mismatch.

Note that the remote document cache and mutation queues are unaffected
so any cached documents that do exist will still work while offline.

* Implement review feedback

* Add a release note for the fix to #1548 (#1024)

* Ensure that we create an empty TargetGlobal row. (#1029)

Ensure the v3 migration unconditionally creates the TargetGlobal row. Remove the no-longer-necessary v2 schema migration.

* Remove unnecessary `any` (#1030)

* Fix an errant any usage

* [AUTOMATED]: Prettier Code Styling

* Publish [email protected]

* Unify local.QueryData with the other platforms (#1027)

This makes it line up with it's own docs, and also the other platforms.

* Fix to #1027 to allow SnapshotVersion == 0 (#1033)

* Add iat to fake access token payload (#1022)

* Add iat to fake access token payload

* [AUTOMATED]: Prettier Code Styling

* Simpler tests

* [AUTOMATED]: Prettier Code Styling

* Do not clobber iat

* catch server error RESET_PASSWORD_EXCEED_LIMIT (#1037)

* Merging Master into Multi-Tab (#1038)
@firebase firebase locked and limited conversation to collaborators Oct 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants