-
Notifications
You must be signed in to change notification settings - Fork 928
Port optimizations to LocalDocumentsView from iOS #1055
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
var-const
commented
Jul 27, 2018
- add a method to find batches affecting a set of keys (port of 1479);
- use the newly-added method to avoid rereading batches when getting documents in LocalDocumentsView (port of 1505);
- avoid rereading batches when searching for documents in a collection (port of 1533).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly good, but I think this port looks close to the beginning state of Android instead of the final state (we have O(mutation queue) implementations instead of O(documentKeys))
return; | ||
} | ||
uniqueBatchIDs = uniqueBatchIDs.add(batchID); | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think this is O(mutation queue size) instead of O(documentKeys size)... but can probably be fixed to be O(documentKeys size) by either calling iterate again for each documentKey, or by using control.skipToKey() after finding / not-finding each documentKey.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL.
} | ||
|
||
/** Internal version of `getDocument` that allows reusing batches. */ | ||
private getDocumentInBatches( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not wild about getDocumentInBatches()
. I'd probably rename to getDocumentInternal(). Or you could just combine with getDocument() and have inBatches be an optional param:
getDocument(
transaction: PersistenceTransaction,
key: DocumentKey,
inBatches?: MutationBatch[]) {
let inBatchesPromise = inBatches ? PersistencePromise.resolve(inBatches) :
this.mutationQueue.getAllMutationBatchesAffectingDocumentKey(transaction, key);
return inBatchesPromise.next(inBatches =>
this.remoteDocumentCache.getEntry(transaction, key))
.next(doc => {
for (const batch of inBatches) {
doc = batch.applyToLocalView(key, doc);
}
return doc;
});
}
(written via copy/paste and untested so probably not quite right)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any preference in favor of the optional param? I'm fine with that, but it would be a little less consistent with other platforms, where the other function is private. OTOH, perhaps it's more idiomatic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No preference. You're right that it would ideally be private... but TypeScript doesn't support "overloads" with mixed visibility. So I think getDocumentInternal()
is good.
if (documentKeys.has(ref.key)) { | ||
uniqueBatchIDs = uniqueBatchIDs.add(ref.targetOrBatchId); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is O(mutation queue) instead of O(document keys). Can we make it match iOS / Android?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL.
* convenient. | ||
*/ | ||
// TODO(mcg): This should really return an enumerator | ||
// also for b/32992024, all backing stores should really index by document key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm. b/32992024 is closed and I think this comment is obsolete (documentMutations is our index). Can you remove it here and above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
…nst/port-affecting-batches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with optional nit. Thanks!
const end = new DocReference(documentKey, Number.POSITIVE_INFINITY); | ||
this.batchesByDocumentKey.forEachInRange([start, end], ref => { | ||
if (!documentKey.isEqual(ref.key)) { | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is it possible to hit this? Seems like our range should constrain us to documentKey
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. This function is derived from getAllMutationBatchesAffectingDocumentKey
, which asserts this equality, this led me to believe this case is not impossible. I can remove it altogether or replace with an assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asserts should generally mean the case is impossible. They exist to document our expectations and/or catch false expectations as early as possible to make finding / fixing bugs later. So I think an assert here makes perfect sense.
Fine print: We're probably not 100% rigorous about asserts always being impossible. E.g. there may be cases where we use asserts for cases that /are/ possible but represent rare error conditions that we don't feel compelled to provide proper error handling for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replicated the assertion.
} | ||
|
||
/** Internal version of `getDocument` that allows reusing batches. */ | ||
private getDocumentInBatches( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No preference. You're right that it would ideally be private... but TypeScript doesn't support "overloads" with mixed visibility. So I think getDocumentInternal()
is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did some performance tests (by running yarn test:browser
with persistence enabled (see code below). The test consisted of writing 10 batches of 500 mutations each, where each mutation creates a very small document in the same collection (so the resulting collection contains 5K documents). Ran the test on master and in this branch 3 times each. Results (in milliseconds):
- writing batches in master: 37169 38600 38246
- writing batches in branch: 11282 14307 14774
- reading the collection in master: 30784 5189 5858
- reading the collection in branch: 1124 851 844.
One weird thing I noticed is that for any given project, reading the collection for the first time is slower than on subsequent runs of the test (all tests run with --single_run
). The difference is really noticeable on master (31 vs 5-6 seconds) and less pronounced in this branch (1120 ms vs 850 ms). Changing the project after a "fast" read will result in one "slow" read, but the subsequent runs are "fast" again. The reading is done in offline mode, so this is a little confusing.
The ad-hoc code I used for the test:
let colRef = null;
let writeStartTime = 0;
let readStartTime = 0;
const promises = [];
return withTestCollection(persistence, {}, coll => {
colRef = coll;
colRef.onSnapshot(() => {});
writeStartTime = Date.now();
for (let iter = 0; iter !== 10; ++iter) {
const batch = colRef.firestore.batch();
for (let i = 0; i !== 500; ++i) {
batch.set(colRef.doc(), { foo: 'bar' + i });
}
promises.push(batch.commit());
}
return Promise.all(promises)
.then(() => {
const writeEndTime = Date.now();
console.error('Writing batches took ' + (writeEndTime - writeStartTime) + ' ms');
return colRef.firestore.disableNetwork();
})
.then(() => {
readStartTime = Date.now();
return colRef.get({source: 'cache'});
})
.then(qrySnap => {
const readEndTime = Date.now();
console.error(
'Get collection took: ' + (readEndTime - readStartTime) + ' ms'
);
});
});
const end = new DocReference(documentKey, Number.POSITIVE_INFINITY); | ||
this.batchesByDocumentKey.forEachInRange([start, end], ref => { | ||
if (!documentKey.isEqual(ref.key)) { | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. This function is derived from getAllMutationBatchesAffectingDocumentKey
, which asserts this equality, this led me to believe this case is not impossible. I can remove it altogether or replace with an assertion.
Nice! Thanks for the performance data. I can't 100% explain the "first time slow" issue without digging in, but it may be related to the "colRef.onSnapshot(() => {});" line... That's going to spin up a full normal listener against the backend which involves reading documents and query metadata from persistence, sending the result to the backend, and handling the response data. To cancel that out, you could make sure the listener has fully initialized / settled by doing something like:
|
…/firebase-js-sdk into varconst/port-affecting-batches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! Adding the snippet you suggested indeed eliminated the first-time slowdown, now I get consistent (faster) results. So my tests with 10 batches show 3x speedup in writing batches and ~6x speedup reading large collections (presumably even more so for more batches).
const end = new DocReference(documentKey, Number.POSITIVE_INFINITY); | ||
this.batchesByDocumentKey.forEachInRange([start, end], ref => { | ||
if (!documentKey.isEqual(ref.key)) { | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replicated the assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still LGTM. Thanks!
@@ -433,7 +434,30 @@ function genericMutationQueueTests(): void { | |||
const matches = await mutationQueue.getAllMutationBatchesAffectingDocumentKey( | |||
key('foo/bar') | |||
); | |||
expect(matches.length).to.deep.equal(expected.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to mention: I noticed that expectEqualArrays
checks for equal length, so this line seems redundant.
* Implement global resume token (#1052) * Add a spec test that shows correct global resume token handling * Minimum implementation to handle global resume tokens * Remove unused QueryView.resumeToken * Avoid persisting the resume token unless required * Persist the resume token on unlisten * Add a type parameter to Persistence (#1047) * Cherry pick sequence number starting point * Working on typed transactions * Start plumbing in sequence number * Back out sequence number changes * [AUTOMATED]: Prettier Code Styling * Fix tests * [AUTOMATED]: Prettier Code Styling * Fix lint * [AUTOMATED]: Prettier Code Styling * Uncomment line * MemoryPersistenceTransaction -> MemoryTransaction * [AUTOMATED]: Prettier Code Styling * Review updates * Style * Lint and style * Review feedback * [AUTOMATED]: Prettier Code Styling * Revert some unintentional import churn * Line 44 should definitely be empty * Checkpoint before adding helper function for stores * Use a helper for casting PersistenceTransaction to IndexedDbTransaction * [AUTOMATED]: Prettier Code Styling * Remove errant generic type * Lint * Fix typo * Port optimizations to LocalDocumentsView from iOS (#1055) * add a method to find batches affecting a set of keys (port of [1479](firebase/firebase-ios-sdk#1479)); * use the newly-added method to avoid rereading batches when getting documents in `LocalDocumentsView` (port of [1505](firebase/firebase-ios-sdk#1505)); * avoid rereading batches when searching for documents in a collection (port of [1533](firebase/firebase-ios-sdk#1533)). Speedup was measured by running tests in browser and checking time spent writing 10 batches of 500 mutations each, and then querying the resulting 5K docs collection from cache in offline mode. For this case, the writing speedup is about 3x, and querying speedup is about 6x (see PR for more details). * Add a CHANGELOG entry for #1052 (#1071) * Add a CHANGELOG entry for #1052 * Add notes for #1055 * Rename idleTimer and fix comments. (#1068) * Merge (#1073)
* Merging PersistentStream refactor * [AUTOMATED]: Prettier Code Styling * Typo * Remove canUseNetwork state. (#1076) * Merging the latest merge into the previous merge (#1077) * Implement global resume token (#1052) * Add a spec test that shows correct global resume token handling * Minimum implementation to handle global resume tokens * Remove unused QueryView.resumeToken * Avoid persisting the resume token unless required * Persist the resume token on unlisten * Add a type parameter to Persistence (#1047) * Cherry pick sequence number starting point * Working on typed transactions * Start plumbing in sequence number * Back out sequence number changes * [AUTOMATED]: Prettier Code Styling * Fix tests * [AUTOMATED]: Prettier Code Styling * Fix lint * [AUTOMATED]: Prettier Code Styling * Uncomment line * MemoryPersistenceTransaction -> MemoryTransaction * [AUTOMATED]: Prettier Code Styling * Review updates * Style * Lint and style * Review feedback * [AUTOMATED]: Prettier Code Styling * Revert some unintentional import churn * Line 44 should definitely be empty * Checkpoint before adding helper function for stores * Use a helper for casting PersistenceTransaction to IndexedDbTransaction * [AUTOMATED]: Prettier Code Styling * Remove errant generic type * Lint * Fix typo * Port optimizations to LocalDocumentsView from iOS (#1055) * add a method to find batches affecting a set of keys (port of [1479](firebase/firebase-ios-sdk#1479)); * use the newly-added method to avoid rereading batches when getting documents in `LocalDocumentsView` (port of [1505](firebase/firebase-ios-sdk#1505)); * avoid rereading batches when searching for documents in a collection (port of [1533](firebase/firebase-ios-sdk#1533)). Speedup was measured by running tests in browser and checking time spent writing 10 batches of 500 mutations each, and then querying the resulting 5K docs collection from cache in offline mode. For this case, the writing speedup is about 3x, and querying speedup is about 6x (see PR for more details). * Add a CHANGELOG entry for #1052 (#1071) * Add a CHANGELOG entry for #1052 * Add notes for #1055 * Rename idleTimer and fix comments. (#1068) * Merge (#1073)
* Catch invalid provider id error (#1064) * RxFire: Api Change and documentation (#1066) * api changes and doc updates * fixes * Refactor PersistentStream (no behavior changes). (#1041) This breaks out a number of changes I made as prep for b/80402781 (Continue retrying streams for 1 minute (idle delay)). PersistentStream changes: * Rather than providing a stream event listener to every call of start(), the stream listener is now provided once to the constructor and cannot be changed. * Streams can now be restarted indefinitely, even after a call to stop(). * PersistentStreamState.Stopped was removed and we just return to 'Initial' after a stop() call. * Added `closeCount` member to PersistentStream in order to avoid bleedthrough issues with auth and stream events once stop() has been called. * Calling stop() now triggers the onClose() event listener, which simplifies stream cleanup. * PersistentStreamState.Auth renamed to 'Starting' to better reflect that it encompasses both authentication and opening the stream. RemoteStore changes: * Creates streams once and just stop() / start()s them as necessary, never recreating them completely. * Added networkEnabled flag to track whether the network is enabled or not, since we no longer null out the streams. * Refactored disableNetwork() / enableNetwork() to remove stream re-creation. Misc: * Comment improvements including a state diagram on PersistentStream. * Fixed spec test shutdown to schedule via the AsyncQueue to fix sequencing order I ran into. * Merging Persistent Stream refactor (#1069) * Merging PersistentStream refactor * [AUTOMATED]: Prettier Code Styling * Typo * Remove canUseNetwork state. (#1076) * Merging the latest merge into the previous merge (#1077) * Implement global resume token (#1052) * Add a spec test that shows correct global resume token handling * Minimum implementation to handle global resume tokens * Remove unused QueryView.resumeToken * Avoid persisting the resume token unless required * Persist the resume token on unlisten * Add a type parameter to Persistence (#1047) * Cherry pick sequence number starting point * Working on typed transactions * Start plumbing in sequence number * Back out sequence number changes * [AUTOMATED]: Prettier Code Styling * Fix tests * [AUTOMATED]: Prettier Code Styling * Fix lint * [AUTOMATED]: Prettier Code Styling * Uncomment line * MemoryPersistenceTransaction -> MemoryTransaction * [AUTOMATED]: Prettier Code Styling * Review updates * Style * Lint and style * Review feedback * [AUTOMATED]: Prettier Code Styling * Revert some unintentional import churn * Line 44 should definitely be empty * Checkpoint before adding helper function for stores * Use a helper for casting PersistenceTransaction to IndexedDbTransaction * [AUTOMATED]: Prettier Code Styling * Remove errant generic type * Lint * Fix typo * Port optimizations to LocalDocumentsView from iOS (#1055) * add a method to find batches affecting a set of keys (port of [1479](firebase/firebase-ios-sdk#1479)); * use the newly-added method to avoid rereading batches when getting documents in `LocalDocumentsView` (port of [1505](firebase/firebase-ios-sdk#1505)); * avoid rereading batches when searching for documents in a collection (port of [1533](firebase/firebase-ios-sdk#1533)). Speedup was measured by running tests in browser and checking time spent writing 10 batches of 500 mutations each, and then querying the resulting 5K docs collection from cache in offline mode. For this case, the writing speedup is about 3x, and querying speedup is about 6x (see PR for more details). * Add a CHANGELOG entry for #1052 (#1071) * Add a CHANGELOG entry for #1052 * Add notes for #1055 * Rename idleTimer and fix comments. (#1068) * Merge (#1073)