-
Notifications
You must be signed in to change notification settings - Fork 938
Port performance optimizations to speed up reading large collections from Android #1433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 16 commits
3da020e
9717fef
204443e
c56820b
f3174ca
36c53d3
942b0fa
7767b60
a1ad6d2
cc29731
307e0b3
62c627b
8f293eb
4e23f3b
fb751dd
431f618
1681f3d
55c7ff3
5afa305
56eefbc
5e506fa
7b11cec
1a32b22
17a69a6
4c270c7
512667e
6cb4bfb
77bf92e
e7b8c8e
a0d25f5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,14 +16,19 @@ | |
|
||
import { Query } from '../core/query'; | ||
import { | ||
DocumentKeySet, | ||
documentKeySet, | ||
DocumentMap, | ||
documentMap, | ||
DocumentSizeEntry, | ||
DocumentSizeEntries, | ||
nullableMaybeDocumentMap, | ||
NullableMaybeDocumentMap, | ||
MaybeDocumentMap, | ||
maybeDocumentMap | ||
} from '../model/collections'; | ||
import { Document, MaybeDocument, NoDocument } from '../model/document'; | ||
import { SortedMap } from '../util/sorted_map'; | ||
import { DocumentKey } from '../model/document_key'; | ||
|
||
import { SnapshotVersion } from '../core/snapshot_version'; | ||
|
@@ -178,6 +183,72 @@ export class IndexedDbRemoteDocumentCache implements RemoteDocumentCache { | |
}); | ||
} | ||
|
||
getEntries( | ||
transaction: PersistenceTransaction, | ||
documentKeys: DocumentKeySet | ||
): PersistencePromise<NullableMaybeDocumentMap> { | ||
return this.getSizedEntries(transaction, documentKeys).next( | ||
result => result.maybeDocuments | ||
); | ||
} | ||
|
||
/** | ||
* Looks up several entries in the cache. | ||
* | ||
* @param documentKeys The set of keys entries to look up. | ||
* @return A map of MaybeDocuments indexed by key (if a document cannot be | ||
* found, the key will be mapped to null) and a map of sizes indexed by | ||
* key (zero if the key cannot be found). | ||
*/ | ||
getSizedEntries( | ||
transaction: PersistenceTransaction, | ||
documentKeys: DocumentKeySet | ||
): PersistencePromise<DocumentSizeEntries> { | ||
let results = nullableMaybeDocumentMap(); | ||
let sizeMap = new SortedMap<DocumentKey, number>(DocumentKey.comparator); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, now that getEntries() is no longer a wrapper around getSizedEntries(), can we drop sizeMap and have this be new SortedMap<DocumentKey, DocumentSizeEntry|null> ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. First, I don't feel strongly about this. The reason I set it up that way is so that It's probably not a big deal, so if you think code clarity is more important here, I'll do the change. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, sorry! I was confused. I thought it was just so the old getEntries() implementation could return the MaybeDocumentMap directly. But I see now that RemoteDocumentChangeBuffer.getEntries() ends up calling getSizedEntries() and using the sizes and also passing the MaybeDocumentMap straight through. So it needs both, and the way it's structured right now makes sense. So nevermind. Please keep it the way it is. |
||
if (documentKeys.isEmpty()) { | ||
return PersistencePromise.resolve({ maybeDocuments: results, sizeMap }); | ||
} | ||
|
||
const range = IDBKeyRange.bound( | ||
documentKeys.first()!.path.toArray(), | ||
documentKeys.last()!.path.toArray() | ||
); | ||
let key = documentKeys.first(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is built upon the idea that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense to me. I might name it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
return remoteDocumentsStore(transaction) | ||
.iterate({ range }, (potentialKeyRaw, dbRemoteDoc, control) => { | ||
const potentialKey = DocumentKey.fromSegments(potentialKeyRaw); | ||
while (DocumentKey.comparator(key!, potentialKey) != 1) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, thanks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to this except I'd use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
if (key!.isEqual(potentialKey)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe not a big deal, but you could store the comparator result and then use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm concerned it would complicate the loop somewhat. |
||
results = results.insert( | ||
key!, | ||
this.serializer.fromDbRemoteDocument(dbRemoteDoc) | ||
); | ||
sizeMap = sizeMap.insert(key!, dbDocumentSize(dbRemoteDoc)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Whether to update the |
||
} else { | ||
results = results.insert(key!, null); | ||
sizeMap = sizeMap.insert(key!, 0); | ||
} | ||
|
||
key = documentKeys.firstAfter(key!); | ||
if (!key) { | ||
control.done(); | ||
return; | ||
} | ||
control.skip(key!.path.toArray()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this should go outside the while loop? Alternatively, while trying to figure out this code I ended up tweaking it a bit. If you think it's clearer, you can adopt it:
Not a big deal either way though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adopted (with very slight modification). Thanks, I didn't clean up this loop too much because I thought it might change a lot during the review. |
||
} | ||
}) | ||
.next(() => { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe add a comment. // The rest of the keys must not be in the cache. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done (with minor addition). |
||
while (key) { | ||
results = results.insert(key, null); | ||
sizeMap = sizeMap.insert(key, 0); | ||
key = documentKeys.firstAfter(key!); | ||
} | ||
return { maybeDocuments: results, sizeMap }; | ||
}); | ||
} | ||
|
||
getDocumentsMatchingQuery( | ||
transaction: PersistenceTransaction, | ||
query: Query | ||
|
@@ -381,6 +452,13 @@ class IndexedDbRemoteDocumentChangeBuffer extends RemoteDocumentChangeBuffer { | |
): PersistencePromise<DocumentSizeEntry | null> { | ||
return this.documentCache.getSizedEntry(transaction, documentKey); | ||
} | ||
|
||
protected getAllFromCache( | ||
transaction: PersistenceTransaction, | ||
documentKeys: DocumentKeySet | ||
): PersistencePromise<DocumentSizeEntries> { | ||
return this.documentCache.getSizedEntries(transaction, documentKeys); | ||
} | ||
} | ||
|
||
export function isDocumentChangeMissingError(err: FirestoreError): boolean { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,11 +17,14 @@ | |
import { Query } from '../core/query'; | ||
import { SnapshotVersion } from '../core/snapshot_version'; | ||
import { | ||
documentKeySet, | ||
DocumentKeySet, | ||
DocumentMap, | ||
documentMap, | ||
MaybeDocumentMap, | ||
maybeDocumentMap | ||
maybeDocumentMap, | ||
NullableMaybeDocumentMap, | ||
nullableMaybeDocumentMap | ||
} from '../model/collections'; | ||
import { Document, MaybeDocument, NoDocument } from '../model/document'; | ||
import { DocumentKey } from '../model/document_key'; | ||
|
@@ -74,6 +77,25 @@ export class LocalDocumentsView { | |
}); | ||
} | ||
|
||
// Returns the view of the given `docs` as they would appear after applying | ||
// all mutations in the given `batches`. | ||
private applyLocalMutationsToDocuments( | ||
transaction: PersistenceTransaction, | ||
docs: NullableMaybeDocumentMap, | ||
batches: MutationBatch[] | ||
): PersistencePromise<NullableMaybeDocumentMap> { | ||
let results = nullableMaybeDocumentMap(); | ||
return new PersistencePromise<NullableMaybeDocumentMap>(resolve => { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think you have to do this in a closure. Consider:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, I should have mentioned this -- this function doesn't have to be async at all. I wrapped the computation in a promise because I presumed it's time-consuming enough to warrant this. Do you think it should just return There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Going further, I don't think this function needs to take a transaction or return a PersistencePromise. It can just be synchronous. Note that in some cases we do have functions that are "needlessly" asynchronous and return a PersistencePromise when they could be synchronous. But we typically do this for public functions where we want to reserve the ability to make them asynchronous in the future, and so we want the consuming component to deal with them as an asynchronous function. But since this is a private function, I wouldn't worry about future-proofing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @var-const In general, JavaScript is completely single-threaded (there's no way to block other than by spinning the CPU) and so wrapping a computation in a promise doesn't really help (in particular it won't enable any parallelism). And PersistencePromise is extra weird because it's a specially-designed Promise-like construct that tries to be as synchronous as possible because IndexedDb has weird semantics where in the completion for one operation you must synchronously start the next operation or else your transaction will auto-close. So even using PersistencePromise, this code is actually 100% synchronous. So you can go ahead and just yank PersistencePromise out. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Right. Thanks, made this function return the result directly. |
||
docs.forEach((key, localView) => { | ||
for (const batch of batches) { | ||
localView = batch.applyToLocalView(key, localView); | ||
} | ||
results = results.insert(key, localView); | ||
}); | ||
resolve(results); | ||
}); | ||
} | ||
|
||
/** | ||
* Gets the local view of the documents identified by `keys`. | ||
* | ||
|
@@ -84,28 +106,40 @@ export class LocalDocumentsView { | |
transaction: PersistenceTransaction, | ||
keys: DocumentKeySet | ||
): PersistencePromise<MaybeDocumentMap> { | ||
return this.remoteDocumentCache | ||
.getEntries(transaction, keys) | ||
.next(docs => this.getLocalViewOfDocuments(transaction, docs)); | ||
} | ||
|
||
/** | ||
* Similar to `getDocuments`, but creates the local view from the given | ||
* `baseDocs` without retrieving documents from the local store. | ||
*/ | ||
getLocalViewOfDocuments( | ||
transaction: PersistenceTransaction, | ||
baseDocs: NullableMaybeDocumentMap | ||
): PersistencePromise<MaybeDocumentMap> { | ||
let allKeys = documentKeySet(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you don't want to construct a new set just to pass the keys, you could change the signature for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done (I had to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you try using any is dangerous in that it basically opts out of typechecking. So if you type the values as
TypeScript 3.x actually introduced There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, missed this comment. Done now. |
||
baseDocs.forEach(key => { | ||
allKeys = allKeys.add(key); | ||
}); | ||
|
||
return this.mutationQueue | ||
.getAllMutationBatchesAffectingDocumentKeys(transaction, keys) | ||
.next(batches => { | ||
const promises = [] as Array<PersistencePromise<void>>; | ||
.getAllMutationBatchesAffectingDocumentKeys(transaction, allKeys) | ||
.next(batches => | ||
this.applyLocalMutationsToDocuments(transaction, baseDocs, batches) | ||
) | ||
.next(docs => { | ||
let results = maybeDocumentMap(); | ||
keys.forEach(key => { | ||
promises.push( | ||
this.getDocumentInternal(transaction, key, batches).next( | ||
maybeDoc => { | ||
// TODO(http://b/32275378): Don't conflate missing / deleted. | ||
if (!maybeDoc) { | ||
maybeDoc = new NoDocument( | ||
key, | ||
SnapshotVersion.forDeletedDoc() | ||
); | ||
} | ||
results = results.insert(key, maybeDoc); | ||
} | ||
) | ||
); | ||
docs.forEach((key, maybeDoc) => { | ||
// TODO(http://b/32275378): Don't conflate missing / deleted. | ||
if (!maybeDoc) { | ||
maybeDoc = new NoDocument(key, SnapshotVersion.forDeletedDoc()); | ||
} | ||
results = results.insert(key, maybeDoc); | ||
}); | ||
return PersistencePromise.waitFor(promises).next(() => results); | ||
|
||
return results; | ||
}); | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -71,7 +71,12 @@ export class LocalSerializer { | |
/** Encodes a document for storage locally. */ | ||
toDbRemoteDocument(maybeDoc: MaybeDocument): DbRemoteDocument { | ||
if (maybeDoc instanceof Document) { | ||
const doc = this.remoteSerializer.toDocument(maybeDoc); | ||
let doc: api.Document; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Up to you, but this bit could be shortened to:
or even:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. (I like the Lua-style second version, but chose the first one because it's more similar to other platforms) |
||
if (maybeDoc.proto) { | ||
doc = maybeDoc.proto; | ||
} else { | ||
doc = this.remoteSerializer.toDocument(maybeDoc); | ||
} | ||
const hasCommittedMutations = maybeDoc.hasCommittedMutations; | ||
return new DbRemoteDocument( | ||
/* unknownDocument= */ null, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another approach would be for
getSizedEntries
to return aSortedMap<DocumentKey, DocumentSizeEntry>
. I decided in favor of returning two maps because it makes it easier to avoid code duplication betweengetEntries
andgetSizedEntries
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider taking as an argument a function that processes each key into the type of the result, for example
fn: (key: DocumentKey, doc: DbRemoteDoc | null) => T
. Then, your return type can bePersistencePromise<SortedMap<DocumentKey, T>>
. You can avoid code duplication and avoid doing extra work for sizes that way.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it out, but I'm not sure I prefer it. The problem is that
getEntries
inRemoteDocumentChangeBuffer
won't be able to return a map of documents directly (due to type difference) and instead would have to build a new map. If extra work for calculating/storing sizes is a concern, it's easy (though ugly) to solve with a flag (or perhaps, more similar to this approach, by having afn
that either updates thesizeMap
or is a no-op).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like that makes sense, but since it deals with DbRemoteDoc, I'd keep it internal and have getEntries() and getSizedEntries() functions that wrap it.
I think what Greg is recommending is basically a mapDbEntries() function, but it might be a little simpler to instead have it be a forEachDbEntry(transaction, documentKeys, callback) function that iterates the matching documents and just calls the callback with each raw dbRemoteDocument. That may mean a little bit of redundant code for getEntries() and getSizedEntries() to build up their respective maps, but it seems simpler to me (and perhaps more generically useful, if we had a case where we don't necessarily want to build up a map).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, please take a look.