Skip to content

Bundle's named query resume. #3395

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 51 commits into from
Aug 6, 2020
Merged

Conversation

wu-hui
Copy link
Contributor

@wu-hui wu-hui commented Jul 12, 2020

No description provided.

wu-hui added 30 commits May 21, 2020 13:49
# Conflicts:
#	packages/firestore/src/util/bundle_reader.ts
#	packages/firestore/test/unit/util/bundle.test.ts
# Conflicts:
#	packages/firestore/src/util/bundle_reader.ts
#	packages/firestore/test/unit/util/bundle.test.ts
Two TODO:
1. load from secondary tab.
2. don't fail async queue.
@wu-hui wu-hui changed the base branch from wuandy/ExpBundlesApi to wuandy/Bundles July 28, 2020 21:28
@@ -1925,7 +1936,8 @@ export class Query<T = firestore.DocumentData> implements firestore.Query<T> {
constructor(
public _query: InternalQuery,
readonly firestore: Firestore,
protected readonly _converter: firestore.FirestoreDataConverter<T> | null
protected readonly _converter: firestore.FirestoreDataConverter<T> | null,
private _readFrom?: SnapshotVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename everywhere.

@@ -175,7 +175,7 @@ export interface SyncEngine extends RemoteSyncer {
* server. All the subsequent view snapshots or errors are sent to the
* subscribed handlers. Returns the initial snapshot.
*/
listen(query: Query): Promise<ViewSnapshot>;
listen(query: Query, readFrom?: SnapshotVersion): Promise<ViewSnapshot>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I misunderstood your code when I looked at it yesterday (was a bit rushed). It looks like the idea is that the readTime comes from EventManager.listem(). That doesn't seem correct.

The code that is interested in a query shouldn't need to know that the query was bundled at one point and it shouldn't even know about the version that it is at. Think about this as similar to resume tokens - LocalStore knows about them, but the API surface does not. Once persisted, the readTime should only need to be passed from LocalStore to RemoteStore.

A named query should be a helper to quickly construct the Query that was bundled. It should not be a requirement for bundling. If a user happens to reconstruct the same query from scratch, they should get the same semantics (e.g. query resumption).

As a first step, it looks like this means that we should remove readTime from the BundleCache and move it to the TargetCache.

return null;
}

// TODO(wuandy): make this work with exp build.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You simply imported the wrong Query:

diff --git a/packages/firestore/exp/src/api/database.ts b/packages/firestore/exp/src/api/database.ts
index 52f80f80..e613e8a1 100644
--- a/packages/firestore/exp/src/api/database.ts
+++ b/packages/firestore/exp/src/api/database.ts
@@ -44,13 +44,14 @@ import { cast } from '../../../lite/src/api/util';
 import { Code, FirestoreError } from '../../../src/util/error';
 import { Deferred } from '../../../src/util/promise';
 import { LruParams } from '../../../src/local/lru_garbage_collector';
-import { CACHE_SIZE_UNLIMITED, Query } from '../../../src/api/database';
+import { CACHE_SIZE_UNLIMITED } from '../../../src/api/database';
 import { DatabaseId, DatabaseInfo } from '../../../src/core/database_info';
 import {
   indexedDbStoragePrefix,
   indexedDbClearPersistence
 } from '../../../src/local/indexeddb_persistence';
 import { LoadBundleTask } from '../../../src/api/bundle';
+import {Query} from "../../../lite";
 
 /**
  * The root reference to the Firestore database and the entry point for the
@@ -322,7 +323,5 @@ export async function namedQuery(
     return null;
   }
 
-  // TODO(wuandy): make this work with exp build.
-  return null;
-  // return new Query(namedQuery.query, firestoreImpl, null, namedQuery.readTime);
+  return new Query(firestoreImpl,namedQuery.query,  null, namedQuery.readTime);
 }

@@ -1925,7 +1936,8 @@ export class Query<T = firestore.DocumentData> implements firestore.Query<T> {
constructor(
public _query: InternalQuery,
readonly firestore: Firestore,
protected readonly _converter: firestore.FirestoreDataConverter<T> | null
protected readonly _converter: firestore.FirestoreDataConverter<T> | null,
private _readTime?: SnapshotVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_readTime should only be part of TargetData and not be used in the API surface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted.

{
readFrom: this._readTime,
includeMetadataChanges: options.includeMetadataChanges
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted.

@@ -975,6 +975,11 @@ export function toTarget(

if (targetData.resumeToken.approximateByteSize() > 0) {
result.resumeToken = toBytes(serializer, targetData.resumeToken);
} else if (targetData.snapshotVersion.compareTo(SnapshotVersion.min()) > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we:

  • Always update the read time for each TargetData?
  • Always persist it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i understand correctly, this is the same comment that we should persist named queries in target cache as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. I would assume that targetData.snapshotVersion.compareTo(SnapshotVersion.min()) > 0 is always true (unless the target just got allocated, in which case the existing snapshot version should also be null). Can you verify that, and if so drop the if-condition here?

if (resumeToken) {
  ...
}

result.readTime = toTimestamp(
      serializer,
      targetData.snapshotVersion.toTimestamp()
    );

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, problem is in the proto resumeToken and readTime are in oneof type.

This brings your question though: should we always use readtime to resume. Because this change makes resumeToken always overwritten. I think functionally this works but we should clean up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the problem is that we can't tell which one is newer and we likely want the newer of the two. In stand up we discussed that the resumeToken is preferred, since it contains more information to help the backend resume at the right query position. I would think we could do something like this:

  • If readTime > snapshotVersion, use readTime
  • Otherwise use resumeToken

Does that work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As further discussed, the fix is to drop resumeToken when we update a target.

Still, there are many tests rely on this behaviour now, that i want to remove the check later.

.orderBy('bar', 'desc')
// eslint-disable-next-line @typescript-eslint/no-explicit-any
.limit(1) as any)._query
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be so nice if these tests lived in "api", which would mean that they get tested via firestore-exp and the minified builds. Can we use pre-encoded bundle and load it from JSON:?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can give it a try, but in a later PR?

query: Query,
readFrom: TestSnapshotVersion
): this {
// Note we do not call this.nextStep() here because we reuse userListens().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this needs to be spec tested, and if anything it seems like it should be two steps (one that loads the query, one that listens to it). To me, restoring a Query by name is separate from query execution. It doesn't need to interact with the rest of the system (SyncEngine, RemoteStore).

The test could be simple unit tests for the Query class:

const query1 = query.where('foo', > 1);
storeInBundle('fooQuery',  'querye.where('foo', > 1)}');
const query2 = namedQuery('fooQuery');
expect(query1).to.equal(query2);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I moved it to localstore test though.

addNamedQuery(
name: string,
readTime: api.Timestamp,
bundledQuery: BundledQuery
query: Query,
limitType?: LimitType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query already has the LimitType.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing I agree: the limit type here is from bundles types because bundles never really save an actual limit-to-last query. It saves a limit query with same orderBy and cursors, with a type indicator that it's a limit to last query.

Using the limit type from Query would mean we need to construct a limit to last query that translates to a limit query that is equivalent query saved in bundles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it at all be possible to do the Query translation (flip the order) here? Not sure if feasible, but then we could drop the LimitType and reduce the confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

export interface SpecUserListen {
targetId: TargetId;
query: string | SpecQuery;
fromName?: string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should ideally be removed (or moved to a different step).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

): Promise<firestore.Query | null> {
const firestoreImpl = cast(firestore, Firestore);
const client = await firestoreImpl._getFirestoreClient();
const namedQuery = await client.getNamedQuery(name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs merging again. The call to getNamedQuery should call a free-standing version that takes LocalStore as an argument. All methods in FirestoreClient have been updated.

Let me know if you need help with this. You can also defer to a follow up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd like to defer to my a follow up PR.

@@ -236,10 +237,21 @@ export interface LocalStore {
* they don't get GC'd. A target must be allocated in the local store before
* the store can be used to manage its view.
*
* @param {SnapshotVersion} readTime The snapshot version the allocated target
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are writing TypeScript, no need for {SnapshotVersion} (and also {PersistenceTransaction}).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

.next(() => targetData);
});
}
_allocateOperation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Can you come up with "normal" name for this? I can't think of anything either, otherwise I would suggest something. You can drop the underscore since this is not a public class and you might be able to make this method private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer needed.

);
}
return allocateOperation.then(targetData => {
// If Multi-Tab is enabled, the existing target data may be newer than
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this cannot be done in _allocateOperation? If you move this code, then saveNamedQuery doesn't have to call allocateTarget anymore and you can drop the transaction and the readTime argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, i was not sure if persistence transaction could be nested. Took your advice and it looks much better now!

fromVersion(query.readTime!),
transaction
);
return localStoreImpl.bundleCache.saveNamedQuery(transaction, query);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple things:

  • I think these two calls don't need to happen in the same transaction, as long as the allocateTarget happens before saveNamedQuery. That would then undo most of the changes in this file.

  • Instead of changing allocateTarget to take a read time, you could also call allocateTarget followed by updateTargetData, removing all changes.

  • You are leaking the target right now since you never call release. That might be by design, but we should point this out somehow (and ideally make it more explicit in the code).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied first two.

Regarding 3, they will get collected aren't they? Also added comments to explain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They will get collected in IndexedDB, but the local target mapping doesn't get cleared. this.targetDataByTarget and this.targetIdByTarget seems to stay around. I wonder if this makes it possible for LRU GC to delete from disk even though it is still considered active in LocalStore? Can you verify what the implications would be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so it looks like being in targetDataByTarget means it is active, unless user happen to unlisten the same query somehow.

This is not a bad thing I think, added comment to explain.

@@ -73,7 +73,7 @@ export interface LimboMap {

export interface ActiveTargetSpec {
queries: SpecQuery[];
resumeToken: string;
resumeToken: string | TestSnapshotVersion;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be split into two distinct fields, since TestSnapshotVersion is not a resumeToken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -253,7 +253,7 @@ export class SpecBuilder {
return this;
}

userListens(query: Query, resumeToken?: string): this {
userListens(query: Query, resumeFrom?: string | TestSnapshotVersion): this {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the test above would read super cleanly if you made this

Suggested change
userListens(query: Query, resumeFrom?: string | TestSnapshotVersion): this {
userListens(query: Query, resumeFrom?: { resumeToken?: string; readTime?: TestSnapshotVersion}}): this {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -1029,7 +1029,7 @@ export class SpecBuilder {
private addQueryToActiveTargets(
targetId: number,
query: Query,
resumeToken?: string
resumeToken?: string | TestSnapshotVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this resumeFrom as well, since TestSnapshotVersion is not a resumeToken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separated.

@@ -1360,8 +1367,10 @@ export interface SpecStep {
expectedSnapshotsInSyncEvents?: number;
}

/** [<target-id>, <query-path>] */
export type SpecUserListen = [TargetId, string | SpecQuery];
export interface SpecUserListen {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HOLY SMOKES. THIS IS THE BEST THING I HAVE READ ALL YEAR.

Thanks for cleaning this up.

addNamedQuery(
name: string,
readTime: api.Timestamp,
bundledQuery: BundledQuery
query: Query,
limitType?: LimitType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it at all be possible to do the Query translation (flip the order) here? Not sure if feasible, but then we could drop the LimitType and reduce the confusion.

@wu-hui wu-hui assigned schmidt-sebastian and unassigned wu-hui Aug 5, 2020
@@ -975,6 +975,11 @@ export function toTarget(

if (targetData.resumeToken.approximateByteSize() > 0) {
result.resumeToken = toBytes(serializer, targetData.resumeToken);
} else if (targetData.snapshotVersion.compareTo(SnapshotVersion.min()) > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the problem is that we can't tell which one is newer and we likely want the newer of the two. In stand up we discussed that the resumeToken is preferred, since it contains more information to help the backend resume at the right query position. I would think we could do something like this:

  • If readTime > snapshotVersion, use readTime
  • Otherwise use resumeToken

Does that work?


return spec()
.loadBundle(bundleString1)
.userListens(query1, /* resumeFrom */ '', 400)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

400 still raises questions here. If you use named optional arguments this will be super clean, I promise:

userListens(query1 ) 
userListens(query1, { resumeToken : 'abc' } ) 
userListens(query1, { readTime : 400 } ) 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, it is better!

fromVersion(query.readTime!),
transaction
);
return localStoreImpl.bundleCache.saveNamedQuery(transaction, query);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They will get collected in IndexedDB, but the local target mapping doesn't get cleared. this.targetDataByTarget and this.targetIdByTarget seems to stay around. I wonder if this makes it possible for LRU GC to delete from disk even though it is still considered active in LocalStore? Can you verify what the implications would be?

Copy link
Contributor

@schmidt-sebastian schmidt-sebastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much LGTM, minus one style nit in regards to the spec tests and the discussion from standup.

@wu-hui wu-hui assigned schmidt-sebastian and unassigned wu-hui Aug 5, 2020
@wu-hui wu-hui merged commit 5ebf8c3 into wuandy/Bundles Aug 6, 2020
@firebase firebase locked and limited conversation to collaborators Sep 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants