Skip to content

[Multi-Tab] Adding Schema Migration #485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Feb 9, 2018

Conversation

schmidt-sebastian
Copy link
Contributor

@schmidt-sebastian schmidt-sebastian commented Feb 2, 2018

This performs the IndexedDB schema migration required for MultiTab:

  • It adds a new IndexedDB store to store instance metadata.
  • It adds a new IndexedDB store to store a query changelog.
  • It adds a field to DbMutationQueue, which we can do silently without changing the schema.

The functionality still allows for creation of schema version 1, so that we can make testing easier.

It's likely that we will want to coordinate this migration with our Garbage Collection efforts.

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review in detail but looked over this and flagged a few things that jumped out.

Have you looked at the SQLitePersistence runMigrations() code for Android to see if we can follow a similar pattern? In particular:

  1. I like the switch / case (with fall-through) approach it uses.
  2. In general we should try to make the upgrade mechanism as similar as possible across platforms.
  3. It may make sense to copy the INDEXING_SUPPORT_ENABLED flag mechanism we used there. I think we should be careful not to migrate anybody's schema until we're fully ready to launch multi-tab.

Also, Greg is working on schema upgrade infrastructure for iOS right now too, so it may be worth peeking at what he's doing.

newVersion: number
): void {
assert(
oldVersion >= 0 || oldVersion <= 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (combined the asserts into one)

'Unexpected upgrade from version ' + oldVersion
);
assert(
newVersion >= 1 || newVersion <= 2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& ?

);

const createV1 = newVersion >= 1 && oldVersion <= 1;
const dropV1 = oldVersion >= 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be <= ? FWIW, I'm not sure these intermediate variables are helping here. I keep looking back and forth between the if-checks...

export function createOrUpgradeDb(
db: IDBDatabase,
oldVersion: number,
newVersion: number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having newVersion be a parameter is super confusing to me. Are you intending to support downgrades (I think we should try hard to avoid that!)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not at all planning on adding the ability to downgrade. I renamed the arguments to fromVersion and toVersion and added an assert that one must be lower than the other.

@@ -131,7 +87,14 @@ export class DbMutationQueue {
* After sending this token, earlier tokens may not be used anymore so
* only a single stream token is retained.
*/
public lastStreamToken: string
public lastStreamToken: string,
/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added newlines here and everywhere else where they were missing. I also tried to make this semi-valid JSDoc along with it (I still don't think it's valid as we would probably have to move the JSDoc comments to the function header).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that prettier stripped the newlines again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry. I should have known this was prettier...

Thanks for your efforts to make things JSDoc friendly, though 1) I don't consider this a goal [we're never going to generate docs from our internal implementation files], and 2) I don't think that needed to go in this PR... In this case it's probably fine, but it could be distracting if somebody was revisiting this PR later (to track down a bug or port to another platform or whatever).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if it is not for external documentation, it could show up in code completion when done properly - whereas right not it just shows up as a warning due do the /** JSDoc */ marks.

I agree that this is mostly a non-issue though and happily pressed some buttons to revert this.

* than or equal to this value are considered to have been acknowledged by
* the server.
*/
public highestPendingBatchId: number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only exists so that we can leave acknowledged mutations in the queue for the sake of other tabs being able to read them, right? I think it's worth calling that out here. Also, are we going to port this to other platforms? If not, please add a "// PORTING NOTE:" comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the outdated comment. This is to reduce the number of scans we have to do in the mutation batch. Right now, to get the highest batch ID, we do an full scan at startup. I don't want to do this every time we add a mutation from a secondary tab.

// https://github.com/Microsoft/TypeScript/issues/14322
type KeyPath = any; // tslint:disable-line:no-any

function createCache(db: IDBDatabase): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, "cache" is non-obvious, so I would at least add a comment.

// Create IndexedDB stores for our query / remote document caches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed this and split this up to match the names in Android: createQueryCache() and createRemoteDocumentCache()

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple more minor comments after a second look.

@@ -404,6 +367,7 @@ export class DbTargetDocument {
* The targetId identifying a target.
*/
public targetId: TargetId,
public snapshotVersion: number,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment? (especially call out if we are doing any cheating with fake snapshot versions or whatever...)

Also I think we want to do some design review (with Gil at least) before moving forward with this (unless it's living in a non-master branch or something).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked to Gil and Greg about this yesterday. Gil favors a separate ChangeLog, as primary key updates are pretty expensive. I updated the Design Doc with this yesterday and added a new table to store this changelog. The nice thing about this is that now all changes in the schema are additive and we don't need to drop the Query & Remote Document Cache anymore.

@@ -455,6 +419,20 @@ export class DbTargetGlobal {
) {}
}

export type DbInstanceKey = [string, string];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments?

Also, "Instance" feels too generic. Perhaps "ClientInstance" (and comments should explain the multi-tab scenario)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer used.

@schmidt-sebastian
Copy link
Contributor Author

schmidt-sebastian commented Feb 6, 2018

I didn't review in detail but looked over this and flagged a few things that jumped out.

Have you looked at the SQLitePersistence runMigrations() code for Android to see if we can follow a similar pattern? In particular:

I like the switch / case (with fall-through) approach it uses.
In general we should try to make the upgrade mechanism as similar as possible across platforms.
It may make sense to copy the INDEXING_SUPPORT_ENABLED flag mechanism we used there. I think we should be careful not to migrate anybody's schema until we're fully ready to launch multi-tab.
Also, Greg is working on schema upgrade infrastructure for iOS right now too, so it may be worth peeking at what he's doing.

This PR is now ready for review.

I looked at the iOS PR, which originally inspired the creation of a separate file to run the migrations. I like the Android way better to do this in place and merged my change into indexeddb_schema.

Both Android and iOS use the switch-statement with fallthrough. This doesn't allow us to specify a range of versions to update (from 0 to 1 for example) and I would like to use that for testing of the actual migration process.

I did not make this flag controlled as this PR is meant to be merged into the firestore-multi-tab branch. We should decide later on whether this feature will be optional and how we want to deal with optional features in the schema. Even if multi-tab is optional, it might make sense to add it to the schema but not write any data into the multi-tab specific stores.

@schmidt-sebastian schmidt-sebastian changed the title Multitab schemamigration [Multi-Tab] Adding Schema Migration Feb 6, 2018
@schmidt-sebastian schmidt-sebastian changed the base branch from master to firestore-multi-tab February 6, 2018 18:48
@schmidt-sebastian
Copy link
Contributor Author

BTW, I updated this PR to remove the "userId" from the InstanceMetadata store. The rest of the multi-tab work assumes that only one user is logged in at a time, and there is no reason to keep state around for users that are no longer active. I haven't updated the doc yet and wanted have this sanity checked first.

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I like this iteration a lot. Some feedback but nothing major. FWIW- My suggestion to add the flag was so that we could merge this to master before multi-tab is done / shipped. I am hoping we don't need to make multi-tab optional long-term.

In any case if you're intending to work in a branch, then we don't need the flag guard. Just be wary of conflicts...

DbTargetDocument.documentTargetsKeyPath,
{ unique: true }
/**
* Performs database creation and schema migrations up to schema version 2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't bake the current schema version into this comment (it'll just get stale). And I think you should explain why toVersion is a parameter. Something like:

Performs database creation and schema migrations up to SCHEMA_VERSION. Note that in production, toVersion will always be SCHEMA_VERSION but for testing we allow "partial upgrades" to earlier versions (but toVersion must still be >= fromVersion).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the comment as suggested. The schema version specific comment is not just above the assert, which has to be changed as we add new schema version. I thought about adding named constant for each schema version, but I am not sure how much value that adds.

createMutationQueue(db);
createQueryCache(db);
createRemoteDocumentCache(db);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're not doing the switch / case pattern, I think you should still organize similarly (in particular put version 1 stuff before version 2), and I think we need to have a simple pattern that makes it obvious how to add code whenever we bump SCHEMA_VERSION... As-is, the toVersion === 2 check will need to be changed when we add version 3. In general, this method should be append-only as we add versions. We can't really change version migration stuff once it's shipped.

Perhaps:

if (fromVersion < 1 && toVersion >= 1) {
  // version 1 stuff...
}

if (fromVersion < 2 && toVersion >= 2) {
  // version 2 stuff...
}
...

Or you could do a for loop:

for (let version = fromVersion+1; version <= toVersion; version++) {
  if (version === 1) {
    // version 1 stuff
  } else if (version === 2) {
    // version 2 stuff
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used code snippet version 1.

@@ -131,7 +87,14 @@ export class DbMutationQueue {
* After sending this token, earlier tokens may not be used anymore so
* only a single stream token is retained.
*/
public lastStreamToken: string
public lastStreamToken: string,
/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry. I should have known this was prettier...

Thanks for your efforts to make things JSDoc friendly, though 1) I don't consider this a goal [we're never going to generate docs from our internal implementation files], and 2) I don't think that needed to go in this PR... In this case it's probably fine, but it could be distracting if somebody was revisiting this PR later (to track down a bug or port to another platform or whatever).

* by the server. All MutationBatches in this queue with batchIds less
* than or equal to this value are considered to have been acknowledged by
* the server.
* @param lastAcknowledgedBatchId - An identifier for the highest numbered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the reason we have these detailed comments here is more because these are fields on the class than because we like to comment all of our parameters. So while @param is probably correct, it may make it less obvious that these are also public fields on the class... So I'm ambivalent about actually using @param here. I don't care enough to ask you to undo it, but I think I'd prefer we not go migrate the whole codebase or anything...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only took seconds to undo, so it's undone.

* batch in the mutation queue. This allows for efficient insert of new
* batches without relying on in-memory state.
*
* PORTING NOTE: iOS and Android clients keep this value in-memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't they track it as nextBatchId? This is a clearer name I think. Could we do the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to nextBatchId (which is the internal name used in the clients), but then ultimately removed it as pointed out above.

createOrUpgradeDb(db, event.oldVersion, targetVersion);
};

return deferred.promise;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use SimpleDb.openOrCreate() ? [I'm not 100% sure]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SimpleDb doesn't expose the version and the list of object stores.

return objectStores;
}

// Sorting these arrays directly should not affect the functionality of the SDK.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is ugly! Why not just do V1_STORES.slice().sort() where you need them sorted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer used.

it('can install schema version 1', () => {
return initDb(1).then(db => {
expect(db.version).to.be.equal(1);
expect(getAllObjectStores(db)).to.deep.equal(V1_STORES);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use to.have.members(...) and not need to ensure they're sorted the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

V1_STORES.sort();
ALL_STORES.sort();

describe('SchemaMigration', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is SchemaMigration? :-) Maybe 'Schema Migration: createOrUpgradeDb()' (since createOrUpgradeDb is the method you're testing here).

toVersion: number
): void {
assert(
fromVersion < toVersion && fromVersion >= 0 && toVersion <= 2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 => SCHEMA_VERSION ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually changed SCHEMA_VERSION to DEFAULT_SCHEMA_VERSION and reset the value back to 1. This way, I can merge this code to master, and Greg can build the GC migration on top of this.

@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this State. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

@schmidt-sebastian schmidt-sebastian changed the base branch from firestore-multi-tab to master February 7, 2018 19:34
@schmidt-sebastian
Copy link
Contributor Author

This is ready for review again. I have updated the base branch to 'master`, but used the SCHEMA_VERSION constant to make sure that we don't run this upgrade yet.

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM, but I'm wary of checking in multi-tab schema work in master if we're doing the feature development in a different branch. It seems like this is at minimum confusing, and could potentially create headaches later (e.g. when you iterate on the schema or if we need to ship a non-multi-tab schema change before the multi-tab ones).

Have you considered just doing the multi-tab work in master? In that case I think this PR would add a global MULTITAB_ENABLED flag for tests to flip and then set SCHEMA_VERSION and ALL_STORES as appropriate based on it, and the rest of our code would similarly trigger off this flag (e.g. create a no-op SharedClientState object instead of the LocalStorage one). FWIW, this is the path we started down for indexing on Android.

I can probably be talked into approving this PR but I'd like to at least have the conversation.

The other option would be to separate the schema-migration part of this PR from the multi-tab part, but I think then we end up with no actual "schema migration" testing. :-)

})
.then(db => {
fn(db);
return db;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just db.close() here and drop a then...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@schmidt-sebastian schmidt-sebastian changed the base branch from master to firestore-multi-tab February 9, 2018 02:22
@schmidt-sebastian
Copy link
Contributor Author

I want to keep the multi-tab work in a separate branch to avoid unnecessary churn on the master branch in case I need to revisit some of my design decisions. The reason I want(ed) to merge this into Master is that Greg is going to write schema migration code pretty soon and it would be nice if he could built it on top of this PR (which might bump multi-tab to version 3).

There are many other ways to guarantee that @gsoltis doesn't do duplicate work, and I'll just ask him to pull in this PR and merge it to Master with multi-tab removed and GC added.

TL/DR: Rebased to firestore-multi-tab and made Schema Version 2 the default in this branch.

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit left...

* clients with the schema version 2.
*/
export const ALL_STORES = [...V1_STORES, ...V2_STORES];
export const DEFAULT_STORES = [...V1_STORES, ...V2_STORES];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put back to ALL_STORES?

@schmidt-sebastian schmidt-sebastian merged commit 9c61e81 into firestore-multi-tab Feb 9, 2018
@schmidt-sebastian schmidt-sebastian deleted the multitab-schemamigration branch March 26, 2018 21:33
@firebase firebase locked and limited conversation to collaborators Oct 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants