Uploading multiple import files from the same source without exeuting the pipeline imbetween will overwrite data #363

sposerina · 2021-06-13T19:57:21Z

Each time a new import file is uploaded the previous file is archived. Since the import files are only loaded into the database on each pipeline execution subsequent uploads of import files from the same data source will cause data to get archived without ever making into the database.

Since this is a capability exposed to the end user, a more user friendly and intuitive behavior is necessary.

c-simpson · 2021-06-14T15:39:10Z

This looks like a good time to move away from storing the import files in the filesystem.
I propose we store the files as blobs in the DB, keep a queue of not-yet-imported files, and also store the hash so that we can skip duplicates.
I also think the current practice of renaming the files as 'upload_type' + import timestamp makes it easy to miss files if you have multiple files of the same type you need to upload.

jwtruver · 2021-06-22T23:57:55Z

we should control the uploading portion so the user does not make any errors.

jwtruver · 2021-10-26T23:07:01Z

Pursue getting rid of execute button, and switch to "automated" execution when new files are uploaded

jwtruver · 2021-11-10T00:22:34Z

@c-simpson will test and confirm

carlos-dominguez · 2022-02-10T21:49:30Z

This would be fixed by #480, though not exactly in the way Cris recommends; instead my PR does some normalization on upload rather than storing a JSON blob

sposerina changed the title ~~Uploading multiple import files from the same source without exeuting the pipeline imbetween will ovewrite data~~ Uploading multiple import files from the same source without exeuting the pipeline imbetween will overwrite data Jun 13, 2021

c-simpson self-assigned this Jun 14, 2021

c-simpson added Importing persistent DB labels Jun 14, 2021

c-simpson added this to the Lauren-2 milestone Jun 15, 2021

jwtruver removed this from the Lauren-2 milestone Jun 22, 2021

kfettich closed this as completed Jun 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uploading multiple import files from the same source without exeuting the pipeline imbetween will overwrite data #363

Uploading multiple import files from the same source without exeuting the pipeline imbetween will overwrite data #363

sposerina commented Jun 13, 2021

c-simpson commented Jun 14, 2021

Uh oh!

jwtruver commented Jun 22, 2021

Uh oh!

jwtruver commented Oct 26, 2021

Uh oh!

jwtruver commented Nov 10, 2021

Uh oh!

carlos-dominguez commented Feb 10, 2022

Uh oh!

Uploading multiple import files from the same source without exeuting the pipeline imbetween will overwrite data #363

Uploading multiple import files from the same source without exeuting the pipeline imbetween will overwrite data #363

Comments

sposerina commented Jun 13, 2021

c-simpson commented Jun 14, 2021

Uh oh!

jwtruver commented Jun 22, 2021

Uh oh!

jwtruver commented Oct 26, 2021

Uh oh!

jwtruver commented Nov 10, 2021

Uh oh!

carlos-dominguez commented Feb 10, 2022

Uh oh!