You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Each time a new import file is uploaded the previous file is archived. Since the import files are only loaded into the database on each pipeline execution subsequent uploads of import files from the same data source will cause data to get archived without ever making into the database.
Since this is a capability exposed to the end user, a more user friendly and intuitive behavior is necessary.
The text was updated successfully, but these errors were encountered:
sposerina
changed the title
Uploading multiple import files from the same source without exeuting the pipeline imbetween will ovewrite data
Uploading multiple import files from the same source without exeuting the pipeline imbetween will overwrite data
Jun 13, 2021
This looks like a good time to move away from storing the import files in the filesystem.
I propose we store the files as blobs in the DB, keep a queue of not-yet-imported files, and also store the hash so that we can skip duplicates.
I also think the current practice of renaming the files as 'upload_type' + import timestamp makes it easy to miss files if you have multiple files of the same type you need to upload.
This would be fixed by #480, though not exactly in the way Cris recommends; instead my PR does some normalization on upload rather than storing a JSON blob
Each time a new import file is uploaded the previous file is archived. Since the import files are only loaded into the database on each pipeline execution subsequent uploads of import files from the same data source will cause data to get archived without ever making into the database.
Since this is a capability exposed to the end user, a more user friendly and intuitive behavior is necessary.
The text was updated successfully, but these errors were encountered: