-
Notifications
You must be signed in to change notification settings - Fork 16
Delete validation failures #1514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
By specifying the full file path, shutil.move overwrites destination files
Deletes or moves files on validation failure
Run delete-move-files if validation failes
Adding mock to make sure that delete-move-files is correctly ran once if validation fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. Since it would be bad for this to go wrong, @krivard please also take a look.
files_to_delete = os.listdir(export_dir) | ||
if delivery_dir is not None: | ||
for file_name in files_to_delete: | ||
if file_name.endswith(".csv") or file_name.endswith(".CSV"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can combine these cases with .lower()
Changing wording for extra precision Co-authored-by: nmdefries <[email protected]>
combining cases with .lower()
…elphi/covidcast-indicators into delete-validation-failures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking my understanding -- here's the truth table of what this code does (numbering the cases for easier reference):
case | dry_run | validation failure | export dir | delivery dir | v failure dir | action |
---|---|---|---|---|---|---|
1 | true | never | * | * | * | no effect |
2 | false | no | * | * | * | no effect |
3 | false | yes | X | X | * | throw assertion exception |
4 | false | yes | X | None | * | no effect |
5 | false | yes | X | Y != X | None | delete CSV from X |
6 | false | yes | X | Y != X | Z | move CSV from X to Z |
First, do I have this right?
Second, are we sure about case 4? If no delivery dir is set, but the validation failure dir is set, we still want to do nothing? I'm not sure if I have an opinion either way, it just seemed worth confirming as a corner case.
Otherwise looks solid, modulo a wee nit
Add early exit here for readability and to reduce empty-directory litter. something like
Added truth table to README for clarity Made delivery_dir throw an assertion if it is None (should not happen since this scenario would only happen if dry-run is False, in which case both export_dir and delivery_dir have to be specified.
Oops - this slipped between the cracks for me I'm not too sure if this is what you had, I feel like there's probably some way to improve readability so let me know what you all think! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My earlier comments have been addressed, so seems good
@krivard Could you take another look at this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Description
During runner runs, upon validation failure for indicators with validator gating, switch to either deleting the csv files stored within
export-dir
(which should only contain files from the run) or moving them to another cache folder,validation-failure-dir
.Changelog
Itemize code/test/documentation changes and files added/removed.
More info in Slack thread on system-monitoring