-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add support to delete model within Predictor and Pipeline class. #647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good!
Just a few small comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get a review from @icywang86rui
src/sagemaker/predictor.py
Outdated
|
||
""" | ||
for model_name in self._model_names: | ||
self.sagemaker_session.delete_model(model_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the desired behavior if one or some of the requests fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I think we should catch the exception and tell user the deletion is incomplete if one or model delete_model() fail.
src/sagemaker/predictor.py
Outdated
for model_name in self._model_names: | ||
self.sagemaker_session.delete_model(model_name) | ||
except Exception: | ||
raise Exception('One or more models cannot be deleted, the deletion is incomplete.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
say if we have 3 models and the second one failed do we want to try to delete the third one?
src/sagemaker/predictor.py
Outdated
request_failed = True | ||
|
||
if request_failed: | ||
raise Exception('One or more models cannot be deleted, please retry.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's print the failed model names here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ship!
* Add list_feature_groups API (#647) * feat: Feature/get record api (#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (#664) * feat: Add DatasetBuilder class (#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (#699) * feat: Add pandas.Dataframe as base case (#708) * feat: Add with_feature_group method in DatasetBuilder (#726) * feat: Handle merge and timestamp filters (#727) * feat: Add to_dataframe method in DatasetBuilder (#729) * Address TODOs (#731) * Unit test for DatasetBuilder (#734) * fix: Fix list_feature_groups max_results (#744) * Add integration tests for create_dataset (#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
* Add list_feature_groups API (aws#647) * feat: Feature/get record api (aws#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (aws#664) * feat: Add DatasetBuilder class (aws#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (aws#699) * feat: Add pandas.Dataframe as base case (aws#708) * feat: Add with_feature_group method in DatasetBuilder (aws#726) * feat: Handle merge and timestamp filters (aws#727) * feat: Add to_dataframe method in DatasetBuilder (aws#729) * Address TODOs (aws#731) * Unit test for DatasetBuilder (aws#734) * fix: Fix list_feature_groups max_results (aws#744) * Add integration tests for create_dataset (aws#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (aws#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
* Add list_feature_groups API (aws#647) * feat: Feature/get record api (aws#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (aws#664) * feat: Add DatasetBuilder class (aws#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (aws#699) * feat: Add pandas.Dataframe as base case (aws#708) * feat: Add with_feature_group method in DatasetBuilder (aws#726) * feat: Handle merge and timestamp filters (aws#727) * feat: Add to_dataframe method in DatasetBuilder (aws#729) * Address TODOs (aws#731) * Unit test for DatasetBuilder (aws#734) * fix: Fix list_feature_groups max_results (aws#744) * Add integration tests for create_dataset (aws#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (aws#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
* Add list_feature_groups API (#647) * feat: Feature/get record api (#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (#664) * feat: Add DatasetBuilder class (#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (#699) * feat: Add pandas.Dataframe as base case (#708) * feat: Add with_feature_group method in DatasetBuilder (#726) * feat: Handle merge and timestamp filters (#727) * feat: Add to_dataframe method in DatasetBuilder (#729) * Address TODOs (#731) * Unit test for DatasetBuilder (#734) * fix: Fix list_feature_groups max_results (#744) * Add integration tests for create_dataset (#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
* Add list_feature_groups API (aws#647) * feat: Feature/get record api (aws#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (aws#664) * feat: Add DatasetBuilder class (aws#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (aws#699) * feat: Add pandas.Dataframe as base case (aws#708) * feat: Add with_feature_group method in DatasetBuilder (aws#726) * feat: Handle merge and timestamp filters (aws#727) * feat: Add to_dataframe method in DatasetBuilder (aws#729) * Address TODOs (aws#731) * Unit test for DatasetBuilder (aws#734) * fix: Fix list_feature_groups max_results (aws#744) * Add integration tests for create_dataset (aws#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (aws#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
* Add list_feature_groups API (aws#647) * feat: Feature/get record api (aws#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (aws#664) * feat: Add DatasetBuilder class (aws#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (aws#699) * feat: Add pandas.Dataframe as base case (aws#708) * feat: Add with_feature_group method in DatasetBuilder (aws#726) * feat: Handle merge and timestamp filters (aws#727) * feat: Add to_dataframe method in DatasetBuilder (aws#729) * Address TODOs (aws#731) * Unit test for DatasetBuilder (aws#734) * fix: Fix list_feature_groups max_results (aws#744) * Add integration tests for create_dataset (aws#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (aws#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
* Add list_feature_groups API (aws#647) * feat: Feature/get record api (aws#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (aws#664) * feat: Add DatasetBuilder class (aws#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (aws#699) * feat: Add pandas.Dataframe as base case (aws#708) * feat: Add with_feature_group method in DatasetBuilder (aws#726) * feat: Handle merge and timestamp filters (aws#727) * feat: Add to_dataframe method in DatasetBuilder (aws#729) * Address TODOs (aws#731) * Unit test for DatasetBuilder (aws#734) * fix: Fix list_feature_groups max_results (aws#744) * Add integration tests for create_dataset (aws#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (aws#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
Issue #, if available:
#447
Description of changes:
Closing #638 in favor of this PR.
Continuation of #630
Add support to delete model within
Predictor
andPipeline
class.Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.