You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When creating a model Monitor and attaching a schedule using "create_monitoring_schedule", If the schedule fails to create due to Validation Exception, the schedule is never created, but the Model_monitor class retains the variables for schedule name etc.
This causes issues, because you can't delete the monitor using delete_monitoring_schedule(), but you cant create a new one as it is already initialized.
To reproduce
Create a Model Monitor
from sagemaker.model_monitor import DefaultModelMonitor
from sagemaker.model_monitor.dataset_format import DatasetFormat
from sagemaker import get_execution_role
role = get_execution_role()
my_monitor = DefaultModelMonitor(
role=role,
instance_count=1,
instance_type='ml.m5.xlarge',
volume_size_in_gb=20,
max_runtime_in_seconds=3600,
)
my_monitor.suggest_baseline(
baseline_dataset='s3://grayjh/player_data/player_data.csv',
dataset_format=DatasetFormat.csv(header=True),
)
You get a fails to create error due to there already being a schedule
It seems that this object was already used to create an Amazon Model Monitoring Schedule. To create another, first delete the existing one using my_monitor.delete_monitoring_schedule().
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-11-63036cc6383a> in <module>()
4 statistics=my_monitor.baseline_statistics(),
5 constraints=my_monitor.suggested_constraints(),
----> 6 schedule_cron_expression=CronExpressionGenerator.hourly(),
7 )
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/model_monitor/model_monitoring.py in create_monitoring_schedule(self, endpoint_input, record_preprocessor_script, post_analytics_processor_script, output_s3_uri, constraints, statistics, monitor_schedule_name, schedule_cron_expression, enable_cloudwatch_metrics)
1213 )
1214 print(message)
-> 1215 raise ValueError(message)
1216
1217 self.monitoring_schedule_name = self._generate_monitoring_schedule_name(
ValueError: It seems that this object was already used to create an Amazon Model Monitoring Schedule. To create another, first delete the existing one using my_monitor.delete_monitoring_schedule().
Try and Delete that schedule my_monitor.delete_monitoring_schedule()
This also fails: ResourceNotFound: An error occurred (ResourceNotFound) when calling the DeleteMonitoringSchedule operation: Monitoring Schedule arn:aws:sagemaker:us-east-1:210829804582:monitoring-schedule/my-monitoring-schedule1 not found
The workaround is to manually force the schedule name to be None my_monitor.monitoring_schedule_name = None
Expected behavior
I would expect that if the create_monitoring_schedule fails, the object variables should remain to None so that we can create without modifying the variables manually.
Screenshots or logs
Will provide example NoteBook with logs and repro steps.
System information
A description of your system. Please provide:
SageMaker Python SDK version: 1.65.1
Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
Framework version: N/A
Python version: python 3 (tested using default conda_python3 kernal on sagemaker notebook with updated sagemaker-python-sdk)
CPU or GPU: CPU
Custom Docker image (Y/N): N
Additional context
The text was updated successfully, but these errors were encountered:
jmgray24
changed the title
ModelMonitor Sclass doesn't cleanout monitor_schedule_name if it fails to create.
ModelMonitor class doesn't cleanout monitor_schedule_name if create_monitor_schedule() fails.
Jun 24, 2020
Describe the bug
When creating a model Monitor and attaching a schedule using "create_monitoring_schedule", If the schedule fails to create due to Validation Exception, the schedule is never created, but the Model_monitor class retains the variables for schedule name etc.
This causes issues, because you can't delete the monitor using delete_monitoring_schedule(), but you cant create a new one as it is already initialized.
To reproduce
Create a Model Monitor
Create a bad schedule:
It should fail due to a bad CRON Expression
Try and recreate a valid monitor schedule
You get a fails to create error due to there already being a schedule
Try and Delete that schedule
my_monitor.delete_monitoring_schedule()
This also fails:
ResourceNotFound: An error occurred (ResourceNotFound) when calling the DeleteMonitoringSchedule operation: Monitoring Schedule arn:aws:sagemaker:us-east-1:210829804582:monitoring-schedule/my-monitoring-schedule1 not found
The workaround is to manually force the schedule name to be None
my_monitor.monitoring_schedule_name = None
Expected behavior
I would expect that if the create_monitoring_schedule fails, the object variables should remain to None so that we can create without modifying the variables manually.
Screenshots or logs
Will provide example NoteBook with logs and repro steps.
System information
A description of your system. Please provide:
Additional context
The text was updated successfully, but these errors were encountered: