Skip to content

probability_threshold_attribute in ModelQualityCheckConfig cannot be a PipelineVariable #4227

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
leviBernadine opened this issue Oct 24, 2023 · 4 comments · Fixed by #4353
Closed
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug

Comments

@leviBernadine
Copy link

Describe the bug

probability_threshold_attribute (str or PipelineVariable): Threshold to

As an user of sagemaker pipeline, I want to use the QualityCheckStep for monitoring my model concept drift.
When I want to configure the ModelQualityCheckConfig I cannot set the probability_threshold_attribute attribute with a PipelineVariable (tested on either sagemaker.workflow.properties.Properties and sagemaker.workflow.parameters.Parameter)

To reproduce

from sagemaker.workflow.parameters import ParameterString
from sagemaker.workflow.quality_check_step import ModelQualityCheckConfig, QualityCheckStep

parameter_label_used_threshold = ParameterString(
    name="used_threshold",
    default_value="0.5",
)  
model_quality_check_config = ModelQualityCheckConfig(
    problem_type = "BinaryClassification",
    probability_attribute= ...,
    probability_threshold_attribute=parameter_label_used_threshold,
    ground_truth_attribute= ...,     
    baseline_dataset=...,
    dataset_format= ...,
    output_s3_uri=...
)

baseline_monitor_model_step_process = QualityCheckStep(
        name=...,
        skip_check=...,
        register_new_baseline=...,
        quality_check_config=model_quality_check_config,
        check_job_config=...,
        model_package_group_name=...,
        cache_config=..,
        supplied_baseline_statistics=...,
        supplied_baseline_constraints=...,
    )

Will raise error

File /opt/conda/lib/python3.10/site-packages/sagemaker/workflow/quality_check_step.py:426, in QualityCheckStep._generate_baseline_processor(self, baseline_dataset_input, baseline_output, post_processor_script_input, record_preprocessor_script_input)
    415     probability_attribute = (
    416         str(quality_check_cfg.probability_attribute)
    417         if quality_check_cfg.probability_attribute is not None
    418         else None
    419     )
    420     ground_truth_attribute = (
    421         str(quality_check_cfg.ground_truth_attribute)
    422         if quality_check_cfg.ground_truth_attribute is not None
    423         else None
    424     )
    425     probability_threshold_attr = (
--> 426         str(quality_check_cfg.probability_threshold_attribute)
    427         if quality_check_cfg.probability_threshold_attribute is not None
    428         else None
    429     )
    430     normalized_env = ModelMonitor._generate_env_map(
    431         env=self._model_monitor.env,
    432         dataset_format=quality_check_cfg.dataset_format,
   (...)
    442         probability_threshold_attribute=probability_threshold_attr,
    443     )
    445 return Processor(
    446     role=self._model_monitor.role,
    447     image_uri=self._model_monitor.image_uri,
   (...)
    459     network_config=self._model_monitor.network_config,
    460 )

File /opt/conda/lib/python3.10/site-packages/sagemaker/workflow/entities.py:86, in PipelineVariable.__str__(self)
     84 def __str__(self):
     85     """Override built-in String function for PipelineVariable"""
---> 86     raise TypeError(
     87         "Pipeline variables do not support __str__ operation. "
     88         "Please use `.to_string()` to convert it to string type in execution time"
     89         "or use `.expr` to translate it to Json for display purpose in Python SDK."
     90     )

TypeError: Pipeline variables do not support __str__ operation. Please use `.to_string()` to convert it to string type in execution timeor use `.expr` to translate it to Json for display purpose in Python SDK.

Expected behavior
To set probability_threshold_attr with a pipelineVariable.

System information
SageMaker Python SDK version: 2.194.0
Python version: 3.10.6

@qidewenwhen qidewenwhen added component: pipelines Relates to the SageMaker Pipeline Platform type: bug labels Oct 27, 2023
@qidewenwhen
Copy link
Member

qidewenwhen commented Oct 28, 2023

Adding a note:
The error was from here:

probability_threshold_attr = (
str(quality_check_cfg.probability_threshold_attribute)
if quality_check_cfg.probability_threshold_attribute is not None
else None
)

This is a bug. We may need to add the following fix but it needs proper integ test to prove probability_threshold_attribute is parametrizable

     if not quality_check_cfg.probability_threshold_attribute: 
           probability_threshold_attr = None
     elif  is_pipeline_variable(quality_check_cfg.probability_threshold_attribute):
           if isinstance(quality_check_cfg.probability_threshold_attribute, Parameter) and not isinstance(quality_check_cfg.probability_threshold_attribute, ParameterString)
                throw ValueError("... probability_threshold_attribute cannot be Parameter other than ParameterString")

           logger.warn("probability_threshold_attribute's runtime value must be string type ...")
           probability_threshold_attr =  quality_check_cfg.probability_threshold_attribute
    else:
           probability_threshold_attr = str(quality_check_cfg.probability_threshold_attribute)

@qidewenwhen
Copy link
Member

Tracking the fix in backlog

@leviBernadine
Copy link
Author

Thank you very much
Do you have any updates on this issue please ?

@qidewenwhen
Copy link
Member

Hi @leviBernadine, sorry for the delay. I just published a fix for the issue.
Reaching out to the team to review it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug
Projects
None yet
2 participants