Skip to content

feature: Add support for Partial Dependence Plots(PDP) in SageMaker Clarify #2716

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 29, 2021

Conversation

keerthanvasist
Copy link
Member

Description of changes:
Add support for Partial Dependence Plots(PDP) in SageMaker Clarify

Testing done:
Added unit and integration tests.

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

  • I have read the CONTRIBUTING doc
  • I certify that the changes I am introducing will be backword compatible, and I have discussed concerns about this, if any, with the Python SDK team
  • I used the commit message format described in CONTRIBUTING
  • I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
  • I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
  • I have checked that my tests are not configured for a specific region or account (if appropriate)
  • I have used unique_name_from_base to create resource names in integ tests (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 13c3b7d
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 13c3b7d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 13c3b7d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 8c40262
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 8c40262
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 8c40262
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

1 similar comment
@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 8c40262
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

karan6181
karan6181 previously approved these changes Oct 20, 2021
Copy link
Contributor

@karan6181 karan6181 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

tox.ini Outdated
@@ -140,7 +140,7 @@ setenv =
LANG=C.UTF-8
deps = black
commands =
black -l 100 --check ./
black -l 100 ./
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it by mistake ? The removal of --check ?

jeniyat
jeniyat previously approved these changes Oct 20, 2021
Copy link
Contributor

@jeniyat jeniyat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: b794c09
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 261a3a6
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: b794c09
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: b794c09
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 261a3a6
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 261a3a6
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

analysis_config["methods"] = explainability_config.get_explainability_config()

explainability_methods = {}
if isinstance(explainability_config, List): # pylint: disable=W1116
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be "list", List is from typing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -772,7 +828,8 @@ def run_explainability(
model_config (:class:`~sagemaker.clarify.ModelConfig`): Config of the model and its
endpoint to be created.
explainability_config (:class:`~sagemaker.clarify.ExplainabilityConfig`): Config of the
specific explainability method. Currently, only SHAP is supported.
specific explainability method or a list of ExplainabilityConfig objects. Currently,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't you document that it can be a list also?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -258,6 +262,57 @@ def test_shap(clarify_processor, data_config, model_config, shap_config, sagemak
check_analysis_config(data_config, sagemaker_session, "shap")


def test_pdp(clarify_processor, data_config, model_config, pdp_config, sagemaker_session):
with timeout.timeout(minutes=CLARIFY_DEFAULT_TIMEOUT_MINUTES):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this running a real job?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I removed the new integration tests added.

tox.ini Outdated
@@ -46,7 +46,7 @@ ignore =
require-code = True

[doc8]
ignore-path=.tox,src/sagemaker.egg-info
ignore-path=.tox,src/sagemaker.egg-info,venv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this change should be removed. I guess you use this venv folder for Python virtual env, it is your own setup, not part of the standard setup recommended by contribution guide. (As a workaround you can use a different path, like ~/.venv)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, please remove this change.

Copy link
Member Author

@keerthanvasist keerthanvasist Oct 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this change. I chose to leave it there because a venv is excluded for other commands in the same file, and in gitignore as well. It would have been a harmless convenience

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

venv is pretty common, maybe make it a separate commit?

@@ -54,7 +56,11 @@ def __init__(
"ShardedByS3Key".
s3_compression_type (str): Valid options are "None" or "Gzip".
"""
if dataset_type not in ["text/csv", "application/jsonlines", "application/x-parquet"]:
if dataset_type not in [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment: my suggestion is that you undo code formatting like because they are not necessary and distract the reviewers, ideally you should only make necessary changes for the feature. (If the code is formatted automatically by your IDE then reset the IDE settings)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this for now. There are multiple formatting checks (pylint, black etc) which check on push too. It is hard for developer also to figure out which change was cause by IDE and caused by formatters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then try harder not to let your IDE cause trouble.


def get_explainability_config(self):
"""Returns config."""
return {"pdp": copy.deepcopy(self.pdp_config)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a copy of the entire return value. (so that any modification to the return value by the caller won't impact the PDPConfig object.)

copy.deepcopy({"pdp": self.pdp_config})

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated this, although I see no difference between the two.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is. In below case do you expect config_dict2 to have the update made for config_dict?

config_dict = pdp_config.get_explainability_config()
config_dict["pdp"]["asdf"] = 1
config_dict2 = pdp_config.get_explainability_config()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. My concern was the caller may update the return value and impact the original class, but in this case there seems no impact. It’s fine as lone as there is no impact.

Comment on lines 873 to 876
if not isinstance(config, ExplainabilityConfig):
raise ValueError(
f"Invalid input: Excepted ExplainabilityConfig, got {type(config)} instead"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the isinstance(config, ExplainabilityConfig) check allows raising a better error/message to the customer, it is not necessary because if explainability_config is invalid then explainability_config.get_explainability_config() should fail anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

explainability_methods[list(explain_config.keys())[0]] = explain_config[
list(explain_config.keys())[0]
]
elif isinstance(explainability_config, ExplainabilityConfig):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be simply a "else:", then the code fail fast here if an invalid explainability_config parameter is passed to the method. Better than sending an empty "methods" list to server side for validation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 878 to 880
explainability_methods[list(explain_config.keys())[0]] = explain_config[
list(explain_config.keys())[0]
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explainability_methods.update(explain_config) should do the job.

And there should be validations after the for loop

  • validate the length of the result it shall not be zero
  • validate the length of the result to make sure that there was no overwrite,
assert len(explainability_methods.keys())>0, "Please provide at least one explaianbility config."
assert len(explainability_methods.keys())==len(explainability_config) "There are duplicate explainability configs"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -574,6 +598,34 @@ def _run_test_shap(
)


@patch("sagemaker.utils.name_from_base", return_value=JOB_NAME)
def test_pdp(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally there should also be test cases for invalid input, for example:

  • explainability_config is None
  • explainability_config is an emtpy list
  • Multiple shap_config (or pdp_config) in explainability_config

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Additional tests are added.

Comment on lines 286 to 289
def test_shap_and_pdp(
clarify_processor, data_config, model_config, shap_config, pdp_config, sagemaker_session
):
with timeout.timeout(minutes=CLARIFY_DEFAULT_TIMEOUT_MINUTES):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I remember pdp is time consuming as well as shap, if we run both in the same job then would the job runtime be close to the CLARIFY_DEFAULT_TIMEOUT_MINUTES? Considering that the time to launch processing instances and create shadow endpoint is not a fixed value, we should leave enough buffer for them to avoid flaky test. My suggestion is that you run this test case using your indivisual account to see the runtime and make a decision (on keeping the timeout value or increase it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing to SM PySDK.

15 Minutes timeout is a very long time for an integ test run. What is the actual runtime expected here?

We should be looking into ways to reduce this runtime and be frugal here. These integ tests currently run for every PR to PySDK and would add that much of a buffer causing our PR's to be queued up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did run these tests, and they pass within the timeout, but I also don't think they are super useful. So, I did away with all the new integ tests I added in this CR. Unit tests cover the changes I have made well. The only change is the what analysis_config is to be generated. Beyond that, clarify container integration tests will cover it.

If this is okay, I am also okay with this. However, we cannot reduce the runtime of these integration tests because bringing up a processing job takes time.

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 4bf250b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 4bf250b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 4bf250b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 1db1ff8
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 1db1ff8
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 1db1ff8
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Comment on lines 872 to 877
assert (
len(explainability_config) > 0
), "Please provide at least one explaianbility config."
for config in explainability_config:
explain_config = config.get_explainability_config()
explainability_methods.update(explain_config)
assert len(explainability_methods.keys()) == len(
explainability_config
), "There are duplicate explainability configs"
else:
Copy link
Member

@mufaddal-rohawala mufaddal-rohawala Oct 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to raise ValueErrors or some other meaningful errors here instead of AssertionError. AssertionError belongs to cases where you test inner behaviour and not make user-sanity checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Parking this PR until a fix goes in. Will update again next week.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. This CR now ready.

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 65bc046
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: aaece32
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 65bc046
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 65bc046
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: aaece32
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: aaece32
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 8d57329
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Copy link
Contributor

@jeniyat jeniyat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@navinsoni navinsoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as all comments are addressed

@navinsoni navinsoni merged commit aae4c57 into aws:master Oct 29, 2021
EthanShouhanCheng pushed a commit to SissiChenxy/sagemaker-python-sdk that referenced this pull request Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants