Skip to content

feature: Added support for feature group schema change and feature parameters #3206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 9, 2022

Conversation

mizanfiu
Copy link
Contributor

@mizanfiu mizanfiu commented Jun 30, 2022

Issue #, if available:

Description of changes:
Added support for feature group schema change and feature parameters

Testing done:
Integration test

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

  • I have read the CONTRIBUTING doc
  • I certify that the changes I am introducing will be backword compatible, and I have discussed concerns about this, if any, with the Python SDK team
  • I used the commit message format described in CONTRIBUTING
  • I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
  • I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
  • I have checked that my tests are not configured for a specific region or account (if appropriate)
  • I have used unique_name_from_base to create resource names in integ tests (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov-commenter
Copy link

codecov-commenter commented Jun 30, 2022

Codecov Report

Merging #3206 (752eb24) into master (4879ec3) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #3206      +/-   ##
==========================================
+ Coverage   88.82%   88.84%   +0.01%     
==========================================
  Files         203      203              
  Lines       17913    17938      +25     
==========================================
+ Hits        15912    15937      +25     
  Misses       2001     2001              
Impacted Files Coverage Δ
src/sagemaker/feature_store/feature_group.py 76.52% <100.00%> (+0.68%) ⬆️
src/sagemaker/feature_store/inputs.py 98.33% <100.00%> (+0.18%) ⬆️
src/sagemaker/session.py 71.49% <100.00%> (+0.28%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4879ec3...752eb24. Read the comment docs.

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 2c1170f
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 2c1170f
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 2c1170f
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 2c1170f
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-notebook-tests
  • Commit ID: 2c1170f
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mizanfiu mizanfiu marked this pull request as ready for review June 30, 2022 18:42
@@ -537,6 +538,66 @@ def describe(self, next_token: str = None) -> Dict[str, Any]:
feature_group_name=self.name, next_token=next_token
)

def update(self, feature_additions: Sequence[FeatureDefinition]) -> Dict[str, Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feature_definitions is one field in FeatureGroup class. Do we need to append the new features in this field?

Copy link
Contributor Author

@mizanfiu mizanfiu Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a very good idea. I think we should. I'll submit a new version tomorrow.
[Not sure why this is not displayed in File Changed section. Thats why missed this one.]

Copy link
Contributor Author

@mizanfiu mizanfiu Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually after a second thought, I think let's skip this and keep it as it is. Because, that is also true for update_feature_metadata. Plus, if we append what's the value? In any way, this feature definition is used in backend for schema validation, in the studio side, customers are not actively using the definitions for any purpose.

Copy link
Contributor

@imingtsou imingtsou Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FeatureDefinition only has FeatureName and FeatureType. UpdateFeatureMetadata should not update this field. We should append the new features in this field.

Copy link
Contributor Author

@mizanfiu mizanfiu Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should, if this update API is completed synchronously. But in reality, that's not the case. There are two parts in the whole update procedure. First this API call which is synchronous and then, asynchronous workflow. Let's assume this synchronous API succeeds and we append the feature definitions BUT what happens if the asynchronous workflow fails? Then customers will get false impressions that features are actually added in the backend.

Lets keep things simple and the same behavior as of now. If customers want to know they can call describe API.

Later we can publish another SDK method, which will be a combined version of sync + async which will append features in the class as well. But for now, this should be enough. Makes sense?

Returns:
Response dict from service.
"""
return self.sagemaker_session.update_feature_metadata(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we verify the feature is in feature_definitions before calling API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets depend on backend for that, lets not have that check here. I am hoping backend will react immediately and response with the error message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what kind of error is thrown? Can we write an integ test to make sure that's caught in the code correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think thats a good idea to check that exception in SDK, backend should have that check both in unit and integration tests, having the same here would be a redundant.

Comment on lines 583 to 584
if parameter_additions is not None
else [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this if/else?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parameter_additions = [parameter_addition.to_dict() for parameter_addition in (parameter_additions or [])]

Unless you have code gaurantines that parameter_additions are not None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imingtsou parameter_additions can be None and parameter_addition in parameter_additions would not work on None type. Error will be TypeError: 'NoneType' object is not iterable. Please see unit tests for different use cases.
@XuetaoYin's code snippet would work but that is analogous to current implementation. Just a different style or personal preference.

Comment on lines +4125 to +4137
request = {
"FeatureGroupName": feature_group_name,
"FeatureName": feature_name,
}

if description is not None:
request["Description"] = description
if parameter_additions is not None:
request["ParameterAdditions"] = parameter_additions
if parameter_removals is not None:
request["ParameterRemovals"] = parameter_removals

return self.sagemaker_client.update_feature_metadata(**request)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to process the request?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because any field can be None and that needs a check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we call the API with None value? I assume we can do something like other APIs?

self.sagemaker_client.update_feature_metadata(
    FeatureGroupName=featureGroupName, 
    FeatureName=featureName,
    ...
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this for a purpose. I exactly don't remember at this moment why, but I think either I was getting error or if I did not do this, description got removed for none. If you want I can double check. But i dont think its blocking. Can we move forward?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can close this one. Please address the above comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Let me know if you have any.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have this one opened:
feature_definitions is one field in FeatureGroup class. Do we need to append the new features in this field?

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 250fa5b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 250fa5b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 250fa5b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-notebook-tests
  • Commit ID: 250fa5b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 250fa5b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Comment on lines 580 to 584
parameter_additions=[
parameter_addition.to_dict() for parameter_addition in parameter_additions
]
if parameter_additions is not None
else [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this outside return in a varible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored the code a bit different way for more readability, please check and let me know.

]
if parameter_additions is not None
else [],
parameter_removals=parameter_removals if parameter_removals is not None else [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored.

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 73a77e6
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 752eb24
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 752eb24
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 73a77e6
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 752eb24
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 73a77e6
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 752eb24
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-notebook-tests
  • Commit ID: 73a77e6
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-notebook-tests
  • Commit ID: 752eb24
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@navinsoni navinsoni changed the title Added support for feature group schema change and feature parameters feature: Added support for feature group schema change and feature parameters Jul 9, 2022
@navinsoni navinsoni merged commit 9369a87 into aws:master Jul 9, 2022
jerrypeng7773 pushed a commit to jerrypeng7773/sagemaker-python-sdk that referenced this pull request Aug 8, 2022
JoseJuan98 pushed a commit to JoseJuan98/sagemaker-python-sdk that referenced this pull request Mar 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants