-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feature: Added support for feature group schema change and feature parameters #3206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
2c1170f
250fa5b
73a77e6
752eb24
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -53,6 +53,7 @@ | |
OfflineStoreConfig, | ||
DataCatalogConfig, | ||
FeatureValue, | ||
FeatureParameter, | ||
) | ||
|
||
logger = logging.getLogger(__name__) | ||
|
@@ -537,6 +538,66 @@ def describe(self, next_token: str = None) -> Dict[str, Any]: | |
feature_group_name=self.name, next_token=next_token | ||
) | ||
|
||
def update(self, feature_additions: Sequence[FeatureDefinition]) -> Dict[str, Any]: | ||
"""Update a FeatureGroup and add new features from the given feature definitions. | ||
|
||
Args: | ||
feature_additions (Sequence[Dict[str, str]): list of feature definitions to be updated. | ||
|
||
Returns: | ||
Response dict from service. | ||
""" | ||
|
||
return self.sagemaker_session.update_feature_group( | ||
feature_group_name=self.name, | ||
feature_additions=[ | ||
feature_addition.to_dict() for feature_addition in feature_additions | ||
], | ||
) | ||
|
||
def update_feature_metadata( | ||
self, | ||
feature_name: str, | ||
description: str = None, | ||
parameter_additions: Sequence[FeatureParameter] = None, | ||
parameter_removals: Sequence[str] = None, | ||
) -> Dict[str, Any]: | ||
"""Update a feature metadata and add/remove metadata. | ||
|
||
Args: | ||
feature_name (str): name of the feature to update. | ||
description (str): description of the feature to update. | ||
parameter_additions (Sequence[Dict[str, str]): list of feature parameter to be added. | ||
parameter_removals (Sequence[str]): list of feature parameter key to be removed. | ||
|
||
Returns: | ||
Response dict from service. | ||
""" | ||
return self.sagemaker_session.update_feature_metadata( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall we verify the feature is in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Lets depend on backend for that, lets not have that check here. I am hoping backend will react immediately and response with the error message. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what kind of error is thrown? Can we write an integ test to make sure that's caught in the code correctly? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I dont think thats a good idea to check that exception in SDK, backend should have that check both in unit and integration tests, having the same here would be a redundant. |
||
feature_group_name=self.name, | ||
feature_name=feature_name, | ||
description=description, | ||
parameter_additions=[ | ||
parameter_addition.to_dict() for parameter_addition in parameter_additions | ||
] | ||
if parameter_additions is not None | ||
else [], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. parameter_additions = [parameter_addition.to_dict() for parameter_addition in (parameter_additions or [])] Unless you have code gaurantines that parameter_additions are not None. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @imingtsou parameter_additions can be None and There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we do this outside return in a varible There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I refactored the code a bit different way for more readability, please check and let me know. |
||
parameter_removals=parameter_removals if parameter_removals is not None else [], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Refactored. |
||
) | ||
|
||
def describe_feature_metadata(self, feature_name: str) -> Dict[str, Any]: | ||
"""Describe feature metadata by feature name. | ||
|
||
Args: | ||
feature_name (str): name of the feature. | ||
Returns: | ||
Response dict from service. | ||
""" | ||
|
||
return self.sagemaker_session.describe_feature_metadata( | ||
feature_group_name=self.name, feature_name=feature_name | ||
) | ||
|
||
def load_feature_definitions( | ||
self, | ||
data_frame: DataFrame, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4076,7 +4076,7 @@ def describe_feature_group( | |
"""Describe a FeatureGroup by name in FeatureStore service. | ||
|
||
Args: | ||
feature_group_name (str): name of the FeatureGroup to descibe. | ||
feature_group_name (str): name of the FeatureGroup to describe. | ||
next_token (str): next_token to get next page of features. | ||
Returns: | ||
Response dict from service. | ||
|
@@ -4086,6 +4086,72 @@ def describe_feature_group( | |
update_args(kwargs, NextToken=next_token) | ||
return self.sagemaker_client.describe_feature_group(**kwargs) | ||
|
||
def update_feature_group( | ||
self, feature_group_name: str, feature_additions: Sequence[Dict[str, str]] | ||
) -> Dict[str, Any]: | ||
"""Update a FeatureGroup and add new features from the given feature definitions. | ||
|
||
Args: | ||
feature_group_name (str): name of the FeatureGroup to update. | ||
feature_additions (Sequence[Dict[str, str]): list of feature definitions to be updated. | ||
Returns: | ||
Response dict from service. | ||
""" | ||
|
||
return self.sagemaker_client.update_feature_group( | ||
FeatureGroupName=feature_group_name, FeatureAdditions=feature_additions | ||
) | ||
|
||
def update_feature_metadata( | ||
self, | ||
feature_group_name: str, | ||
feature_name: str, | ||
description: str = None, | ||
parameter_additions: Sequence[Dict[str, str]] = None, | ||
parameter_removals: Sequence[str] = None, | ||
) -> Dict[str, Any]: | ||
"""Update a feature metadata and add/remove metadata. | ||
|
||
Args: | ||
feature_group_name (str): name of the FeatureGroup to update. | ||
feature_name (str): name of the feature to update. | ||
description (str): description of the feature to update. | ||
parameter_additions (Sequence[Dict[str, str]): list of feature parameter to be added. | ||
parameter_removals (Sequence[Dict[str, str]): list of feature parameter to be removed. | ||
Returns: | ||
Response dict from service. | ||
""" | ||
|
||
request = { | ||
"FeatureGroupName": feature_group_name, | ||
"FeatureName": feature_name, | ||
} | ||
|
||
if description is not None: | ||
request["Description"] = description | ||
if parameter_additions is not None: | ||
request["ParameterAdditions"] = parameter_additions | ||
if parameter_removals is not None: | ||
request["ParameterRemovals"] = parameter_removals | ||
|
||
return self.sagemaker_client.update_feature_metadata(**request) | ||
Comment on lines
+4134
to
+4146
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we need to process the request? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because any field can be None and that needs a check. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall we call the API with None value? I assume we can do something like other APIs?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I did this for a purpose. I exactly don't remember at this moment why, but I think either I was getting error or if I did not do this, description got removed for none. If you want I can double check. But i dont think its blocking. Can we move forward? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can close this one. Please address the above comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. Let me know if you have any. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I still have this one opened: |
||
|
||
def describe_feature_metadata( | ||
self, feature_group_name: str, feature_name: str | ||
) -> Dict[str, Any]: | ||
"""Describe feature metadata by feature name in FeatureStore service. | ||
|
||
Args: | ||
feature_group_name (str): name of the FeatureGroup. | ||
feature_name (str): name of the feature. | ||
Returns: | ||
Response dict from service. | ||
""" | ||
|
||
return self.sagemaker_client.describe_feature_metadata( | ||
FeatureGroupName=feature_group_name, FeatureName=feature_name | ||
) | ||
|
||
def put_record( | ||
self, | ||
feature_group_name: str, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feature_definitions
is one field in FeatureGroup class. Do we need to append the new features in this field?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a very good idea. I think we should. I'll submit a new version tomorrow.
[Not sure why this is not displayed in
File Changed
section. Thats why missed this one.]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually after a second thought, I think let's skip this and keep it as it is. Because, that is also true for
update_feature_metadata
. Plus, if we append what's the value? In any way, this feature definition is used in backend for schema validation, in the studio side, customers are not actively using the definitions for any purpose.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FeatureDefinition only has FeatureName and FeatureType. UpdateFeatureMetadata should not update this field. We should append the new features in this field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should, if this update API is completed synchronously. But in reality, that's not the case. There are two parts in the whole update procedure. First this API call which is synchronous and then, asynchronous workflow. Let's assume this synchronous API succeeds and we append the feature definitions BUT what happens if the asynchronous workflow fails? Then customers will get false impressions that features are actually added in the backend.
Lets keep things simple and the same behavior as of now. If customers want to know they can call describe API.
Later we can publish another SDK method, which will be a combined version of sync + async which will append features in the class as well. But for now, this should be enough. Makes sense?