Skip to content

Commit effed64

Browse files
updates docstrings for datasetbuilder members
1 parent 5ca7f28 commit effed64

File tree

1 file changed

+27
-18
lines changed

1 file changed

+27
-18
lines changed

src/sagemaker/feature_store/dataset_builder.py

+27-18
Original file line numberDiff line numberDiff line change
@@ -171,24 +171,33 @@ class DatasetBuilder:
171171
_event_time_identifier_feature_name (str): A string representing the event time identifier
172172
feature if base is a DataFrame (default: None).
173173
_included_feature_names (List[str]): A list of strings representing features to be
174-
included in the output (default: None).
175-
_kms_key_id (str): An KMS key id. If set, will be used to encrypt the result file
174+
included in the output. If not set, all features will be included in the output.
176175
(default: None).
177-
_point_in_time_accurate_join (bool): A boolean representing whether using point in time join
178-
or not (default: False).
179-
_include_duplicated_records (bool): A boolean representing whether including duplicated
180-
records or not (default: False).
181-
_include_deleted_records (bool): A boolean representing whether including deleted records or
182-
not (default: False).
183-
_number_of_recent_records (int): An int that how many records will be returned for each
184-
record identifier (default: 1).
185-
_number_of_records (int): An int that how many records will be returned (default: None).
186-
_write_time_ending_timestamp (datetime.datetime): A datetime that all records' write time in
187-
dataset will be before it (default: None).
188-
_event_time_starting_timestamp (datetime.datetime): A datetime that all records' event time
189-
in dataset will be after it (default: None).
190-
_event_time_ending_timestamp (datetime.datetime): A datetime that all records' event time in
191-
dataset will be before it (default: None).
176+
_kms_key_id (str): A KMS key id. If set, will be used to encrypt the result file
177+
(default: None).
178+
_point_in_time_accurate_join (bool): A boolean representing if point-in-time join
179+
is applied to the resulting dataframe when calling "to_dataframe".
180+
When set to True, users can retrieve data using “row-level time travel”
181+
according to the event times provided to the DatasetBuilder. This requires that the
182+
entity dataframe with event times is submitted as the base in the constructor
183+
(default: False).
184+
_include_duplicated_records (bool): A boolean representing whether the resulting dataframe
185+
when calling "to_dataframe" should include duplicated records (default: False).
186+
_include_deleted_records (bool): A boolean representing whether the resulting
187+
dataframe when calling "to_dataframe" should include deleted records (default: False).
188+
_number_of_recent_records (int): An integer representing how many records will be
189+
returned for each record identifier (default: 1).
190+
_number_of_records (int): An integer representing the number of records that should be
191+
returned in the resulting dataframe when calling "to_dataframe" (default: None).
192+
_write_time_ending_timestamp (datetime.datetime): A datetime that represents the latest
193+
write time for a record to be included in the resulting dataset. Records with a
194+
newer write time will be omitted from the resulting dataset. (default: None).
195+
_event_time_starting_timestamp (datetime.datetime): A datetime that represents the earliest
196+
event time for a record to be included in the resulting dataset. Records
197+
with an older event time will be omitted from the resulting dataset. (default: None).
198+
_event_time_ending_timestamp (datetime.datetime): A datetime that represents the latest
199+
event time for a record to be included in the resulting dataset. Records
200+
with a newer event time will be omitted from the resulting dataset. (default: None).
192201
_feature_groups_to_be_merged (List[FeatureGroupToBeMerged]): A list of
193202
FeatureGroupToBeMerged which will be joined to base (default: []).
194203
_event_time_identifier_feature_type (FeatureTypeEnum): A FeatureTypeEnum representing the
@@ -247,7 +256,7 @@ def with_feature_group(
247256
return self
248257

249258
def point_in_time_accurate_join(self):
250-
"""Set join type as point in time accurate join.
259+
"""Enable point-in-time accurate join.
251260
252261
Returns:
253262
This DatasetBuilder object.

0 commit comments

Comments
 (0)