Skip to content

Running multiple Transform jobs with same Transformer uses same output_path for all transform jobs #691

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
andremoeller opened this issue Mar 9, 2019 · 3 comments · Fixed by #905

Comments

@andremoeller
Copy link
Contributor

Please fill out the form below.

System Information

  • Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): TensorFlow
  • Framework Version: 1.12.0
  • Python Version: 3
  • CPU or GPU: CPU
  • Python SDK Version: sagemaker==1.16.3
  • Are you using a custom image: No.

Describe the problem

If I run more than one Transform Job using a Transformer, the outputs are written to the same S3 output path, even though I'm running different transform jobs. So outputs can collide and be overwritten or I'll end up with outputs from multiple jobs in the same S3 output path

Minimal repro / logs

I'm running tensorflow_batch_transform_mnist.ipynb.

You can reproduce this by doing:
transformer.transform([your batch s3 input path], content_type='text/csv')
print(transformer.output_path)

transformer.transform([some other batch s3 input path], content_type='text/csv')
print(transformer.output_path)

And the transformer.output_path will be the same (but the jobs are different)

@nadiaya
Copy link
Contributor

nadiaya commented Mar 11, 2019

So current behavior is to pass to the platform s3 location provided by the user or create a job_id based folder inside default s3 bucket (similar to what SageMaker platform does for training output):
https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/transformer.py#L111-L112

Moving this issue to the platform team to answer why the collision use case is not addressed at the platform level.

@andremoeller
Copy link
Contributor Author

The issue is:

If I run one job with default generated name "sagemaker-tensorflow-1", that gets written to "s3:/...../sagemaker-tensorflow-1/".

If I run another job with the same transformer with default generated name "sagemaker-tensorflow-2", that gets written to "s3:/...../sagemaker-tensorflow-1", instead of "s3:/...../sagemaker-tensorflow-2", which means Python SDK tells the platform to write output from both jobs to the same output location. I don't think that's expected (or good for users), regardless of what the platform does -- wouldn't it make more sense to default to telling the platform to write to a prefix with the job name?

@zmjjmz
Copy link

zmjjmz commented Jun 3, 2019

I'd like to add that I think this is a little weird from an API perspective, since the CreateTransformJob that the Transformer.transform method presumably would correspond to takes a TransformOutput as an argument. Is there a specific reason that this has to be provided on the creation of the Transformer object?

qidewenwhen added a commit to qidewenwhen/sagemaker-python-sdk that referenced this issue Dec 13, 2022
* feature: Add Experiment helper classes (aws#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (aws#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (aws#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (aws#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (aws#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
qidewenwhen added a commit to qidewenwhen/sagemaker-python-sdk that referenced this issue Dec 14, 2022
* feature: Add Experiment helper classes (aws#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (aws#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (aws#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (aws#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (aws#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
qidewenwhen added a commit to qidewenwhen/sagemaker-python-sdk that referenced this issue Dec 14, 2022
* feature: Add Experiment helper classes (aws#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (aws#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (aws#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (aws#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (aws#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
qidewenwhen added a commit to qidewenwhen/sagemaker-python-sdk that referenced this issue Dec 14, 2022
* feature: Add Experiment helper classes (aws#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (aws#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (aws#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (aws#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (aws#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
qidewenwhen added a commit to qidewenwhen/sagemaker-python-sdk that referenced this issue Dec 14, 2022
* feature: Add Experiment helper classes (aws#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (aws#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (aws#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (aws#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (aws#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
navinsoni pushed a commit that referenced this issue Dec 14, 2022
* feature: Add experiment plus Run class (#691)

* feature: Add Experiment helper classes (#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>

* change: Simplify exp plus integ test configuration (#694)

Co-authored-by: Dewen Qi <[email protected]>

* feature: add RunName to expeirment_config (#696)

* change: Update Run init and add Run load and _RunContext (#707)

* change: Update Run init and add Run load

Add exp name and run group name to load and address comments

* Address nit comments

Co-authored-by: Dewen Qi <[email protected]>

* fix: Fix run name uniqueness issue (#730)

Co-authored-by: Dewen Qi <[email protected]>

* change: Update integ tests for Exp Plus M1 changes (#741)

Co-authored-by: Dewen Qi <[email protected]>

* add metrics client to session object (#745)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* change: Add integ test for using Run in Transform Job (#749)

Co-authored-by: Dewen Qi <[email protected]>

* Add async metrics sink (#739)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* use metrics client provided by session (#754)

* fix flaky metrics test (#753)

* change: Change Run.init and Run.load to constructor and module method respectively (#752)

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add latest metric service model (#757)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* fix: lowercase run name (#767)

* Change: Minimize use of lower case tc name (#769)

* change: Clean up test resources to remove model files (#756)

* change: Clean up test resources to remove model files

* fix: Change experiment enums to upper case

* change: Upgrade boto3 and update test to validate mixed case name

* fix: Update as per latest botocore release and backend change

Co-authored-by: Dewen Qi <[email protected]>

* lowercase trial component name (#776)

* change: Expose sagemaker experiment doc strings

* fix: Fix exp name mixed case in issue

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Yifei Zhu <[email protected]>
claytonparnell pushed a commit to claytonparnell/sagemaker-python-sdk that referenced this issue Dec 16, 2022
* feature: Add experiment plus Run class (aws#691)

* feature: Add Experiment helper classes (aws#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (aws#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (aws#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (aws#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (aws#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>

* change: Simplify exp plus integ test configuration (aws#694)

Co-authored-by: Dewen Qi <[email protected]>

* feature: add RunName to expeirment_config (aws#696)

* change: Update Run init and add Run load and _RunContext (aws#707)

* change: Update Run init and add Run load

Add exp name and run group name to load and address comments

* Address nit comments

Co-authored-by: Dewen Qi <[email protected]>

* fix: Fix run name uniqueness issue (aws#730)

Co-authored-by: Dewen Qi <[email protected]>

* change: Update integ tests for Exp Plus M1 changes (aws#741)

Co-authored-by: Dewen Qi <[email protected]>

* add metrics client to session object (aws#745)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* change: Add integ test for using Run in Transform Job (aws#749)

Co-authored-by: Dewen Qi <[email protected]>

* Add async metrics sink (aws#739)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* use metrics client provided by session (aws#754)

* fix flaky metrics test (aws#753)

* change: Change Run.init and Run.load to constructor and module method respectively (aws#752)

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add latest metric service model (aws#757)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* fix: lowercase run name (aws#767)

* Change: Minimize use of lower case tc name (aws#769)

* change: Clean up test resources to remove model files (aws#756)

* change: Clean up test resources to remove model files

* fix: Change experiment enums to upper case

* change: Upgrade boto3 and update test to validate mixed case name

* fix: Update as per latest botocore release and backend change

Co-authored-by: Dewen Qi <[email protected]>

* lowercase trial component name (aws#776)

* change: Expose sagemaker experiment doc strings

* fix: Fix exp name mixed case in issue

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Yifei Zhu <[email protected]>
mufaddal-rohawala pushed a commit to mufaddal-rohawala/sagemaker-python-sdk that referenced this issue Dec 19, 2022
* feature: Add experiment plus Run class (aws#691)

* feature: Add Experiment helper classes (aws#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (aws#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (aws#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (aws#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (aws#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>

* change: Simplify exp plus integ test configuration (aws#694)

Co-authored-by: Dewen Qi <[email protected]>

* feature: add RunName to expeirment_config (aws#696)

* change: Update Run init and add Run load and _RunContext (aws#707)

* change: Update Run init and add Run load

Add exp name and run group name to load and address comments

* Address nit comments

Co-authored-by: Dewen Qi <[email protected]>

* fix: Fix run name uniqueness issue (aws#730)

Co-authored-by: Dewen Qi <[email protected]>

* change: Update integ tests for Exp Plus M1 changes (aws#741)

Co-authored-by: Dewen Qi <[email protected]>

* add metrics client to session object (aws#745)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* change: Add integ test for using Run in Transform Job (aws#749)

Co-authored-by: Dewen Qi <[email protected]>

* Add async metrics sink (aws#739)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* use metrics client provided by session (aws#754)

* fix flaky metrics test (aws#753)

* change: Change Run.init and Run.load to constructor and module method respectively (aws#752)

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add latest metric service model (aws#757)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* fix: lowercase run name (aws#767)

* Change: Minimize use of lower case tc name (aws#769)

* change: Clean up test resources to remove model files (aws#756)

* change: Clean up test resources to remove model files

* fix: Change experiment enums to upper case

* change: Upgrade boto3 and update test to validate mixed case name

* fix: Update as per latest botocore release and backend change

Co-authored-by: Dewen Qi <[email protected]>

* lowercase trial component name (aws#776)

* change: Expose sagemaker experiment doc strings

* fix: Fix exp name mixed case in issue

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Yifei Zhu <[email protected]>
mufaddal-rohawala pushed a commit that referenced this issue Dec 20, 2022
* feature: Add experiment plus Run class (#691)

* feature: Add Experiment helper classes (#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>

* change: Simplify exp plus integ test configuration (#694)

Co-authored-by: Dewen Qi <[email protected]>

* feature: add RunName to expeirment_config (#696)

* change: Update Run init and add Run load and _RunContext (#707)

* change: Update Run init and add Run load

Add exp name and run group name to load and address comments

* Address nit comments

Co-authored-by: Dewen Qi <[email protected]>

* fix: Fix run name uniqueness issue (#730)

Co-authored-by: Dewen Qi <[email protected]>

* change: Update integ tests for Exp Plus M1 changes (#741)

Co-authored-by: Dewen Qi <[email protected]>

* add metrics client to session object (#745)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* change: Add integ test for using Run in Transform Job (#749)

Co-authored-by: Dewen Qi <[email protected]>

* Add async metrics sink (#739)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* use metrics client provided by session (#754)

* fix flaky metrics test (#753)

* change: Change Run.init and Run.load to constructor and module method respectively (#752)

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add latest metric service model (#757)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* fix: lowercase run name (#767)

* Change: Minimize use of lower case tc name (#769)

* change: Clean up test resources to remove model files (#756)

* change: Clean up test resources to remove model files

* fix: Change experiment enums to upper case

* change: Upgrade boto3 and update test to validate mixed case name

* fix: Update as per latest botocore release and backend change

Co-authored-by: Dewen Qi <[email protected]>

* lowercase trial component name (#776)

* change: Expose sagemaker experiment doc strings

* fix: Fix exp name mixed case in issue

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Yifei Zhu <[email protected]>
JoseJuan98 pushed a commit to JoseJuan98/sagemaker-python-sdk that referenced this issue Mar 4, 2023
* feature: Add experiment plus Run class (aws#691)

* feature: Add Experiment helper classes (aws#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (aws#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (aws#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (aws#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (aws#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>

* change: Simplify exp plus integ test configuration (aws#694)

Co-authored-by: Dewen Qi <[email protected]>

* feature: add RunName to expeirment_config (aws#696)

* change: Update Run init and add Run load and _RunContext (aws#707)

* change: Update Run init and add Run load

Add exp name and run group name to load and address comments

* Address nit comments

Co-authored-by: Dewen Qi <[email protected]>

* fix: Fix run name uniqueness issue (aws#730)

Co-authored-by: Dewen Qi <[email protected]>

* change: Update integ tests for Exp Plus M1 changes (aws#741)

Co-authored-by: Dewen Qi <[email protected]>

* add metrics client to session object (aws#745)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* change: Add integ test for using Run in Transform Job (aws#749)

Co-authored-by: Dewen Qi <[email protected]>

* Add async metrics sink (aws#739)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* use metrics client provided by session (aws#754)

* fix flaky metrics test (aws#753)

* change: Change Run.init and Run.load to constructor and module method respectively (aws#752)

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add latest metric service model (aws#757)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* fix: lowercase run name (aws#767)

* Change: Minimize use of lower case tc name (aws#769)

* change: Clean up test resources to remove model files (aws#756)

* change: Clean up test resources to remove model files

* fix: Change experiment enums to upper case

* change: Upgrade boto3 and update test to validate mixed case name

* fix: Update as per latest botocore release and backend change

Co-authored-by: Dewen Qi <[email protected]>

* lowercase trial component name (aws#776)

* change: Expose sagemaker experiment doc strings

* fix: Fix exp name mixed case in issue

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Yifei Zhu <[email protected]>
JoseJuan98 pushed a commit to JoseJuan98/sagemaker-python-sdk that referenced this issue Mar 4, 2023
* feature: Add experiment plus Run class (aws#691)

* feature: Add Experiment helper classes (aws#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (aws#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (aws#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (aws#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (aws#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>

* change: Simplify exp plus integ test configuration (aws#694)

Co-authored-by: Dewen Qi <[email protected]>

* feature: add RunName to expeirment_config (aws#696)

* change: Update Run init and add Run load and _RunContext (aws#707)

* change: Update Run init and add Run load

Add exp name and run group name to load and address comments

* Address nit comments

Co-authored-by: Dewen Qi <[email protected]>

* fix: Fix run name uniqueness issue (aws#730)

Co-authored-by: Dewen Qi <[email protected]>

* change: Update integ tests for Exp Plus M1 changes (aws#741)

Co-authored-by: Dewen Qi <[email protected]>

* add metrics client to session object (aws#745)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* change: Add integ test for using Run in Transform Job (aws#749)

Co-authored-by: Dewen Qi <[email protected]>

* Add async metrics sink (aws#739)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* use metrics client provided by session (aws#754)

* fix flaky metrics test (aws#753)

* change: Change Run.init and Run.load to constructor and module method respectively (aws#752)

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add latest metric service model (aws#757)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* fix: lowercase run name (aws#767)

* Change: Minimize use of lower case tc name (aws#769)

* change: Clean up test resources to remove model files (aws#756)

* change: Clean up test resources to remove model files

* fix: Change experiment enums to upper case

* change: Upgrade boto3 and update test to validate mixed case name

* fix: Update as per latest botocore release and backend change

Co-authored-by: Dewen Qi <[email protected]>

* lowercase trial component name (aws#776)

* change: Expose sagemaker experiment doc strings

* fix: Fix exp name mixed case in issue

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Yifei Zhu <[email protected]>
nmadan pushed a commit to nmadan/sagemaker-python-sdk that referenced this issue Apr 18, 2023
* feature: Add experiment plus Run class (aws#691)

* feature: Add Experiment helper classes (aws#646)

* feature: Add Experiment helper classes

feature: Add helper class _RunEnvironment

* change: Change sleep retry to backoff retry for get TC

* minor fixes in backoff retry

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add helper classes and methods for Run class (aws#660)

* feature: Add helper classes and methods for Run class

* Add Parent class to address comment

* fix docstyle check

* Add arg docstrings in _helper

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add Experiment Run class (aws#651)

Co-authored-by: Dewen Qi <[email protected]>

* change: Add integ tests for Run (aws#673)

Co-authored-by: Dewen Qi <[email protected]>

* Update run log metric to use MetricsManager (aws#678)

* Update run.log_metric to use _MetricsManager

* fix several metrics issues

* Add doc strings to metrics.py

Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dewen Qi <[email protected]>

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>

* change: Simplify exp plus integ test configuration (aws#694)

Co-authored-by: Dewen Qi <[email protected]>

* feature: add RunName to expeirment_config (aws#696)

* change: Update Run init and add Run load and _RunContext (aws#707)

* change: Update Run init and add Run load

Add exp name and run group name to load and address comments

* Address nit comments

Co-authored-by: Dewen Qi <[email protected]>

* fix: Fix run name uniqueness issue (aws#730)

Co-authored-by: Dewen Qi <[email protected]>

* change: Update integ tests for Exp Plus M1 changes (aws#741)

Co-authored-by: Dewen Qi <[email protected]>

* add metrics client to session object (aws#745)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* change: Add integ test for using Run in Transform Job (aws#749)

Co-authored-by: Dewen Qi <[email protected]>

* Add async metrics sink (aws#739)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* use metrics client provided by session (aws#754)

* fix flaky metrics test (aws#753)

* change: Change Run.init and Run.load to constructor and module method respectively (aws#752)

Co-authored-by: Dewen Qi <[email protected]>

* feature: Add latest metric service model (aws#757)

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: qidewenwhen <[email protected]>

* fix: lowercase run name (aws#767)

* Change: Minimize use of lower case tc name (aws#769)

* change: Clean up test resources to remove model files (aws#756)

* change: Clean up test resources to remove model files

* fix: Change experiment enums to upper case

* change: Upgrade boto3 and update test to validate mixed case name

* fix: Update as per latest botocore release and backend change

Co-authored-by: Dewen Qi <[email protected]>

* lowercase trial component name (aws#776)

* change: Expose sagemaker experiment doc strings

* fix: Fix exp name mixed case in issue

Co-authored-by: Dewen Qi <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Dana Benson <[email protected]>
Co-authored-by: Yifei Zhu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants