Skip to content

[Request] accept logs param in .attach() #1405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
athewsey opened this issue Apr 13, 2020 · 3 comments
Closed

[Request] accept logs param in .attach() #1405

athewsey opened this issue Apr 13, 2020 · 3 comments
Assignees
Milestone

Comments

@athewsey
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
Training job logs can be significantly long, and are sometimes (but not always) useful to surface when waiting on a job. _TrainingJob.wait() supports the logs parameter to give users control over this.

EstimatorBase.attach() calls .wait(), but doesn't accept and pass through a logs parameter: Meaning users are stuck with log pollution and potential performance degradation.

Describe the solution you'd like
.attach() should support and pass through kwargs for .wait(), with same default behaviours as the underlying .wait().

Describe alternatives you've considered

Could hard-wire attach() to a different log status e.g. wait(logs="None") on the basis that after a training job has been completed, we don't need to re-present its logs... But this is an assumption that might be incompatible with other workflows.

There's currently only one argument for wait() - the subject of this request... But in general attach() makes multiple SageMaker API calls with potentially many arguments - so attach() arguments should probably be explicitly defined and fed through, rather than relying on **kwargs.

Additional context

Example use case: Lambda function in a Step Functions pipeline to deploy a model from a known good training job name, via the SageMaker SDK, without polluting the Lambda logs.

@athewsey
Copy link
Collaborator Author

Additionally, it looks like simply passing logs=False through to .wait() won't be enough to silence the messages, because _flush_log_streams() is still getting called (and spitting out messages with print() rather than logger).

@laurenyu
Copy link
Contributor

laurenyu commented May 7, 2020

Thanks for the feature request. We're actually thinking about going a step further, and separating out the logs from attach() entirely. Something like:

estimator.attach("job-name")
estimator.logs()

# another scenario
estimator.fit(wait=False)
estimator.logs()

where logs() could take in other parameters to allow for better control over what logs are printed. (A further extension could include a method that prints out only new log lines, mimicking tail -f.)

@ajaykarpur ajaykarpur added this to the v2.0.0 milestone May 7, 2020
@laurenyu laurenyu modified the milestones: v2.0.0, v2.0.0.rc2 May 7, 2020
@laurenyu laurenyu modified the milestones: v2.0.0.rc2, v2.0.0 Jun 4, 2020
@chuyang-deng chuyang-deng self-assigned this Jul 13, 2020
@athewsey
Copy link
Collaborator Author

Confirming the v2 separation seems to have satisfied the original request - thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants