-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[bug] broken custom image local mode training due to 1.11.1 #421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
(Update: the custom setting is unrelated, but I remember it lead to some other issues that is out of scope of this discussion.)
|
Hello, Thank you for bringing this bug to our attention. This is related to a change made here #411 Specifically here: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/local/image.py#L112 That value is provided through hyperparameters, which is normally provided if you use a deep learning framework estimator such as TensorFlow. This is a bug and a fix will be released soon. |
Thanks for the attention and glad that you found it useful. Next release will be totally fine with me. Thanks. Will report when things change. |
Has been released! Please update and let me know if the change addresses your problem! |
Thanks for the fast response. Still have problems. Not sure if this is related?
|
@yifeim - It looks like this time it failed at creating an endpoint. Does the same code work with 1.11 for the hosting part? Have you tried creating a model first like this:
I have walked through the code path that failed and the addtional_env_var is suppose to be an empty dict since you did not pass in any additional arguments to the deploy() call. I am trying to reproduce this error now. |
I could create a model, but could not deploy it. I had the exact same errors. |
It seemed that the codes detected that it is an empty dict and replaced it with empty list. Not sure why a list is the expected format. |
@yifeim Your are right and we have a bug there. It suppose to be:
Thanks for pointing it out. I will get a fix out as soon as possible. |
PR to fix this #429 |
PR is merged and this change is scheduled to be released tomorrow. |
I can verify that the new release solved both problems. Thanks a lot everyone! |
Please fill out the form below.
System Information
Describe the problem
The following codes were fine with sagemaker ==1.11.0, but not okay with sagemaker==1.11.1. Any guesses what might be missing? Thanks!
PS: I did downgrade sagemaker to 1.11.0 and tested out that the example is okay.
Minimal repro / logs
The text was updated successfully, but these errors were encountered: