-
Notifications
You must be signed in to change notification settings - Fork 1.2k
NCHW format not supported in c5.xlarge deployment #771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @gautiese, thanks for using SageMaker! Yes, it looks like you are using a non-MKL build of TensorFlow in SageMaker, and an MKL build in EC2. Which "framework_version" (or container image uri) are you using when you create the endpoint? |
I am using 1.13 as the framework version.
Not using a custom image.
Is the sagemaker tensorflow inference image a non-MKL build?
…On Mon, 29 Apr 2019 at 10:32 PM, jesterhazy ***@***.***> wrote:
Hi @gautiese <https://github.com/gautiese>, thanks for using SageMaker!
Yes, it looks like you are using a non-MKL build of TensorFlow in
SageMaker, and an MKL build in EC2.
Which "framework_version" (or container image uri) are you using when you
create the endpoint?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#771 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AES7VK5SFHJ4WRXDRAKT2A3PS4SZHANCNFSM4HJC3ZTA>
.
|
We haven't released any tensorflow 1.13 container yet, are you sure that's the right version? The info in the original post says 1.12, so I'm going to assume that's still correct. Right now our tensorflow containers do not include an MKL build of TensorFlow Serving. We plan to add it but don't have a target release date yet. In the meantime, your best bet would be to change your model so it accepts NHWC inputs. I'm going to tag this as a feature request (for MKL support) and leave it open. |
You are right, I am on 1.12. |
If I was to create an MKL-DNN sagemaker tensorflow serving container myself... How would I go about it? I am being pressed to deploy my model on a CPU instance soon (budgets) |
More recent containers now come with the MKL optimization. Closing as fixed. |
Please fill out the form below.
System Information
Describe the problem
Describe the problem or feature request clearly here.
We trained a model on Tensorflow which consumes images in the format of NCHW. The training was done on GPUs (I believe NCHW is only supported by GPUs or MKL capable Intel processors).
When I try to infer from the model using an endpoint which is ml.c5.xlarge, I get the following error:
E external/org_tensorflow/tensorflow/core/common_runtime/executor.cc:623] Executor failed to create kernel. Invalid argument: Conv2DCustomBackpropInputOp only supports NHWC.
#11 [[{{node Gs/cond/8x8/Conv0_up/conv2d_transpose}} = Conv2DBackpropInput[T=DT_FLOAT, _output_shapes=[[1,512,8,8]], data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Gs/cond/8x8/Conv0_up/conv2d_transpose/output_shape, Gs/cond/8x8/Conv0_up/AddN, Gs/cond/ToRGB_lod6/Conv2D/Switch:1)]]
Strangely, when I deploy the same on the local notebook instance (which is also ml.c5.xlarge) the model works just fine!
The text was updated successfully, but these errors were encountered: