-
Notifications
You must be signed in to change notification settings - Fork 1.2k
fix: add missing Tensorflow 2.9 inference image #3251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add missing Tensorflow 2.9 inference image #3251
Conversation
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
35426df
to
9cafeb0
Compare
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
9cafeb0
to
8fdeda4
Compare
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
@kevinyang8 @navinsoni I think this change is still required, but some of the tests are still failing. I think this is because the training image version is 2.9.1, but the inference version I'm adding is 2.9.0. However, no training image 2.9.0 exists, and no inference image 2.9.1 exists, and the SDK (or the tests) assume that matching images will exist.
Training image 2.9.1 does exist:
But training image 2.9.0 does not. It's not set in the SDK's image uris:
And it doesn't exist in the source ECR repos:
And the CI failure I'm seeing looks like:
So the test is trying to train with TF 2.9.0, and the SDK doesn't know about an image for that, and indeed none exists in the ECR repos. I think the tests are trying to run with 2.9.0 because this PR adds a 2.9.0 inference image. To summarise (in us-west-2, repo
(* = "added to the SDK by this PR") Some questions/thoughts:
|
Aha: sagemaker-python-sdk/tests/conftest.py Lines 380 to 392 in 6f72e3c
That assumes that the the minimum of the two latest versions for training and inference will exist. So, in this case, is finding 2.9.1 for training, 2.9.0 for inference and assuming that 2.9.0 will exist for both, but it does not. So the tests need a version that exists for both. So need do one of
|
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
f7ff2df
to
e0274dc
Compare
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
e0274dc
to
7e59f82
Compare
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Superseded by #3465 |
Issue #, if available: #3250
Description of changes:
Add the missing Tensorflow 2.9 inference image.
Note that I'm considering this to be a "fix", not a "feature" as the training image for 2.9 is in and released, but the inference image is missing, so I think this is a bug.
There might well be some automation or script or something that can do this, but I believe I, as an end user, don't have the ability to list all images in a repository, so this is just based of my finding that I can pull
tensorflow-inference:2.9.0-cpu
(and can't pulltensorflow-inference:2.9.1-cpu
).Testing done:
So 2.9 and 2.9.0 exist in both cpu and gpu formats, in the eu-west-1 registries, at least.
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
unique_name_from_base
to create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.