feature: Hugging Face Transformers 4.12 for Pt1.9/TF2.5 #2752

philschmid · 2021-11-04T08:27:38Z

Issue #, if available:

Description of changes:

Testing done:

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

I have read the CONTRIBUTING doc
I certify that the changes I am introducing will be backword compatible, and I have discussed concerns about this, if any, with the Python SDK team
I used the commit message format described in CONTRIBUTING
I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

I have added tests that prove my fix is effective or that my feature works (if appropriate)
I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
I have checked that my tests are not configured for a specific region or account (if appropriate)
I have used unique_name_from_base to create resource names in integ tests (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

codecov-commenter · 2021-11-04T08:39:30Z

Codecov Report

Merging #2752 (1007498) into master (7268e82) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #2752   +/-   ##
=======================================
  Coverage   88.71%   88.71%           
=======================================
  Files         167      167           
  Lines       14766    14766           
=======================================
  Hits        13099    13099           
  Misses       1667     1667

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7268e82...1007498. Read the comment docs.

sagemaker-bot · 2021-11-04T08:44:00Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: 13138ea
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

philschmid · 2021-11-05T07:21:18Z

all releases are done:

[huggingface_tensorflow, huggingface_pytorch] Release Images for Transformers 4.12 deep-learning-containers#1475

can we run the pipeline and then merge?

sagemaker-bot · 2021-11-06T10:12:44Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: ad15178
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-06T15:47:59Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: 9781aba
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-06T15:48:00Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: 2758c16
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-06T15:54:22Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-local-mode-tests
Commit ID: 2758c16
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-06T15:54:23Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-local-mode-tests
Commit ID: 13138ea
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-06T15:55:26Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-local-mode-tests
Commit ID: 9781aba
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-06T15:57:15Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: 13138ea
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-06T16:02:22Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-slow-tests
Commit ID: 9781aba
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-06T16:02:23Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-slow-tests
Commit ID: 13138ea
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-06T16:28:56Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-slow-tests
Commit ID: 2758c16
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-09T07:23:48Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: 8be2914
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-09T07:36:27Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: faa5c67
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-09T07:45:19Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: cb2d374
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-09T08:11:08Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-local-mode-tests
Commit ID: 8be2914
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-09T08:26:17Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-local-mode-tests
Commit ID: cb2d374
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-09T08:26:54Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-slow-tests
Commit ID: 8be2914
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-09T08:26:55Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-slow-tests
Commit ID: cb2d374
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

jeniyat

looks good to me.

shreyapandit · 2021-11-10T21:08:25Z

tests/integ/test_huggingface.py

@@ -158,7 +158,7 @@ def test_huggingface_inference(
    huggingface_pytorch_latest_inference_py_version,
 ):
    env = {
-        "HF_MODEL_ID": "sshleifer/tiny-distilbert-base-uncased-finetuned-sst-2-english",
+        "HF_MODEL_ID": "philschmid/tiny-distilbert-classification",


@philschmid Can we please give this a generic name to this model?

Not sure what you mean by that. That is not a model which will be created it is stored on the hf.co/models hub and used to run tests. I changed it because to a model which we can control. https://huggingface.co/philschmid/tiny-distilbert-classification.

shreyapandit · 2021-11-10T21:16:56Z

tests/data/huggingface/run_tf.py

-        x: train_dataset[x].to_tensor(default_value=0, shape=[None, tokenizer.model_max_length])
-        for x in ["input_ids", "attention_mask"]
-    }
+    train_features = {x: train_dataset[x] for x in ["input_ids", "attention_mask"]}


Can you detail why we are removing the to_tensor call, especially since the shape's tokenizer.model_max_length parameter is something that we gave explicitly in the past? Is this driven by the change in TF version?

Please make this backwards compatible ie. have an original test case with the previous changes, and add a new test case for this requirement where the to_tensor call is not required so that we test both scenarios.

It has been removed in Datasets since we changed the internal structure. it used to return RaggedTensor even when the tensors were normal dense tensors.
And the tokenizer.mode_max_length is already represented in

train_dataset = train_dataset.map( lambda e: tokenizer(e["text"], truncation=True, padding="max_length"), batched=True )

which creates already a shape of max_length.

I added a condition to test to check the transformers version and added the old code

navinsoni

Please address Shreya's comments

sagemaker-bot · 2021-11-11T08:32:49Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: 856a949
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-11T09:08:02Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: a5c6012
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-11T09:46:47Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-slow-tests
Commit ID: a5c6012
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2021-11-11T09:49:00Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-local-mode-tests
Commit ID: a5c6012
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

philschmid · 2021-11-11T09:50:45Z

Can some restart the pipeline please? the error is

        if http.status_code >= 300:
            error_code = parsed_response.get("Error", {}).get("Code")
            error_class = self.exceptions.from_code(error_code)
>           raise error_class(parsed_response, operation_name)
E           botocore.errorfactory.ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateMonitoringSchedule operation: The account-level service limit 'ml.c5.xlarge for processing job usage' is 50 Instances, with current utilization of 50 Instances and a request delta of 1 Instances. Please contact AWS support to request an increase for this limit.

shreyapandit · 2021-11-11T22:01:04Z

Can some restart the pipeline please? the error is

        if http.status_code >= 300:
            error_code = parsed_response.get("Error", {}).get("Code")
            error_class = self.exceptions.from_code(error_code)
>           raise error_class(parsed_response, operation_name)
E           botocore.errorfactory.ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateMonitoringSchedule operation: The account-level service limit 'ml.c5.xlarge for processing job usage' is 50 Instances, with current utilization of 50 Instances and a request delta of 1 Instances. Please contact AWS support to request an increase for this limit.

Yes, re-running the tests.

navinsoni · 2021-11-12T03:15:41Z

Update: Refactored code and created function to generate dataset feature set.

sagemaker-bot · 2021-11-12T03:29:37Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: 1007498
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

philschmid · 2021-11-12T07:40:04Z

Thank you for the reviews and support! Is there an ETA for the pypi release?

* added new HuggingFace DLCs Co-authored-by: Navin Soni <[email protected]>

philschmid added 2 commits November 4, 2021 09:24

added new HuggingFace DLCs

8e29874

removed break

13138ea

This was referenced Nov 4, 2021

Feature: Add new Hugging Face DLCs to Image config #2751

Closed

[huggingface_tensorflow, huggingface_pytorch] Release Images for Transformers 4.12 aws/deep-learning-containers#1475

Closed

ahsan-z-khan changed the base branch from master to dev November 4, 2021 18:26

ahsan-z-khan previously approved these changes Nov 4, 2021

View reviewed changes

philschmid dismissed ahsan-z-khan’s stale review via ad15178 November 6, 2021 10:02

ahsan-z-khan changed the base branch from dev to master November 8, 2021 17:37

saved with correct format

cb2d374

ahsan-z-khan previously approved these changes Nov 9, 2021

View reviewed changes

ahsan-z-khan added the HuggingFace label Nov 9, 2021

jeniyat reviewed Nov 9, 2021

View reviewed changes

jeniyat previously approved these changes Nov 9, 2021

View reviewed changes

mufaddal-rohawala previously approved these changes Nov 10, 2021

View reviewed changes

shreyapandit reviewed Nov 10, 2021

View reviewed changes

shreyapandit suggested changes Nov 10, 2021

View reviewed changes

navinsoni suggested changes Nov 10, 2021

View reviewed changes

added condition for test to also work with lower transformers version

856a949

philschmid dismissed stale reviews from mufaddal-rohawala, jeniyat, and ahsan-z-khan via 856a949 November 11, 2021 08:25

make black happy

a5c6012

shreyapandit previously approved these changes Nov 11, 2021

View reviewed changes

refactor feature generation

1007498

navinsoni dismissed shreyapandit’s stale review via 1007498 November 12, 2021 03:12

shreyapandit approved these changes Nov 12, 2021

View reviewed changes

navinsoni approved these changes Nov 12, 2021

View reviewed changes

navinsoni merged commit 1dc0564 into aws:master Nov 12, 2021

mufaddal-rohawala pushed a commit that referenced this pull request Nov 23, 2021

feature: Hugging Face Transformers 4.12 for Pt1.9/TF2.5 (#2752)

1b0cf76

* added new HuggingFace DLCs Co-authored-by: Navin Soni <[email protected]>

EthanShouhanCheng pushed a commit to SissiChenxy/sagemaker-python-sdk that referenced this pull request Jan 11, 2022

feature: Hugging Face Transformers 4.12 for Pt1.9/TF2.5 (aws#2752)

75f022e

* added new HuggingFace DLCs Co-authored-by: Navin Soni <[email protected]>

feature: Hugging Face Transformers 4.12 for Pt1.9/TF2.5 #2752

feature: Hugging Face Transformers 4.12 for Pt1.9/TF2.5 #2752

Conversation

philschmid commented Nov 4, 2021 • edited Loading

Merge Checklist

General

Tests

codecov-commenter commented Nov 4, 2021 • edited Loading

Codecov Report

sagemaker-bot commented Nov 4, 2021

AWS CodeBuild CI Report

philschmid commented Nov 5, 2021

sagemaker-bot commented Nov 6, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 6, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 6, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 6, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 6, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 6, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 6, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 6, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 6, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 6, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 9, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 9, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 9, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 9, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 9, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 9, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 9, 2021

AWS CodeBuild CI Report

jeniyat left a comment

Choose a reason for hiding this comment

shreyapandit Nov 10, 2021

Choose a reason for hiding this comment

philschmid Nov 11, 2021

Choose a reason for hiding this comment

shreyapandit Nov 10, 2021

Choose a reason for hiding this comment

philschmid Nov 11, 2021

Choose a reason for hiding this comment

navinsoni left a comment

Choose a reason for hiding this comment

sagemaker-bot commented Nov 11, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 11, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 11, 2021

AWS CodeBuild CI Report

sagemaker-bot commented Nov 11, 2021

AWS CodeBuild CI Report

philschmid commented Nov 11, 2021

shreyapandit commented Nov 11, 2021

navinsoni commented Nov 12, 2021

sagemaker-bot commented Nov 12, 2021

AWS CodeBuild CI Report

philschmid commented Nov 12, 2021

philschmid commented Nov 4, 2021 •

edited

Loading

codecov-commenter commented Nov 4, 2021 •

edited

Loading