Skip to content

fix: Remove confusing log line emitted during feature group ingestion #3597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 1, 2023

Conversation

psnilesh
Copy link
Contributor

@psnilesh psnilesh commented Jan 18, 2023

Description

A feature store worker that's been told to ingest rows 1 to 100 of a data frame would emit a log line given below if some rows failed to ingest,

Failed to ingest row 1 to 100. 

This often confuse some customers into thinking more rows have failed than what actually have, because the same line is logged if all but one row had succeeded .

There is already a log line in line 234 indicating row number and exception of failed rows that customers can use to debug. This PR is for removing aforementioned extra log line.

Testing

Verified the change by installing SDK locally and ingesting a data frame containing some faulty rows.

Merge Checklist

General

  • I have read the CONTRIBUTING doc
  • I certify that the changes I am introducing will be backward compatible, and I have discussed concerns about this, if any, with the Python SDK team
  • I used the commit message format described in CONTRIBUTING
  • I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
  • I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

This is a only a minor logging change, so below checklist is not applicable. The code path is already covered by existing unit and integration tests.

  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
  • I have checked that my tests are not configured for a specific region or account (if appropriate)
  • I have used unique_name_from_base to create resource names in integ tests (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@psnilesh psnilesh requested a review from a team as a code owner January 18, 2023 06:01
@psnilesh psnilesh requested review from claytonparnell and removed request for a team January 18, 2023 06:01
@claytonparnell claytonparnell self-assigned this Jan 18, 2023
Copy link
Collaborator

@claytonparnell claytonparnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/bot run all

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 3615c4a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 3615c4a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@codecov-commenter
Copy link

codecov-commenter commented Jan 25, 2023

Codecov Report

Merging #3597 (3615c4a) into master (483665a) will decrease coverage by 0.80%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master    #3597      +/-   ##
==========================================
- Coverage   89.59%   88.80%   -0.80%     
==========================================
  Files         968      227     -741     
  Lines       89536    22140   -67396     
==========================================
- Hits        80219    19661   -60558     
+ Misses       9317     2479    -6838     
Impacted Files Coverage Δ
src/sagemaker/feature_store/feature_group.py 77.41% <0.00%> (ø)
...hon3.10/site-packages/sagemaker/hyperparameters.py
...e-packages/sagemaker/automl/candidate_estimator.py
.../sagemaker/model_card/evaluation_metric_parsers.py
...r/cli/compatibility/v2/modifiers/renamed_params.py
...-packages/sagemaker/feature_store/feature_store.py
...3.10/site-packages/sagemaker/lineage/_api_types.py
.../py38/lib/python3.8/site-packages/sagemaker/job.py
...b/python3.10/site-packages/sagemaker/local/data.py
....8/site-packages/sagemaker/tensorflow/estimator.py
... and 1186 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 3615c4a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-notebook-tests
  • Commit ID: 3615c4a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 3615c4a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Copy link
Collaborator

@claytonparnell claytonparnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/bot run notebook-tests, slow-tests, pr

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 3615c4a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-notebook-tests
  • Commit ID: 3615c4a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 3615c4a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@psnilesh
Copy link
Contributor Author

psnilesh commented Jan 26, 2023

Just had a quick look at the failure logs.

Notebook tests

SUCCEEDED	9
FAILED	0
ERROR	0
SKIPPED	0
CANCELLED	0
TIMED_OUT	3
--------------------------------------------------------------------------------
TIMED_OUT	./amazon-sagemaker-examples/sagemaker-python-sdk/1P_kmeans_lowlevel/kmeans_mnist_lowlevel.ipynb	PT1M0.202S
SUCCEEDED	./amazon-sagemaker-examples/advanced_functionality/tensorflow_iris_byom/tensorflow_BYOM_iris.ipynb	PT3M57.215S
SUCCEEDED	./amazon-sagemaker-examples/sagemaker-python-sdk/tensorflow_moving_from_framework_mode_to_script_mode/tensorflow_moving_from_framework_mode_to_script_mode.ipynb	PT2M45.504S
SUCCEEDED	./amazon-sagemaker-examples/sagemaker-python-sdk/mxnet_onnx_export/mxnet_onnx_export.ipynb	PT5M33.134S
SUCCEEDED	./amazon-sagemaker-examples/advanced_functionality/kmeans_bring_your_own_model/kmeans_bring_your_own_model.ipynb	PT5M36.434S
TIMED_OUT	./amazon-sagemaker-examples/sagemaker-python-sdk/tensorflow_script_mode_pipe_mode/tensorflow_script_mode_pipe_mode.ipynb	PT3M42.734S
SUCCEEDED	./amazon-sagemaker-examples/sagemaker-python-sdk/1P_kmeans_highlevel/kmeans_mnist.ipynb	PT9M25.045S
SUCCEEDED	./amazon-sagemaker-examples/sagemaker-python-sdk/mxnet_gluon_sentiment/mxnet_sentiment_analysis_with_gluon.ipynb	PT12M25.84S
SUCCEEDED	./amazon-sagemaker-examples/sagemaker-python-sdk/scikit_learn_randomforest/Sklearn_on_SageMaker_end2end.ipynb	PT14M19.376S
SUCCEEDED	./amazon-sagemaker-examples/sagemaker-python-sdk/tensorflow_serving_using_elastic_inference_with_your_own_model/tensorflow_serving_pretrained_model_elastic_inference.ipynb	PT6M53.785S
SUCCEEDED	./amazon-sagemaker-examples/sagemaker_processing/spark_distributed_data_processing/sagemaker-spark-processing.ipynb	PT24M14.055S
TIMED_OUT	./amazon-sagemaker-examples/sagemaker-pipelines/tabular/abalone_build_train_deploy/sagemaker-pipelines-preprocess-train-evaluate-batch-transform.ipynb	PT43M22.071S

Slow tests

=========================== short test summary info ============================
FAILED tests/integ/test_inference_pipeline.py::test_inference_pipeline_model_deploy_and_update_endpoint
ERROR tests/integ/test_clarify_model_monitor.py::test_run_explainability_monitor
ERROR tests/integ/test_model_quality_monitor.py::test_run_model_quality_monitor_baseline
ERROR tests/integ/test_model_quality_monitor.py::test_run_model_quality_monitor
ERROR tests/integ/test_clarify_model_monitor.py::test_run_bias_monitor - sage...
======== 1 failed, 11 passed, 1 warning, 4 errors in 2099.62s (0:34:59) ========
ERROR: InvocationError for command /codebuild/output/src122870313/src/github.com/aws/sagemaker-python-sdk/.tox/py310/bin/pytest --cov=sagemaker --cov-append tests/integ -m slow_test -n 16 --durations 0 (exited with code 1)
___________________________________ summary ____________________________________
ERROR:   py310: commands failed

sagemaker-python-sdk-pr

FAILED tests/integ/test_feature_store.py::test_update_feature_group - Runtime...
= 1 failed, 387 passed, 65 skipped, 150 warnings, 96 rerun in 5139.33s (1:25:39) =
ERROR: InvocationError for command /codebuild/output/src480907648/src/github.com/aws/sagemaker-python-sdk/.tox/py310/bin/pytest --cov=sagemaker --cov-append tests/integ -m 'not local_mode and not cron and not slow_test' -n 384 --reruns 3 --reruns-delay 15 --durations 50 --boto-config '{"region_name": "us-west-2"}' (exited with code 1)

I see one failure related to feature store (first two failures look unrelated) but the failing test is exercising a path that is unmodified in this CR. Build logs also show

E           botocore.errorfactory.ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateFeatureGroup operation: The account-level service limit 'Maximum number of feature groups allowed per account' is 200 FeatureGroups, with current utilization of 200 FeatureGroups and a request delta of 1 FeatureGroups. Please contact AWS support to request an increase for this limit.

Perhaps we should increase the limit or delete some old feature groups ?

@claytonparnell
Copy link
Collaborator

@psnilesh Yes first two are flaky tests we are dealing with; I'll look into if we need feature grouplimit increase, will let you know

Copy link
Collaborator

@claytonparnell claytonparnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/bot run slow-tests

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 3615c4a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-notebook-tests
  • Commit ID: 3615c4a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 3615c4a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 3615c4a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@psnilesh
Copy link
Contributor Author

@claytonparnell is this PR waiting on something, like a fix for those transient failures ?

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 3615c4a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 3615c4a
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 3615c4a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mufaddal-rohawala mufaddal-rohawala merged commit d3aa98c into aws:master Feb 1, 2023
JoseJuan98 pushed a commit to JoseJuan98/sagemaker-python-sdk that referenced this pull request Mar 4, 2023
nmadan pushed a commit to nmadan/sagemaker-python-sdk that referenced this pull request Apr 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants