-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Restore SKLearn FrameworkProcessor via _normalize_args #2633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is to conform to the existing style adopted in the sagemaker python sdk.
Drop SKLearnProcessorAlt altogether.
…into feat/fw-processor
…nt=1 This fix specific bugs with SKLearnEstimator which raises exception on instance_count > 1.
…into feat/fw-processor
Add comment for the SKLearnProcessor multiple instances fix and remove the now-unused function parameter. Merge remote-tracking branch 'verdi/pr-framework-processor-round-02' into feat/fw-processor
Swap mandatory 's3_prefix' param in FrameworkProcessors for optional 'code_location' param inkeeping with Framework estimators. Separate out the 'code' from 'entrypoint' processing inputs to avoid having one input channel inside another in the container (input/payload). Update existing unit & integration tests to work with the new code.
Swap s3_prefix for code_location and separate code/entrypoint
R1720: Unnecessary "else" after "raise" (no-else-raise)
FrameworkProcessor now accepts a 'command' param like the ScriptProcessor class did previously, and tries to maintain similar API by forwarding it to the parent class... But patching the container run command generation from the parent to override. Also add initial unit tests for multi-instance and overridden commands.
Restore 'command' for FrameworkProcessors
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Ensure Processor classes are listed in framework API ToCs, and add a brief introduction to usage in the SageMaker Processing guide.
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Add a unit test to check SKLearn FrameworkProcessor behaves as ProcessingStep expects to avoid previous regression on SKLearnProcessor breaking SageMaker Pipelines examples.
Some lightweight Linux distributions oriented towards containers (e.g. Alpine) might not include bash by default: The POSIX-compliant /bin/sh shell is more portable and a less stringent requirement on custom containers to work with FrameworkProcessor. Also expanded scope of tests.
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
A note on alternative (1): although this PR removes |
Maintain API compatibility with the base Processor class
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
(Pushed to draft in favour of #2664) |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@athewsey |
Closing as this has been in draft state for too long! Please reopen if the change is still relevant. |
Issue #, if available: #2656
Description of changes:
An alternative proposal to #2564 for restoring
FrameworkProcessor
toSKLearnProcessor
and fixing howFrameworkProcessor
works with SageMaker Pipelines (which currently seems broken).Processor._normalize_args()
(as used byProcessingStep
) as the interface point to handle user and extra code inputsFrameworkProcessor
's handling of user code and framework code upload into this function, instead of hooking in atrun()
orget_run_args()
source_dir
functionality even inScriptProcessor
FrameworkProcessor
entrypoint from/bin/bash
to POSIX-compliant/bin/sh
, because some lightweight Linux distributions optimized for containers (e.g. Alpine) may not install bash by default: This should be a more flexible option for bringing custom containers with FrameworkProcessorsource_dir
on the Processing user guide page, and ensuring *Processor classes get listed in framework API directories.Additional context & design:
Framework
-based training jobs.code
input to the processing jobentrypoint
input and overrides the entrypoint of the job: The shell script extracts the bundle and installs requirements.txt, then kicks off the user's actual requested entrypoint/script.import: command not found
) - so specificallySKLearnProcessor
was rolled back to its old implementation by fix: revert #2251 changes for sklearn processor #2560.SKLearnProcessor
's framework functionality by changingFrameworkProcessor
to use a python entrypoint instead of shell script.From my investigations, it seems like #2564 proposes to patch a symptom of the issue rather than the root cause: I've not seen any reports that the entrypoint script being shell-based is an issue per-se (e.g. no complaints about custom containers not having
/bin/bash
installed) - the problem is the inconsistency of trying to use the command for the (shell-script) framework entrypoint against the file for the user's (Python) script.In fact, switching to a Python-based framework entrypoint script would:
R
,Julia
, etc.In this alternative solution, we integrate
FrameworkProcessor
more closely withScriptProcessor
by overriding parent private methods (including_normalize_args
,_include_code_in_inputs
, etc) rather than adding extra new methods hooked in to specific events as before.This allows
FrameworkProcessor
to work correctly with SM Pipelines without changing the API between them, with the trade-off that several closely coupled overrides are needed with some code duplication: Every new parameter a child class adds torun()
(e.g.git_config
,dependencies
, etc) needs to be pushed through a few internal methods where the only difference is the number of args transmitted.Alternatives I think could include:
_normalize_args
for external classes needing to generate the actual job arguments; having existing Processor/ScriptProcessor pass this through to_normalize_args
, and having FrameworkProcessor add in the additional required hook for preparing code bundle.Processor.get_run_args()
to be this API - but I think this would be a breaking changeTesting done:
Some minor additions to test coverage (including checking ScriptProcessor with a folder works), but mostly captured under the scope of existing tests - just tweaked arguments for some diversity between providing a script or a full folder + requirements.txt to
SKLearnProcessor
.Verified the fix against the pipelines sample notebook that was known to be broken before.
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
I have passed the region in to all S3 and STS clients that I've initialized as part of this change.Tests
ProcessingStep
I have usedunique_name_from_base
to create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.