-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Accessing argument property in ProcessingStep adds new code ProcessingInput every time #3484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @wsykala, thanks for the information. This behavior is a result of an idempotency problem involving a few ProcessingStep sub classes. The issue is being tracked here, where there are some more details including workarounds and a link to the PR to fix the issue. Will keep this issue updated with latest info. |
Hello @brockwade633, thanks for the reply. I didn't realize that there was an issue open for that bug already, that's why I created a new one. It's good to see that a fix for that is already in the pipeline. |
I also met this same issue. Do we have any roadmap to resolve it? thank you. |
Hi @shenshaoyong, thank you for your patience. I will update this thread as soon as I have more information to share. In the meantime, there are a couple workarounds discussed in the other issue mentioned above. |
Hi @wsykala and @shenshaoyong, a new sagemaker package version ( |
closing this issue, feel free to re-open. |
Hey @brockwade633. Sorry for the delay answering. I confirm that the new version fixed this issue. Thanks for taking care of it so fast. |
Describe the bug
Every time you access the
arguments
property inProcessingStep
, a newcode
input is added to theProcessingInputs
. Additionally each time the property is accessed, a newsourcedir.tar.gz
file is uploaded to a S3 bucket.If you were to remove the custom
inputs
argument from the example below, thecode
input would not be added, although thesourcedir.tar.gz
file will still get uploaded to S3.Because input and output names must be unique for a job, this behaviour can result in the error below:
To reproduce
Expected behavior
The number of inputs should be equal to
3
. And the code should be uploaded to S3 only once. This was the case in previous versions of sagemaker (<=2.115.0
).Screenshots or logs
Meanwhile, the number of inputs is incremented by one every time
arguments
property is accessed. Moreover, the code gets uploaded to S3 every time.System information
A description of your system. Please provide:
ProcessingStep
class.)Additional context
Possibly, the exact same bug appeared in #3477.
This does not happen in versions lower than
2.116.0
, meaning that this behaviour was introduced with the addition ofcontext
object inPipelineSession
.The text was updated successfully, but these errors were encountered: