Skip to content

Support multi-part uploads #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 22, 2018
Merged

Support multi-part uploads #45

merged 3 commits into from
Jan 22, 2018

Conversation

jbencook
Copy link

For large datasets, the current Session.upload_data method fails. This PR switches the call to Object.upload_file which can do multi-part uploads. Also updated the unit tests.

owen-t
owen-t previously approved these changes Jan 18, 2018
@@ -20,3 +20,4 @@ examples/tensorflow/distributed_mnist/data
doc/_build
**/.DS_Store
venv/
*~
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this make it into the commit message as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I accidentally added an emacs file on our fork. You mean you want a new commit in this PR with a message about that change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good addition, happy for it to be part of this PR. If we leave it in this PR, can the commit message just reference the addition. Something like

"Upload files to S3 using multipart uploads.

Emacs temporary files covered in .gitignore"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm happy to add it. Any idea if amending my commit message and force pushing to our fork will do it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, should be fine

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that worked

@owen-t
Copy link
Contributor

owen-t commented Jan 18, 2018

Thanks for your submission!

Ignore Emacs backup files
@lukmis lukmis merged commit 05d4b0b into aws:master Jan 22, 2018
ragavvenkatesan pushed a commit that referenced this pull request Jan 30, 2018
* add sagemaker cli (#32)

* add sagemaker cli

* remove unnecessary close

* address PR comments

* tidy up imports

* fix imports, flake8 errors

* improve help message for bucket-name

* remove default role name

* fix log-level and py3 tests, add copyright

* update cli example scripts

* Add documentation about BYO Models (#47)

* Add test for BYO estimator using Factorization Machines algorithm as an example. (#50)

* Support multi-part uploads (#45)

* Update TensorFlow examples following API change (#44)

* Add data_type to hyperparameters (#54)

When we describe a training job the data type of the hyper parameters is
lost because we use a dict[str, str]. This adds a new field to
Hyperparameter so that we can convert the datatypes at runtime.

instead of validating with isinstance(), we cast the hp value to the type it
is meant to be. This enforces a "strongly typed" value. When we
deserialize from the API string responses it becomes easier to deal with
too.
@ragavvenkatesan ragavvenkatesan mentioned this pull request Jan 30, 2018
laurenyu pushed a commit to laurenyu/sagemaker-python-sdk that referenced this pull request May 31, 2018
apacker pushed a commit to apacker/sagemaker-python-sdk that referenced this pull request Nov 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants