-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add AugmentedManifestFile & ShuffleConfig support #528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #528 +/- ##
==========================================
+ Coverage 92.7% 92.71% +<.01%
==========================================
Files 71 71
Lines 5336 5343 +7
==========================================
+ Hits 4947 4954 +7
Misses 389 389
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #528 +/- ##
==========================================
+ Coverage 92.7% 92.83% +0.12%
==========================================
Files 71 71
Lines 5336 5343 +7
==========================================
+ Hits 4947 4960 +13
+ Misses 389 383 -6
Continue to review full report at Codecov.
|
I'm going to add Lauren Yu to this as well |
Using a class for shuffle config is a great idea and keeping with the pattern used in s3 input. I don't think this needs an integration test as it's pretty straightforward API parameter passing. |
+1
…On Thu., 6 Dec. 2018, 10:37 am Ishaaq Chandy ***@***.*** wrote:
***@***.**** commented on this pull request.
------------------------------
In src/sagemaker/session.py
<#528 (comment)>
:
> @@ -1264,6 +1270,24 @@ def __init__(self, s3_data, distribution='FullyReplicated', compression=None,
self.config['RecordWrapperType'] = record_wrapping
if input_mode is not None:
self.config['InputMode'] = input_mode
+ if attribute_names is not None:
+ self.config['DataSource']['S3DataSource']['AttributeNames'] = attribute_names
+ if shuffle_config is not None:
+ self.config['ShuffleConfig'] = {'Seed': shuffle_config.seed}
+
+
+class ShuffleConfig(object):
Well, I had a choice of either taking a dict or a class. A dict would work
though it forces the user to go look at SageMaker docs to understand that
the dict requires a single key called "Seed" with a long value. We also
cannot take just the seed as input since SageMaker Training's ShuffleConfig
could eventually become a richer object.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#528 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/Ad189KdzWDUBv-RkbrPcEY_cgwi7Ic8Uks5u2WPvgaJpZM4ZF3vb>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. please add a changelog entry, and then I'll approve
Description of changes:
Add changes to support for AugmentedManifestFile and ShuffleConfig.
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.