-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Limited size of parameters #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @PedroCardoso , For each hyper-parameter in the map, we have limits that each key or value should have length no more than 256. For what you mentioned, if you have too many hyper-parameters, that won't reach this limit if each of them has key or value length within 256. If the map value is a list of a lot things, it might be a problem. So could you give me a specific example? Then we can either recommend better practice to you or increase the limit to a more reasonable number. Thanks |
Hi @yangaws I believe that my particular problem is with sending a list of labels as parameter. I do need those to build the Estimator. As an example, think of a parameter that contains a list with 30 or 40 strings objects. |
I am not confident that we will increase that limit recently. I can put a feature request here. If we keep receiving such issues, we will definitely prioritize this feature. For now my suggestion is, for your list of 30-40 labels, specify all the labels as a separate channel in some common format like JSON. |
Are the channels information present in the parameters for the function call estimator_fn() ? |
Hello, I don't think the channels information is exposed to the estimator_fn(), as evident here https://github.com/aws/sagemaker-tensorflow-container/blob/master/src/tf_container/trainer.py#L92 I believe only the train_input_fn and eval_input_fn have access to the channels. A workaround for this is to use the hyperparameters to store the channel metadata. Like... |
Use get_image_uri for pyspark_mnist_customer_estimator notebook
Closing due to inactivity. Feel free to reopen if necessary. |
Just hit this issue, using a custom docker container to train a model and I can't specify the features I want to train on. 👎 |
hitting the same thing too. Its odd that this notebook for shows a value larger than 256 in the hyper params but its actually not supported |
for those hitting this. My solution was pass the big parameters as a json file, and have it send to the job with a manifesto file. |
do you have a sample for that? |
I too am interested in learning about this, since I'm currently using the hyperparams file for all my image annotation labels in an object recognition case, and there are too many labels apparently. |
I'm also stuck here. My use case is that I need to set the SAGEMAKER_SPARKML_SCHEMA environment variable when using the https://github.com/aws/sagemaker-sparkml-serving-container (required for CSV input) and I also have ~40 features to pass. I don't think this is an uncommon pattern |
Having this issue under the same context of @pnadolny13 . CSV inference requires passing the schema as an environment variable to the
I'd dare to say that 1024 characters is still a very small limit when dealing with highly-dimensional schemas (in my case, I must indicate 350+ features along with their data types). Is there any suggested workaround to this? |
I send the long parameters as json in an S3 blob in a parameters channel.
|
I have this problem in a new context:
The long_path are really some long S3 path specific to me. This happens from this code:
This is a bit problematic since it's not immediately clear if there's a workaround. If I need to specify a long list of jars, there may be no other way to pass this information. Each jar1, jar2, etc. string will have s3://bucket_name/path_to_jar, so this effectively puts a very small cap on how many JARs there could be. |
Please fill out the form below.
System Information
Tensorflow
2.7
1.7.0
Describe the problem
When calling Tensorflow from the SDK, we are limited in the size of the parameters :
256 is small, in particular if you send a list of labels or have many parameters.
Minimal repro / logs
The text was updated successfully, but these errors were encountered: