-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Will tensorflow return the best model as a result of training #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The TensorFlow container saves the most recent exported model. We provide TensorFlow checkpoints, which contain older models checkpoints produced during training. |
@owen-t so, right now I should manually evaluate all checkpoints, find the best one and create SageMaker model + endpoint based on that. That's something that makes sense to do in the SageMaker library, what do you think? |
Agreed - allowing better control over model export in TensorFlow would be useful. |
Using training hooks and evaluation hooks can do both best model saving and early stopping. EstimatorSpec( |
Do you have more specific examples or links that better demonstrate how to utilize training and evaluation hooks? e,g., how would one construct a tf.train.SessionRunHook that can read the evaluation metrics from inside model_fn over many iterations and compare the loss I assume the EstimatorSpec is returned by model_fn? |
I would refer to this feature request: |
Arpin kmeans markdown
Closing due to inactivity and the introduction of script mode with our TensorFlow containers. Script mode allows for greater flexibility in writing the TF training script, which should allow for using the hooks described above. For more information about script mode, see our TF README. |
I'm trying to understand the strategy of model evaluation implemented in TF within a container, my goal is to sage the most accurate model, not the most recent one. Is it possible via this API somehow?
The text was updated successfully, but these errors were encountered: