You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
System Information
MXNet Gluon
Python 3.6
GPU
Using custom model for training and inference
Question:
Can we load model parameters that are trained somewhere else (on our server) and create a training job?
I have trained my model on our server and now i want to create training job for that model. I have customized that model to sagemaker and trying to create training job by loading the previously saved parameters. But i am getting the following error while loading them:
‘AssertionError: Parameter embedding0_weight is missing in file /opt/ml/input/data/training/encoder_.params’
Any suggestions will be helpful!
Thanks,
Harathi
The text was updated successfully, but these errors were encountered:
Yes, you can load the model parameters from your server without problems. You can pass in these model parameters as a channel (using File mode), as a file included with your source code, or as data that download during training.
I would need to understand more how your are loading this data, my rough interpretation of the error about gives me intuition that the weighs are not being saved/loaded in the training job properly.
I will close this ticket, given that it seems to be not a python SDK issue.
Feel free to open addition issue for other questions.
System Information
MXNet Gluon
Python 3.6
GPU
Using custom model for training and inference
Question:
Can we load model parameters that are trained somewhere else (on our server) and create a training job?
I have trained my model on our server and now i want to create training job for that model. I have customized that model to sagemaker and trying to create training job by loading the previously saved parameters. But i am getting the following error while loading them:
‘AssertionError: Parameter embedding0_weight is missing in file /opt/ml/input/data/training/encoder_.params’
Any suggestions will be helpful!
Thanks,
Harathi
The text was updated successfully, but these errors were encountered: