Skip to content

hyperparameters.json hyperparameter encoding breaks backward compatibility #3487

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
maekawataiki opened this issue Nov 25, 2022 · 1 comment

Comments

@maekawataiki
Copy link

What did you find confusing? Please describe.
Each hyperparameter passed to Script Mode SageMaker through /opt/ml/input/config/hyperparameters.json is json encoded after #3344 (merged since version 2.110.0) which breaks script mode SageMaker code which directly reads hyperparameters from the file.

Before 2.109.0:

{
  "hyperparameter-key": "hyperparameter-value"
}

After 2.110.0:

{
  "hyperparameter-key": "\"hyperparameter-value\""
}

SageMaker Training Tookit used to handle this nicely by decoding each hyperparameter and supply it to program with either environment variable SM_HPS or argument. This is not done in custom container and need to be handled by developers. The update in version 2.110.0 leads to breaking change for some custom containers which does not implement the decoding process.

There are no clear documentation about how to properly handle /opt/ml/input/config/hyperparameters.json. The document only states the hyperparameters is in the file. It is better to document how to handle the hyperparameters correctly to make script work in different SageMaker versions.

Describe how documentation can be improved
Add a line explaining how /opt/ml/input/config/hyperparameters.json value should be parsed in the document
For example:

Hyperparameters in hyperparameters.json is encoded as json string.

For example, you can use following code to get decoded content.

def read_hyperparameters():  # type: () -> dict
    """Read the hyperparameters from /opt/ml/input/config/hyperparameters.json.

    For more information about hyperparameters.json:
    https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-training-algo-running-container.html#your-algorithms-training-algo-running-container-hyperparameters

    Returns:
         (dict[string, object]): A dictionary containing the hyperparameters.
    """
    with open("/opt/ml/input/config/hyperparameters.json", "r") as f:
        hyperparameters = json.load(f)

    deserialized_hps = {}

    for k, v in hyperparameters.items():
        try:
            v = json.loads(v)
        except (ValueError, TypeError):
            logger.info(
                "Failed to parse hyperparameter %s value %s to Json.\n"
                "Returning the value itself",
                k,
                v,
            )

        deserialized_hps[k] = v

    return deserialized_hps

Note
Hyperparameters are passed as plain string before sagemaker 2.109.0. Use the sample code to keep compatibility over different SageMaker versions.

@maekawataiki
Copy link
Author

Current SageMaker doc is already describing the format of hyperparameters.json, and it is not mandatory to use error handling method described above going forwards.

/opt/ml/input/config contains information to control how your program runs. hyperparameters.json is a JSON-formatted dictionary of hyperparameter names to values. These values will always be strings, so you may need to convert them. resourceConfig.json is a JSON-formatted file that describes the network layout used for distributed training. Since scikit-learn doesn’t support distributed training, we’ll ignore it here.

https://sagemaker-examples.readthedocs.io/en/latest/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.html

For people who are using BYOL container and using older version of SageMaker < 2.110.0, follow the code described above or explicitly specify SageMaker version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant