Skip to content

Estimator.delete_endpoint does not delete the Model or Endpoint Configuration #447

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mvsusp opened this issue Oct 27, 2018 · 2 comments
Closed

Comments

@mvsusp
Copy link
Contributor

mvsusp commented Oct 27, 2018

Please fill out the form below.

System Information

  • Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans):
    Any Framework Container
  • Python Version:
    2 and 3
  • Python SDK Version:
    1.12

Describe the problem

Issue found by @andrewcking here #402

In SageMaker Hosting, the process to create an endpoint requires to create an model, an endpoint configuration, and an endpoint.

SageMaker Python SDK abstract these 3 layers for you. These are the lines where this process happens

        container_def = self.prepare_container_def(instance_type)
        self.name = self.name or name_from_image(container_def['Image'])
        self.sagemaker_session.create_model(self.name, self.role, container_def, vpc_config=self.vpc_config)
        production_variant = sagemaker.production_variant(self.name, instance_type, initial_instance_count)
        self.endpoint_name = endpoint_name or self.name
        self.sagemaker_session.endpoint_from_production_variants(self.endpoint_name, [production_variant], tags)

To be able to use Python SDK to create an endpoint with the same name, you will have to delete the model, the endpoint_config, and the endpoint. SageMaker Python SDK delete_endpoint() only deletes the endpoint, which is why the issue is happening.

Minimal repro / logs

Please provide any logs and a bare minimum reproducible test case, as this will be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

  • Exact command to reproduce:
from sagemaker.pytorch import PyTorch

# create estimator
estimator = PyTorch(entry_point='train.py', 
                    role='SageMakerRole', 
                    framework_version='0.4.0', 
                    train_instance_count=1, 
                    train_instance_type='ml.c5.xlarge', 
                    source_dir='source', 
                    hyperparameters={'epochs': 6,})

# fit estimator
estimator.fit('s3://sagemaker-sample-data-us-west-2/spark/mnist/train/')

# deploy estimator (except it doesn't deploy your new estimator if a previously used endpoint name is specified)
estimator.deploy(instance_type='ml.c5.xlarge'', initial_instance_count=1)

# delete endpoint
estimator.delete_endpoint()

# deploy again
estimator.deploy(instance_type='ml.c5.xlarge'', initial_instance_count=1, endpoint_name=estimator.last_training_job.name)
apacker pushed a commit to apacker/sagemaker-python-sdk that referenced this issue Nov 15, 2018
Freezing TF version on HPO BYOC example
@ChoiByungWook
Copy link
Contributor

These features are being worked on in the following PRs:

The ability to delete endpoint configurations has been merged.

@chuyang-deng
Copy link
Contributor

The change has been merged, waiting on release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants