Skip to content

Make sourcedir.tar.gz and repacked model.tar.gz structure consistent #3491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
plienhar opened this issue Nov 29, 2022 · 0 comments
Open

Make sourcedir.tar.gz and repacked model.tar.gz structure consistent #3491

plienhar opened this issue Nov 29, 2022 · 0 comments

Comments

@plienhar
Copy link

plienhar commented Nov 29, 2022

When deploying a model together with customer code (described by one or more Model.__init__ arguments among entry_point, source_dir, dependencies), the SDK (actually the relevant Model.prepare_container_def method) has 2 options for the customer code:

  • Either bundling all the code artifacts in a sourcedir.tar.gz file. The file is then staged to S3 and later downloaded and extracted in the container at /opt/ml/model/code. If supplied, the entry_point file is copied at the root of the tar file. If supplied, the content of the source_dir directory is copied at the root of the tar file. If supplied, each dependency in dependencies is copied at the root of the tar file. This behavior is implemented by the sagemaker.fw_utils.tar_and_upload_dir function.
  • Or repacking the model and code artifacts together in a single model.tar.gz file. The file is then staged to S3 and later downloaded by the container's host and made available in the container at /opt/ml/model where it is extracted. From the model.tar.gz file perspective, code artifacts (the entry_point file if supplied and the content of the source_dir directory if supplied) are placed in a code folder (location is relative to the root of the tar file). If supplied, each dependency in dependencies in placed in a code/lib folder. This behavior is implemented by the sagemaker.utils._create_or_update_code_dir function.

In both cases, code artifacts end up being available in the inference container at /opt/ml/model/code. However an inconsistency appears if we use dependencies. In that case, our dependencies end up being located:

  • In /opt/ml/model/code if the code was bundled in a source.dir.tar.gz file.
  • In /opt/ml/model/code/lib if the code was repacked with the model artifacts in a model.tar.gz file.

The SageMaker inference toolkits automatically add /opt/ml/model and /opt/ml/model/code to sys.path, unlike /opt/ml/model/code/lib. Therefore, dependencies located in the latter directory cannot be imported using the Python import system. The user/customer has to manually add this location to sys.path for its dependencies to be importable. This ultimately boils down to the inconsistency in the file structure which is annoying since the process of opting for a sourcedir.tar.gz or a repacked model.tar.gz is opaque to the user (and highly framework-dependent).

Notice: We do not consider the Multi-Model Enabled (MME) mode here.

IMHO, the solution with minimal impact would be not to create a code/lib directory in the case of the repacked model.tar.gz, dependencies would simply be copied to the code directory. Dependencies from a repacked model.tar.gz would then be directly available under /opt/ml/code which is already automatically added to sys.path by the inference toolkits. This solution would in fact simply align the structure of the repacked model.tar.gz file on the structure of the sourcedir.tar.gz. The latter being already in use, this fix should not raise backward-compatibility issues.

This topic directly relates to the following issues:

  • Issue 1065 - Failed to import code copied into the /opt/ml/model/code/lib directory
  • Issue 1832 - Extra lib directory when adding dependencies for PyTorchModel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants