requirements.txt file is processed in tensorflow script mode even if not specified #839

filthysocks · 2019-06-07T14:14:12Z

System Information

Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): TensorFlow
Framework Version: 1.12.0
Python Version: 3.7
CPU or GPU: CPU
Python SDK Version: 3.7
Are you using a custom image: no

Describe the problem

I'm not 100% sure if its a bug or just a confusing documentation. However, when i want to train a TensorFlow model and specify a Src dir that contains a requirements.txt at its root, then this requirements.txt is processed even if i didn't specify one. Though, If I specify a requirements.txt then an exception is thrown, telling me i cannot specify a requirements.txt in script mode.

Minimal repro / logs

log:

INFO:sagemaker:Creating training-job with name: sagemaker-tensorflow-scriptmode-2019-06-07-14-08-57-606
2019-06-07 14:08:58 Starting - Starting the training job...
2019-06-07 14:09:00 Starting - Launching requested ML instances......
2019-06-07 14:10:05 Starting - Preparing the instances for training......
2019-06-07 14:11:27 Downloading - Downloading input data
2019-06-07 14:11:27 Training - Training image download completed. Training in progress...
2019-06-07 14:11:30,871 sagemaker-containers INFO     Imported framework sagemaker_tensorflow_container.training
2019-06-07 14:11:30,877 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)
2019-06-07 14:11:31,178 sagemaker-containers INFO     Installing module with the following command:
/usr/bin/python -m pip install -U . -r requirements.txt

Exact command to reproduce:

tf_estimator =  TensorFlow(
    entry_point='train.py',
    source_dir='src',
    role=role,
    train_instance_type='ml.m5.large',
    train_instance_count=1,
    framework_version='1.12.0',
    py_version='py3'
)
tf_estimator.fit(inputs=inputs)

package struct:

* src/
** train.py
** requirements.txt

The text was updated successfully, but these errors were encountered:

laurenyu · 2019-06-07T16:48:59Z

hi @filthysocks, apologies for the confusion! the requirements_file argument goes only with the legacy TensorFlow images, while the new behavior that our images moving forward have adopted is to detect a file named "requirements.txt" in the source directory and install it (which you have discovered).

jerrygb · 2020-01-23T16:32:04Z

@laurenyu Is this still the case?

I receive error when using script mode.

AttributeError: training_steps, evaluation_steps, requirements_file, checkpoint_path are deprecated in script mode. Please do not set requirements_file.

Can you confirm what are legacy images and how it relates to script mode?

laurenyu · 2020-01-23T22:37:25Z

@jerrygb the requirements_file argument in the TensorFlow constructor does still only apply to legacy mode. The legacy images are ones that required your training script to implement a specific list of methods, whereas script mode allows you to use almost any TensorFlow script with little modification for running on SageMaker. You can read more in our documentation.

Specifically about this issue, the way to use requirements.txt with the current TF images is still to put the requirements.txt file in the same directory as the training script, and pass that path to the estimator, i.e.

estimator = TensorFlow(
    entry_point='train.py',
    source_dir='path/to/source',
    ...
)

jerrygb · 2020-02-13T02:21:36Z

Thanks @laurenyu

Nimrods · 2020-03-03T07:35:00Z

@laurenyu Can you explain again how to make TensorFlow estimator install packages using a requirements file before running my script?
Currently, I install the packages in my entry_point script...

laurenyu · 2020-03-03T23:03:24Z

@Nimrods at this time, the recommended method (assuming TF > 1.13) is to use a shell script for your entry point and have it call pip install -r requirements.txt and then execute your training script. You can read more about using a shell script as an entry point in this example notebook.

There are some other ways to also use a requirements file:

if you're using an older version of TF with script mode, like with the original post in this thread, then simply having a file named "requirements.txt" in the same folder as your entry script (see my above comment) will work
you can include a setup.py file with your entry script, and have that file list required dependencies

Given popular demand, we are working on returning to automatically detecting if there's a file named "requirements.txt" and pip installing it for a future release. (This is also the behavior we have settled on with the other frameworks.)

Nimrods · 2020-03-05T05:37:54Z

@laurenyu Thank you!
I've tried the option of having the "requirements.txt" in the entry-point script, but it didn't work.
I'll try the shell script option (with a newer TF version)

laurenyu · 2020-05-20T19:09:27Z

docs have been updated in #1512.

laurenyu added the type: documentation label Jun 7, 2019

xkumiyu mentioned this issue Jul 7, 2019

Not detect requirements.txt in TensorFlow script mode #911

Closed

ajaykarpur mentioned this issue May 19, 2020

doc: clarify support for requirements.txt in Tensorflow docs #1512

Merged

7 tasks

laurenyu closed this as completed May 20, 2020

laurenyu mentioned this issue May 26, 2020

doc: fix TF requirements.txt documentation #1526

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

requirements.txt file is processed in tensorflow script mode even if not specified #839

requirements.txt file is processed in tensorflow script mode even if not specified #839

filthysocks commented Jun 7, 2019

laurenyu commented Jun 7, 2019

jerrygb commented Jan 23, 2020 •

edited

Loading

laurenyu commented Jan 23, 2020

jerrygb commented Feb 13, 2020

Nimrods commented Mar 3, 2020

laurenyu commented Mar 3, 2020

Nimrods commented Mar 5, 2020

laurenyu commented May 20, 2020

requirements.txt file is processed in tensorflow script mode even if not specified #839

requirements.txt file is processed in tensorflow script mode even if not specified #839

Comments

filthysocks commented Jun 7, 2019

System Information

Describe the problem

Minimal repro / logs

laurenyu commented Jun 7, 2019

jerrygb commented Jan 23, 2020 • edited Loading

laurenyu commented Jan 23, 2020

jerrygb commented Feb 13, 2020

Nimrods commented Mar 3, 2020

laurenyu commented Mar 3, 2020

Nimrods commented Mar 5, 2020

laurenyu commented May 20, 2020

jerrygb commented Jan 23, 2020 •

edited

Loading