|
| 1 | +<!-- |
| 2 | +# Copyright (c) 2021-2022, NVIDIA CORPORATION. |
| 3 | +# |
| 4 | +# Licensed under the Apache License, Version 2.0 (the "License"); |
| 5 | +# you may not use this file except in compliance with the License. |
| 6 | +# You may obtain a copy of the License at |
| 7 | +# |
| 8 | +# http://www.apache.org/licenses/LICENSE-2.0 |
| 9 | +# |
| 10 | +# Unless required by applicable law or agreed to in writing, software |
| 11 | +# distributed under the License is distributed on an "AS IS" BASIS, |
| 12 | +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| 13 | +# See the License for the specific language governing permissions and |
| 14 | +# limitations under the License. |
| 15 | +--> |
| 16 | + |
| 17 | +# Running E2E Benchmarks |
| 18 | + |
| 19 | +### Set up Triton Inference Server |
| 20 | + |
| 21 | +##### Pull Triton Inference Server Docker Image |
| 22 | +Pull Docker image from NGC (https://ngc.nvidia.com/catalog/containers/nvidia:tritonserver) suitable for your environment. |
| 23 | + |
| 24 | +Example: |
| 25 | + |
| 26 | +``` |
| 27 | +docker pull nvcr.io/nvidia/tritonserver:22.02-py3 |
| 28 | +``` |
| 29 | + |
| 30 | +##### Start Triton Inference Server container |
| 31 | +``` |
| 32 | +cd ${MORPHEUS_ROOT}/models |
| 33 | +
|
| 34 | +docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD:/models nvcr.io/nvidia/tritonserver:22.06-py3 tritonserver --model-repository=/models/triton-model-repo --model-control-mode=explicit --load-model sid-minibert-onnx --load-model abp-nvsmi-xgb --load-model phishing-bert-onnx |
| 35 | +``` |
| 36 | + |
| 37 | +##### Verify Model Deployments |
| 38 | +Once Triton server finishes starting up, it will display the status of all loaded models. Successful deployment of the model will show the following: |
| 39 | + |
| 40 | +``` |
| 41 | ++--------------------+---------+--------+ |
| 42 | +| Model | Version | Status | |
| 43 | ++--------------------+---------+--------+ |
| 44 | +| abp-nvsmi-xgb | 1 | READY | |
| 45 | +| phishing-bert-onnx | 1 | READY | |
| 46 | +| sid-minibert-onnx | 1 | READY | |
| 47 | ++--------------------+---------+--------+ |
| 48 | +``` |
| 49 | + |
| 50 | +### Run E2E Benchmarks |
| 51 | + |
| 52 | +Benchmarks are run using `pytest-benchmark`. Benchmarks for an individual workflow can be run using the following: |
| 53 | + |
| 54 | +``` |
| 55 | +cd tests/benchmarks |
| 56 | +
|
| 57 | +pytest -s --benchmark-enable --benchmark-autosave test_bench_e2e_pipelines.py::<test-workflow> |
| 58 | +``` |
| 59 | +The `-s` option allows outputs of pipeline execution to be displayed so you can ensure there are no errors while running your benchmarks. |
| 60 | + |
| 61 | +`<test-workflow>` is the name of the test to run benchmarks on. This can be `test_sid_nlp_e2e`, `test_abp_fil_e2e`, `test_phishing_nlp_e2e` or `test_cloudtrail_ae_e2e`. |
| 62 | + |
| 63 | +For example, to run E2E benchmarks on the SID NLP workflow: |
| 64 | +``` |
| 65 | +pytest -s --benchmark-enable --benchmark-autosave test_bench_e2e_pipelines.py::test_sid_nlp_e2e |
| 66 | +``` |
| 67 | + |
| 68 | +To run E2E benchmarks on all workflows: |
| 69 | +``` |
| 70 | +pytest -s --benchmark-enable --benchmark-autosave test_bench_e2e_pipelines.py |
| 71 | +``` |
| 72 | + |
| 73 | +The console output should look like this: |
| 74 | +``` |
| 75 | +------------------------------------------------------------------------------------------- benchmark: 4 tests ------------------------------------------------------------------------------------------ |
| 76 | +Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations |
| 77 | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 78 | +test_phishing_nlp_e2e 834.5413 (1.0) 892.8774 (1.0) 858.9724 (1.0) 22.5832 (1.0) 854.7082 (1.0) 31.7465 (1.0) 2;0 1.1642 (1.0) 5 1 |
| 79 | +test_sid_nlp_e2e 2,055.0733 (2.46) 2,118.1255 (2.37) 2,095.8951 (2.44) 26.2586 (1.16) 2,105.8771 (2.46) 38.5301 (1.21) 1;0 0.4771 (0.41) 5 1 |
| 80 | +test_abp_fil_e2e 5,016.7639 (6.01) 5,292.9841 (5.93) 5,179.0901 (6.03) 121.5466 (5.38) 5,195.2253 (6.08) 215.2213 (6.78) 1;0 0.1931 (0.17) 5 1 |
| 81 | +test_cloudtrail_ae_e2e 6,929.7436 (8.30) 7,157.0487 (8.02) 6,995.1969 (8.14) 92.8935 (4.11) 6,971.9611 (8.16) 87.2056 (2.75) 1;1 0.1430 (0.12) 5 1 |
| 82 | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 83 | +``` |
| 84 | + |
| 85 | +A comprehensive report for each test run will be saved to a JSON file in `./tests/benchmarks/.benchmarks`. This will include throughput (lines/sec, bytes/sec), GPU info and Morpheus configs for each test workflow. |
0 commit comments