E2E benchmark tests (#269)

efajardo-nv · web-flow · commit a47cc65465f6 · 2022-08-08T17:08:05.000Z
Uses `pytest-benchmark` to perform E2E benchmarks on the following workflows: - SID NLP - Phishing Detection NLP - ABP FIL - HAMMAH Input for these tests are pulled from LFS data already included in this repo. Hook is used to add the following information to the output JSON: GPU: ``` "gpu_0": { "id": 0, "name": "Quadro RTX 8000", "load": "0.0%", "free_memory": "42444.0MB", "used_memory": "6156.0MB", "temperature": "61.0 C", "uuid": "GPU-dc32de82-bdaa-2d05-2abe-260a847e1989" } ``` Morpheus config for each workflow: - num_threads - pipeline_batch_size - model_max_batch_size - feature_length - edge_buffer_size Additional benchmark stats for each workflow: - input_lines - min_throughput_lines - max_throughput_lines - mean_throughput_lines - median_throughput_lines - input_bytes - min_throughput_bytes - max_throughput_bytes - mean_throughput_bytes - median_throughput_bytes Thoughputs were calculated by dividing input lines/bytes by total runtimes. ~These calculations will be replaced using tracer implementation developed by @drobison00.~ Tracer will be added in a separate PR. Authors: - Eli Fajardo (https://github.com/efajardo-nv) Approvers: - Michael Demoret (https://github.com/mdemoret-nv) URL: #269
diff --git a/ci/conda/recipes/morpheus/meta.yaml b/ci/conda/recipes/morpheus/meta.yaml
@@ -85,6 +85,7 @@ outputs:
     test:
       requires:
         - cudatoolkit {{ cuda_version }}.*
+        - gputil
         - pytest
         - pytest-cov
         - pytest-benchmark
diff --git a/docker/conda/environments/cuda11.5_dev.yml b/docker/conda/environments/cuda11.5_dev.yml
@@ -45,6 +45,7 @@ dependencies:
     - git>=2.35.3 # Needed for wildcards on safe.directory
     - glog=0.6
     - gmock=1.10
+    - gputil
     - grpc-cpp>=1.43
     - gtest=1.10
     - gxx_linux-64=9.4
diff --git a/tests/benchmarks/README.md b/tests/benchmarks/README.md
@@ -0,0 +1,85 @@
+<!--
+# Copyright (c) 2021-2022, NVIDIA CORPORATION.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+-->
+
+# Running E2E Benchmarks
+
+### Set up Triton Inference Server
+
+##### Pull Triton Inference Server Docker Image
+Pull Docker image from NGC (https://ngc.nvidia.com/catalog/containers/nvidia:tritonserver) suitable for your environment.
+
+Example:
+
+```
+docker pull nvcr.io/nvidia/tritonserver:22.02-py3
+```
+
+##### Start Triton Inference Server container
+```
+cd ${MORPHEUS_ROOT}/models
+
+docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD:/models nvcr.io/nvidia/tritonserver:22.06-py3 tritonserver --model-repository=/models/triton-model-repo --model-control-mode=explicit --load-model sid-minibert-onnx --load-model abp-nvsmi-xgb --load-model phishing-bert-onnx
+```
+
+##### Verify Model Deployments
+Once Triton server finishes starting up, it will display the status of all loaded models. Successful deployment of the model will show the following:
+
+```
++--------------------+---------+--------+
+| Model              | Version | Status |
++--------------------+---------+--------+
+| abp-nvsmi-xgb      | 1       | READY  |
+| phishing-bert-onnx | 1       | READY  |
+| sid-minibert-onnx  | 1       | READY  |
++--------------------+---------+--------+
+```
+
+### Run E2E Benchmarks
+
+Benchmarks are run using `pytest-benchmark`. Benchmarks for an individual workflow can be run using the following:
+
+```
+cd tests/benchmarks
+
+pytest -s --benchmark-enable --benchmark-autosave test_bench_e2e_pipelines.py::<test-workflow>
+```
+The `-s` option allows outputs of pipeline execution to be displayed so you can ensure there are no errors while running your benchmarks.
+
+`<test-workflow>` is the name of the test to run benchmarks on. This can be `test_sid_nlp_e2e`, `test_abp_fil_e2e`, `test_phishing_nlp_e2e` or `test_cloudtrail_ae_e2e`.
+
+For example, to run E2E benchmarks on the SID NLP workflow:
+```
+pytest -s --benchmark-enable --benchmark-autosave test_bench_e2e_pipelines.py::test_sid_nlp_e2e
+```
+
+To run E2E benchmarks on all workflows:
+```
+pytest -s --benchmark-enable --benchmark-autosave test_bench_e2e_pipelines.py
+```
+
+The console output should look like this:
+```
+------------------------------------------------------------------------------------------- benchmark: 4 tests ------------------------------------------------------------------------------------------
+Name (time in ms)                 Min                   Max                  Mean              StdDev                Median                 IQR            Outliers     OPS            Rounds  Iterations
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+test_phishing_nlp_e2e        834.5413 (1.0)        892.8774 (1.0)        858.9724 (1.0)       22.5832 (1.0)        854.7082 (1.0)       31.7465 (1.0)           2;0  1.1642 (1.0)           5           1
+test_sid_nlp_e2e           2,055.0733 (2.46)     2,118.1255 (2.37)     2,095.8951 (2.44)      26.2586 (1.16)     2,105.8771 (2.46)      38.5301 (1.21)          1;0  0.4771 (0.41)          5           1
+test_abp_fil_e2e           5,016.7639 (6.01)     5,292.9841 (5.93)     5,179.0901 (6.03)     121.5466 (5.38)     5,195.2253 (6.08)     215.2213 (6.78)          1;0  0.1931 (0.17)          5           1
+test_cloudtrail_ae_e2e     6,929.7436 (8.30)     7,157.0487 (8.02)     6,995.1969 (8.14)      92.8935 (4.11)     6,971.9611 (8.16)      87.2056 (2.75)          1;1  0.1430 (0.12)          5           1
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+```
+
+A comprehensive report for each test run will be saved to a JSON file in  `./tests/benchmarks/.benchmarks`. This will include throughput (lines/sec, bytes/sec), GPU info and Morpheus configs for each test workflow.
diff --git a/tests/benchmarks/conftest.py b/tests/benchmarks/conftest.py
@@ -0,0 +1,69 @@
+# SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import glob
+import os
+
+import GPUtil
+from test_bench_e2e_pipelines import E2E_TEST_CONFIGS
+
+
+def pytest_benchmark_update_json(config, benchmarks, output_json):
+
+    gpus = GPUtil.getGPUs()
+
+    for i, gpu in enumerate(gpus):
+        # output_json["machine_info"]["gpu_" + str(i)] = gpu.name
+        output_json["machine_info"]["gpu_" + str(i)] = {}
+        output_json["machine_info"]["gpu_" + str(i)]["id"] = gpu.id
+        output_json["machine_info"]["gpu_" + str(i)]["name"] = gpu.name
+        output_json["machine_info"]["gpu_" + str(i)]["load"] = f"{gpu.load*100}%"
+        output_json["machine_info"]["gpu_" + str(i)]["free_memory"] = f"{gpu.memoryFree}MB"
+        output_json["machine_info"]["gpu_" + str(i)]["used_memory"] = f"{gpu.memoryUsed}MB"
+        output_json["machine_info"]["gpu_" + str(i)]["temperature"] = f"{gpu.temperature} C"
+        output_json["machine_info"]["gpu_" + str(i)]["uuid"] = gpu.uuid
+
+    line_count = 0
+    byte_count = 0
+    for bench in output_json['benchmarks']:
+        if "file_path" in E2E_TEST_CONFIGS[bench["name"]]:
+            source_file = E2E_TEST_CONFIGS[bench["name"]]["file_path"]
+            line_count = len(open(source_file).readlines())
+            byte_count = os.path.getsize(source_file)
+
+        elif "glob_path" in E2E_TEST_CONFIGS[bench["name"]]:
+            for fn in glob.glob(E2E_TEST_CONFIGS[bench["name"]]["glob_path"]):
+                line_count += len(open(fn).readlines())
+                byte_count += os.path.getsize(fn)
+
+        repeat = E2E_TEST_CONFIGS[bench["name"]]["repeat"]
+
+        bench["morpheus_config"] = {}
+        bench["morpheus_config"]["num_threads"] = E2E_TEST_CONFIGS[bench["name"]]["num_threads"]
+        bench["morpheus_config"]["pipeline_batch_size"] = E2E_TEST_CONFIGS[bench["name"]]["pipeline_batch_size"]
+        bench["morpheus_config"]["model_max_batch_size"] = E2E_TEST_CONFIGS[bench["name"]]["model_max_batch_size"]
+        bench["morpheus_config"]["feature_length"] = E2E_TEST_CONFIGS[bench["name"]]["feature_length"]
+        bench["morpheus_config"]["edge_buffer_size"] = E2E_TEST_CONFIGS[bench["name"]]["edge_buffer_size"]
+
+        bench['stats']["input_lines"] = line_count * repeat
+        bench['stats']['min_throughput_lines'] = (line_count * repeat) / bench['stats']['max']
+        bench['stats']['max_throughput_lines'] = (line_count * repeat) / bench['stats']['min']
+        bench['stats']['mean_throughput_lines'] = (line_count * repeat) / bench['stats']['mean']
+        bench['stats']['median_throughput_lines'] = (line_count * repeat) / bench['stats']['median']
+        bench['stats']["input_bytes"] = byte_count * repeat
+        bench['stats']['min_throughput_bytes'] = (byte_count * repeat) / bench['stats']['max']
+        bench['stats']['max_throughput_bytes'] = (byte_count * repeat) / bench['stats']['min']
+        bench['stats']['mean_throughput_bytes'] = (byte_count * repeat) / bench['stats']['mean']
+        bench['stats']['median_throughput_bytes'] = (byte_count * repeat) / bench['stats']['median']
diff --git a/tests/benchmarks/e2e_test_configs.json b/tests/benchmarks/e2e_test_configs.json
@@ -0,0 +1,39 @@
+{
+    "triton_server_url": "localhost:8001",
+    "test_sid_nlp_e2e": {
+        "file_path": "../../models/datasets/validation-data/sid-validation-data.csv",
+        "repeat": 10,
+        "num_threads": 8,
+        "pipeline_batch_size": 1024,
+        "model_max_batch_size": 64,
+        "feature_length": 256,
+        "edge_buffer_size": 4
+    },
+    "test_abp_fil_e2e": {
+        "file_path": "../../models/datasets/validation-data/abp-validation-data.jsonlines",
+        "repeat": 100,
+        "num_threads": 8,
+        "pipeline_batch_size": 1024,
+        "model_max_batch_size": 1024,
+        "feature_length": 29,
+        "edge_buffer_size": 4
+    },
+    "test_phishing_nlp_e2e": {
+        "file_path": "../../models/datasets/validation-data/phishing-email-validation-data.jsonlines",
+        "repeat": 10,
+        "num_threads": 8,
+        "pipeline_batch_size": 1024,
+        "model_max_batch_size": 64,
+        "feature_length": 128,
+        "edge_buffer_size": 4
+    },
+    "test_cloudtrail_ae_e2e": {
+        "glob_path": "../../models/datasets/validation-data/hammah-*.csv",
+        "repeat": 1,
+        "num_threads": 1,
+        "pipeline_batch_size": 1024,
+        "model_max_batch_size": 1024,
+        "feature_length": 32,
+        "edge_buffer_size": 4
+    }
+}
diff --git a/tests/benchmarks/test_bench_e2e_pipelines.py b/tests/benchmarks/test_bench_e2e_pipelines.py