You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The LPG benchmark tool works by sending traffic to the specified target IP and port, and collecting the results.
36
-
Follow the steps below to run a single benchmark. Multiple LPG instances can be deployed to run benchmarks in
37
-
parallel against different targets.
33
+
The LPG benchmark tool works by sending traffic to the specified target IP and port, and collect results. Follow the steps below to run a single benchmark. You can deploy multiple LPG instances if you want to run benchmarks in parallel against different targets.
1. Get the target IP. The examples below shows how to get the IP of a gateway or a k8s service.
42
+
1. Get the target IP. Examples below show how to get the IP of a gateway or a LoadBalancer k8s service.
47
43
48
44
```bash
49
45
# Get gateway IP
@@ -55,43 +51,32 @@ parallel against different targets.
55
51
echo$SVC_IP
56
52
```
57
53
58
-
1. Then update the `<target-ip>`in`./config/manifests/benchmark/benchmark.yaml` to the value of `$SVC_IP` or `$GW_IP`.
59
-
Feel free to adjust other parameters such as `request_rates` as well. For a complete list of LPG configurations, refer to the
60
-
[LPG user guide](https://github.com/AI-Hypercomputer/inference-benchmark?tab=readme-ov-file#configuring-the-benchmark).
54
+
1. Then update the `<target-ip>`in`./config/manifests/benchmark/benchmark.yaml` to your target IP. Feel free to adjust other parameters such as request_rates as well. For a complete list of LPG configurations, pls refer to the [LPG user guide](https://github.com/AI-Hypercomputer/inference-benchmark?tab=readme-ov-file#configuring-the-benchmark).
61
55
62
-
1. Start the benchmark tool.
56
+
1. Start the benchmark tool.`kubectl apply -f ./config/manifests/benchmark/benchmark.yaml`
1. Wait for benchmark to finish and download the results. Use the `benchmark_id` environment variable to specify what this
69
-
benchmark is for. For instance, `inference-extension` or `k8s-svc`. When the LPG tool finishes benchmarking, it will print
70
-
a log line `LPG_FINISHED`. The script below will watch for that log line and then start downloading results.
58
+
1. Wait for benchmark to finish and download the results. Use the `benchmark_id` environment variable
59
+
to specify what this benchmark is for. For instance, `inference-extension` or `k8s-svc`. When the LPG tool finishes benchmarking, it will print a log line `LPG_FINISHED`,
60
+
the script below will watch for that log line and then start downloading results.
After the script finishes, you should see benchmark results under `./tools/benchmark/output/default-run/k8s-svc/results/json` folder.
77
-
Here is a [sample json file](./sample.json). Replace `k8s-svc` with `inference-extension` when running an inference extension benchmark.
65
+
1. After the script finishes, you should see benchmark results under `./tools/benchmark/output/default-run/my-benchmark/results/json` folder. Here is a [sample json file](./sample.json).
78
66
79
67
### Tips
80
68
81
-
* When using a `benchmark_id` other than `k8s-svc` or `inference-extension`, the labels in`./tools/benchmark/benchmark.ipynb` must be
82
-
updated accordingly to analyze the results.
83
69
* You can specify `run_id="runX"` environment variable when running the `./download-benchmark-results.bash` script.
84
70
This is useful when you run benchmarks multiple times to get a more statistically meaningful results and group the results accordingly.
85
71
* Update the `request_rates` that best suit your benchmark environment.
86
72
87
73
### Advanced Benchmark Configurations
88
74
89
-
Refer to the [LPG user guide](https://github.com/AI-Hypercomputer/inference-benchmark?tab=readme-ov-file#configuring-the-benchmark) for a
90
-
detailed list of configuration knobs.
75
+
Pls refer to the [LPG user guide](https://github.com/AI-Hypercomputer/inference-benchmark?tab=readme-ov-file#configuring-the-benchmark) for a detailed list of configuration knobs.
91
76
92
77
## Analyze the results
93
78
94
-
This guide shows how to run the jupyter notebook using vscode after completing k8s service and inference extension benchmarks.
79
+
This guide shows how to run the jupyter notebook using vscode.
95
80
96
81
1. Create a python virtual environment.
97
82
@@ -109,4 +94,4 @@ This guide shows how to run the jupyter notebook using vscode after completing k
109
94
1. Open the notebook `./tools/benchmark/benchmark.ipynb`, and run each cell. In the last cell update the benchmark ids with`inference-extension` and `k8s-svc`. At the end you should
110
95
see a bar chart like below where **"ie"** represents inference extension. This chart is generated using this benchmarking tool with 6 vLLM (v1) model servers (H100 80 GB), [llama2-7b](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/tree/main) and the [ShareGPT dataset](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json).
0 commit comments