Vegeta is a versatile HTTP load testing tool built to generate loads on HTTP services with a constant request rate.
In this guide, we show how to run a load test on an KServe Endpoint. This guide offers concise instructions on using Vegeta, a simple but powerful tool, to load test HTTP services, thus assessing their performance under high-level stress. Particularly for KServe Endpoints, which handle complex machine learning models, these tests are critical to identify potential bottlenecks, ensuring optimal performance and reliability.
You could use this to create a variety of test to cover different scenarios, such as quick tests, tests of different batch sizes, or “soak tests” that run for hours to ensure stability.
Main web site: https://github.com/tsenart/vegeta
Here’s an example YAML for testing a KServe endpoint:
vegeta-inference-service-test.yaml
apiVersion: batch/v1
kind: Job
metadata:
generateName: load-test
spec:
backoffLimit: 6
parallelism: 1
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
restartPolicy: OnFailure
containers:
- args:
- vegeta -cpus=4 attack -duration=2m -rate=100/1s -targets=/var/vegeta/cfg
| vegeta report -type=text
command:
- sh
- -c
image: peterevans/vegeta:latest
imagePullPolicy: Always
name: vegeta
volumeMounts:
- mountPath: /var/vegeta
name: vegeta-cfg
volumes:
- configMap:
defaultMode: 420
name: vegeta-cfg
name: vegeta-cfg
---
apiVersion: v1
data:
cfg: |
POST <http://churn-gbm-onnx.my-namespace.svc.cluster.local/v2/models/churn-gbm-onnx/infer>
@/var/vegeta/payload
payload: |
{"inputs": [{"name": "float_input", "datatype": "FP32", "shape": [10, 13], "data": [[14200.0, 14200.0, 14200.0, 195175.0, 1, 1, 195175, 95, 386.0, 2022, 11, 48, 2015], [14200.0, 14200.0, 14200.0, 195175.0, 1, 1, 195175, 95, 386.0, 2022, 11, 48, 2015], [14200.0, 14200.0, 14200.0, 195175.0, 1, 1, 195175, 95, 386.0, 2022, 11, 48, 2015], [14200.0, 14200.0, 14200.0, 195175.0, 1, 1, 195175, 95, 386.0, 2022, 11, 48, 2015], [14200.0, 14200.0, 14200.0, 195175.0, 1, 1, 195175, 95, 386.0, 2022, 11, 48, 2015], [14200.0, 14200.0, 14200.0, 195175.0, 1, 1, 195175, 95, 386.0, 2022, 11, 48, 2015], [14200.0, 14200.0, 14200.0, 195175.0, 1, 1, 195175, 95, 386.0, 2022, 11, 48, 2015], [14200.0, 14200.0, 14200.0, 195175.0, 1, 1, 195175, 95, 386.0, 2022, 11, 48, 2015], [14200.0, 14200.0, 14200.0, 195175.0, 1, 1, 195175, 95, 386.0, 2022, 11, 48, 2015], [14200.0, 14200.0, 14200.0, 195175.0, 1, 1, 195175, 95, 386.0, 2022, 11, 48, 2015]]}]}
kind: ConfigMap
metadata:
annotations:
name: vegeta-cfg
To customize this test for your endpoint:
To run this test, run:
$ kubectl create -f vegeta-inference-service-test.yaml