Continuous Benchmark Tool

What is the Continuous Benchmark Tool?

Continuous Benchmark Tool allows you to get benchmark of Vald cluster in 24/7.

Assumed use case is:

  • Verification with workload close to the production environment
  • Verification before service installation when Vald version up

Architecture

Continuous Benchmark Tool has following 2 components:

  • Benchmark Operator: Manages benchmark jobs
  • Benchmark Job: Executes CRUDs request to the target Vald cluster

Benchmark component and its feature

Benchmark Operator

  • Manages benchmark jobs according to applied manifest.
  • Apply method:
    • Scenario method: one manifest with multiple benchmark jobs
    • Job method: one manifest with one benchmark job

Benchmark Job

  • Executes CRUD request to the target Vald cluster based on defined config.
  • Execute steps are:
    1. Load dataset (valid only for HDF5 format )
    2. Execute request with load dataset

Benchmark CRD

Benchmark workload can be set by applying the Kubernetes Custom Resources(CRDs), ValdBenchmarkScenarioResource or ValdBenchmarkJobResource. Benchmark Operator manages benchmark job according to the applied manifest.

ValdBenchmarkJob

ValdBenchmarkJob is used for executing single benchmark job.

And, Benchmark Operator also applies it to the Kubernetes cluster based on ValdBenchmarkScenarioResource.

main properties

NamemandatoryDescriptiontypesample
target*target Vald clusterobjectref: target
dataset*dataset informationobjectref: dataset
job_type*execute job typestring enum: [insert, update, upsert, remove, search, getobject, exists]search
repetitionthe number of job repetitions
default: 1
integer1
replicathe number of job concurrent job executions
default: 1
integer2
rpsdesigned request per sec to the target cluster
default: 1000
integer1000
concurrency_limitgoroutine count limit for rps adjustment
default: 200
integer20
ttl_seconds_after_finishedtime until deletion of Pod after job end
default: 600
integer120
insert_configrequest config for insert jobobjectref: config
update_configrequest config for update jobobjectref: config
upsert_configrequest config for upsert jobobjectref: config
search_configrequest config for search jobobjectref: config
remove_configrequest config for remove jobobjectref: config
object_configrequest config for object jobobjectref: config
client_configgRPC client config for running benchmark job
Tune if can not getting the expected performance with default config.
objectref: defaults.grpc
server_configserver config for benchmark job pod
Tune if can not getting the expected performance with default config.
objectref: defaults.server_config

target

  • target Vald cluster information
  • type: object
propertymandatorydescriptiontypesample
host*target cluster’s hoststringlocalhost
port*target cluster’s portinteger8081

dataset

  • dataset which is used for executing job operation
  • type: object
propertymandatorydescriptiontypesample
name*dataset namestring enum: [fashion-mnist, original]fashion-mnist
group*group namestring enum: [train, test, neighbors]train
indexes*amount of index sizeinteger1000000
range*range of indexes to be used (if there are many indexes, the range will be corrected on the job side)object-
range.start*start of rangeinteger1
range.end*end of rangeinteger1000000
urlthe dataset url. It should be set when set name as originalstring

insert_config

  • rpc config for insert request
  • type: object
propertymandatorydescriptiontypesample
skip_strict_exist_checkCheck whether the same vector is already inserted or not.
The ID should be unique if the value is true.
boolfalse
timestampThe timestamp of the vector inserted.
If it is N/A, the current time will be used.
string1707272658

update_config

  • rpc config for update request
  • type: object
propertymandatorydescriptiontypesample
skip_strict_exist_checkCheck whether the same vector is already inserted or not.
The ID should be unique if the value is true.
boolfalse
timestampThe timestamp of the vector inserted.
If it is N/A, the current time will be used.
string1707272658
disable_balanced_updateA flag to disable balanced update (split remove -> insert operation) during update operation.boolfalse

upsert_config

  • rpc config for upsert request
  • type: object
propertymandatorydescriptiontypesample
skip_strict_exist_checkCheck whether the same vector is already inserted or not.
The ID should be unique if the value is true.
boolfalse
timestampThe timestamp of the vector inserted.
If it is N/A, the current time will be used.
string1707272658
disable_balanced_updateA flag to disable balanced update (split remove -> insert operation) during update operation.boolfalse

upsert_config

  • rpc config for search request
  • type: object
propertymandatorydescriptiontypesample
radiusThe search radius.
default: -1
number-1
epsilonThe search coefficient.
default: 0.05
number0.05
num*The maximum number of results to be returned.integer10
min_numThe minimum number of results to be returned.integer5
timeoutSearch timeout in nanoseconds
default: 10s
string3s
enable_linear_searchA flag to enable linear search operation for estimating search recall.
If it is true, search operation with linear operation will execute.
boolfalse
aggregation_algorithmThe search aggregation algorithm option.
default: Unknown
string enum: [“Unknown”, “ConcurrentQueue”, “SortSlice”, “SortPoolSlice”, “PairingHeap”]

remove_config

  • rpc config for remove request
  • type: object
propertymandatorydescriptiontypesample
skip_strict_exist_checkCheck whether the same vector is already inserted or not.
The ID should be unique if the value is true.
boolfalse
timestampThe timestamp of the vector inserted.
If it is N/A, the current time will be used.
string1707272658

object_config

  • rpc config for get object request
  • type: object
propertymandatorydescriptiontypesample
filter_config.targetsfilter target host and port for bypassing filter component.[]object

ValdBenchmarkScenario

ValdBenchmarkScenario is used for executing single or multiple benchmark job.

Benchmark Operator decomposes manifest and creates benchmark resources one by one. The target and dataset property are the global config for scenario, they can be overwritten when each job has own config.

main properties

propertymandatorydescriptiontypesample
target*target Vald cluster information
It will be overwritten when each job has own config
objectref: target
dataset*dataset information
It will be overwritten when each job has own config
objectref: dataset
jobs*benchmark job config
The jobs written above will be executed in order.
objectref: benchmark job

Deploy Benchmark Operator

Continuous benchmark operator can be applied with Helm same as Vald cluster.

It requires ValdBenchmarkOperatorRelease for deploying vald-benchmark-operator.

It is not must to apply, so please edit and apply as necessary.

Sample ValdBenchmarkOperatorRelease YAML
# @schema {"name": "name", "type": "string"}
# name -- name of the deployment
name: vald-benchmark-operator
# @schema {"name": "time_zone", "type": "string"}
# time_zone -- time_zone
time_zone: ""
# @schema {"name": "image", "type": "object"}
image:
  # @schema {"name": "image.repository", "type": "string"}
  # image.repository -- image repository
  repository: vdaas/vald-benchmark-operator
  # @schema {"name": "image.tag", "type": "string"}
  # image.tag -- image tag
  tag: v1.7.5
  # @schema {"name": "image.pullPolicy", "type": "string", "enum": ["Always", "Never", "IfNotPresent"]}
  # image.pullPolicy -- image pull policy
  pullPolicy: Always
# @schema {"name": "job_image", "type": "object"}
job_image:
  # @schema {"name": "job_image.repository", "type": "string"}
  # image.repository -- job image repository
  repository: vdaas/vald-benchmark-job
  # @schema {"name": "job_image.tag", "type": "string"}
  # image.tag -- image tag for job docker image
  tag: v1.7.5
  # @schema {"name": "job_image.pullPolicy", "type": "string", "enum": ["Always", "Never", "IfNotPresent"]}
  # image.pullPolicy -- image pull policy
  pullPolicy: Always
# @schema {"name": "resources", "type": "object"}
# resources -- kubernetes resources of pod
resources:
  # @schema {"name": "resources.limits", "type": "object"}
  limits:
    cpu: 300m
    memory: 300Mi
  # @schema {"name": "resources.requests", "type": "object"}
  requests:
    cpu: 200m
    memory: 200Mi
# @schema {"name": "logging", "type": "object"}
logging:
  # @schema {"name": "logging.logger", "type": "string", "enum": ["glg", "zap"]}
  # logging.logger -- logger name.
  logger: glg
  # @schema {"name": "logging.level", "type": "string", "enum": ["debug", "info", "warn", "error", "fatal"]}
  # logging.level -- logging level.
  level: debug
  # @schema {"name": "logging.format", "type": "string", "enum": ["raw", "json"]}
  # logging.format -- logging format.
  format: raw

For more details of the configuration of vald-benchmark-operator-release, please refer to here

  1. Add Vald repo into the helm repo

    helm repo add vald https://vdaas.vald.org
    
  2. Deploy vald-benchmark-operator-release

    helm install vald-benchmark-operator-release vald/vald-benchmark-operator
    
  3. Apply vbor.yaml (optional)

    kubectl apply -f vbor.yaml
    

Running Continuous Benchmarks

After deploy the benchmark operator, you can execute continuous benchmark by applying ValdBenchmarkScenarioRelease or ValdBenchmarkJobRelease.

Please configure designed benchmark and apply by kubectl command.

The sample manifests are here.

See also