[ovs-dev] [PATCH ovn v3 3/3] tests: Add check-perf target

Wed Jun 30 12:49:21 UTC 2021

On 24/06/2021 16:34, Dumitru Ceara wrote:
> On 6/18/21 10:52 AM, Mark Gray wrote:
>> Add a suite of micro-benchmarks to aid a developer in understanding the
>> performance impact of any changes that they are making. They can be used to
>> help to understand the relative performance between two test runs on the same
>> test machine, but are not intended to give the absolute performance of OVN.
>>
>> To invoke the performance testsuite, run:
>>
>>     $ make check-perf
>>
>> This will run all available performance tests.
>>
>> Additional metrics (e.g. memory, coverage, perf counters) may be added
>> in the future. Additional tests (e.g. additional topologies,  ovn-controller
>> tests) may be added in the future.
>>
>> Signed-off-by: Mark Gray <mark.d.gray at redhat.com>
>> ---
> 
> Thanks for this, I think this is a very good idea!

Thanks (and thanks for the review!), I think it is a good starting part
but it will only be useful if it gets used! Personally, I can see myself
using this. However, if it doesn't, it is not very invasive and could
easily be let rot or removed.
> 
>>
>> Notes:
>>     v2:  create results directory to fix build error
>>     v3:  forgot to commit, create results directory to fix build error
>>
>>  Documentation/topics/testing.rst |  49 ++++++++
>>  tests/.gitignore                 |   3 +
>>  tests/automake.mk                |  27 ++++
>>  tests/perf-northd.at             | 207 +++++++++++++++++++++++++++++++
>>  tests/perf-testsuite.at          |  26 ++++
>>  5 files changed, 312 insertions(+)
>>  create mode 100644 tests/perf-northd.at
>>  create mode 100644 tests/perf-testsuite.at
>>
>> diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst
>> index be9e7c57331c..ccd3278437b1 100644
>> --- a/Documentation/topics/testing.rst
>> +++ b/Documentation/topics/testing.rst
>> @@ -256,3 +256,52 @@ the following::
>>  All the features documented under `Unit Tests`_ are available for the
>>  datapath testsuites, except that the datapath testsuites do not
>>  support running tests in parallel.
>> +
>> +Performance testing
>> +~~~~~~~~~~~~~~~~~~~
>> +
>> +OVN includes a suite of micro-benchmarks to aid a developer in understanding the
>> +performance impact of any changes that they are making. They can be used to
>> +help to understand the relative performance between two test runs on the same
>> +test machine, but are not intended to give the absolute performance of OVN.
>> +
>> +To invoke the performance testsuite, run::
>> +
>> +    $ make check-perf
>> +
>> +This will run all available performance tests. Some of these tests may be
>> +long-running as they need to build complex logical network topologies. In order
>> +to speed up subsequent test runs, some objects (e.g. the Northbound DB) may be
>> +cached. In order to force the tests to rebuild all these objects, run::
>> +
>> +    $ make check-perf TESTSUITEFLAGS="--rebuild"
>> +
>> +A typical workflow for a developer trying to improve the performance of OVN
>> +would be the following:
>> +
>> +0. Optional: Modify/add a performance test to buld the topology that you are
>> +   benchmarking, if required.
>> +1. Run ``make check-perf TESTSUITEFLAGS="--rebuild"`` to generate cached
>> +   databases.
>> +
>> +.. note::
>> +   This step may take some time depending on the number of tests that are being
>> +   rebuilt, the complexity of the tests and the performance of the test machine.
>> +   If you are only using one test, you can specify the test to run by adding the
>> +   test number to the ``make`` command.
>> +   (e.g. ``make check-perf TESTSUITEFLAGS="--rebuild <test number>"``)
>> +
>> +2. Run ``make check-perf`` to measure the performance metric that you are
>> +   benchmarking against. If you are only using one test, you can specify the test to run by adding the test number to the ``make`` command.
>> +   (e.g. ``make check-perf TESTSUITEFLAGS="--rebuild <test number>"``)
> 
> It's not very clear where the user would find the test results:
> 
> tests/perf-testsuite.dir/results

It is mentioned further below but I will move it up to here to make it
clearer.
> 
>> +3. Modify OVN code to implement the change that you believe will improve the
>> +   performance.
>> +4. Go to Step 2. to continue making improvements.
>> +
>> +If, as a developer, you modify a performance test in a way that may change one
>> +of these cached objects, be sure to rebuild the test.
>> +
>> +The results of each test run are displayed on the screen at the end of the test
>> +run but are also saved in the file ``tests/perf-testsuite.dir/results``. The
>> +cached objects are stored under the relevant folder in
>> +``tests/perf-testsuite.dir/cached``.
>> diff --git a/tests/.gitignore b/tests/.gitignore
>> index 8479f9bb0f8f..65cb1c6e4fad 100644
>> --- a/tests/.gitignore
>> +++ b/tests/.gitignore
>> @@ -22,6 +22,9 @@
>>  /system-offloads-testsuite
>>  /system-offloads-testsuite.dir/
>>  /system-offloads-testsuite.log
>> +/perf-testsuite
>> +/perf-testsuite.dir/
>> +/perf-testsuite.log
>>  /test-aes128
>>  /test-atomic
>>  /test-bundle
>> diff --git a/tests/automake.mk b/tests/automake.mk
>> index 742e5cff28cc..77ed70cccdb9 100644
>> --- a/tests/automake.mk
>> +++ b/tests/automake.mk
>> @@ -4,9 +4,11 @@ EXTRA_DIST += \
>>  	$(SYSTEM_TESTSUITE_AT) \
>>  	$(SYSTEM_KMOD_TESTSUITE_AT) \
>>  	$(SYSTEM_USERSPACE_TESTSUITE_AT) \
>> +	$(PERF_TESTSUITE_AT) \
>>  	$(TESTSUITE) \
>>  	$(SYSTEM_KMOD_TESTSUITE) \
>>  	$(SYSTEM_USERSPACE_TESTSUITE) \
>> +	$(PERF_TESTSUITE) \
>>  	tests/atlocal.in \
>>  	$(srcdir)/package.m4 \
>>  	$(srcdir)/tests/testsuite \
>> @@ -52,6 +54,10 @@ SYSTEM_TESTSUITE_AT = \
>>  	tests/system-ovn.at \
>>  	tests/system-ovn-kmod.at
>>  
>> +PERF_TESTSUITE_AT = \
>> +	tests/perf-testsuite.at \
>> +	tests/perf-northd.at
>> +
>>  check_SCRIPTS += tests/atlocal
>>  
>>  TESTSUITE = $(srcdir)/tests/testsuite
>> @@ -59,6 +65,9 @@ TESTSUITE_PATCH = $(srcdir)/tests/testsuite.patch
>>  TESTSUITE_DIR = $(abs_top_builddir)/tests/testsuite.dir
>>  SYSTEM_KMOD_TESTSUITE = $(srcdir)/tests/system-kmod-testsuite
>>  SYSTEM_USERSPACE_TESTSUITE = $(srcdir)/tests/system-userspace-testsuite
>> +PERF_TESTSUITE = $(srcdir)/tests/perf-testsuite
>> +PERF_TESTSUITE_DIR = $(abs_top_builddir)/tests/perf-testsuite.dir
>> +PERF_TESTSUITE_RESULTS = $(PERF_TESTSUITE_DIR)/results
>>  DISTCLEANFILES += tests/atconfig tests/atlocal
>>  
>>  AUTOTEST_PATH = $(ovs_builddir)/utilities:$(ovs_builddir)/vswitchd:$(ovs_builddir)/ovsdb:$(ovs_builddir)/vtep:tests:$(PTHREAD_WIN32_DIR_DLL):$(SSL_DIR):controller-vtep:northd:utilities:controller:ic
>> @@ -171,6 +180,20 @@ check-system-userspace: all
>>  
>>  clean-local:
>>  	test ! -f '$(TESTSUITE)' || $(SHELL) '$(TESTSUITE)' -C tests --clean
>> +
>> +check-perf: all
>> +	@mkdir -p $(PERF_TESTSUITE_DIR)
>> +	@echo  > $(PERF_TESTSUITE_RESULTS)
>> +	set $(SHELL) '$(PERF_TESTSUITE)' -C tests  AUTOTEST_PATH='$(AUTOTEST_PATH)'; \
>> +        $(SUDO) "$$@" $(TESTSUITEFLAGS) -j1 || (test X'$(RECHECK)' = Xyes && $(SUDO) "$$@" --recheck)
> 
> We don't really need sudo.
> 

Oops, I must have copied this stanza from the system tests :)

>> +	@echo
>> +	@echo  '## -------------------- ##'
>> +	@echo  '##  Performance Results ##'
>> +	@echo  '## -------------------- ##'
>> +	@cat $(PERF_TESTSUITE_RESULTS)
>> +	@echo
>> +	@echo "Results can be found in $(PERF_TESTSUITE_RESULTS)"
>> +
>>  

>>  AUTOTEST = $(AUTOM4TE) --language=autotest
>>  
>> @@ -193,6 +216,10 @@ $(SYSTEM_USERSPACE_TESTSUITE): package.m4 $(SYSTEM_TESTSUITE_AT) $(SYSTEM_USERSP
>>  	$(AM_V_GEN)$(AUTOTEST) -I '$(srcdir)' -o $@.tmp $@.at
>>  	$(AM_V_at)mv $@.tmp $@
>>  
>> +$(PERF_TESTSUITE): package.m4 $(PERF_TESTSUITE_AT) $(COMMON_MACROS_AT)
>> +	$(AM_V_GEN)$(AUTOTEST) -I '$(srcdir)' -o $@.tmp $@.at
>> +	$(AM_V_at)mv $@.tmp $@
>> +
>>  # The `:;' works around a Bash 3.2 bug when the output is not writeable.
>>  $(srcdir)/package.m4: $(top_srcdir)/configure.ac
>>  	$(AM_V_GEN):;{ \
>> diff --git a/tests/perf-northd.at b/tests/perf-northd.at
>> new file mode 100644
>> index 000000000000..eb0c391c4650
>> --- /dev/null
>> +++ b/tests/perf-northd.at
>> @@ -0,0 +1,207 @@
>> +AT_BANNER([ovn-northd performance tests])
>> +
>> +# CACHE_NBDB()
>> +#
>> +# Save the current northbound database for future test runs.
>> +#
>> +m4_define([CACHE_NBDB],[
>> +    mkdir -p ${at_suite_dir}/cached/${at_group}
>> +    cp -f ${ovs_base}/ovn-nb/ovn-nb.db ${at_suite_dir}/cached/${at_group}/
>> +])
>> +
>> +# BUILD_NBDB([COMMANDS])
>> +#
>> +# Configure the northbound database using COMMANDS.
>> +#
>> +# BUILD_NBDB() will check if there is a cached nortbound database present
>> +# for this test group. If present, it will use the cached version. If the
>> +# testsuite was run using the '--rebuild' flag, it will force a rebuild of the
>> +# northbound database.
>> +#
>> +m4_define([BUILD_NBDB],[
>> +    if [[ ! -f ${at_suite_dir}/cached/${at_group}/ovn-nb.db ]] || [[ $at_arg_rebuild != false ]]; then
>> +        echo "Rebuild NBDB"
>> +        $1
>> +        CACHE_NBDB()
>> +    else
>> +        echo "Using cached NBDB"
>> +        rm ${at_suite_dir}/cached/${at_group}/.ovn-nb.db.~lock~
>> +        ovs-appctl -t ovn-nb/ovsdb-server ovsdb-server/remove-db OVN_Northbound
>> +        ovs-appctl -t ovn-nb/ovsdb-server ovsdb-server/add-db ${at_suite_dir}/cached/${at_group}/ovn-nb.db
>> +    fi
>> +    ovn-nbctl --wait=sb sync
>> +])
>> +
>> +# PERF_RECORD_BANNER()
>> +#
>> +# Append standard banner to performance results.
>> +#
>> +m4_define([PERF_RECORD_START],[
>> +    echo >> ${at_suite_dir}/results
>> +    echo "$at_desc_line" >> ${at_suite_dir}/results
>> +    echo "  ---" >> ${at_suite_dir}/results
>> +])
>> +
>> +# PERF_RECORD_RESULT([KEY], [VALUE])
>> +#
>> +# Append KEY and VALUE to performance results.
>> +#
>> +m4_define([PERF_RECORD_RESULT],[
>> +    echo "  $1: $2" >> ${at_suite_dir}/results
>> +])
>> +
>> +m4_define([PARSE_STOPWATCH], [
>> +    grep -A 6 $1 | grep $2 | sed 's/[[^0-9.]]*//g'
>> +])
>> +
>> +# PERF_RECORD_STOPWATCH([NAME], [METRIC])
>> +#
>> +# Append the value of the OVN stopwatch metric METRIC from stopwatch NAME
>> +# to performance results.
>> +#
>> +m4_define([PERF_RECORD_STOPWATCH], [
>> +    PERF_RECORD_RESULT($3, [`ovn-appctl -t northd/NORTHD_TYPE stopwatch/show | PARSE_STOPWATCH($1, $2)`])
>> +])
>> +
>> +# PERF_RECORD()
>> +#
>> +# Append a number of metrics to performance results
>> +#
>> +m4_define([PERF_RECORD_STOP], [
>> +    PERF_RECORD_STOPWATCH(ovnnb_db_run, ["Maximum"], [Maximum (NB)])
>> +    PERF_RECORD_STOPWATCH(ovnnb_db_run, ["Short term average"], [Average (NB)])
>> +])
>> +
>> +# OVN_NBCTL([NBCTL_COMMAND])
>> +#
>> +# Add NBCTL_COMMAND to list of commands to be run by RUN_OVN_NBCTL().
>> +#
>> +m4_define([OVN_NBCTL], [
>> +    command="${command} -- $1"
>> +])
>> +
>> +# RUN_OVN_NBCTL()
>> +#
>> +# Execute list of commands built by the OVN_NBCTL() macro.
>> +#
>> +m4_define([RUN_OVN_NBCTL], [
>> +    check ovn-nbctl ${command}
>> +    unset command
>> +])
>> +
>> +OVS_START_SHELL_HELPERS
>> +generate_subnet () {
>> +    local a=$(printf %d $(expr $1 / 256 + 10))
>> +    local b=$(printf %d $(expr $1 % 256))
>> +    echo $a.$b.0.0/16
>> +}
>> +generate_ip () {
>> +    local a=$(printf %d $(expr $1 / 256 + 10))
>> +    local b=$(printf %d $(expr $1 % 256))
>> +    local c=$(printf %d $(expr $2 / 256))
>> +    local d=$(printf %d $(expr $2 % 256))
>> +    echo $a.$b.$c.$d
>> +}
>> +generate_router_ip () {
>> +    local a=$(printf %d $(expr $1 / 256 + 10))
>> +    local b=$(printf %d $(expr $1 % 256))
>> +    echo $a.$b.255.254
>> +}
>> +generate_mac () {
>> +    local a=$(printf %02x $(expr $1 / 256))
>> +    local b=$(printf %02x $(expr $1 % 256))
>> +    local c=$(printf %02x $(expr $2 / 256))
>> +    local d=$(printf %02x $(expr $2 % 256))
>> +    echo f0:00:$a:$b:$c:$d
>> +}
>> +OVS_END_SHELL_HELPERS
> 
> The file is called perf-northd.at so if we ever decide to add some tests
> to measure end-to-end performance including ovn-controller I assume
> we'll need a new file, e.g., perf-controller.at.  Shouldn't we already
> move all of the generic macros above to a shared file; maybe ovn-macros.at?
> 

This is exactly what I was thinking but I didn't want to do it yet
because a) I am not sure if it will be used much b) I am unsure what
will be common yet.

As such, I would prefer to hold off but I am happy to do it now if you
think there is benefit.

>> +
>> +# OVN_BASIC_SCALE_CONFIG(HYPERVISORS, PORTS)
>> +#
>> +# Configures OVN with HYPERVISORS x logical switches. Each logical
>> +# switch has PORTS x logical ports and is connected to one
>> +# Logical Router. The logical router hosts an snat entry for the subnet hosted
>> +# on each logical switch. This is illustrated below.
>> +#
>> +#                                    .
>> +#                                    .
>> +#                                    .
>> +#       .
>> +#       .
>> +#       .                     Logical Switch ───┐
>> +#                         │                     │
>> +# Logical Port ───────────┤                     │
>> +#                         │                     │
>> +# Logical Port ───────────┼── Logical Switch ───┤
>> +#                         │                     │
>> +# Logical Port ───────────┤                     ├──── Logical Router
>> +#                         │                     │     (snat entries)
>> +#       .                     Logical Switch ───┤
>> +#       .                                       │
>> +#       .                            .          │
>> +#                                    .          │
>> +#                                    .
>> +#
>> +m4_define([OVN_BASIC_SCALE_CONFIG], [
>> +    # In order to speed up building the NBDB, we run `ovn-nbctl` in daemon mode.
>> +    # We also coalesce `ovn-nbctl` commands using the OVN_NBCTL() and
>> +    # RUN_OVN_NBCTL() macros
>> +    #
>> +    on_exit 'kill $(cat ovn-nbctl.pid)'
>> +    export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach)
>> +
>> +    for hv in $(seq 1 $1); do
>> +        echo "Adding hypervisor ${hv}"
>> +        total_hv=$1
>> +        logical_switch=lsw${hv}
>> +        logical_router=lrw${hv}
>> +        logical_router_port=lr${hv}lrp0
>> +        logical_switch_router_port=lr${hv}lsp0
>> +        logical_router_ip=$(generate_router_ip ${hv})
>> +        logical_router_nat_ip=$(generate_ip $((hv+total_hv)) 1)
>> +        logical_router_mac=$(generate_mac ${hv} 1)
>> +        logical_switch_subnet=$(generate_subnet ${hv})
>> +
>> +        OVN_NBCTL(ls-add $logical_switch)
>> +        OVN_NBCTL(set Logical_Switch $logical_switch other_config:subnet=${logical_switch_subnet})
>> +        OVN_NBCTL(set Logical_Switch $logical_switch other_config:exclude_ips=$logical_router_ip)
>> +
>> +        OVN_NBCTL(lr-add $logical_router)
>> +        OVN_NBCTL(lrp-add $logical_router $logical_router_port $logical_router_mac ${logical_router_ip}/16)
>> +        OVN_NBCTL(lr-nat-add $logical_router snat $logical_router_nat_ip $logical_switch_subnet)
>> +        OVN_NBCTL(lsp-add $logical_switch $logical_switch_router_port -- set Logical_Switch_Port $logical_switch_router_port type=router options:router-port=$logical_router_port addresses=dynamic)
>> +
>> +        for port in $(seq 1 $2); do
>> +            logical_switch_port=lsw${hv}lsp${port}
>> +            OVN_NBCTL(lsp-add $logical_switch $logical_switch_port)
>> +            OVN_NBCTL(lsp-set-addresses $logical_switch_port dynamic)
>> +        done
>> +
>> +        RUN_OVN_NBCTL()
>> +
>> +    done
>> +])
>> +
>> +OVN_FOR_EACH_NORTHD([
>> +AT_SETUP([ovn-northd basic scale test -- 200 Hypervisors, 200 Logical Ports/Hypervisor])
>> +PERF_RECORD_START()
>> +
>> +ovn_start
> 
> This starts ovn-northd with debug log level for jsonrpc.  We're
> benchmarking performance so we probably need to use a less verbose log
> level, maybe info.

Based on your later reply to this thread, I won't change this.

> 
> Regards,
> Dumitru
>