[ovs-dev] Flaky ALB tests.

Kevin Traynor ktraynor at redhat.com
Fri Nov 19 12:34:35 UTC 2021


On 19/11/2021 12:25, Ilya Maximets wrote:
> Hi, Kevin.
> 

Hi Ilya,

> After the recent ALB commits, some of these tests are failing more
> or less frequently in our CI and on my local setup too.  Could you,
> please, take a look?
> 

Sure, I'll take a look. Thanks for letting me know, I hadn't seen it.

Kevin.

> Below is one of the recent examples.  Another test that fails with
> similar symptoms periodically is 'ALB - min num PMD/RxQ'.
> It looks like it doesn't actually perform the dry run.  At least it
> doesn't log that.
> 
> Bets regards, Ilya Maximets.
> 
> 
> #                             -*- compilation -*-
> 1026. alb.at:98: testing ALB - PMD/RxQ assignment type ...
> ../../../tests/alb.at:99: ovsdb-tool create conf.db $abs_top_srcdir/vswitchd/vswitch.ovsschema
> ../../../tests/alb.at:99: ovsdb-server --detach --no-chdir --pidfile --log-file --remote=punix:$OVS_RUNDIR/db.sock
> stderr:
> 2021-11-19T02:22:41Z|00001|vlog|INFO|opened log file /home/runner/work/ovs/ovs/openvswitch-2.16.90/_build/sub/tests/testsuite.dir/1026/ovsdb-server.log
> ../../../tests/alb.at:99: sed < stderr '
> /vlog|INFO|opened log file/d
> /ovsdb_server|INFO|ovsdb-server (Open vSwitch)/d'
> ../../../tests/alb.at:99: ovs-vsctl --no-wait init
> ../../../tests/alb.at:99: ovs-vswitchd --enable-dummy --disable-system --disable-system-route --dummy-numa="0,0" --detach --no-chdir --pidfile --log-file -vvconn -vofproto_dpif -vunixctl
> stderr:
> 2021-11-19T02:22:41Z|00001|vlog|INFO|opened log file /home/runner/work/ovs/ovs/openvswitch-2.16.90/_build/sub/tests/testsuite.dir/1026/ovs-vswitchd.log
> 2021-11-19T02:22:41Z|00002|ovs_numa|INFO|Discovered 2 CPU cores on NUMA node 0
> 2021-11-19T02:22:41Z|00003|ovs_numa|INFO|Discovered 1 NUMA nodes and 2 CPU cores
> 2021-11-19T02:22:41Z|00004|reconnect|INFO|unix:/home/runner/work/ovs/ovs/openvswitch-2.16.90/_build/sub/tests/testsuite.dir/1026/db.sock: connecting...
> 2021-11-19T02:22:41Z|00005|netlink_socket|INFO|netlink: could not enable listening to all nsid (Operation not permitted)
> 2021-11-19T02:22:41Z|00006|reconnect|INFO|unix:/home/runner/work/ovs/ovs/openvswitch-2.16.90/_build/sub/tests/testsuite.dir/1026/db.sock: connected
> ../../../tests/alb.at:99: sed < stderr '
> /ovs_numa|INFO|Discovered /d
> /vlog|INFO|opened log file/d
> /vswitchd|INFO|ovs-vswitchd (Open vSwitch)/d
> /reconnect|INFO|/d
> /dpif_netlink|INFO|Generic Netlink family .ovs_datapath. does not exist/d
> /ofproto|INFO|using datapath ID/d
> /netdev_linux|INFO|.*device has unknown hardware address family/d
> /ofproto|INFO|datapath ID changed to fedcba9876543210/d
> /dpdk|INFO|DPDK Disabled - Use other_config:dpdk-init to enable/d
> /netlink_socket|INFO|netlink: could not enable listening to all nsid/d
> /probe tc:/d
> /tc: Using policy/d'
> ../../../tests/alb.at:99: add_of_br 0 add-port br0 p0 \
>                      -- set Interface p0 type=dummy-pmd options:n_rxq=3 \
>                      -- set Open_vSwitch . other_config:pmd-cpu-mask=3 \
>                      -- set open_vswitch . other_config:pmd-auto-lb="true" \
>                      -- set open_vswitch . other_config:pmd-auto-lb-load-threshold=0
> alb.at:105: waiting until grep "PMD auto load balance is enabled" ovs-vswitchd.log...
> 2021-11-19T02:22:41.880Z|00039|dpif_netdev|INFO|PMD auto load balance is enabled, interval 1 mins, pmd load threshold 0%, improvement threshold 25%.
> alb.at:105: wait succeeded immediately
> ../../../tests/alb.at:109: ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=roundrobin
> alb.at:110: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "mode changed to: 'roundrobin'"...
> 2021-11-19T02:22:41.942Z|00051|dpif_netdev|INFO|Rxq to PMD assignment mode changed to: 'roundrobin'.
> alb.at:110: wait succeeded immediately
> ../../../tests/alb.at:113: ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="false"
> alb.at:114: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance is disabled"...
> 2021-11-19T02:22:42.026Z|00056|dpif_netdev|INFO|PMD auto load balance is disabled.
> alb.at:114: wait succeeded immediately
> ../../../tests/alb.at:116: ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true"
> alb.at:117: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance is enabled"...
> 2021-11-19T02:22:42.060Z|00057|dpif_netdev|INFO|PMD auto load balance is enabled, interval 1 mins, pmd load threshold 0%, improvement threshold 25%.
> alb.at:117: wait succeeded immediately
> ../../../tests/alb.at:121: ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=cycles
> alb.at:122: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "mode changed to: 'cycles'"...
> 2021-11-19T02:22:42.122Z|00058|dpif_netdev|INFO|Rxq to PMD assignment mode changed to: 'cycles'.
> alb.at:122: wait succeeded immediately
> ../../../tests/alb.at:125: ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="false"
> alb.at:126: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance is disabled"...
> 2021-11-19T02:22:42.175Z|00063|dpif_netdev|INFO|PMD auto load balance is disabled.
> alb.at:126: wait succeeded immediately
> ../../../tests/alb.at:128: ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true"
> alb.at:129: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance is enabled"...
> 2021-11-19T02:22:42.230Z|00064|dpif_netdev|INFO|PMD auto load balance is enabled, interval 1 mins, pmd load threshold 0%, improvement threshold 25%.
> alb.at:129: wait succeeded immediately
> ../../../tests/alb.at:132: ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=group
> alb.at:133: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "mode changed to: 'group'"...
> 2021-11-19T02:22:42.279Z|00065|dpif_netdev|INFO|Rxq to PMD assignment mode changed to: 'group'.
> alb.at:133: wait succeeded immediately
> ../../../tests/alb.at:136: ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="false"
> alb.at:137: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance is disabled"...
> 2021-11-19T02:22:42.305Z|00070|dpif_netdev|INFO|PMD auto load balance is disabled.
> alb.at:137: wait succeeded immediately
> ../../../tests/alb.at:139: ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true"
> alb.at:140: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance is enabled"...
> 2021-11-19T02:22:42.330Z|00071|dpif_netdev|INFO|PMD auto load balance is enabled, interval 1 mins, pmd load threshold 0%, improvement threshold 25%.
> alb.at:140: wait succeeded immediately
> ../../../tests/alb.at:143: ovs-appctl vlog/set dpif_netdev:dbg
> ../../../tests/alb.at:146: ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=group
> warped
> alb.at:149: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance performing dry run."...
> 2021-11-19T02:22:42.434Z|00078|dpif_netdev|DBG|PMD auto load balance performing dry run.
> alb.at:149: wait succeeded quickly
> ../../../tests/alb.at:152: ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=cycles
> alb.at:153: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "mode changed to: 'cycles'"...
> 2021-11-19T02:22:42.570Z|00085|dpif_netdev|INFO|Rxq to PMD assignment mode changed to: 'cycles'.
> alb.at:153: wait succeeded immediately
> warped
> alb.at:155: waiting until tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance performing dry run."...
> alb.at:155: wait failed after 30 seconds
> ../../../tests/ovs-macros.at:242: hard failure
> 1026. alb.at:98: 1026. ALB - PMD/RxQ assignment type (alb.at:98): FAILED (ovs-macros.at:242)
> 
> 
> 
> Last bits of the ovs-vswitchd.log:
> 
> 2021-11-19T02:22:42.434Z|00078|dpif_netdev|DBG|PMD auto load balance performing dry run.
> 2021-11-19T02:22:42.434Z|00079|dpif_netdev|DBG|Core  0 on numa node 0 assigned port 'p0' rx queue 0 (measured processing cycles 0).
> 2021-11-19T02:22:42.434Z|00080|dpif_netdev|DBG|Core  1 on numa node 0 assigned port 'p0' rx queue 1 (measured processing cycles 0).
> 2021-11-19T02:22:42.434Z|00081|dpif_netdev|DBG|Core  0 on numa node 0 assigned port 'p0' rx queue 2 (measured processing cycles 0).
> 2021-11-19T02:22:42.434Z|00082|dpif_netdev|DBG|Current variance 0 Estimated variance 0.
> 2021-11-19T02:22:42.434Z|00083|dpif_netdev|DBG|Variance improvement 0%.
> 2021-11-19T02:22:42.434Z|00084|dpif_netdev|DBG|PMD load variance improvement threshold 25% is not met.
> 2021-11-19T02:22:42.570Z|00085|dpif_netdev|INFO|Rxq to PMD assignment mode changed to: 'cycles'.
> 2021-11-19T02:22:42.570Z|00086|dpif_netdev|INFO|Performing pmd to rx queue assignment using cycles algorithm.
> 2021-11-19T02:22:42.570Z|00087|dpif_netdev|INFO|Core  0 on numa node 0 assigned port 'p0' rx queue 0 (measured processing cycles 0).
> 2021-11-19T02:22:42.570Z|00088|dpif_netdev|INFO|Core  1 on numa node 0 assigned port 'p0' rx queue 1 (measured processing cycles 0).
> 2021-11-19T02:22:42.570Z|00089|dpif_netdev|INFO|Core  1 on numa node 0 assigned port 'p0' rx queue 2 (measured processing cycles 0).
> 2021-11-19T02:22:42.581Z|00001|dpif_netdev(pmd-c00/id:6)|DBG|Core 0 processing port 'p0' with queue-id 0
> 2021-11-19T02:22:42.584Z|00001|dpif_netdev(pmd-c01/id:7)|DBG|Core 1 processing port 'p0' with queue-id 2
> 2021-11-19T02:22:42.584Z|00002|dpif_netdev(pmd-c01/id:7)|DBG|Core 1 processing port 'p0' with queue-id 1
> 2021-11-19T02:22:42.598Z|00090|unixctl|DBG|received request time/warp["60000","10000"], id=0
> 2021-11-19T02:22:42.651Z|00091|unixctl|DBG|replying with success, id=0: "warped"
> 2021-11-19T02:23:13.232Z|00092|unixctl|DBG|received request exit["--cleanup"], id=0
> 2021-11-19T02:23:13.232Z|00093|unixctl|DBG|replying with success, id=0: ""
> 2021-11-19T02:23:13.233Z|00094|bridge|INFO|bridge br0: deleted interface br0 on port 65534
> 2021-11-19T02:23:13.236Z|00095|bridge|INFO|bridge br0: deleted interface p0 on port 1
> 2021-11-19T02:23:13.244Z|00003|dpif_netdev(pmd-c01/id:7)|DBG|Core 1 processing port 'p0' with queue-id 2
> 2021-11-19T02:23:13.244Z|00004|dpif_netdev(pmd-c01/id:7)|DBG|Core 1 processing port 'p0' with queue-id 1
> 2021-11-19T02:23:13.253Z|00002|dpif_netdev(pmd-c00/id:6)|DBG|Core 0 processing port 'p0' with queue-id 0
> 2021-11-19T02:23:13.256Z|00096|dpif_netdev|INFO|Performing pmd to rx queue assignment using cycles algorithm.
> 2021-11-19T02:23:13.256Z|00097|dpif_netdev|INFO|Core  0 on numa node 0 assigned port 'p0' rx queue 0 (measured processing cycles 0).
> 2021-11-19T02:23:13.256Z|00098|dpif_netdev|INFO|Core  1 on numa node 0 assigned port 'p0' rx queue 1 (measured processing cycles 0).
> 2021-11-19T02:23:13.256Z|00099|dpif_netdev|INFO|Core  1 on numa node 0 assigned port 'p0' rx queue 2 (measured processing cycles 0).
> 2021-11-19T02:23:13.257Z|00100|dpif_netdev|INFO|PMD thread on numa_id: 0, core id:  1 destroyed.
> 2021-11-19T02:23:13.258Z|00101|dpif_netdev|INFO|PMD thread on numa_id: 0, core id:  0 destroyed.
> 



More information about the dev mailing list