[ovs-dev] ovs-vswitchd 2.4.1 scale >10K add/delete flows 100% cpu

Ravi Kerur rkerur at gmail.com
Thu Jul 5 20:44:33 UTC 2018


Hi,

During scale flow add/delete (>10K), I am seeing ovs-vswitchd cpu usage
spike to 100% and stay there without any sign of returning to normal cpu
usage. It's normal OVS 2.4.1 and no DPDK involved. I am trying to get
'perf' working which might help in isolating the problem. In the meantime I
would like to understand following things

(1) Recommended system configuration i.e. core allocation, memory,
hugepages, ...
(2) Published scale numbers for 2.4.1
(3) Known performance issues with 2.4.1
(4) Debugs to collect

ovs-ofctl --version
ovs-ofctl (Open vSwitch) 2.4.1
Compiled May 25 2016 13:31:48
OpenFlow versions 0x1:0x4

 PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND

56446 root      20   0 45.894g 0.031t  12084 S 814.7 12.6 788147:01
qemu-system-x86

28509 root      10 -10 2215740 310668   7304 R 100.2  0.1  10502:22
ovs-vswitchd


55579 root      20   0 45.688g 0.029t  12508 S  15.8 11.8 511176:21
qemu-system-x86

Enabled some debugs
ovs-appctl vlog/set poll_loop:DBG

and logs show

2018-07-04T14:27:20.307Z|00474|connmgr|INFO|vn-vn9014<->unix: 17000
flow_mods in the last 56 s (16612 adds, 388 deletes)
2018-07-04T14:27:24.213Z|00475|connmgr|INFO|vn-vn9046<->unix: 17188
flow_mods in the last 58 s (16610 adds, 578 deletes)
2018-07-04T14:28:20.306Z|00476|connmgr|INFO|vn-vn9014<->unix: 2786
flow_mods in the last 59 s (2786 adds)
2018-07-04T14:28:24.214Z|00477|connmgr|INFO|vn-vn9046<->unix: 2790
flow_mods in the last 59 s (2790 adds)
2018-07-04T14:28:30.002Z|00478|poll_loop|INFO|Dropped 48 log messages in
last 151 seconds (most recently, 149 seconds ago) due to excessive rate
2018-07-04T14:28:30.002Z|00479|poll_loop|INFO|wakeup due to [POLLIN] on fd
446 (FIFO pipe:[2524722865]) at ofproto/ofproto-dpif.c:1574 (*58% CPU
usage)*
2018-07-04T14:28:30.002Z|00480|poll_loop|INFO|wakeup due to [POLLIN] on fd
453 (/var/run/openvswitch/vn-vn9014.mgmt<->) at *lib/stream-fd.c:155 (58%
CPU usage)*
2018-07-04T14:28:30.002Z|00481|poll_loop|INFO|wakeup due to [POLLIN] on fd
453 (/var/run/openvswitch/vn-vn9014.mgmt<->) at lib/stream-fd.c:155 (58%
CPU usage)
2018-07-04T14:28:30.002Z|00482|poll_loop|INFO|wakeup due to 0-ms timeout at
ofproto/ofproto-dpif.c:1571 (*58% CPU usage*)
2018-07-04T14:28:30.002Z|00483|poll_loop|INFO|wakeup due to [POLLIN] on fd
446 (FIFO pipe:[2524722865]) at lib/ovs-rcu.c:206 (*58% CPU usage*)
2018-07-04T14:28:30.003Z|00484|poll_loop|INFO|wakeup due to 0-ms timeout at
lib/ovs-rcu.c:206 (58% CPU usage)
2018-07-04T14:28:30.003Z|00485|poll_loop|INFO|wakeup due to [POLLIN] on fd
446 (FIFO pipe:[2524722865]) at lib/ovs-rcu.c:206 (*58% CPU usage*)
...

Eventually CPU usage reaches 100% and stays there.

Thanks.


More information about the dev mailing list