[ovs-discuss] ovs performance on 'worst case scenario' with ovs-vswitchd up to 100%

Jesse Gross jesse at nicira.com
Wed Mar 20 01:40:01 UTC 2013


On Mon, Mar 18, 2013 at 8:50 PM, George Shuklin
<george.shuklin at gmail.com> wrote:
> Good day.
>
> I done some research related to performance of few software bridges, and
> here some thoughts.
>
> Hyper-V and ESXi contains software bridges working fine at speeds up to 10G
> regardless traffic type.
> Native linux brtools has no issues with 'bad flows', but stuck at 50k pps
> per CPU core.
> OVS shows excellent performance on low-flow high-volume traffic like 'iperf'
> or few tcp flows. I saw at least 9.1Gbit/s and it was limited by xen's
> netback, not by ovs.
>
> But when traffic change it shape from 'few huge flow' to 'many small flows'
> situation changes drastically.
>
> Following line (mangled slightly against script kiddies):
>
> hping3 -i 500 -s 2  -p ++1 --udp target
>
> cause about 80% CPU utilization by ovs-vswitchd on host receiving traffic.
>
> Reducing latency to lower values cause complete denial of service at overall
> traffic volume just about 15-20 Mb/s.
>
> Checked with ovs: 1.0, 1.4.3, latest 1.9 - same behavior (and 1.0 adds some
> memory leaks).
>
> I done some research and found reason for that kind of behavior: when new
> flow came in, it forwarded by kernel to ovs-vswitchd, wich slowly analyze it
> and send back data to kernel with 'kernel flows'.
>
> I thought it should be problem only for 'normal' rule, but following set of
> rules actually cause same effect:
>
> ovs-ofctl add-flow xenbr0 'arp action=normal priority=30'
> ovs-ofctl add-flow xenbr0 'action=drop priority=20'
>
> As you can see that rule says 'drop every traffic except arp'. And even with
> that rule 'hping' line cause huge CPU load on ovs-vswitchd.
>
> I check it in ovs-dpctl dump-flow xenbr0  and got following lines:
>
> in_port(1),eth(src=88:e0:f3:b6:47:f0,dst=56:de:7d:66:ad:28),eth_type(0x0800),ipv4(src=offender,dst=victim,proto=17,tos=0,ttl=63,frag=no),udp(src=54882,dst=54882),
> packets:0, bytes:0, used:never, actions:drop
> in_port(1),eth(src=88:e0:f3:b6:47:f0,dst=56:de:7d:66:ad:28),eth_type(0x0800),ipv4(src=offender,dst=victim,proto=17,tos=0,ttl=63,frag=no),udp(src=48424,dst=48424),
> packets:0, bytes:0, used:never, actions:drop
>
> ...
> thousands of them.
> I clearly see kernel asking ovs-vswitchd about every packet (because of -p
> ++1 and -s 2 cause to change source and designation port on every upd
> packet).
> And ovs-vswitchd replies on every 'sample' packet with exact rules with very
> deep L3 set of headers (source and designation ports).
>
> Why ovs-vswitchd can't feed kernel with very simple 'kernel flow' 'drop all'
> after some smart and exact rules for arp?
>
> Why UDP is parsed so deep?
>
> Is any way to feed kernel module with fast generic rules without passing
> every new flow to the userspace?

Directly moving classification logic into the kernel as you describe
is either not very general purpose (i.e. having special logic for
"drop everything except ARP") or not very performant/good design (it
ends up moving much more logic into the fast path).  There is ongoing
work to improve performance in this area but it is a non-trivial
problem.



More information about the discuss mailing list