[ovs-discuss] ovn-controller consuming lots of CPU

Russell Bryant russell at ovn.org
Wed Dec 20 13:39:07 UTC 2017


You're right that my patch was a bit of a short term hack.  It
addressed a particular case where we could easily discard flows when
we know they could never match (because of the logical port included
in the match).

Use of conjunctive matches is a better more general solution to the
explosion of flows when using address sets, but that hasn't been
worked on yet.


On Wed, Dec 13, 2017 at 5:00 PM, Kevin Lin <kevin at kelda.io> wrote:
> Thanks for the replies!
>
> We’re using v2.8.1.
>
> I don’t completely understand Russell’s patch, but I don’t think our ACLs
> were taking advantage of it. Do the ACLs need to be “tagged” with port
> information in order for it to be useful?
>
> Before, our ACLs were in terms of L3 and above. I brought down the number of
> flows from 252981 to 16224 by modifying the ACLs to include the logical
> port. For example:
>
> from-lport: (inport == “1” || inport == “2” || inport == “3”) && (ip4.dst ==
> $addressSet)
> to-lport: (ip4.src == $addressSet) && (outport == “1” || outport == “2” ||
> outport == “3”)
>
>
> Is that the right way to take advantage of Russell’s patch? This explodes
> the address set on one side of the connection, but does decrease the number
> of flows installed on each vswitchd.
>
> Even with the decrease in flows in vswitchd, I’m still seeing the log
> messages. Does 16224 flows per vswitchd instance, and 559 flows in ovn-sb
> sound reasonable?
>
> Thanks,
> —Kevin
>
> On Dec 13, 2017, at 7:55 AM, Mark Michelson <mmichels at redhat.com> wrote:
>
> On 12/12/2017 03:26 PM, Kevin Lin wrote:
>
> Hi again,
> We’re trying to scale up our OVN deployment and we’re seeing some worrying
> log messages.
> The topology is 32 containers connected to another 32 containers on 10
> different ports. This is running on 17 machines (one machine runs ovn-northd
> and ovsdb-server, the other 16 run ovn-controller, ovs-vswitchd, and
> ovsdb-server). We’re using an address set for the source group, but not the
> destination group. We’re also creating a different ACL for each port. So the
> ACLs look like:
> One address set for { container1, container2, … container32 }
> addressSet -> container1 on port 80
> addressSet -> container1 on port 81
>> addressSet -> container1 on port 90
> addressSet -> container2 on port 80
>> addressSet -> container32 on port 90
> The ovn-controller log:
> 2017-12-12T20:14:49Z|11878|timeval|WARN|Unreasonably long 1843ms poll
> interval (1840ms user, 0ms system)
> 2017-12-12T20:14:49Z|11879|timeval|WARN|disk: 0 reads, 16 writes
> 2017-12-12T20:14:49Z|11880|timeval|WARN|context switches: 0 voluntary, 21
> involuntary
> 2017-12-12T20:14:49Z|11881|poll_loop|DBG|wakeup due to [POLLIN] on fd 9
> (172.31.11.193:48460<->172.31.2.181:6640) at lib/stream-fd.c:157 (36% CPU
> usage)
> 2017-12-12T20:14:49Z|11882|poll_loop|DBG|wakeup due to [POLLIN] on fd 12
> (<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:157 (36% CPU usage)
> 2017-12-12T20:14:49Z|11883|jsonrpc|DBG|tcp:172.31.2.181:6640: received
> reply, result=[], id="echo"
> 2017-12-12T20:14:49Z|11884|netlink_socket|DBG|nl_sock_transact_multiple__
> (Success): nl(len:36, type=38(family-defined), flags=9[REQUEST][ECHO],
> seq=b11, pid=2268452876
> 2017-12-12T20:14:49Z|11885|netlink_socket|DBG|nl_sock_recv__ (Success):
> nl(len:136, type=36(family-defined), flags=0, seq=b11, pid=2268452876
> 2017-12-12T20:14:49Z|11886|vconn|DBG|unix:/var/run/openvswitch/br-int.mgmt:
> received: OFPT_ECHO_REQUEST (OF1.3) (xid=0x0): 0 bytes of payload
> 2017-12-12T20:14:49Z|11887|vconn|DBG|unix:/var/run/openvswitch/br-int.mgmt:
> sent (Success): OFPT_ECHO_REPLY (OF1.3) (xid=0x0): 0 bytes of payload
> 2017-12-12T20:14:51Z|11888|timeval|WARN|Unreasonably long 1851ms poll
> interval (1844ms user, 8ms system)
> 2017-12-12T20:14:51Z|11889|timeval|WARN|context switches: 0 voluntary, 11
> involuntary
> 2017-12-12T20:14:52Z|11890|poll_loop|DBG|wakeup due to [POLLIN] on fd 9
> (172.31.11.193:48460<->172.31.2.181:6640) at lib/stream-fd.c:157 (73% CPU
> usage)
> 2017-12-12T20:14:52Z|11891|jsonrpc|DBG|tcp:172.31.2.181:6640: received
> request, method="echo", params=[], id="echo"
> 2017-12-12T20:14:52Z|11892|jsonrpc|DBG|tcp:172.31.2.181:6640: send reply,
> result=[], id="echo"
> 2017-12-12T20:14:52Z|11893|netlink_socket|DBG|nl_sock_transact_multiple__
> (Success): nl(len:36, type=38(family-defined), flags=9[REQUEST][ECHO],
> seq=b12, pid=2268452876
> 2017-12-12T20:14:52Z|11894|netlink_socket|DBG|nl_sock_recv__ (Success):
> nl(len:136, type=36(family-defined), flags=0, seq=b12, pid=2268452876
> 2017-12-12T20:14:52Z|11895|netdev_linux|DBG|Dropped 18 log messages in last
> 56 seconds (most recently, 3 seconds ago) due to excessive rate
> 2017-12-12T20:14:52Z|11896|netdev_linux|DBG|unknown qdisc "mq"
> 2017-12-12T20:14:54Z|11897|hmap|DBG|Dropped 15511 log messages in last 6
> seconds (most recently, 0 seconds ago) due to excessive rate
> 2017-12-12T20:14:54Z|11898|hmap|DBG|ovn/lib/expr.c:2644: 6 nodes in bucket
> (128 nodes, 64 buckets)
> 2017-12-12T20:14:54Z|11899|timeval|WARN|Unreasonably long 1831ms poll
> interval (1828ms user, 4ms system)
> 2017-12-12T20:14:54Z|11900|timeval|WARN|context switches: 0 voluntary, 12
> involuntary
> The log messages show up continuously. The logs appear even when the network
> isn’t being used.
> I poked around with Ethan Jackson and he noted that the hmap counters seem
> unusually high:
> root at ip-172-31-11-193:/# ovs-appctl -t ovn-controller coverage/show
> Event coverage, avg rate over last: 5 seconds, last minute, last hour,
> hash=d6ee5804:
> hmap_pathological        2323.6/sec  2662.467/sec     2514.0069/sec   total:
> 9407536
> hmap_expand              3596.8/sec  4121.283/sec     3890.8833/sec   total:
> 14604479
> txn_unchanged              0.8/sec     0.917/sec        0.8658/sec   total:
> 5659
> txn_incomplete             0.0/sec     0.000/sec        0.0008/sec   total:
> 33
> txn_success                0.0/sec     0.000/sec        0.0006/sec   total:
> 24
> poll_create_node           2.4/sec     2.750/sec        2.5986/sec   total:
> 18218
> poll_zero_timeout          0.0/sec     0.000/sec        0.0100/sec   total:
> 71
> rconn_queued               0.0/sec     0.050/sec        0.0531/sec   total:
> 252570
> rconn_sent                 0.0/sec     0.050/sec        0.0531/sec   total:
> 252570
> seq_change                 1.2/sec     1.383/sec        1.2992/sec   total:
> 8500
> pstream_open               0.0/sec     0.000/sec        0.0000/sec   total:
> 1
> stream_open                0.0/sec     0.000/sec        0.0000/sec   total:
> 6
> unixctl_received           0.0/sec     0.000/sec        0.0019/sec   total:
> 7
> unixctl_replied            0.0/sec     0.000/sec        0.0019/sec   total:
> 7
> util_xalloc              2731550.2/sec 3129900.483/sec   569276.5414/sec
> total: 11057381035
> vconn_open                 0.0/sec     0.000/sec        0.0000/sec   total:
> 4
> vconn_received             0.0/sec     0.050/sec        0.0444/sec   total:
> 201
> vconn_sent                 0.0/sec     0.000/sec        0.0144/sec   total:
> 253535
> netdev_get_ifindex         0.4/sec     0.467/sec        0.4328/sec   total:
> 2822
> netlink_received           0.4/sec     0.467/sec        0.4328/sec   total:
> 2822
> netlink_sent               0.4/sec     0.467/sec        0.4328/sec   total:
> 2822
> cmap_expand                0.0/sec     0.000/sec        0.0000/sec   total:
> 2
> 47 events never hit
> I’ve also attached the output of ovs-bugtool run from the machine running
> ovn-northd, and one of the machines running ovn-controller and ovs-vswitchd.
> Thanks,
> —Kevin
>
>
> Hi Kevin,
>
> In short, I agree with Ethan's assessment that the hmap numbers appear
> large. The ACLs, combined with ovn-controller's algorithms, are causing
> that. The ACLs result in ~250000 openflow flows being generated.
> ovn-controller re-creates this flow table from scratch each time it runs its
> main loop.
>
> These are unsolved problems as yet in ovn-controller. The two high-level
> thoughts on the matter are:
> 1) See if we can reduce the number of generated flows when ACLs are used.
> 2) See if we can perform incremental changes on the flow table rather than
> re-creating it from scratch every time.
>
> There are two commits that may be of interest to you.
>
> * As Miguel mentioned in his response, Russell Bryant submitted a change [1]
> within the past couple of months to help towards (1) a bit. The idea was
> that if we can detect that a flow will never be needed on a specific
> machine, then don't generate it. It's not going to cover everything, but it
> could potentially help you. That change is present in 2.7.3 and 2.8.1
>
> * A change [2] by Han Zhou fixed an issue where we were generating the flow
> table and then just throwing it away since vswitchd was not ready for us to
> submit a new flow table to it. This change is present in 2.7.3 and 2.8.0.
>
> These two commits aren't silver bullets but if you're running a version of
> OVS that does not have these commits in them already, you may see some
> improvement with an upgrade.
>
> Mark
>
> [1]
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338847.html
> [2] https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/331286.html
>
>
>
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>



-- 
Russell Bryant


More information about the discuss mailing list