[ovs-discuss] OVN at scale in production

Seena Fallah seenafallah at gmail.com
Sat Oct 9 19:01:49 UTC 2021


Also I get many logs like this in ovn:

2021-10-09T18:54:45.263Z|01151|jsonrpc|WARN|Dropped 6 log messages in last
8 seconds (most recently, 3 seconds ago) due to excessive rate
2021-10-09T18:54:45.263Z|01152|jsonrpc|WARN|tcp:10.0.0.1:44454: receive
error: Connection reset by peer
2021-10-09T18:54:45.263Z|01153|reconnect|WARN|tcp:10.0.01:44454: connection
dropped (Connection reset by peer)
2021-10-09T18:54:46.798Z|01154|reconnect|WARN|tcp:10.0.0.2:50224:
connection dropped (Connection reset by peer)
2021-10-09T18:54:49.127Z|01155|reconnect|WARN|tcp:10.0.0.3:48514:
connection dropped (Connection reset by peer)
2021-10-09T18:54:51.241Z|01156|reconnect|WARN|tcp:10.0.0.3:48544:
connection dropped (Connection reset by peer)
2021-10-09T18:54:53.005Z|01157|reconnect|WARN|tcp:10.0.0.3:48846:
connection dropped (Connection reset by peer)
2021-10-09T18:54:53.246Z|01158|reconnect|WARN|tcp:10.0.0.3:48796:
connection dropped (Connection reset by peer)

What does it mean about excessive rate? How many req/s is going to be an
excessive rate?

On Thu, Oct 7, 2021 at 12:46 AM Seena Fallah <seenafallah at gmail.com> wrote:

> Seems the most leader failure is for NB and the command you said is for SB.
>
> Do you have any benchmarks of how many ACLs can OVN perform normally?
> I see many failures after 100k ACLs.
>
> On Thu, Oct 7, 2021 at 12:14 AM Numan Siddique <numans at ovn.org> wrote:
>
>> On Wed, Oct 6, 2021 at 2:49 PM Seena Fallah <seenafallah at gmail.com>
>> wrote:
>> >
>> > I'm using these versions on a centos container:
>> > ovsdb-server (Open vSwitch) 2.15.2
>> > ovn-nbctl 21.06.0
>> > Open vSwitch Library 2.15.90
>> > DB Schema 5.32.0
>> >
>> > Today I see the election timed out too and I should increase ovsdb
>> election timeout too. I saw the commits but I didn't find any related
>> change to my problem.
>> > If I use ovn 21.09 with ovsdb 2.16 Is there still any need to increase
>> election timeout and disable the inactivity probe?
>>
>> Not sure on that.  It's worth a try if you have a test environment.
>>
>> > Also is there any limitation on the number of ACLs that can OVN handle?
>>
>> I don't think there is any limitation on the number of ACLs.  In
>> general as the size of the SB DB increases, we have seen issues.
>>
>> Can you run the below command on each of your nodes where
>> ovn-controller runs and see if that helps ?
>>
>> ---
>> ovs-vsctl set open . external_ids:ovn-monitor-all=true
>> ---
>>
>> Thanks
>> Numan
>>
>>
>> >
>> > Thanks.
>> >
>> > On Wed, Oct 6, 2021 at 9:43 PM Numan Siddique <numans at ovn.org> wrote:
>> >>
>> >> On Wed, Oct 6, 2021 at 12:15 PM Seena Fallah <seenafallah at gmail.com>
>> wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > I use ovn for OpenStack neutron plugin for my production. After days
>> I see issues about losing a leader in ovsdb. It seems it was because of the
>> failing inactivity probe and because I had 17k acls. After I disable the
>> inactivity probe it works fine but when I did a scale test on it (about 40k
>> ACLS) again it fails the leader.
>> >> > I saw many docs about ovn at scale issues that were raised by both
>> RedHat and eBay and seems the solution is to rewrite ovn with ddlog. I
>> checked it with northd-ddlog but nothing changes.
>> >> >
>> >> > My question is should I wait more for ovn to be stable for high
>> scale or is there any tuning I miss in my deployment?
>> >> > Also, will the ovn-nb/sb rewrite with ddlog and can help the issues
>> at a high scale? if yes is there any due time?
>> >>
>> >> What is the ovsdb-server version you're using ?  There are many
>> >> improvements in the ovsdb-server in 2.16.
>> >> Maybe that would help in your deployment.  And also there were many
>> >> improvements which went into OVN 21.09
>> >> if you want to test it out.
>> >>
>> >> Thanks
>> >> Numan
>> >>
>> >> >
>> >> > Thanks.
>> >> > _______________________________________________
>> >> > discuss mailing list
>> >> > discuss at openvswitch.org
>> >> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>> >
>> > _______________________________________________
>> > discuss mailing list
>> > discuss at openvswitch.org
>> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20211009/82480569/attachment.html>


More information about the discuss mailing list