[ovs-dev] [ovs-discuss] OVN scale

Tue Jul 28 07:05:41 UTC 2020

On Mon, Jul 27, 2020 at 10:16 AM Tony Liu <tonyliu0592 at hotmail.com> wrote:

> Hi Han,
>
> Just some updates here.
>
> I tried with 4K networks on single router. Configuration was done without
> any issues. I checked both
> nb-db and sb-db, they all look good. It's just that router configuration
> is huge (in Neutron DB, nb-db
> and flow table in sb-db), because it contains all 4K ports. Also, the
> pipeline of router datapath in sb-db
> is quite big.
>
> I see ovn-northd master and sb-db leader are busy, taking 90+% CPU. There
> are only 3 compute nodes
> and 2 gateway nodes. Does that monitor setting "ovn-monitor-all" matters
> in such case? Any idea what
> they are busy with, without any configuration updates from OpenStack? The
> nb-db is not busy though.
>

Did you create logical switch ports in your test? Did you do port-binding
on compute nodes? If yes, then "ovn-monitor-all" would matter, since all
networks are connected to the same router. With "ovn-monitor-all" = true,
it would avoid the huge monitor condition change messages.

Normally, if there is no NB-DB change, all components should be idle.

> Probably because nb-db is busy, ovn-controller can't connect to it
> consistently. It keeps being
> disconnected and reconnecting. Restarting ovn-controller seems help. I am
> able to launch a few VMs
> on different networks and they are connected via the router.
>
> If you are seeing ovn-controller disconnected to sb-db due to probe
timeout, you can disable/adjust the probe interval. See this slide:
https://www.slideshare.net/hanzhou1978/large-scale-overlay-networks-with-ovn-problems-and-solutions/16

> Now, I have problem on external access. The router is set as gateway to a
> provider/underlay network
> on an interface on the gateway node. The router is allocated an underlay
> address from that provider
> network. My understanding is that, the br-ex on gateway node holding the
> active router will broadcast
> ARP to announce that router underlay address in case of failover. Also, it
> will respond ARP request for
> that router underlay address. But when I run tcpdump on that underlay
> interface on gateway node,
> I see ARP request coming in, but no ARP response going out. I checked the
> flow table in sb-db, it seems
> ok. I also checked flow on br-ex by "ovs-ofctl dump-flows br-ex", I don't
> see anything about ARP there.
> How should I look into it?
>

"br-ex" is not managed by OVN, so you won't see any flows there. Did you
use OpenStack commands to setup the gateway? Did you see port-binding of
the gateway port in SB DB?

> Again, the case is to support 4K networks with external access (security
> group is disabled),
> 4K routers (one for each network), 50 routers (one for 80 networks), 1
> router (for all 4K networks)...
> All networks are isolated by ACL on the logical router. Which option
> should work better?
> Any comment is appreciated.
>
>
If the 4K networks don't need to communicate with each other, then what
would scale the best (in theory) is: 4K routers (one for each network) with
ovn-monitor-all=false. This way, each HV only need to process a small
proportion of the data. (the monitor condition change message should also
be small because each HV only monitor the networks that have related VMs on
the HV).

Thanks,
Han

> Thanks!
>
> Tony
>
>
> ------------------------------
> *From:* discuss <ovs-discuss-bounces at openvswitch.org> on behalf of Tony
> Liu <tonyliu0592 at hotmail.com>
> *Sent:* July 21, 2020 09:09 PM
> *To:* Daniel Alvarez <dalvarez at redhat.com>
> *Cc:* ovs-discuss at openvswitch.org <ovs-discuss at openvswitch.org>
> *Subject:* Re: [ovs-discuss] OVN scale
>
> [root at ovn-db-2 ~]# ovn-nbctl list nb_global
> _uuid               : b7b3aa05-f7ed-4dbc-979f-10445ac325b8
> connections         : []
> external_ids        : {"neutron:liveness_check_at"="2020-07-22
> 04:03:17.726917+00:00"}
> hv_cfg              : 312
> ipsec               : false
> name                : ""
> nb_cfg              : 2636
> options             : {mac_prefix="ca:e8:07",
> svc_monitor_mac="4e:d0:3a:80:d4:b7"}
> sb_cfg              : 2005
> ssl                 : []
>
> [root at ovn-db-2 ~]# ovn-sbctl list sb_global
> _uuid               : 3720bc1d-b0da-47ce-85ca-96fa8d398489
> connections         : []
> external_ids        : {}
> ipsec               : false
> nb_cfg              : 312
> options             : {mac_prefix="ca:e8:07",
> svc_monitor_mac="4e:d0:3a:80:d4:b7"}
> ssl                 : []
>
> The NBDB and SBDB is definitely out of sync. Is there any way to force
> ovn-northd sync them?
>
> Thanks!
>
> Tony
>
> ------------------------------
> *From:* Tony Liu <tonyliu0592 at hotmail.com>
> *Sent:* July 21, 2020 08:39 PM
> *To:* Daniel Alvarez <dalvarez at redhat.com>
> *Cc:* Cory Hawkless <Cory at hawkless.id.au>; ovs-discuss at openvswitch.org <
> ovs-discuss at openvswitch.org>; Dumitru Ceara <dceara at redhat.com>
> *Subject:* Re: [ovs-discuss] OVN scale
>
> When create a network (and subnet) on OpenStack, a GW port and service
> port (for DHCP and metadata)
> are also created. They are created in Neutron and onv-nb-db by ML2 driver.
> Then ovn-northd will translate
> such update from NBDB to SBDB. My question here is that, with 20.03, is
> this translation incremental?
>
> After created 4000 networks successfully on OpenStack, I see 4000 logical
> switches and 8000 LS ports
> in NBDB. But in SBDB, there are only 1567 port-bindings. The break
> happened when translating 1568th
> port. If ovn-northd recompiles the whole DB for every update, this problem
> can be explained. The DB is
> too big for ovn-northd to compile in time, so all the followed updates are
> lost. Does it make sense?
>
> I recall DB update is coordinated by some "version", like some changes
> happened in NBDB, the version
> bumps up, ovn-northd update SBDB and bumps up version as well, so they
> match. So, if NBDB version
> bumps up more than once while ovn-northd updating SBDB, is that still
> going to work? If yes, then it's
> just matter of time, no matter how fast update happening in NBDB,
> ovn-northd will catch them up
> eventually. Am I right about that?
>
> Any comment is welcome.
>
>
> Thanks!
>
> Tony
>
>
> ------------------------------
> *From:* Tony Liu <tonyliu0592 at hotmail.com>
> *Sent:* July 21, 2020 10:22 AM
> *To:* Daniel Alvarez <dalvarez at redhat.com>
> *Cc:* Cory Hawkless <Cory at hawkless.id.au>; ovs-discuss at openvswitch.org <
> ovs-discuss at openvswitch.org>; Dumitru Ceara <dceara at redhat.com>
> *Subject:* Re: [ovs-discuss] OVN scale
>
> Hi Daniel, all
>
> 4000 networks and 50 routers, 200 networks on each router, they are all
> created.
> CPU usage of Neutron server, ovn-nb-db, ovn-northd, ovn-sb-db,
> ovn-controller and ovs-vswitchd is OK,
> not consistently 100%, but still some spikes to it.
>
> Now, when create VM, I got that "waiting for vif-plugged-in timeout". This
> brings out another question,
> it used to be neutron-agent notifying Neutron server port status change,
> with OVN, who does it?
> How should I look into this?
>
> Please see my other comments Inline...
>
>
> Thanks!
>
> Tony
> ------------------------------
> *From:* Daniel Alvarez <dalvarez at redhat.com>
> *Sent:* July 21, 2020 12:06 AM
> *To:* Tony Liu <tonyliu0592 at hotmail.com>
> *Cc:* Cory Hawkless <Cory at hawkless.id.au>; ovs-discuss at openvswitch.org <
> ovs-discuss at openvswitch.org>; Dumitru Ceara <dceara at redhat.com>
> *Subject:* Re: [ovs-discuss] OVN scale
>
> Hi Tony, all
>
>
>
> On 21 Jul 2020, at 07:53, Tony Liu <tonyliu0592 at hotmail.com> wrote:
>
> 
> Hi Cory,
>
> With 4000 networks all connecting to one router with external GW, all
> networks and router
> are created and connected. I launched a few VMs on some networks, they are
> connected and
> all have external connectivity. When running ping on VM, there is a slow
> ping (a few seconds)
> out of 10+ normal pings (< 1ms). When checking CPU usage, I see Neutron
> server, OVN DB,
> OVN controller and ovs-switchd all take almost 100% CPU. It's been like
> that for hours already.
> Since they are all created and some of them work fine (didn't validate all
> networks), not sure
> what those services are busy with. Checked log, the ovn-controller keep
> switching between
> ovn-sb-db, because of heartbeat timeout.
>
>
> How are you deploying OpenStack and in particular the OVN dbs? Is it RAFT
> cluster?
>
> > Kolla Ansible. I see cluster-local-address and remote address (to the
> first node)
> > is specified for all 3 nodes. I assume clustering is enabled.
> > Is there different type of cluster?
>
> What’s your current value for ovn-remote-probe-interval? If it’s too low,
> this can be triggering reconnections all the time and creating a snowball
> effect.
>
> > external_ids        : {ovn-encap-ip="10.6.30.22", ovn-encap-type=geneve,
> ovn-remote="tcp:10.6.20.84:6642,tcp:10.6.20.85:6642,tcp:10.6.20.86:6642",
> ovn-remote-probe-interval="60000", system-id="compute-3"}
>
> You can bump the probe interval timeout like this:
>
> ovs-vsctl set open . external_ids:ovn-remote-probe-interval=<TIME IN MS>
>
>
> I'd like know if that's expected, or something I can tune to fix the
> problem. If that's expected,
> I can't think of anything other than building multiple clusters to support
> that kind of scale.
>
> I am running test with 4000 networks with 50 routers, 80 networks on each
> router. Wondering
> if that's going to help.
>
>
> Reducing the number of routers should help. Also there are some
> improvements in 20.06 release when it comes to the number of logical flows
> by a series of patches from Han. I will post the links later, sorry.
>
> Also there is a big improvement around large Port Groups as they are now
> split by data path reducing dramatically the calculations in
> ovn-controller. Specially in scenarios with a large number of networks like
> yours.
> However you seem to have no security groups and hence no Port Groups in
> the NB database. Is this correct?
>
> > Yes. For now, I want to avoid scale impact from SG, so I disable it.
>
> Is there any chance you can re run the initial scenario but with 20.06?
>
> > Is there container for 20.06? Or where I can get the packages of 20.06?
> >I should be able to upgrade 20.03 to 20.06 by upgrading packages.
>
>
> The goal is to have thousands networks connecting to external. I'd like to
> know what's the
> expected scale supported by current OVN.
>
>
> +Dumitru as we know that there is a limit of 3000 in the number of re
> submissions. So having 3K routers connected to the public logical switch
> may hit this limitation. Please @Dumitru correct me if I’m wrong.
>
> Any comment is welcome.
>
>
> Thanks!
>
> Tony
>
> ------------------------------
> *From:* Cory Hawkless <Cory at hawkless.id.au>
> *Sent:* July 20, 2020 10:04 PM
> *To:* Tony Liu <tonyliu0592 at hotmail.com>; ovs-discuss at openvswitch.org <
> ovs-discuss at openvswitch.org>
> *Subject:* RE: OVN scale
>
>
> I would expect to see 100% cpu utilisation on anything involved in the
> process of creating 4000 networks and routers but the question is for how
> long do you see high utilisation? Does it last for seconds, minutes, hours?
>
> Do the resources actually get created after some period of time or is the
> process failing?
>
>
>
> *From:* discuss [mailto:ovs-discuss-bounces at openvswitch.org] *On Behalf
> Of *Tony Liu
> *Sent:* Tuesday, 21 July 2020 1:53 PM
> *To:* ovs-discuss at openvswitch.org
> *Subject:* [ovs-discuss] OVN scale
>
>
>
> Hi folks,
>
>
>
> This is my first email here. Please let me know if there is any rule
>
> or convention I need to follow. Don't want to break it.
>
>
>
> I started with OpenStack Ussuri and OVN 20.03.0 recently and currently
>
> running some scaling test. Searched around for scaling info and noticed
>
> some improvements already presented, which is pretty cool.
>
> Wondering that "incremental" by DDlog implemented yet?
>
>
>
> With a 3-node OVN DB cluster and 3 compute nodes (with OVN controller),
>
> I created 4000 networks from OpenStack, 4000 logical routers with
>
> external GW, add one network to each LR. Port security is disabled on
>
> all networks. Then I see ovn-northd, ovn-controller and ovs-switchd all
>
> take almost 100% CPU. Is this expected?
>
>
>
> I revised solution and running test to have 4000 networks, 20 LRs and
>
> 200 networks on each LR. Will see if this makes any difference.
>
>
>
> Is there any scaling and performance report with the latest OVN release
>
> as my reference?
>
>
>
>
>
> Thanks!
>
>
>
> Tony
>
>
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
>