[ovs-discuss] [OVN] ovn-controller Incremental Processing scale testing

Numan Siddique nusiddiq at redhat.com
Tue Jul 9 07:12:49 UTC 2019


On Tue, Jul 9, 2019 at 12:25 PM Daniel Alvarez Sanchez <dalvarez at redhat.com>
wrote:

> Thanks Numan for running these tests outside OpenStack!
>
> On Tue, Jul 9, 2019 at 7:50 AM Numan Siddique <nusiddiq at redhat.com> wrote:
> >
> >
> >
> > On Tue, Jul 9, 2019 at 11:05 AM Han Zhou <zhouhan at gmail.com> wrote:
> >>
> >>
> >>
> >> On Fri, Jun 21, 2019 at 12:31 AM Han Zhou <zhouhan at gmail.com> wrote:
> >> >
> >> >
> >> >
> >> > On Thu, Jun 20, 2019 at 11:42 PM Numan Siddique <nusiddiq at redhat.com>
> wrote:
> >> > >
> >> > >
> >> > >
> >> > > On Fri, Jun 21, 2019, 11:47 AM Han Zhou <zhouhan at gmail.com> wrote:
> >> > >>
> >> > >>
> >> > >>
> >> > >> On Tue, Jun 11, 2019 at 9:16 AM Daniel Alvarez Sanchez <
> dalvarez at redhat.com> wrote:
> >> > >> >
> >> > >> > Thanks a lot Han for the answer!
> >> > >> >
> >> > >> > On Tue, Jun 11, 2019 at 5:57 PM Han Zhou <zhouhan at gmail.com>
> wrote:
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > > On Tue, Jun 11, 2019 at 5:12 AM Dumitru Ceara <
> dceara at redhat.com> wrote:
> >> > >> > > >
> >> > >> > > > On Tue, Jun 11, 2019 at 10:40 AM Daniel Alvarez Sanchez
> >> > >> > > > <dalvarez at redhat.com> wrote:
> >> > >> > > > >
> >> > >> > > > > Hi Han, all,
> >> > >> > > > >
> >> > >> > > > > Lucas, Numan and I have been doing some 'scale' testing of
> OpenStack
> >> > >> > > > > using OVN and wanted to present some results and issues
> that we've
> >> > >> > > > > found with the Incremental Processing feature in
> ovn-controller. Below
> >> > >> > > > > is the scenario that we executed:
> >> > >> > > > >
> >> > >> > > > > * 7 baremetal nodes setup: 3 controllers (running
> >> > >> > > > > ovn-northd/ovsdb-servers in A/P with pacemaker) + 4
> compute nodes. OVS
> >> > >> > > > > 2.10.
> >> > >> > > > > * The test consists on:
> >> > >> > > > >   - Create openstack network (OVN LS), subnet and router
> >> > >> > > > >   - Attach subnet to the router and set gw to the external
> network
> >> > >> > > > >   - Create an OpenStack port and apply a Security Group
> (ACLs to allow
> >> > >> > > > > UDP, SSH and ICMP).
> >> > >> > > > >   - Bind the port to one of the 4 compute nodes (randomly)
> by
> >> > >> > > > > attaching it to a network namespace.
> >> > >> > > > >   - Wait for the port to be ACTIVE in Neutron ('up ==
> True' in NB)
> >> > >> > > > >   - Wait until the test can ping the port
> >> > >> > > > > * Running browbeat/rally with 16 simultaneous process to
> execute the
> >> > >> > > > > test above 150 times.
> >> > >> > > > > * When all the 150 'fake VMs' are created, browbeat will
> delete all
> >> > >> > > > > the OpenStack/OVN resources.
> >> > >> > > > >
> >> > >> > > > > We first tried with OVS/OVN 2.10 and pulled some results
> which showed
> >> > >> > > > > 100% success but ovn-controller is quite loaded (as
> expected) in all
> >> > >> > > > > the nodes especially during the deletion phase:
> >> > >> > > > >
> >> > >> > > > > - Compute node: https://imgur.com/a/tzxfrIR
> >> > >> > > > > - Controller node (ovn-northd and ovsdb-servers):
> https://imgur.com/a/8ffKKYF
> >> > >> > > > >
> >> > >> > > > > After conducting the tests above, we replaced
> ovn-controller in all 7
> >> > >> > > > > nodes by the one with the current master branch (actually
> from last
> >> > >> > > > > week). We also replaced ovn-northd and ovsdb-servers but
> the
> >> > >> > > > > ovs-vswitchd has been left untouched (still on 2.10). The
> expected
> >> > >> > > > > results were to get less ovn-controller CPU usage and also
> better
> >> > >> > > > > times due to the Incremental Processing feature introduced
> recently.
> >> > >> > > > > However, the results don't look very good:
> >> > >> > > > >
> >> > >> > > > > - Compute node: https://imgur.com/a/wuq87F1
> >> > >> > > > > - Controller node (ovn-northd and ovsdb-servers):
> https://imgur.com/a/99kiyDp
> >> > >> > > > >
> >> > >> > > > > One thing that we can tell from the ovs-vswitchd CPU
> consumption is
> >> > >> > > > > that it's much less in the Incremental Processing (IP)
> case which
> >> > >> > > > > apparently doesn't make much sense. This led us to think
> that perhaps
> >> > >> > > > > ovn-controller was not installing the necessary flows in
> the switch
> >> > >> > > > > and we confirmed this hypothesis by looking into the
> dataplane
> >> > >> > > > > results. Out of the 150 VMs, 10% of them were unreachable
> via ping
> >> > >> > > > > when using ovn-controller from master.
> >> > >> > > > >
> >> > >> > > > > @Han, others, do you have any ideas as of what could be
> happening
> >> > >> > > > > here? We'll be able to use this setup for a few more days
> so let me
> >> > >> > > > > know if you want us to pull some other data/traces, ...
> >> > >> > > > >
> >> > >> > > > > Some other interesting things:
> >> > >> > > > > On each of the compute nodes, (with an almost evenly
> distributed
> >> > >> > > > > number of logical ports bound to them), the max amount of
> logical
> >> > >> > > > > flows in br-int is ~90K (by the end of the test, right
> before deleting
> >> > >> > > > > the resources).
> >> > >> > > > >
> >> > >> > > > > It looks like with the IP version, ovn-controller leaks
> some memory:
> >> > >> > > > > https://imgur.com/a/trQrhWd
> >> > >> > > > > While with OVS 2.10, it remains pretty flat during the
> test:
> >> > >> > > > > https://imgur.com/a/KCkIT4O
> >> > >> > > >
> >> > >> > > > Hi Daniel, Han,
> >> > >> > > >
> >> > >> > > > I just sent a small patch for the ovn-controller memory leak:
> >> > >> > > > https://patchwork.ozlabs.org/patch/1113758/
> >> > >> > > >
> >> > >> > > > At least on my setup this is what valgrind was pointing at.
> >> > >> > > >
> >> > >> > > > Cheers,
> >> > >> > > > Dumitru
> >> > >> > > >
> >> > >> > > > >
> >> > >> > > > > Looking forward to hearing back :)
> >> > >> > > > > Daniel
> >> > >> > > > >
> >> > >> > > > > PS. Sorry for my previous email, I sent it by mistake
> without the subject
> >> > >> > > > > _______________________________________________
> >> > >> > > > > discuss mailing list
> >> > >> > > > > discuss at openvswitch.org
> >> > >> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> >> > >> > >
> >> > >> > > Thanks Daniel for the testing and reporting, and thanks
> Dumitru for fixing the memory leak.
> >> > >> > >
> >> > >> > > Currently ovn-controller incremental processing only handles
> below SB changes incrementally:
> >> > >> > > - logical_flow
> >> > >> > > - port_binding (for regular VIF binding NOT on current chassis)
> >> > >> > > - mc_group
> >> > >> > > - address_set
> >> > >> > > - port_group
> >> > >> > > - mac_binding
> >> > >> > >
> >> > >> > > So, in test scenario you described, since each iteration
> creates network (SB datapath changes) and router ports (port_binding
> changes for non VIF), the incremental processing would not help much,
> because most steps in your test should trigger recompute. It would help if
> you create more Fake VMs in each iteration, e.g. create 10 VMs or more on
> each LS. Secondly, when VIF port-binding happens on current chassis, the
> ovn-controller will still do re-compute, and because you have only 4
> compute nodes, so 1/4 of the compute node will still recompute even when
> binding a regular VIF port. When you have more compute nodes you would see
> incremental processing more effective.
> >> > >> >
> >> > >> > Got it, it makes sense (although then worst case, it should be at
> >> > >> > least what we had before and not worse but it can also be because
> >> > >> > we're mixing version here: 2.10 vs master).
> >> > >> > >
> >> > >> > > However, what really worries me is the 10% VM unreachable. I
> have one confusion here on the test steps. The last step you described was:
> - Wait until the test can ping the port. So if the VM is not pingable the
> test won't continue?
> >> > >> >
> >> > >> > Sorry I should've explained it better. We wait for 2 minutes to
> the
> >> > >> > port to respond to pings, if it's not reachable then we continue
> with
> >> > >> > the next port (16 rally processes are running simultaneously so
> the
> >> > >> > rest of the process may be doing stuff at the same time).
> >> > >> >
> >> > >> > >
> >> > >> > > To debug the problem, the first thing is to identify what
> flows are missing for the VMs that is unreachable. Could you do ovs-appctl
> ofproto/trace for the ICMP flow of any VM with ping failure? And then,
> please enable debug log for ovn-controller with ovs-appctl -t
> ovn-controller vlog/set file:dbg. There may be too many logs so please
> enable it for as short time as any VM with ping failure is reproduced. If
> the last step "wait until the test can ping the port" is there then it
> should be able to detect the first occurrence if the VM is not reachable in
> e.g. 30 sec.
> >> > >> >
> >> > >> > We'll need to hack a bit here but let's see :)
> >> > >> > >
> >> > >> > > In the ovn-scale-test we didn't have data plane test, but this
> problem was not seen in our live environment either, with a far larger
> scale. The major difference in your test v.s. our environment are:
> >> > >> > > - We are runing with an older version. So there might be some
> rebase/refactor problem caused this. To eliminate this, I'd suggest to try
> a branch I created for 2.10 (
> https://github.com/hzhou8/ovs/tree/ip12_rebase_on_2.10), which matches
> the base test you did which is also 2.10. It may also eliminate
> compatibility problem, if there is any, between OVN master branch and OVS
> 2.10 as you mentioned is used in the test.
> >> > >> > > - We don't use Security Group (I guess the  ~90k OVS flows you
> mentioned were mainly introduced by the Security Group use, if all ports
> were put in same group). The incremental processing is expected to be
> correct for security-groups, and handling it incrementally because of
> address_set and port_group incremental processing. However, since the
> testing only relied on the regression tests, I am not 100% sure if the test
> coverage was sufficient. So could you try disabling Security Group to rule
> out the problem?
> >> > >> >
> >> > >> > Ok will try to repeat the tests without the SGs.
> >> > >> > >
> >> > >> > > Thanks,
> >> > >> > > Han
> >> > >> >
> >> > >> > Thanks once again!
> >> > >> > Daniel
> >> > >>
> >> > >> Hi Daniel,
> >> > >>
> >> > >> Any updates? Do you still see the 10% VM unreachable
> >> > >>
> >> > >>
> >> > >> Thanks,
> >> > >> Han
> >> > >
> >> > >
> >> > > Hi Han,
> >> > >
> >> > > As such there is no datapath impact. After increasing the ping wait
> timeout value from 120 seconds to 180 seconds its 100% now.
> >> > >
> >> > > But the time taken to program the flows is too huge when compared
> to OVN master without IP patches.
> >> > > Here is some data -  http://paste.openstack.org/show/753224/ .  I
> am still investigating it. I will update my findings in some time.
> >> > >
> >> > > Please see the times for the action - vm.wait_for_ping
> >> > >
> >> >
> >> > Thanks Numan for the investigation and update. Glad to hear there is
> no correctness issue, but sorry for the slowness in your test scenario. I
> expect that the operations in your test trigger recomputing and the worst
> case should be similar performance as withour I-P. It is weird that it
> turned out so much slower in your test. There can be some extra overhead
> when it tries to do incremental processing and then fallback to full
> recompute, but it shouldn't cause that big difference. It might be that for
> some reason the main loop iteration is triggered more times unnecessarily.
> I'd suggest to compare the coverage counter "lflow_run" between the tests,
> and also check perf report to see if the hotspot is somewhere else. (Sorry
> that I can't provide full-time help now since I am still on vacation but I
> will try to be useful if things are blocked)
> >>
> >> Hi Numan/Daniel, do you have any new findings on why I-P got worse
> result in your test? The extremely long latency (2 - 3 min) shown in your
> report reminds me a similar problem I reported before:
> https://mail.openvswitch.org/pipermail/ovs-dev/2018-April/346321.html
> >>
> >> The root cause of that problem was still not clear. In that report, the
> extremely long latency (7 min) was observed without I-P and it didn't
> happen with I-P. If it is the same problem, then I suspect it is not
> related to I-P or non I-P, but some problem related to ovsdb monitor
> condition change. To confirm if it is same problem, could you:
> >> 1. pause the test when the scale is big enough (e.g. when the test is
> almost completed), and then
> >> 2. enable ovn-controller debug log, and then
> >> 3. run one more iteration of the test, and see if the time was spent on
> waiting for SB DB update notification.
> >>
> >> Please ignore my speculation above if you already found the root cause
> and it would be great if you could share it :)
> >
> >
> > Thanks for sharing this Han.
> >
> > I do not have any new findings. Yesterday I ran ovn-scale-test comparing
> OVN with IP vs without IP (using the master branch).
> > The test creates a new logical switch, adds it to a router, few ACLs and
> creates 2 logical ports and pings between them.
> > I am using physical deployment which creates actual namespaces instead
> of sandboxes.
> >
> > The results doesn't show any huge difference between the two.
> 2300 vs 2900 seconds total time or  44 vs 56 seconds for the 95%ile?
> It is not negligible IMHO. It's a >25% penalty with the IP. Maybe I
> missed something from the results?
>
>
Initially I ran with ovn-nbctl running commands as one batch (ie combining
commands with "--"). The results were very similar. Like this one

*******

With non IP - ovn-nbctl NO daemon mode

+--------------------------------------------------------------------------------------------------------------+
|                                             Response Times (sec)
                                    |
+---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+
| action                                | min   | median | 90%ile | 95%ile
| max    | avg    | success | count |
+---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+
| ovn_network.create_routers            | 0.288 | 0.429  | 5.454  | 5.538
 | 20.531 | 1.523  | 100.0%  | 1000  |
| ovn.create_lswitch                    | 0.046 | 0.139  | 0.202  | 5.084
 | 10.259 | 0.441  | 100.0%  | 1000  |
| ovn_network.connect_network_to_router | 0.164 | 0.411  | 5.307  | 5.491
 | 15.636 | 1.128  | 100.0%  | 1000  |
| ovn.create_lport                      | 0.11  | 0.272  | 0.478  | 5.284
 | 15.496 | 0.835  | 100.0%  | 1000  |
| ovn_network.bind_port                 | 1.302 | 2.367  | 2.834  | 3.24
| 12.409 | 2.527  | 100.0%  | 1000  |
| ovn_network.wait_port_up              | 0.0   | 0.001  | 0.001  | 0.001
 | 0.002  | 0.001  | 100.0%  | 1000  |
| ovn_network.ping_ports                | 0.04  | 10.24  | 10.397 | 10.449
| 10.82  | 6.767  | 100.0%  | 1000  |
| total                                 | 2.219 | 13.903 | 23.068 | 24.538
| 49.437 | 13.222 | 100.0%  | 1000  |
+---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+


With IP - ovn-nbctl NO daemon mode

concurrency - 10

+--------------------------------------------------------------------------------------------------------------+
|                                             Response Times (sec)
                                    |
+---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+
| action                                | min   | median | 90%ile | 95%ile
| max    | avg    | success | count |
+---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+
| ovn_network.create_routers            | 0.274 | 0.402  | 0.493  | 0.51
| 0.584  | 0.408  | 100.0%  | 1000  |
| ovn.create_lswitch                    | 0.064 | 0.137  | 0.213  | 0.244
 | 0.33   | 0.146  | 100.0%  | 1000  |
| ovn_network.connect_network_to_router | 0.203 | 0.395  | 0.677  | 0.766
 | 0.912  | 0.427  | 100.0%  | 1000  |
| ovn.create_lport                      | 0.13  | 0.261  | 0.437  | 0.497
 | 0.604  | 0.283  | 100.0%  | 1000  |
| ovn_network.bind_port                 | 1.307 | 2.374  | 2.816  | 2.904
 | 3.401  | 2.325  | 100.0%  | 1000  |
| ovn_network.wait_port_up              | 0.0   | 0.001  | 0.001  | 0.001
 | 0.002  | 0.001  | 100.0%  | 1000  |
| ovn_network.ping_ports                | 0.028 | 10.237 | 10.422 | 10.474
| 11.281 | 6.453  | 100.0%  | 1000  |
| total                                 | 2.251 | 13.631 | 14.822 | 15.008
| 15.901 | 10.044 | 100.0%  | 1000  |
+---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+

*****************

The results I shared in the previous email were with  ACLs added and
ovn-nbctl - batch mode disabled.

I agree with you. Let me do few more runs to be sure that the results are
consistent.

Thanks
Numan


> I will test with OVN 2.9 vs 2.11 master along with what you have
> suggested above and see if there are any problems related to ovsdb monitor
> condition change.
> >
> > Thanks
> > Numan
> >
> > Below are the results
> >
> >
> > With IP master - nbctl daemon node - No batch mode
> > concurrency - 10
> >
> >
> +--------------------------------------------------------------------------------------------------------------+
> > |                                             Response Times (sec)
>                                        |
> >
> +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+
> > | action                                | min   | median | 90%ile |
> 95%ile | max    | avg    | success | count |
> >
> +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+
> > | ovn_network.create_routers            | 0.269 | 0.661  | 10.426 |
> 15.422 | 37.259 | 3.721  | 100.0%  | 1000  |
> > | ovn.create_lswitch                    | 0.313 | 0.45   | 12.107 |
> 15.373 | 30.405 | 4.185  | 100.0%  | 1000  |
> > | ovn_network.connect_network_to_router | 0.163 | 0.255  | 10.121 |
> 10.64  | 20.475 | 2.655  | 100.0%  | 1000  |
> > | ovn.create_lport                      | 0.351 | 0.514  | 12.255 |
> 15.511 | 34.74  | 4.621  | 100.0%  | 1000  |
> > | ovn_network.bind_port                 | 1.362 | 2.447  | 7.34   |
> 7.651  | 17.651 | 3.146  | 100.0%  | 1000  |
> > | ovn_network.wait_port_up              | 0.086 | 2.734  | 5.272  |
> 7.827  | 22.717 | 2.957  | 100.0%  | 1000  |
> > | ovn_network.ping_ports                | 0.038 | 10.196 | 20.285 |
> 20.39  | 40.74  | 7.52   | 100.0%  | 1000  |
> > | total                                 | 2.862 | 27.267 | 49.956 |
> 56.39  | 90.884 | 28.808 | 100.0%  | 1000  |
> >
> +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+
> > Load duration: 2950.4133141
> > Full duration: 2951.58845997 seconds
> >
> > ***********
> > With non IP - nbctl daemin node -ACLs - No batch mode
> >
> > concurrency - 10
> >
> >
> +--------------------------------------------------------------------------------------------------------------+
> > |                                             Response Times (sec)
>                                        |
> >
> +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+
> > | action                                | min   | median | 90%ile |
> 95%ile | max    | avg    | success | count |
> >
> +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+
> > | ovn_network.create_routers            | 0.267 | 0.421  | 10.395 |
> 10.735 | 25.501 | 3.09   | 100.0%  | 1000  |
> > | ovn.create_lswitch                    | 0.314 | 0.408  | 10.331 |
> 10.483 | 25.357 | 3.049  | 100.0%  | 1000  |
> > | ovn_network.connect_network_to_router | 0.153 | 0.249  | 6.552  |
> 10.268 | 20.545 | 2.236  | 100.0%  | 1000  |
> > | ovn.create_lport                      | 0.344 | 0.49   | 10.566 |
> 15.428 | 25.542 | 3.906  | 100.0%  | 1000  |
> > | ovn_network.bind_port                 | 1.372 | 2.409  | 7.437  |
> 7.665  | 17.518 | 3.192  | 100.0%  | 1000  |
> > | ovn_network.wait_port_up              | 0.086 | 1.323  | 5.157  |
> 7.769  | 20.166 | 2.291  | 100.0%  | 1000  |
> > | ovn_network.ping_ports                | 0.034 | 2.077  | 10.347 |
> 10.427 | 20.307 | 5.123  | 100.0%  | 1000  |
> > | total                                 | 3.109 | 21.26  | 39.245 |
> 44.495 | 70.197 | 22.889 | 100.0%  | 1000  |
> >
> +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+
> > Load duration: 2328.11378407
> > Full duration: 2334.43504095 seconds
> >
> >
> >>
> >>
> >> Thanks,
> >> Han
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20190709/92a81e1f/attachment-0001.html>


More information about the discuss mailing list