[ovs-dev] Update on DDlog port of ovn-northd

Leonid Ryzhyk ryzhyk at gmail.com
Fri Jul 19 16:47:25 UTC 2019


Hi Mark,

Many thanks for reporting this experiment!  The latency plot is very
insightful.

The DDlog implementation is far from being optimized yet, so having
multiple scale tests to use as performance targets is great. As one example
of a low-hanging fruit, here are the top two lines from DDlog's CPU usage
profile for your workload:

```
     436s339688us (   154514calls)     Reduce:
OVN_Southbound.Out_Multicast_Group(.datapath=datapath, .name=name,
.tunnel_key=tunnel_key, .ports=ports),
OVN_Southbound.UUIDMap_Datapath_Binding(.uuid_name=datapath,
.id=__id_datapath), var __one = FlatMap(if std.set_is_empty(ports)
{std.set_singleton("")} else
{ports}),OVN_Southbound.UUIDMap_Port_Binding(.uuid_name=__one,
.id=__one_swizzled), var __id_ports =
Aggregate((__id_datapath,name,tunnel_key),
ovsdb.group2set_remove_sentinel(__one_swizzled)) 3187
     391s335837us (   242500calls)     Join:
OVN_Southbound.Out_Multicast_Group(.datapath=datapath, .name=name,
.tunnel_key=tunnel_key, .ports=ports),
OVN_Southbound.UUIDMap_Datapath_Binding(.uuid_name=datapath,
.id=__id_datapath), var __one = FlatMap(if std.set_is_empty(ports)
{std.set_singleton("")} else {ports}),
OVN_Southbound.UUIDMap_Port_Binding(.uuid_name=__one, .id=__one_swizzled)
3180
```

It looks like DDlog spent a quarter of its runtime executing one rule that
was generated automatically by the DDlog/OVSDB binding generator.  The rule
matches port id's generated by DDlog rules against UUID's assigned to them
by OVSDB. It is only needed because OVSDB does not currently let the client
assign UUIDs directly. To make things worse, aggregation operations are
still not fully incremental in DDlog, which means that multicast groups are
re-computed from scratch on every transaction, triggering this expensive
rule.

We observed similar performance issues with Han's scale test.  Again,
having such scale tests is very valuable, as they help to identify and fix
bottlenecks.

Once we've fixed issues like this one, we will hopefully see the DDlog plot
grow much slower, at which point its runtime will be dominated by the
constant latency that we see in the first iterations on your graph.  DDlog
is optimized for processing batches of changes at the cost of introducing
higher latency for tiny updates. So, even though it improves incremental
performance, batching multiple transactions should still pay off big time
with DDlog.

Leonid

On Thu, Jul 18, 2019 at 1:51 PM Mark Michelson <mmichels at redhat.com> wrote:

> I've now performed tests where I measured the wall clock time of each
> iteration through the test. I've created a plot of the C vs. the DDlog
> ovn-northd here: https://i.imgur.com/bIsbeMN.png
>
> This confirms for me that the DDlog implementation does a better job of
> staying consistent in its execution time even as the network grows
> larger. You can see that the C implementation gets noticeably slower as
> the network grows. The DDlog implementation also gets slower but the
> slope of the trend is much less severe compared to C.
>
> Through my initial tests, I'm happy with what DDlog is doing. It appears
> to be handling large-scale networks much better than the C
> implementation. The incremental processing allows for the network size
> not to affect the performance of ovn-northd as much.
>
> On 7/18/19 1:18 PM, Mark Michelson wrote:
> > Hi, just wanted to send a follow-up and a test result.
> >
> > I got everything up and running (yay!). So I set up a server and ran a
> > sandbox test using ovn with traditional C-based ovn-northd and then ran
> > the same test using ovn with DDlog-based ovn-northd.
> >
> > The test script can be found here:
> > https://gist.github.com/putnopvut/3df7156b6d44b81d75a598262c5e959b
> >
> > The test sets up a network with 159 logical switches all connected to a
> > single logical router. Each logical switch has 92 logical switch ports.
> > For every 2 logical switch ports we add, we create a new port group, and
> > ACLs are created for each port group. The important part of this test is
> > that the final call to ovn-nbctl in each iteration of the loop uses
> > --wait=sb. In other words, each new logical switch port we add has to
> > wait for ovn-northd to process it completely before we can start adding
> > the next logical switch port.
> >
> > With traditional C-based ovn-northd:
> > [root at wsfd-netdev67 tutorial]# time ./scale-daemon-new-pg.sh
> >
> > real    128m59.280s
> > user    1m27.601s
> > sys    2m0.986s
> >
> > With DDlog-based ovn-northd:
> > [root at wsfd-netdev67 tutorial]# time ./scale-daemon-new-pg.sh
> >
> > real    33m40.586s
> > user    1m26.976s
> > sys    1m59.784s
> >
> > As you can see, the test takes about a quarter of the time to complete
> > when running with DDlog. So far, it seems like a good improvement.
> >
> > As a point of reference, if I remove the --wait=sb from the test, then
> > the result is:
> >
> > [root at wsfd-netdev67 tutorial]# time ./scale-daemon-new-pg.sh
> >
> > real    3m59.696s
> > user    1m18.937s
> > sys    1m44.669s
> >
> > This test was run with DDlog-based ovn-northd, but the result is nearly
> > identical with C-based ovn-northd since the test no longer is bound by
> > ovn-northd's performance.
> >
> > I'm a bit surprised that with DDlog, adding --wait=sb to each iteration
> > causes such a large difference in the overall test time (~4 minutes up
> > to ~34 minutes).
> >
> > My next step is going to be to plot the time of each individual
> > iteration during the test. I'll share those results when I have them
> > ready. I don't want to speculate on anything until I have that data to
> > share.
> >
> > Thanks,
> > Mark Michelson
> >
> > On 7/15/19 12:34 PM, Leonid Ryzhyk wrote:
> >> Hi Mark,
> >>
> >> Thanks for the feedback. We will add the `--version` flag.
> >>
> >> Yes, `stack install` merely copies the two DDlog executable to
> >> `~/.local/bin/` (or a custom path specified using `--local-bin-path
> >> <custom_path>`).
> >>
> >> Also, just want to mention that we maintain up-to-date binary releases
> >> of DDlog at `https://github.com/vmware/differential-datalog/releases`
> <https://github.com/vmware/differential-datalog/releases>,
> >> so most users should not need to install the Haskell tool stack and
> >> compile DDlog. When using binary releases, all you have to do is add
> >> the `ddlog/bin` directory to `$PATH`.
> >>
> >> Leonid
> >>
> >> On Mon, Jul 15, 2019 at 6:30 AM Mark Michelson <mmichels at redhat.com
> >> <mailto:mmichels at redhat.com>> wrote:
> >>
> >>     I did install DDlog originally back in early December last year. So
> >>     there are probably remnants of that still present. However, I did
> >>     perform a `stack install` using an updated pull of master on
> >> Friday. So
> >>     I guess the `stack install` didn't get rid of the old installation?
> >>
> >>     Also, there's no `ddlog --version` or anything similar to see what
> >>     version of DDlog is installed. That could be a nice feature to
> >> have in
> >>     the near future.
> >>
> >>     On 7/12/19 12:40 PM, Leonid Ryzhyk wrote:
> >>      > Thanks for trying it out!  Sounds like you have a very old
> >>     version of
> >>      > ddlog. If you install from source, please make sure that you run
> >>     `stack
> >>      > install` and that there is no other version off ddlog in your
> >>     path other
> >>      > than the one created by `stack install`.
> >>      >
> >>      > Leonid
> >>      >
> >>      > On Fri, Jul 12, 2019, 9:05 AM Mark Michelson <
> mmichels at redhat.com
> >>     <mailto:mmichels at redhat.com>
> >>      > <mailto:mmichels at redhat.com <mailto:mmichels at redhat.com>>>
> wrote:
> >>      >
> >>      >     On 7/12/19 3:29 AM, Leonid Ryzhyk wrote:
> >>      >      > Dear OVN developers,
> >>      >      >
> >>      >      > This is a brief update on the state of the DDlog port of
> >>     ovn-northd.
> >>      >      >
> >>      >      > We completed the initial implementation of ovn-northd in
> >> DDlog
> >>      >     few months
> >>      >      > ago.  Justin kindly
> >>      >      > helped to integrate it with OVN, so that it can be used as
> >>     a drop-in
> >>      >      > replacement for the C
> >>      >      > version (and passes all the tests in the OVN test
> >> suite).     The DDlog
> >>      >      > implementation does
> >>      >      > not have any of the new features/improvements added in
> >>     April 2019
> >>      >     or later.
> >>      >      >
> >>      >      > ## Repository
> >>      >      >
> >>      >      > The code is in the `ddlog-dev` branch of the `ovn-org/ovn`
> >>      >     repository:
> >>      >      > https://github.com/ovn-org/ovn/tree/ddlog-dev
> >>      >
> >>      >     Hi Leonid,
> >>      >
> >>      >     I ran into an issue when attempting to build ovn-northd. I
> >>     successfully
> >>      >     installed DDLog, but then I encountered this issue when
> >>     building OVN:
> >>      >
> >>      >     ddlog -i ovn/northd/ovn_northd.dl -L
> >>      >     /home/putnopvut/differential-datalog/lib
> >>      >     ddlog: Failed to parse input file: "ovn/northd/ovn_northd.dl"
> >>     (line 94,
> >>      >     column 5):
> >>      >     unexpected "&"
> >>      >     expecting "not", variable name, relation name, "var",
> >>     expression or "."
> >>      >
> >>      >     The line in question looks like this:
> >>      >
> >>      >           &SwitchPort(.lsp = lsp, .sw = &sw),
> >>      >
> >>      >     Any idea what's gone wrong here?
> >>      >
> >>      >     Thanks,
> >>      >
> >>      >     Mark Michelson
> >>      >
> >>      >
> >>      >
> >>      >      >
> >>      >      > ## Documentation
> >>      >      >
> >>      >      > Building and using ovn-northd-ddlog:
> >>      >      >
> >>      >
> >>
> >> https://github.com/ovn-org/ovn/blob/ddlog-dev/ovn/northd/docs/design.md
> >>      >      >
> >>      >      > Debugging ovn-northd-ddlog:
> >>      >      >
> >>      >
> >>
> >>
> https://github.com/ovn-org/ovn/blob/ddlog-dev/ovn/northd/docs/debugging.md
> >>
> >>      >      >
> >>      >      > ## Preliminary performance results
> >>      >      >
> >>      >      > Han Zhou kindly tested ovn-northd-ddlog with his OVN scale
> >>     test
> >>      >     and even
> >>      >      > found a nasty
> >>      >      > performance bug in the process (thanks, Han!).  He reports
> >>     that DDlog
> >>      >      > speeds up the test
> >>      >      > by almost a factor of 10:
> >>      >      >
> >>      >      > - ddlog version: 7:39min
> >>      >      > - C version: 67:47min
> >>      >      >
> >>      >      > This is great, and in fact profiling shows that there is
> >> still
> >>      >     plenty of
> >>      >      > space for
> >>      >      > improvement.  He also reports a 10+ times increase in
> >> memory
> >>      >     footprint:
> >>      >      >
> >>      >      > - ddlog version: 1944696KB
> >>      >      > - C version: 147984KB
> >>      >      >
> >>      >      > Again, we are working on a number of optimizations, which
> >>     should
> >>      >     reduce
> >>      >      > this overhead; although it will never be as low as C,
> >>     since DDlog
> >>      >      > fundamentally
> >>      >      > needs to cache more state to enable fast incremental
> >>     computation.
> >>      >      >
> >>      >      > Han also used DDlog's record&replay feature to capture all
> >>     northd
> >>      >      > transactions
> >>      >      > performed by the scale test in a format that can be
> >> replayed
> >>      >     against the
> >>      >      > standalone DDlog executable without having to reproduce
> >> Han's
> >>      >     OpenStack
> >>      >      > setup.
> >>      >      > The replay file is here: http://ryzhyk.net/replay.tgz
> >>      >      >
> >>      >      > Instructions for replaying this script:
> >>      >      >
> >>      >
> >>
> >>
> https://github.com/ovn-org/ovn/blob/ddlog-dev/ovn/northd/docs/debugging.md#record-and-replay-ddlog-execution
> >>
> >>      >      > The script will run for a few
> >>      >      > minutes and finally print some profiling information,
> >>     including the
> >>      >      > breakdown of
> >>      >      > DDlog's CPU and memory usage.
> >>      >      >
> >>      >      > ## Next steps
> >>      >      >
> >>      >      > We seek help from the OVN community in maintaining
> >>      >     ovn-northd-ddlog.  The
> >>      >      > first
> >>      >      > step is to start porting new OVN features introduced in
> >>     the last
> >>      >     few months
> >>      >      > to
> >>      >      > DDlog.
> >>      >      >
> >>      >      > Leonid
> >>      >
> >>      >     Hi Leonid,
> >>      >
> >>      >     I attempted
> >>      >
> >>      >      > _______________________________________________
> >>      >      > dev mailing list
> >>      >      > dev at openvswitch.org <mailto:dev at openvswitch.org>
> >>     <mailto:dev at openvswitch.org <mailto:dev at openvswitch.org>>
> >>      >      > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>      >      >
> >>      >
> >>
> >
>
>


More information about the dev mailing list