[ovs-discuss] RFC: incremental computation for OVN with DDlog

Ben Pfaff blp at ovn.org
Wed Nov 7 16:10:16 UTC 2018

On Wed, Nov 07, 2018 at 08:57:00AM -0500, Mark Michelson wrote:
> Thanks for the e-mail, Ben. I'm 100% behind this effort. The performance
> benefits and the potential drop in CPU usage of OVN components is absolutely
> worth it. I have some questions inline below with regards to specific points
> you've brought up.
> On 11/02/2018 01:44 PM, Ben Pfaff wrote:
> >I was asked in an OVN meeting to send out an email talking about what
> >we're working on to make ovn-northd and ovn-controller faster.  Here's
> >my summary.
> >
> >OVN is essentially a stack of compilers.  At the top, the CMS dumps
> >some configuration into the northbound database (NDBB).  Then:
> >
> >     1. ovn-northd centrally translates the high-level NBDB description
> >        into logical flows in the southbound database (SBDB).
> >
> >     2. ovn-controller, on each HV, translates the SBDB logical flows
> >        into "physical" (OpenFlow) flows for the local hypervisor and
> >        passes them to ovs-vswitchd.
> >
> >     3. ovs-vswitchd translates OpenFlow flows into datapath flows on
> >        demand as traffic appears.
> >
> >Currently, OVN implements steps 1 and 2 with code that translates all
> >input to output in one go.  When any of the input changes, it
> >re-translates all of it.  This is fine for small deployments, but it
> >scales poorly beyond about 1000 hypervisors, at which point each
> >translation step begins to take multiple seconds.  Larger deployments
> >call for incremental computation, in which a small change in the input
> >requires only a small amount of computation to yield a small change in
> >the output.
> >
> >It is difficult to implement incremental computation in C.  For
> >ovn-controller, two attempts have been made already.  The first attempt,
> >in 2016, increased code complexity without similar benefit
> >(https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/078272.html).
> >A recent approach, by Han Zhou shows a much bigger improvement, but it
> >also increases complexity greatly and definitely makes maintenance more
> >difficult.
> >
> >Justin and I are proposing a new approach, based on an incremental
> >computation engine called Differential Datalog, or DDlog for short
> >(https://github.com/ryzhyk/differential-datalog).  DDlog is open source
> >software developed at the VMware Research Group in Palo Alto by Leonid
> >Ryzhyk, Mihai Budiu, and others.  It uses an underlying engine developed
> >by Frank McSherry at Microsoft Research, called Differential Dataflow
> >(https://github.com/frankmcsherry/differential-dataflow).  Here's a talk
> >that Leonid gave on DDlog earlier this earlier:
> >https://ovsorbit.org/#e58
> >
> >DDlog appears suitable for steps 1 and 2, that is, for both ovn-northd
> >and ovn-controller.  Justin and I are starting with ovn-northd, because
> >it is a simpler case, and once we've arrived at some minimum amount of
> >success, Han is going to apply what we've learned to ovn-controller as
> >well.  Leonid and Mihai have been working very closely with us (we have
> >literally been writing DDlog code in conference rooms in 90 minute
> >sessions with everyone clustered around laptops) and none of it could
> >happen without them.
> >
> >Here's the process we'll need to follow to get DDlog to work with
> >ovn-northd:
> >
> >* DDlog needs to be able to talk to OVSDB for input (reading data from
> >   the northbound database) and output (writing data to the southbound
> >   database).  Therefore, we need to write OVSDB adapters for DDlog.
> >   Leonid has already done an important part of this work.  There is
> >   more work to do plumbing the adapter into ovn-northd's database
> >   connections.
> Is this work in one of the repos you previously linked? If not, is there
> somewhere we can find the WIP?

The part that is implemented is a DDlog API that accepts and produces
the JSON format that OVSDB understands.  The API for this is in
rust/template/ddlog.h in the northd branch at
https://github.com/ryzhyk/differential-datalog.  You can search for
"json" in that file to see what's there.

The missing piece is OVS client library code to pass the JSON to and
from the actual database server.

> >* We need to translate the C flow generation code in ovn-northd into
> >   DDlog's domain specific language.  There are some tricky parts to
> >   this but we expect the bulk of it to be straightforward and probably
> >   easier to read in DDlog than in C.  We've started with the tricky
> >   parts, which you can find at
> >   https://github.com/ryzhyk/differential-datalog/blob/northd/test/ovn/ovn_northd.dl
> >   Please don't take the code there as illustrative of what one would
> >   typically see for flow generation, because as I said, these are the
> >   hard parts.
> Thanks for the code examples. Seeing sample DDlog is very nice, even if it's
> not necessarily illustrative of what the final product will be.
> For those of us doing work right now to add new features to OVN, how should
> we approach the conversion to DDlog? As an example, I have some multicast
> work in progress that will add some new northbound data. It also introduces
> ovn-northd changes to generate logical flows and southbound data.
> My assumption is that I should focus 100% on the C implementation for now.
> When should I consider adding the analogous DDlog changes?

The DDlog implementation has a lot of catching up to do.  I think that
other OVN efforts should focus on the C implementation.  Applying recent
changes to DDlog should not add much to the DDlog work.

> Is there some sort of plan for how to keep DDlog up to date in the face of
> new C development? For instance, would we implement a policy that states
> that C changes will not be accepted without equivalent DDlog changes? For
> this initial conversion, would we declare a C feature freeze date that
> states that no new ovn-northd C changes may be added after that date in
> order to give a stable feature set for DDlog conversion?

We need to have some kind of policy once the DDlog implementation is
merged into mainline OVS.  I don't think it's necessary to freeze the C
implementation yet; maybe not at all, since it doesn't really move that
quickly either.

> >* The OVN build system will need some changes:
> >
> >   - The DDlog compiler, which translates .dl files into Rust, is
> >     written in Haskell, so Haskell becomes an OVN build requirement
> >     but not a runtime requirement.  (If that's a problem, then we can
> >     arrange to distribute the Rust output as well as the .dl input,
> >     for situations where Haskell is not available.)
> >
> >   - OVN will require a Rust compiler at build time.  Whatever
> >     libraries Rust needs becomes runtime requirements.
> >
> >   - ovn-northd (and eventually ovn-controller) will link against the
> >     Rust object files and call into DDlog through its external API.
> >
> >* Initially, we plan to make DDlog optional.  If Haskell and Rust are
> >   available at configure and build time, ovn-northd will build in
> >   support for DDlog.  At ovn-northd runtime, command-line options will
> >   control which implementation is used; we hope to make it possible to
> >   run both in parallel to check for differences in behavior.  After
> >   the DDlog implementation is proven in practice, we hope to delete
> >   the C implementation entirely.
> It's a bit of a loaded question, but by what measure do we consider the
> DDlog implementation to be proven in practice?
> Until the C implementation is erased, should we expect to develop features
> for both C and DDlog? Or should we expect to target new features only for
> DDlog?

I expect that "proven in practice" is subjective.  One baseline would be
that the DDlog implementation passes all the existing OVN unit tests.
Beyond that, I think that we would have to convince real users to try
the DDlog implementation and report problems.  (If we get the "run both
implementations in parallel" working well, then we could alternatively
ask them to try that and pass along any errors that it reports.)

My guess is that developing features for both implementations won't be
much of a burden, beyond the burden of remembering to do it, because it
should be easy to write DDlog for common features, not really harder
(maybe easier) than writing the C.

More information about the discuss mailing list