[ovs-discuss] RFC: incremental computation for OVN with DDlog

Russell Bryant russell at ovn.org
Wed Nov 14 01:25:23 UTC 2018


On Wed, Nov 7, 2018 at 11:10 AM Ben Pfaff <blp at ovn.org> wrote:
>
> On Wed, Nov 07, 2018 at 08:57:00AM -0500, Mark Michelson wrote:
> > Thanks for the e-mail, Ben. I'm 100% behind this effort. The performance
> > benefits and the potential drop in CPU usage of OVN components is absolutely
> > worth it. I have some questions inline below with regards to specific points
> > you've brought up.
> >
> > On 11/02/2018 01:44 PM, Ben Pfaff wrote:
> > >I was asked in an OVN meeting to send out an email talking about what
> > >we're working on to make ovn-northd and ovn-controller faster.  Here's
> > >my summary.
> > >
> > >OVN is essentially a stack of compilers.  At the top, the CMS dumps
> > >some configuration into the northbound database (NDBB).  Then:
> > >
> > >     1. ovn-northd centrally translates the high-level NBDB description
> > >        into logical flows in the southbound database (SBDB).
> > >
> > >     2. ovn-controller, on each HV, translates the SBDB logical flows
> > >        into "physical" (OpenFlow) flows for the local hypervisor and
> > >        passes them to ovs-vswitchd.
> > >
> > >     3. ovs-vswitchd translates OpenFlow flows into datapath flows on
> > >        demand as traffic appears.
> > >
> > >Currently, OVN implements steps 1 and 2 with code that translates all
> > >input to output in one go.  When any of the input changes, it
> > >re-translates all of it.  This is fine for small deployments, but it
> > >scales poorly beyond about 1000 hypervisors, at which point each
> > >translation step begins to take multiple seconds.  Larger deployments
> > >call for incremental computation, in which a small change in the input
> > >requires only a small amount of computation to yield a small change in
> > >the output.
> > >
> > >It is difficult to implement incremental computation in C.  For
> > >ovn-controller, two attempts have been made already.  The first attempt,
> > >in 2016, increased code complexity without similar benefit
> > >(https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/078272.html).
> > >A recent approach, by Han Zhou shows a much bigger improvement, but it
> > >also increases complexity greatly and definitely makes maintenance more
> > >difficult.
> > >
> > >Justin and I are proposing a new approach, based on an incremental
> > >computation engine called Differential Datalog, or DDlog for short
> > >(https://github.com/ryzhyk/differential-datalog).  DDlog is open source
> > >software developed at the VMware Research Group in Palo Alto by Leonid
> > >Ryzhyk, Mihai Budiu, and others.  It uses an underlying engine developed
> > >by Frank McSherry at Microsoft Research, called Differential Dataflow
> > >(https://github.com/frankmcsherry/differential-dataflow).  Here's a talk
> > >that Leonid gave on DDlog earlier this earlier:
> > >https://ovsorbit.org/#e58
> > >
> > >DDlog appears suitable for steps 1 and 2, that is, for both ovn-northd
> > >and ovn-controller.  Justin and I are starting with ovn-northd, because
> > >it is a simpler case, and once we've arrived at some minimum amount of
> > >success, Han is going to apply what we've learned to ovn-controller as
> > >well.  Leonid and Mihai have been working very closely with us (we have
> > >literally been writing DDlog code in conference rooms in 90 minute
> > >sessions with everyone clustered around laptops) and none of it could
> > >happen without them.
> > >
> > >Here's the process we'll need to follow to get DDlog to work with
> > >ovn-northd:
> > >
> > >* DDlog needs to be able to talk to OVSDB for input (reading data from
> > >   the northbound database) and output (writing data to the southbound
> > >   database).  Therefore, we need to write OVSDB adapters for DDlog.
> > >   Leonid has already done an important part of this work.  There is
> > >   more work to do plumbing the adapter into ovn-northd's database
> > >   connections.
> >
> > Is this work in one of the repos you previously linked? If not, is there
> > somewhere we can find the WIP?
>
> The part that is implemented is a DDlog API that accepts and produces
> the JSON format that OVSDB understands.  The API for this is in
> rust/template/ddlog.h in the northd branch at
> https://github.com/ryzhyk/differential-datalog.  You can search for
> "json" in that file to see what's there.
>
> The missing piece is OVS client library code to pass the JSON to and
> from the actual database server.
>
> > >* We need to translate the C flow generation code in ovn-northd into
> > >   DDlog's domain specific language.  There are some tricky parts to
> > >   this but we expect the bulk of it to be straightforward and probably
> > >   easier to read in DDlog than in C.  We've started with the tricky
> > >   parts, which you can find at
> > >   https://github.com/ryzhyk/differential-datalog/blob/northd/test/ovn/ovn_northd.dl
> > >   Please don't take the code there as illustrative of what one would
> > >   typically see for flow generation, because as I said, these are the
> > >   hard parts.
> >
> > Thanks for the code examples. Seeing sample DDlog is very nice, even if it's
> > not necessarily illustrative of what the final product will be.
> >
> > For those of us doing work right now to add new features to OVN, how should
> > we approach the conversion to DDlog? As an example, I have some multicast
> > work in progress that will add some new northbound data. It also introduces
> > ovn-northd changes to generate logical flows and southbound data.
> >
> > My assumption is that I should focus 100% on the C implementation for now.
> > When should I consider adding the analogous DDlog changes?
>
> The DDlog implementation has a lot of catching up to do.  I think that
> other OVN efforts should focus on the C implementation.  Applying recent
> changes to DDlog should not add much to the DDlog work.
>
> > Is there some sort of plan for how to keep DDlog up to date in the face of
> > new C development? For instance, would we implement a policy that states
> > that C changes will not be accepted without equivalent DDlog changes? For
> > this initial conversion, would we declare a C feature freeze date that
> > states that no new ovn-northd C changes may be added after that date in
> > order to give a stable feature set for DDlog conversion?
>
> We need to have some kind of policy once the DDlog implementation is
> merged into mainline OVS.  I don't think it's necessary to freeze the C
> implementation yet; maybe not at all, since it doesn't really move that
> quickly either.
>
> > >* The OVN build system will need some changes:
> > >
> > >   - The DDlog compiler, which translates .dl files into Rust, is
> > >     written in Haskell, so Haskell becomes an OVN build requirement
> > >     but not a runtime requirement.  (If that's a problem, then we can
> > >     arrange to distribute the Rust output as well as the .dl input,
> > >     for situations where Haskell is not available.)
> > >
> > >   - OVN will require a Rust compiler at build time.  Whatever
> > >     libraries Rust needs becomes runtime requirements.
> > >
> > >   - ovn-northd (and eventually ovn-controller) will link against the
> > >     Rust object files and call into DDlog through its external API.
> > >
> > >* Initially, we plan to make DDlog optional.  If Haskell and Rust are
> > >   available at configure and build time, ovn-northd will build in
> > >   support for DDlog.  At ovn-northd runtime, command-line options will
> > >   control which implementation is used; we hope to make it possible to
> > >   run both in parallel to check for differences in behavior.  After
> > >   the DDlog implementation is proven in practice, we hope to delete
> > >   the C implementation entirely.
> >
> > It's a bit of a loaded question, but by what measure do we consider the
> > DDlog implementation to be proven in practice?
> >
> > Until the C implementation is erased, should we expect to develop features
> > for both C and DDlog? Or should we expect to target new features only for
> > DDlog?
>
> I expect that "proven in practice" is subjective.  One baseline would be
> that the DDlog implementation passes all the existing OVN unit tests.
> Beyond that, I think that we would have to convince real users to try
> the DDlog implementation and report problems.  (If we get the "run both
> implementations in parallel" working well, then we could alternatively
> ask them to try that and pass along any errors that it reports.)
>
> My guess is that developing features for both implementations won't be
> much of a burden, beyond the burden of remembering to do it, because it
> should be easy to write DDlog for common features, not really harder
> (maybe easier) than writing the C.

I think this is implied based on the description of how ovn-northd
would work, but do you expect to make a completely seamless drop-in
replacement (aside from build-time and run-time dependencies?  All
parameters would be identical, no new configuration, and requiring
zero change to integrations project like ovn-kubernetes or the
OpenStack OVN integration?

In terms of "proven in practice", OVN is at a stage where it's being
used in production, so ideally we set a very high bar for a switchover
like this.  It sounds like you're planning for that by enabling
implementations to work in parallel instead of forcing a hard cutover
early.  I would hope for something like multiple releases of a new
implementation in experimental state, allowing plenty of time for
testing in realistic, larger scale environments, and relying on
reports of significant successes before a cutover.

-- 
Russell Bryant


More information about the discuss mailing list