[ovs-dev] upstreaming datapath

Ben Pfaff blp at nicira.com
Mon Oct 19 18:15:33 UTC 2009


Stephen Hemminger <shemminger at vyatta.com> writes:

> I would like to cleanup and submit the current datapath to the kernel
> drivers staging tree. This would get wider review from kernel developers.
>
> My long term goal is to have a unified interface (netlink) for bridging that
> works for both existing bridge and openvswitch.
>
> BUT it would also mean removing all the compatiablity code for older kernels.

Hi Stephen.  I know you as a kernel developer from reading
linux-kernel for 10 years or more, but I didn't realize before
that you worked at Vyatta.

Removing compatibility code should be easy, because we've been
conscientiously following a model where we always update our
kernel code to the latest kernel version and insert compatibility
code into compatibility headers that make old kernels look like
newer ones.  There are a few places where this didn't quite work,
but by and large just "rm -r compat-2.6" should be sufficient to
delete compatibility code.

Oh, and we'd delete the brcompat module entirely, too, of course.
Its initial purpose is obsolete, so we don't need it longer
anyhow.

The main place where the openvswitch module is actively
incompatible with anything in the upstream kernel is the bridge
hook (br_handle_frame_hook).  My thought there is that this hook
should become per-net_device, so that the existing Linux bridge
and openvswitch can coexist in a single system (which is useful,
and yet OVS can't support it right now).  Does that make sense to
you?

We'd also need to decide on a sysfs interface for openvswitch.
Currently the code emulates the existing bridge's sysfs
interface, because we needed compatibility, but clearly it's not
completely suitable and we should design something better.

What kind of unified interface do you have in mind?  I can
imagine using the same netlink calls for, say, adding and
removing bridges and ports.  But both the existing bridge and the
openvswitch also have functionality that the other does not.  It
would not make sense to try to shoehorn both into exactly the
same interface.

Initially (I think that this was so long ago that it is not in
our current Git tree), Open vSwitch used Netlink entirely for
communication with userspace (whereas now it uses character
devices).  But this proved not to work well for transactional
operations that are not idempotent, because responses to Netlink
messages can get lost.  For example, Open vSwitch has a datapath
operation to delete a flow and return its statistics.  When this
was implemented as a Netlink request and response, it was
possible for the response to get lost (because a kernel memory
allocation failed).  But re-sending the request would not work,
because the first command had deleted the flow.  And breaking it
into two separate commands (get flow stats, delete flow)
introduces a race where statistics on packets that arrive between
the commands are lost.  This is the main reason that we are not
using Netlink now.  I think there were other reasons, too, but
that is the one that comes to mind first.

But the biggest reason that we have not already submitted OVS for
inclusion is this one: currently the interface is not flexible
and not extensible.  In particular, beyond the L2 Ethernet
header, it can only match IPv4 packets.  I have some thoughts on
how to make it more flexible and extensible, but I have not had
time to work any of it out in detail or to start writing code for
it.

Your advice is appreciated.

Thanks,

Ben.




More information about the dev mailing list