[ovs-dev] [PATCH v1] ovn: Cache mac bindings from broadcasted dynamic ARP responses

Ben Pfaff blp at ovn.org
Thu Jan 5 05:32:05 UTC 2017


On Wed, Dec 28, 2016 at 01:35:28PM +0530, Babu Shanmugam wrote:
> This patch attempts to avoid the usage of MAC_Binding table.
> 
> Dynamic ARP response originates from the logical ports with "unknown"
> address. When the ARP resolution is requested via a logical
> router datapath, ARP response will be delivered to the logical router
> port of the switch. Since the logical router is distributed, the
> response will never reach the other chassis from where the ARP resolution
> is requested.
> 
> This is done using a OVN logical action bcast2lr() which adds an action
> to send the packet to a specific multicast group for the logical router
> datapath which will ultimately be broadcasted to all the other chassis.

Thanks for the revised patch, and especially for the expanded
explanation.

I think that I understand now how this is supposed to work.  I think the
idea is that ARP responses from an LSP with "unknown" MAC will be sent
from the LSP's LS to the LR on all the other chassis, as well as being
processed locally.  I think that this basic idea is sound.

I still do not understand understand the detailed design.

First, the code goes to some trouble to ensure that this special
processing for ARP responses from an LSP with "unknown" MAC only happens
on the chassis where the LSP resides.  It passes the name of this
chassis as an argument to the bcast2lr action, which violates the OVN
principle that the logical flow table's contents are independent of the
physical layout of the system.  It also requires the logical flow table
to be updated whenever the physical layout of the system changes, which
is unprecedented as far as I can tell.  Such a change would, therefore,
need to be for an important reason, but I don't know what it's good for,
because I don't know why hypervisor hv0 would process a packet through
an LS ingress pipeline for LSP lsp0, if lsp0 didn't reside on hv0.

Second, I'm suspicious of the idea of a special-purpose action dedicated
to this purpose.  When we can, it's better to make use of general
primitives than to invent new special-purpose ones.  In this case, the
new action combines a few goals that can be broken down into primitives
that might individually be useful in other contexts:

    1. Do nothing, if the packet originated on any chassis but one
       designated one.  As I said, I don't yet see why this is needed,
       but if it is then it seems like this could be made generic.  For
       example, the originating chassis could be part of the match
       condition, or we could introduce a new action that executes
       actions provided as arguments only on chassis specified in a
       parameter.

    2. Send a packet to a different logical datapath.  Logical patch
       ports can do this already.

    3. Replicate a packet to multiple hypervisors, with the same logical
       output port on all of them.  The patch does this with a logical
       multicast group, but it has to special case it.  I'm not sure
       that's a good way to do it.  It's nice to have uniform
       abstractions, which multicast groups have been until now.  An
       alternative that comes to mind is to introduce a new kind of
       logical port that replicates packets to every chassis that hosts
       the logical datapath.  One reason that this seems better is
       because logical ports are already nonuniform and adding another
       type simply doesn't make it much worse.  Another is that,
       eventually, I guess we'll want to have more configurability here
       and logical ports already have options for configuration.

Details
=======

Here are some detailed comments.  Some of these only make sense if we
stick with the overall design, which, as I said above, I'd prefer to
change.

This patch adds a new logical action, so it should add new tests under
"ovn -- action parsing" in ovn.at.

parse_BCAST2LR() silently does nothing if a parameter is missing.  Why
doesn't it report an error?

parse_BCAST2LR() leaks memory if it finds a syntax error or if a
parameter is missing.

parse_BCAST2LR() doesn't really need to build a structure and then copy
it, it can just initialize it in-place.

I don't understand why ovnact_bcast2lr's "char *"s are const.  They
don't seem any more "const" than other ovnacts' members.

encode_BCAST2LR() should use init_stack(), like encode_setup_args() and
encode_restore_args(), instead of open-coding it.

ovn-northd.8.xml needs an update to explain the new flows.

Please do not introduce pinctrl_put_mac_binding_for_each().  Callback
interfaces are awkward and should only be used when really necessary.
Instead, I'd move all the related code from lflow.c into a new
pinctrl_*() function that can be called directly from ovn-controller.c
somewhere around the call to lflow_run().

The code in pinctrl makes only limited sense now.  Until now, it was
organized as a kind of a buffer or queue that was flushed to the
database when it could be.  Now, it's just a cache, but it's still
organized the same way.  This has some nasty side effects.  I don't see
anything that ever flushes an entry, for example.  Combined with a fixed
limit of 1000 entries, this means that after 1000 MACs have been
learned, no more can ever be learned until ovn-controller restarts.

Thanks,

Ben.


More information about the dev mailing list