[ovs-dev] [patch_v4] ovn: Add additional comments regarding arp responders.

Tue Oct 11 02:21:34 UTC 2016

On Mon, Oct 10, 2016 at 12:27 AM, Mickey Spiegel <mickeys.dev at gmail.com>
wrote:

> This is getting close. Some rewording suggestions below.
>
> On Thu, Oct 6, 2016 at 10:34 AM, Darrell Ball <dlu998 at gmail.com> wrote:
>
>> There has been enough confusion regarding logical switch datapath
>> arp responders in ovn to warrant some additional comments;
>> hence add a general description regarding why they exist and
>> document the special cases.
>>
>> Signed-off-by: Darrell Ball <dlu998 at gmail.com>
>> ---
>>
>> Note this patch is meant to be merge with the code change for vtep
>> inport handling here.
>> https://patchwork.ozlabs.org/patch/675796/
>>
>> v3->v4: Capitalization fixes.
>>         Reinstate comment regarding L2 learning confusion.
>>
>> v2->v3: Reword and further elaborate.
>>
>> v1->v2: Dropped RFC code change for logical switch router
>>         type ports
>>
>>  ovn/northd/ovn-northd.8.xml | 52 ++++++++++++++++++++++++++++++
>> +++++++++------
>>  1 file changed, 46 insertions(+), 6 deletions(-)
>>
>> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
>> index 77eb3d1..5ac351d 100644
>> --- a/ovn/northd/ovn-northd.8.xml
>> +++ b/ovn/northd/ovn-northd.8.xml
>> @@ -415,20 +415,60 @@
>>      <h3>Ingress Table 9: ARP/ND responder</h3>
>>
>>      <p>
>> -      This table implements ARP/ND responder for known IPs.  It contains
>> these
>> -      logical flows:
>> +      This table implements ARP/ND responder for known IPs.
>
>
> I agree with Han Zhou that mentioning logical switch explicitly helps
> clarify things.
>

We three are fine with this.

>
> +      The advantage
>> +      of the ARP responder flow is to limit ARP broadcasts by locally
>> +      responding to ARP requests without the need to send to other
>> +      hypervisors.  One common case is when the inport is a logical
>> +      port associated with a VIF and the broadcast is responded to on the
>> +      local hypervisor rather than broadcast across the whole network and
>> +      responded to by the destination VM.  This behavior is proxy ARP.
>>
>
> Agree up to this point. Wondering if there should be multiple paragraphs,
> with the text above being the first paragraph.
>

I was thinking same - done.

>
>
>> +      ARP requests received by multiple hypervisors, as in the case of
>> +      <code>localnet</code> and <code>vtep</code> logical inports need
>> +      to skip these logical switch ARP responders;  the reason being
>> +      that northd downloads the same mac binding rules to all hypervisors
>> +      and all hypervisors will receive the ARP request from the external
>> +      network and respond.  This will confuse L2 learning on the source
>> +      of the ARP requests.  These skip rules are mentioned under
>> +      priority-100 flows.
>
>
> I am OK with Han Zhou's suggestion to move this description to the
> priority-100 flows themselves. If you do that, then the l2 gateway text
> below should also move to the priority-100 flows.
>

priority-100 flows are skip flows; we are not skipping l2gateway
inport types, so I think we can leave the l2gateway references where they
are.

>
>
>> +      ARP requests arrive from VMs with a logical
>> +      switch inport type of type empty, which is the default.  For this
>> +      case, the logical switch proxy ARP rules can be for other VMs or
>> +      a logical router port.
>
>
> Suggest to replace the above two lines with something generic like:
>
>       Logical switch proxy ARP rules may be programmed both for IP
>       addresses on logical switch VIF ports (of type empty, which is the
>       default, representing connectivity to VMs or containers), and for IP
>       addresses on logical switch router ports.
>

I see you added IP address - I would have hoped that was obvious context,
but I can add
it. We have slightly different wording and Han does not even like the
paragraph.
Let us compromise on:

Logical switch proxy ARP rules may be programmed both for binding IP
addresses on other logical switch VIF ports (which are of the default
logical switch port type, representing connectivity to VMs or containers),
and for binding IP addresses on logical switch router type ports,
representing their logical router port peers.

> Note that it is common
>       for ARP requests to be received on one type of port (e.g. of type
>       empty, from a VM) for an IP address that resides on a different
>       type of port (e.g. of type router).
>

I  am going to skip this part - I think it is hard to understand.
By the way, logical switch router type arp responders are on my
hit list - I have checked with some NSX folks and they agree with me that
there is no real optimization here for the reasons I mentioned before.
I intend to follow up separately regarding this.

>
> +      In order to support proxy ARP for logical
>> +      router ports, an IP address must be configured on the logical
>> +      switch router type port, with the same value as the peer of the
>> +      logical router port.  The configured MAC addresses must match as
>> +      well.
>
>
> Agree with the text above.
>
> +      If this logical switch router type port does not have an
>> +      IP address configured, ARP requests will hit another ARP responder
>> +      on the logical router datapath itself, which is most commonly a
>> +      distributed logical router.  The advantage of using the logical
>> +      switch proxy ARP rule for logical router ports is that this rule
>> +      is hit before the logical switch L2 broadcast rule.  This means
>> +      the ARP request is not broadcast on this logical switch.
>
>
> Han Zhou suggested:
>
> If this logical switch router type port does not have an IP address
> configured, although the request will still be responded by the ARP
> responder on the logical router datapath, the ARP request will be broadcast
> on the logical switch.
>
> My proposal, building on top of the original text and Han's proposal:
>
> If this logical switch router type port does not have an IP address
> configured, the ARP request will be broadcast on the logical switch.
> One of the copies of the ARP request will go through the logical
> switch router type port to the logical router datapath, where the
> logical router ARP responder will generate a reply.
>

I am fine with this - more wordy but even first year art students
will understand it.

>
> +      Logical
>> +      switch ARP responder proxy ARP rules can also be hit when
>> +      receiving ARP requests externally on a L2 gateway port.  In this
>> +      case, the hypervisor acting as an L2 gateway, responds to the ARP
>> +      request on behalf of a VM.  Note that ARP requests received from
>> +      <code>localnet</code> or <code>vtep</code> logical inports can
>> +      either go directly to VMs, in which case the VM responds or can
>> +      hit an ARP responder for a logical router port if the packet is
>> +      used to resolve a logical router port next hop address.
>>
>
> If the priority-100 text near the top moves to the priority-100 bullet
> below, then the above text should also move to the priority-100 bullet.
>

I can move the skip flow (below) part back to priority-100; I will leave
the l2gateway
blurb where it is. I think Han will need to live with the below sentence,
since we both
are leaning in favor of it.

+     Note that ARP requests received from
+      <code>localnet</code> or <code>vtep</code> logical inports can
+      either go directly to VMs, in which case the VM responds or can
+      hit an ARP responder for a logical router port if the packet is
+      used to resolve a logical router port next hop address.

>
>> +      It contains these logical flows:
>>      </p>
>>
>>      <ul>
>>        <li>
>> -        Priority-100 flows to skip ARP responder if inport is of type
>> -        <code>localnet</code>, and advances directly to the next table.
>> +        Priority-100 flows to skip the ARP responder if inport is
>> +        of type <code>localnet</code> or <code>vtep</code> and
>> +        advances directly to the next table.
>
>
> If text is moved from the top section above, it would go here.
>

Good idea

>
> +        The inport being of type
>> +        <code>router</code> has no known use case for these ARP
>> +        responders.  However, no skip flows are installed for these
>> +        packets, as there would be some additional flow cost for this
>> +        and the value appears limited.
>>
>
> This would be easier to follow with more explanation.  How about:
>
>         ARP requests received on an inport of type <code>router</code>
>         are not expected to hit any logical switch ARP responder flows,
>         since the logical router ARP resolve stage already includes all
>         corresponding IP address to MAC address flows.  However, no
>         skip flows are installed for these packets, as there would be
>         some additional flow cost for this and the value appears limited.
>

I will omit this sentence fragment:

 "since the logical router ARP resolve stage already includes all
        corresponding IP address to MAC address flows".

because the reason is related to the OVN pipeline (as it exists today); it
is
not an intentional flow optimization.

Let me go with the resulting merged version:

ARP requests received on an inport of type <code>router</code>
are not expected to hit any logical switch ARP responder flows.
However, no skip flows are installed for these packets, as there would be
some additional flow cost for this and the value appears limited.

>
> Mickey
>
>        </li>
>>
>>        <li>
>>          <p>
>>            Priority-50 flows that match ARP requests to each known IP
>> address
>> -          <var>A</var> of every logical router port, and respond with ARP
>> +          <var>A</var> of every logical switch port, and respond with ARP
>>            replies directly with corresponding Ethernet address
>> <var>E</var>:
>>          </p>
>>
>> @@ -455,7 +495,7 @@ output;
>>          <p>
>>            Priority-50 flows that match IPv6 ND neighbor solicitations to
>>            each known IP address <var>A</var> (and <var>A</var>'s
>> -          solicited node address) of every logical router port, and
>> +          solicited node address) of every logical switch port, and
>>            respond with neighbor advertisements directly with
>>            corresponding Ethernet address <var>E</var>:
>>          </p>
>> --
>> 1.9.1
>>
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
>>
>
>