[ovs-discuss] [OVN] flow explosion in lr_in_arp_resolve table

Girish Moodalbail gmoodalbail at gmail.com
Sun May 10 00:01:20 UTC 2020


Hello Han, Tim

Please see in-line:



> Hello Han,
>>>
>>> I did consider distributed gateway port. However, there are two issues
>>> with it
>>>
>>> 1. In order to support K8s NodePort services we need to create a
>>> North-South LB and L3 gateway is a perfect solution for that. AFAIK,
>>>    DGP doesn't support it
>>>
>>
> In fact DGP supports LB (at least from code
> https://github.com/ovn-org/ovn/blob/master/northd/ovn-northd.c#L9318),
> but the ovn-nb manpage may need an update.
>

I see


>
>
>> 2. Datapath performance would be bad with DGP. We want the packet meant
>>> for the host or the Internet to exit out of the hypervisor on which the pod
>>> exists. The L3 gateway router provides us with this functionality. With dgp
>>> and with OVN supporting only one instance of it, packets unnecessarily gets
>>> forwarded over tunnel to dgp chassis for SNATing and then gets forwarded
>>> back over tunnel to the host to just exit out locally.
>>>
>>
> This is related to the changes needed for DGP (the first point I mentioned
> in previous email). In the diagram I draw, there will be 1000 DGPs, each
> reside on a chassis, just to make sure north-south traffic can be forwarded
> on the local chassis without going through a central node, just like how it
> works today in ovn-k8s. However, maybe this is not a small change, because
> today the NAT and LB processing on such LRs (LRs with DGP) are all based on
> the assumption that there is only one DGP. For example, the NB schema would
> also need to be changed so that the NAT/LB rules for a router can specify
> DGP to determine the central processing location for those rules.
>

Correct


>
> So, to summarize, if we can make multi-DGP work, it would be the best
> solution for the ovn-k8s scenario. If we can't (either because of design
> problem, or because it is too big effort for the gains), maybe configurably
> avoiding the static neighbour flows is a good way to go. Both options
> requires changes in OVN.
>

Han, optimizing the neighbor cache from the current O(n^2) to something
scalable will be ideal for short-term. I am hoping that the changes to OVN
will not be as complicated as multi-DGP work and other changes to OVN
proposed on this email thread.



> Without changes in OVN, a further optimization based on your current
> workaround can be done is what Tim has suggested: to replace the large
> number of small join LSes (and LRPs and patch ports on both sides) by same
> number of directly connected LRPs.
>

Han and Tim,

OVN supports only peering two distributed routers without a logical switch,
however it doesn't support connecting a distributed router and an l3
gateway router directly as peers. I remember very clearly this being
mentioned in the ovn-architecture man page.

---------8<--------------8<---------------------

       The distributed router and the
       gateway router are  connected  by  another  logical  switch,  sometimes
       referred  to  as a ``join’’ logical switch. (OVN logical routers may be
       connected to one another directly, without an intervening  switch,  but
       the  OVN  implementation only supports gateway logical routers that are
       connected to logical switches. Using a join logical switch also reduces
       the  number  of  IP addresses needed on the distributed router.)

---------8<--------------8<---------------------

Before splitting the OVN join logical switch into several small logical
switches, I did try directly connecting the LR to each of the node-specific
LR using a point-to-point link but it didn't work. Since this was
corroborated by the man page, I didn't debug the topology and moved on to
splitting the `join` logical switch.

Regards,
~Girish


>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200509/c134144b/attachment.html>


More information about the discuss mailing list