[ovs-dev] [PATCH ovn v2] OVN: Fix learning of neighbors from ARP/ND packets.

Dumitru Ceara dceara at redhat.com
Mon Aug 5 08:05:49 UTC 2019

On Mon, Aug 5, 2019 at 1:41 AM Han Zhou <zhouhan at gmail.com> wrote:
> On Mon, Jul 29, 2019 at 2:30 AM Dumitru Ceara <dceara at redhat.com> wrote:
> >
> > Add a restriction on the target protocol addresses to match the
> > configured subnets. All other ARP/ND packets are invalid in this
> > context.
> >
> > One exception is for ARP replies that are received for route next-hops
> > that are only reachable via a port but can't be directly resolved
> > through route lookups. Such support was introduced by commit:
> >
> > 6b785fd8fe29 ("ovn-util: Allow /32 IP addresses for router ports.")
> >
> > Reported-at: https://bugzilla.redhat.com/1729846
> > Reported-by: Haidong Li <haili at redhat.com>
> > CC: Han Zhou <zhouhan at gmail.com>
> > CC: Guru Shetty <guru at ovn.org>
> > Fixes: b068454082f5 ("ovn-northd: Support learning neighbor from ARP request.")
> > Signed-off-by: Dumitru Ceara <dceara at redhat.com>
> Hi Dumitru,

Hi Han,

Thanks for reviewing this.

> Sorry for my slow response, and thanks a lot for revising the patch for a bigger scope of validations. However, the exception of /32 address makes me thinking more about this patch. If ARP replies is allowed from any nexthop for a LR port with /32, at least ARP request for GARP purpose should also be allowed. But before asking for a v3, I'd hold on to rethink the purpose of this patch.

Right, we should allow GARP requests too. If we decide to go ahead
with this patch I'll add a function in v3 to handle all types of ARPs
and call it both for unreachable static route next-hops and attached

> The nexthop specific flows are now from static routes. What if OVN support dynamical routing protocols in the future? Of course we can then take those dynamic nexthops into allowed peers. But then I am thinking what is the real benefit of all these restrictions? Why can't we just have simpler logic to handle all these situations without validation? I think the major benefit of the validation is to avoid handling the noise ARP/NDs when multiple subnets shares same L2, but most cases it is really not a big deal, right? For the CPU problem caused by ARP flooding as mentioned by the bug report, it is a real problem, but this patch seems not really helpful for that, because the attacker can just trigger the same CPU problem with *valid* packets. So I am not sure if the benefit of the change is worth the complexity it introduced. Please share your thought and correct me if I missed something.
> Thanks,
> Han

I assume the simpler logic to handle all these situations without
validation is to add rate limiting for ARP packets that get punted to
the controller. I agree that this should be implemented too.

But I think rate limiting all ARP packets regardless of IP addresses
is not enough. In the following scenario and if we would have a way to
rate limit ARP packets:
- Subnet configured on the OVN
- "Invalid" ARP packets are injected at high rate in the network for
- Host sends GARP.
- Rate limiting of ARP packets towards controller at 100pps

With the current code, ARP packets for will be punted to
controller at a rate of at most 100 per second. But the chances that
the valid GARP is dropped is really high.

If we instead use the following approach:
a. Punt only useful ARPs to controller.
b. Rate limit ARPs that are sent to controller.

Then ARP packets outside 42.42.42./24 are never punted to the
controller and don't consume any rate limiting tokens. For the second
case, when an attacker would flood with valid ARP packets we would
have the rate limit in place to protect the controller CPU.

My commit addresses point "a" above as I think point "b" should be
done in a generic way for all protocol packets that need to reach the

For dynamic routing protocols on the other hand I think we should not
install routes with next-hops that are unreachable through other
direct or indirect routes.


More information about the dev mailing list