[ovs-discuss] [OVN] Too many resubmits for packets coming from "external" network

Daniel Alvarez Sanchez dalvarez at redhat.com
Tue Sep 29 11:23:49 UTC 2020


On Tue, Sep 29, 2020 at 1:14 PM Dumitru Ceara <dceara at redhat.com> wrote:

> On 9/29/20 1:07 PM, Krzysztof Klimonda wrote:
> > On Tue, Sep 29, 2020, at 12:40, Dumitru Ceara wrote:
> >> On 9/29/20 12:14 PM, Daniel Alvarez Sanchez wrote:
> >>>
> >>>
> >>> On Tue, Sep 29, 2020 at 11:14 AM Krzysztof Klimonda
> >>> <kklimonda at syntaxhighlighted.com
> >>> <mailto:kklimonda at syntaxhighlighted.com>> wrote:
> >>>
> >>>     On Tue, Sep 29, 2020, at 10:40, Dumitru Ceara wrote:
> >>>     > On 9/29/20 12:42 AM, Krzysztof Klimonda wrote:
> >>>     > > Hi Dumitru,
> >>>     > >
> >>>     > > This cluster is IPv4-only for now - there are no IPv6 networks
> >>>     defined at all - overlay or underlay.
> >>>     > >
> >>>     > > However, once I increase a number of routers to ~250, a similar
> >>>     behavior can be observed when I send ARP packets for non-existing
> >>>     IPv4 addresses. The following warnings will flood ovs-vswitchd.log
> >>>     for every address not known to OVN when I run `fping -g
> >>>     192.168.0.0/16` <http://192.168.0.0/16> <http://192.168.0.0/16>:
> >>>     > >
> >>>     > > ---8<---8<---8<---
> >>>     > >
> >>>
>  2020-09-28T22:26:40.967Z|21996|ofproto_dpif_xlate(handler6)|WARN|over 4096
> >>>     resubmit actions on bridge br-int while processing
> >>>
>  arp,in_port=1,vlan_tci=0x0000,dl_src=fa:16:3e:75:38:be,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.0.1,arp_tpa=192.168.0.35,arp_op=1,arp_sha=fa:16:3e:75:38:be,arp_tha=00:00:00:00:00:00
> >>>     > > ---8<---8<---8<---
> >>>     > >
> >>>     > > This is even a larger concern for me, as some of our clusters
> >>>     would be exposed to the internet where we can't easily prevent
> >>>     scanning of an entire IP range.
> >>>     > >
> >>>     > > Perhaps this is something that should be handled differently
> for
> >>>     traffic coming from external network? Is there any reason why OVN
> is
> >>>     not dropping ARP requests and IPv6 ND for IP addresses it knows
> >>>     nothing about? Or maybe OVN should drop most of BUM traffic on
> >>>     external network in general? I think all this network is used for
> is
> >>>     SNAT and/or SNAT+DNAT for overlay networks.
> >>>     > >
> >>>     >
> >>>     > Ok, so I guess we need a combination of the existing broadcast
> domain
> >>>     > limiting options:
> >>>     >
> >>>     > 1. send ARP/NS packets only to router ports that own the target
> IP
> >>>     address.
> >>>     > 2. flood IPv6 ND RS packets only to router ports with IPv6
> addresses
> >>>     > configured and ipv6_ra_configs.address_mode set.
> >>>     > 3. according to the logical switch multicast configuration either
> >>>     flood
> >>>     > unkown IP multicast or forward it only to hosts that registered
> >>>     for the
> >>>     > IP multicast group.
> >>>     > 4. drop all other BUM traffic.
> >>>     >
> >>>     > From the above, 1 and 3 are already implemented. 2 is what I
> suggested
> >>>     > earlier. 4 would probably turn out to be configuration option
> that
> >>>     needs
> >>>     > to be explicitly enabled on the logical switch connected to the
> >>>     external
> >>>     > network.
> >>>     >
> >>>     > Would this work for you?
> >>>
> >>>     I believe it would work for me, although it may be a good idea to
> >>>     consult with neutron developers and see if they have any input on
> that.
> >>>
> >>>
> >>> I think that's a good plan. Implementing 4) via a configuration option
> >>> sounds smart. From an OpenStack point of view, I think that as all the
> >>> ports are known, we can just have it on by default.
> >>> We need to make sure it works for 'edge' cases like virtual ports, load
> >>> balancers and subports (ports with a parent port and a tag) but the
> idea
> >>> sounds great to me.
> >>>
> >>> Thanks folks for the discussion!
> >>
> >> Thinking more about it it's probably not OK to drop all other BUM
> >> traffic. Instead we should just flood it on all logical ports of a
> >> logical switch _except_ router ports.
> >>
> >> Otherwise we'll be breaking E-W traffic between VIFs connected to the
> >> same logical switch. E.g., VM1 and VM2 connected to the same LS and VM1
> >> sending ARP request for VM2's IP.
> >
>

Probably this won't affect OpenStack although I'm not saying it's not a
valid concern.
OpenStack will always configure addresses on each logical switch port so
they're going to be known for OVN.
Please, correct me if I'm wrong.


> > Does it also matter for the LS that is used by openstack for external
> networks? We don't usually connect VMs directly to that network, instead
> using FIPs for some VMs and SNATing traffic from other VMs on the router.
> Or is it unrelated to how VM is connected to the network and it would break
> for example FIP<->FIP traffic?
>
> FIP<->FIP traffic wouldn't be affected because those IPs are owned by
> OVN logical routers so they would be taken care of by point 1 above.
>
> >
> >>
> >>>
> >>>
> >>>     >
> >>>     > Thanks,
> >>>     > Dumitru
> >>>     >
> >>>     > > -- Krzysztof Klimonda kklimonda at syntaxhighlighted.com
> >>>     <mailto:kklimonda at syntaxhighlighted.com> On Mon, Sep 28,
> >>>     > > 2020, at 21:14, Dumitru Ceara wrote:
> >>>     > >> On 9/28/20 5:33 PM, Krzysztof Klimonda wrote:
> >>>     > >>> Hi,
> >>>     > >>>
> >>>     > >> Hi Krzysztof,
> >>>     > >>
> >>>     > >>> We're still doing some scale tests of OpenStack ussuri with
> >>>     ml2/ovn driver. We've deployed 140 virtualized compute nodes, and
> >>>     started creating routers that share single external network between
> >>>     them. Additionally, each router is connected to a private network.
> >>>     > >>> Previously[1] we hit a problem of too many logical flows
> being
> >>>     generated per router connected to the same "external" network -
> this
> >>>     put too much stress on ovn-controller and ovs-vswitchd on compute
> >>>     nodes, and we've applied a patch[2] to limit a number of logical
> >>>     flows created per router.
> >>>     > >>> After we dealt with that we've done more testing and created
> >>>     200 routers connected to single external network. After that we've
> >>>     noticed the following logs in ovs-vswitchd.log:
> >>>     > >>>
> >>>     > >>> ---8<---8<---8<---
> >>>     > >>>
> >>>
>  2020-09-28T11:10:18.938Z|18401|ofproto_dpif_xlate(handler9)|WARN|over 4096
> >>>     resubmit actions on bridge br-int while processing
> >>>
>  icmp6,in_port=1,vlan_tci=0x0000,dl_src=fa:16:3e:9b:77:c3,dl_dst=33:33:00:00:00:02,ipv6_src=fe80::f816:3eff:fe9b:77c3,ipv6_dst=ff02::2,ipv6_label=0x2564e,nw_tos=0,nw_ecn=0,nw_ttl=255,icmp_type=133,icmp_code=0
> >>>     > >>> ---8<---8<---8<---
> >>>     > >>>
> >>>     > >>> That starts happening after I create ~178 routers connected
> to
> >>>     the same external network.
> >>>     > >>>
> >>>     > >>> IPv6 RS ICMP packets are coming from the external network -
> >>>     that's due to the fact that all virtual compute nodes have IPv6
> >>>     address on their interface used for the external network and are
> >>>     trying to discover a gateway. That's by accident, and we can remove
> >>>     IPv6 address from that interface, however I'm worried that it would
> >>>     just hide some bigger issue with flows generated by OVN.
> >>>     > >>>
> >>>     > >> Is this an IPv4 cluster; are there IPv6 addresses configured
> on the
> >>>     > >> logical router ports connected to the external network?
> >>>     > >>
> >>>     > >> If there are IPv6 addresses, do the logical router ports
> >>>     connected to
> >>>     > >> the external network have
> >>>     > >> Logical_Router_Port.ipv6_ra_configs.address_mode set?
> >>>     > >>
> >>>     > >> If not, we could try to enhance the broadcast domain limiting
> >>>     code in
> >>>     > >> OVN [3] to also limit sending router solicitations only to
> >>>     router ports
> >>>     > >> with address_mode configured.
> >>>     > >>
> >>>     > >> Regards,
> >>>     > >> Dumitru
> >>>     > >>
> >>>     > >> [3]
> >>>     > >>
> >>>
> https://github.com/ovn-org/ovn/blob/20a20439219493f27eb222617f045ba54c95ebfc/northd/ovn-northd.c#L6424
> >>>     > >>
> >>>     > >>> software stack:
> >>>     > >>>
> >>>     > >>> ovn: 20.06.2
> >>>     > >>> ovs: 2.13.1
> >>>     > >>> neutron: 16.1.0
> >>>     > >>>
> >>>     > >>> [1]
> >>>
> http://lists.openstack.org/pipermail/openstack-discuss/2020-September/017370.html
> >>>     > >>> [2] https://review.opendev.org/#/c/752678/
> >>>     > >>>
> >>>     > >>
> >>>     >
> >>>     >
> >>>     _______________________________________________
> >>>     discuss mailing list
> >>>     discuss at openvswitch.org <mailto:discuss at openvswitch.org>
> >>>     https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> >>>
> >>
> >>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200929/415f62d5/attachment-0001.html>


More information about the discuss mailing list