[ovs-discuss] OVS will not send ARP packets as packet-in to OpenFlow controller

Ryan Izard rizard at g.clemson.edu
Wed Apr 6 22:09:25 UTC 2016


So, we've found a document
<https://github.com/openvswitch/ovs/blob/master/DESIGN.md#user-content-in-band-control>
on in-band control for OVS. The hidden flows we see installed are exactly
what the in-band control document states will be installed, including ARP
flows to/from the LOCAL port's MAC address with output=NORMAL action.

One of these ARP flows is matching our ARP requests directed into br0
(LOCAL) and forwarding them as a learning switch (NORMAL). This looks like
it's the issue. Now to figure out how this happened everywhere and how to
disable it.

Ryan Izard
PhD Candidate, Research/Teaching Assistant
ECE Department, Clemson University
rizard at g.clemson.edu
---------------------------------------------------
Big Switch Networks
ryan.izard at bigswitch.com

On Wed, Apr 6, 2016 at 5:59 PM, Ryan Izard <rizard at g.clemson.edu> wrote:

> On Wed, Apr 6, 2016 at 5:33 PM, Nicholas Bastin <nick.bastin at gmail.com>
> wrote:
>
>> On Wed, Apr 6, 2016 at 5:16 PM, Ryan Izard <rizard at g.clemson.edu> wrote:
>>
>>> I have a very simple topology as follows:
>>>
>>> network----[Dell S4810]-24---link---1-[host w/OVS br0]-LOCAL
>>>
>>> The host with OVS has IP 192.168.1.3/24 with a route into the br0 (i.e.
>>> LOCAL) interface.
>>>
>>
>> I don't really understand what this means.  What ports are on br0 and
>> what interfaces have IP addresses?
>>
>
> br0 has port 1 (eth1) and LOCAL (br0)
>
>>
>>
>>> We try to ping another host on the network from host 192.168.1.3, but
>>> the ping confuses our controller's MAC learning algorithm due to OVS
>>> mishandling ARP packets. Here are some observations:
>>>
>>
>> Where are you issuing the ping from, the command line of the host with
>> OVS?  What do your local routing and arp tables look like?
>>
>
> On the host itself running the OVS bridge, we have a route for
> 192.168.1/24 into br0. We are running ping 192.168.1.4 from the host.
>
>>
>>
>>> -- using OVS 2.3.1 and has been running stably since release until
>>> recently (no known changes)
>>>
>>
>> Do ovs-vsctl commands hang?  I doubt it in your case, but we've had some
>> lockups on vswitchd that forced us to upgrade all the VTS hardware to 2.5.0.
>>
>
> Nope. Nothing hangs.
>
>>
>>
>>> -- there is only 1 flow installed. It is a single, zero-priority,
>>> fully-wildcarded table-miss flow w/output=controller
>>>
>>
>> Well, not really.. :-)  Try:
>>
>> sudo ovs-appctl bridge/dump-flows br0
>>
>
> Good idea :-) Did not realize you could dive that deep into into the
> forwarding tables. There are some ARP flows with NORMAL output actions.
> These definitely look suspicious, especially the one matching our host as
> src MAC, ethertype=ARP, and opcode=request...
>
>>
>> There's some special handling for ARP for in-band control that is set in
>> very-high-priority hidden flows in a late pipeline table.  Make sure you're
>> not hitting those flows.
>>
>
> All these hidden ARP flows are all very high (18000+) priority flows. Why
> would these be here if we are operating in secure mode? More puzzling is
> that we have probably 50 OVS bridges across all our disjoint network
> topologies and disjoint control planes that this problem happened to
> seemingly overnight.
>
>>
>>
>>> -- the Dell switch gets all the ARPs and sends them as packet-ins to our
>>> controller, so they are being forwarded by the OVS somehow
>>>
>>
>> I still don't quite understand your topology graph, but sourcing packets
>> from a host connected to an OVS bridge that it is itself hosting can get
>> problematic without some namespacing.
>>
>
> Will look into this. Should the ideal setup be a veth pair -- one end
> attached to the bridge and the other to a different netns?
>
>>
> I hope this is a little better. Topology is:
>
> [LAN with other hosts, one is 192.168.1.4]
>         |
>         |
> [Dell-S4810--port24]----[eth1(1)--br0(LOCAL)]
>
> IP 192.168.1.3/24 is assigned to br0. ARP packets sent to br0 by the host
> running the OVS br0 bridge arrive on LOCAL. From there, we'd expect a
> packet-in, which obviously now is being stopped by the hidden matching ARP
> flow. Instead, OVS is forwarding ARP for us to port 1, which goes out eth1
> to our next hop switch.
>
>>
>>
>
>>
>>> -- tried installing explicit
>>> priority=1,in_port=LOCAL,dl_type=0x806,actions=output:CONTROLLER flow; this
>>> does not match the ARP packets. They are still forwarded thru OVS
>>> -- there are no other routes on the host that could match the packets
>>> and circumvent OVS
>>>
>>> My inclination is that OVS is forwarding all ARP packets "under the
>>> table" and only sending L3+ and unknown ethertypes (LLDP perhaps?) to the
>>> controller.
>>>
>>
>> All I can guess right now is that you're hitting the in-band ARP matches,
>> although I'm not sure why you've never had this problem before.  More
>> information about your topology and bridge configuration might reveal
>> something more useful.
>>
>
> Yes, we are hitting the in-band ARP matches, but again, as I mentioned
> above, we've been running these OVS (of different versions) for a very long
> time now using LOCAL as a way for our hosts running OVS to attach to the
> data plane. Almost every OVS bridge we have running (on our own machines,
> in CloudLab, in GENI) has gotten into this state at seemingly the same
> time. They're all part of different networks and controllers and different
> locations around the country.
>
>>
>> --
>> Nick
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20160406/2346800a/attachment-0002.html>


More information about the discuss mailing list