[ovs-discuss] Strange behaviour with VLANs and Bridges and ARP.

Schlacta, Christopher aarcane at aarcane.org
Thu Aug 3 05:32:48 UTC 2017


So this is a bit hard to explain, but I hope you'll follow.  I have
two hosts, density and densetsu.  they each host VMs and CEPH nodes
using libvirt and ceph and they're connected to a smart switch using
openvswitch and there are a bunch of VLANs.  5, 6, 10, 20, and 30.
Most "normal" traffic goes along VLAN 10.  That's just the LAN VLAN.
So here's what the ovs-vsctl show looks like on each host:


density:
aarcane at density:~$ sudo ovs-vsctl show
YubiKey for `aarcane':
f2ae0266-6cae-44f0-8ca5-9d6f66562ff4
    Bridge "br0"
        Port lan
            tag: 10
            Interface lan
                type: internal
        Port "vnet0"
            trunks: [5, 6, 10, 20, 30]
            Interface "vnet0"
        Port "eth1"
            Interface "eth1"
        Port "vnet1"
            trunks: [5, 6, 10, 20, 30]
            Interface "vnet1"
        Port "br0"
            Interface "br0"
                type: internal
    ovs_version: "2.7.0"


densetsu:
aarcane at densetsu:~$ sudo ovs-vsctl show
2d6843ee-2bb6-48b8-a979-ba7f64bf5ebc
    Bridge "br0"
        Port "eth0"
            Interface "eth0"
        Port lan
            tag: 10
            Interface lan
                type: internal
        Port "br0"
            Interface "br0"
                type: internal
    ovs_version: "2.7.0"


The problem is when I try to ping the opposite machine (densetsu to
density or vice versa), the ARP packets get sent with appropriate
information through BR0 and out the eth device all tagged with VLAN
10, but the *inbound* arp packet is never sent to the lan interface.
I see it at the br0 and the eth interface, but not the destination
host's lan interface.  Furthermore, the destination never seems to see
this or respond to it, so it's then impossible for the to initiate
contact without adding entries to the ethers file.

This is well and good, but it also means that other systems on the
network also cannot connect to them for purposes of administration and
management, and this is very problematic.

I've not made any changes to openflow.  They've been able to
communicate with each other for a long time.  This change happened
with a recent kernel upgrade.  Not sure how to fix it or if it's a
bug.  Again, note:  All IP traffic seems to work fine.  ICMP works
without issue, TCP, etc.  It's only the ARP protocol that seems to not
be passing inward when it should be.


More information about the discuss mailing list