[ovs-discuss] Asymetric bridging problem?

Jonathan Proulx jon at csail.mit.edu
Mon Jun 26 16:12:19 UTC 2017


Hi All,


I've been using OVS in my OpenStack deployment for years and it's
mostly "just worked" so I've soemwhat embarassingly managed to remain
ignorant of it's inner workings...until now.

I have a situation where a client VM (which I don't have direct
internal access to but to have fairly resonsive competent admin to
talk to who does) which has two interfaces on the same L2 network.

Under load its primary network somtimes fails. This is eth0 inside the
system which is where default route is configured. I've not checked
it's secondary network.

This was motly done as a P2V migration and the only practical value at
this point seems to be keeping the interface counters distinct.  I
suggested "well don't do that, just put both IPs on one interface".
But they don't want to and similar configs with more interfaces on
same L2 seem to be working for them on other VMs.

What I have seen is when failing out bound traffic emerages from the
vm-eth0 path but OVS sends incoming traffic to the vm-eth1 path. The
returning traffic then gets dropped by iptables on the tap device
before the VM can see it. 

This only happens under high traffic load (which for this service is
may small flows), and sometimes flips back to the "good" path after
multiple attemps (these were conenction attempts from VM to external
webservers)

Why would OVS switch which port incoming traffic goes to and how might
I stop it? OR what else can I do to determine this?


Details of the internal path:

physical host:
Ubuntu 14.04
OpenStack Mitaka (ML2/OVS plugin)
ovs-vsctl (Open vSwitch) 2.5.0


          --------------------
          |                  |
          |        VM        |
          | eth0        eth1 |
          ---|-----------|----
iptables--- tap0        tap1 ---iptables
        |                      |
lin-br0 |                      | lin-br1
        --- veth0a    veth1a----
              |         |
        ______|_________|_______
        |     |         |      |
ovsbr0  |   veth0b    veth1b   |
        |                      |
        |        patch0a       |
        ____________|___________
                    |
                    |           
            --------|-------
            |    patch0b   |
    ovsbr1  |              |
            |     phy      |
            _______|________
                   |
                   |
                internet


in the typical OpenStack way there's a linux bridge device for each VM
interface so that iptables can be applied to their tap devices. This
connects into the OVS system using veth pairs.

The veth devices in the ovsbr0 are where I see the split happen.

iptables on tap1 drop the misdirected incoming packtes so they never
make it to 

view from inside VM (IP addrs changed to protect the guilty):


vm:# route -n
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.52.1     0.0.0.0         UG    0      0        0 eth0
192.168.52.0     0.0.0.0         255.255.255.0   U     0      0        0 eth1
192.168.52.0     0.0.0.0         255.255.252.0   U     0      0        0 eth0
192.168.52.1     0.0.0.0         255.255.255.255 UH    0      0        0 eth0

vm:# ifconfig
eth0      Link encap:Ethernet  HWaddr fa:16:3e:4c:46:31
          inet addr:192.168.52.112  Bcast:192.168.55.255  Mask:255.255.252.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5258275421 errors:0 dropped:297481 overruns:0 frame:0
          TX packets:49893672145 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:551639416655 (513.7 GiB)  TX bytes:67947667619144 (61.7 TiB)
eth1      Link encap:Ethernet  HWaddr fa:16:3e:7a:8b:f9
          inet addr:192.168.52.114  Bcast:192.168.52.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:75344280732 errors:0 dropped:297478 overruns:0 frame:0
          TX packets:27625783576 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:72808509598845 (66.2 TiB)  TX bytes:5986645540469 (5.4 TiB)


Any clues what to look for, what to read, or where to go from here
most welcome...

Thanks,
-Jon


More information about the discuss mailing list