[ovs-dev] [BUG] broad-/multicast & SLB bonding -> FAIL

Markus Schuster ml at markus.schuster.name
Tue Jan 22 18:39:01 UTC 2013


Hi everybody,

we're using Open vSwitch as it's the default software switch/bridge in XCP 
1.6 (Xen Cloud Platform). As far as I know, it's version 1.4.2. 
I've already posted this problem to the XCP mailinglist but got no reply, so 
I'm trying again here :)

The following is a quote of my mail to the XCP mailinglist:
We have a pool of XCP 1.6 (final) hosts using Open vSwitch. Two NICs form 
an 
active/active SLB bond for network connectivity. 
Recently we migrated a two node Tomcat cluster to this environment and 
those 
two VMs had a very hard time beeing reachable from the outside. After 
investigating the problem a bit further we learned Tomcat is using 
multicast 
for cluster communication and that's where the problem started. 
Open vSwitch sends out the multicast frames on ALL physical interfaces 
belonging to the SLB bond. That causes a lot of confusion on the physical 
switches that XCP hosts are connected to (VM MAC addresses jumping between 
ports multiple times a second). 
After investigating that even further, I noticed the very same problem is 
happening not only for multicast frames, but even for normal broadcast 
frames (ARP, broadcast ping, ...) - luckily Linux servers don't sent that 
much broadcast traffic :)

We spent a few hours digging in the Open vSwitch source code and it looks 
like there's some special handling for broad-/multicast frames - flooding 
them out all ports but the port it came in (classic bridge behavior) - but 
there seems to be no special handling for the SLB case where I'd expect to 
see those packets only on the active slave for the MAC/VLAN combination of 
the sending VM. 

Hope someone can help. 

Best regards,
Markus




More information about the dev mailing list