[ovs-discuss] Can I add a manual bonded interface as a ovs port ?

netsurfed zhuohaofan at 126.com
Thu Jun 21 02:10:10 UTC 2018


Dear Ben:
I don't know the snapshot cannot be display in the email, so I wrote this again, please ignore the last one, thank you. If my expression is not clear enough, please let me know in time. Thank you very much.


When I added a"Linux balance-alb bond" to an OVS bridge, I had a problem. Some machines can ping the bridge‘s IP, but some can not.
My procedure would look something like this:
--------------------- --------------------- --------------------- --------------------- | machine-1, ip: ip-1 | | machine-2, ip: ip-2 | | | | machine-n, ip: ip-n | | arp: | | arp: | | ...... | | arp: | | ip-x mac-x1 | | ip-x mac-x1 | | | | ip-x mac-x2 | --------------------- --------------------- --------------------- ---------------------


------------------------------------------------------------------------
| | ------------------------------------------------------------------ |
| | | -------------------------- -------------------------- | | |
| | | | eth1, mac: mac-x1 | | eth2, mac: mac-x2 | | | |
| | |  --------------------------    --------------------------     |     |     |
| | | ovs port1: bond0, mac: mac-x1                          |     |     |
| |  -------------------------------------------------------------|     |     |
| | ovs bridge: ovsbr0, ip: ip-x, mac: mac-x1                   |     |
|  -------------------------------------------------------------------       |
| machine-x                                                                           |
 --------------------------------------------------------------------------


1. configure "bond0" in "machine-x", mode is balance-alb.
$ ip link 3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT qlen 1000 link/ether ac:1f:6b:12:4c:1e brd ff:ff:ff:ff:ff:ff 5: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT qlen 1000 link/ether ac:1f:6b:12:4c:1f brd ff:ff:ff:ff:ff:ff 6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP mode DEFAULT qlen 1000 link/ether ac:1f:6b:12:4c:1e brd ff:ff:ff:ff:ff:ff


2. add an ovs bridge "ovsbr0" in "machine-x", and add a port "bond0"(linux balance-alb bond).
$ ovs-vsctl add-br ovsbr0
$ ovs-vsctl add-port ovsbr0 bond0
$ ip addr flush dev bond0
$ ip addr add ip-x/24 dev ovsbr0
$ ip link set ovsbr0 up


3.many machines ping the bridge's IP at the same time.
Some machines can ping the bridge‘s IP, but some can not.


I tried to analyze the cause of the problem. I found that any machine with HWaddress "mac-x1" in the ARP table could ping "machine-x", while others with HWaddress "mac-x2" could not.
The description of the balance-alb in the bonding documentation(https://www.kernel.org/doc/Documentation/networking/bonding.txt) is as follows:
balance-alb or 6

		Adaptive load balancing: includes balance-tlb plus
		receive load balancing (rlb) for IPV4 traffic, and
		does not require any special switch support.  The
		receive load balancing is achieved by ARP negotiation.
		The bonding driver intercepts the ARP Replies sent by
		the local system on their way out and overwrites the
		source hardware address with the unique hardware
		address of one of the slaves in the bond such that
		different peers use different hardware addresses for
		the server.

		Receive traffic from connections created by the server
		is also balanced.  When the local system sends an ARP
		Request the bonding driver copies and saves the peer's
		IP information from the ARP packet.  When the ARP
		Reply arrives from the peer, its hardware address is
		retrieved and the bonding driver initiates an ARP
		reply to this peer assigning it to one of the slaves
		in the bond.  A problematic outcome of using ARP
		negotiation for balancing is that each time that an
		ARP request is broadcast it uses the hardware address
		of the bond.  Hence, peers learn the hardware address
		of the bond and the balancing of receive traffic
		collapses to the current slave.  This is handled by
		sending updates (ARP Replies) to all the peers with
		their individually assigned hardware address such that
		the traffic is redistributed.  Receive traffic is also
		redistributed when a new slave is added to the bond
		and when an inactive slave is re-activated.  The
		receive load is distributed sequentially (round robin)
		among the group of highest speed slaves in the bond.

		When a link is reconnected or a new slave joins the
		bond the receive traffic is redistributed among all
		active slaves in the bond by initiating ARP Replies
		with the selected MAC address to each of the
		clients. The updelay parameter (detailed below) must
		be set to a value equal or greater than the switch's
		forwarding delay so that the ARP Replies sent to the
		peers will not be blocked by the switch.

		Prerequisites:

		1. Ethtool support in the base drivers for retrieving
		the speed of each slave.

		2. Base driver support for setting the hardware
		address of a device while it is open.  This is
		required so that there will always be one slave in the
		team using the bond hardware address (the
		curr_active_slave) while having a unique hardware
		address for each slave in the bond.  If the
		curr_active_slave fails its hardware address is
		swapped with the new curr_active_slave that was
		chosen.
Different peers use different hardware addresses for the server.  So I wonder if the destination MAC address of the message sent from "machine-n" is not in the CAM table of ovs bridge, so ovs bridge discarded the message?
If this is the case, does it mean that the ovs bridge does not support bonding of specific receive load balancing function? And, is there any way to solve this problem?


I tested linux bridge, which has no such problem. I found that the destination MAC address of the message eth2 received was mac-x2, but after the message was sent to bond0, the MAC address was changed to mac-x1. Is this the reason why the Linux bridge doesn't have this ping problem? If so, what can I do to make ovs bridge work?


I used the "tcpdump" tool to grab packages for "eth1", "eth2", "bond0", and linux bridge "br0", as below:
tcpdump -i br0 -n -e -p src 13.10.12.102 or dst 13.10.12.102
11:49:32.843719 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3357, length 64
11:49:32.843744 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3357, length 64
11:49:33.843731 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3358, length 64
11:49:33.843754 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3358, length 64
11:49:34.843745 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3359, length 64
11:49:34.843768 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3359, length 64
11:49:35.843841 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3360, length 64
11:49:35.843869 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3360, length 64


tcpdump -i bond0 -n -e -p src 13.10.12.102 or dst 13.10.12.102
11:49:32.843713 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3357, length 64
11:49:32.843747 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3357, length 64
11:49:33.843724 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3358, length 64
11:49:33.843757 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3358, length 64
11:49:34.843738 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3359, length 64
11:49:34.843771 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3359, length 64
11:49:35.843834 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3360, length 64
11:49:35.843873 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3360, length 64


tcpdump -i eth1 -n -e -p src 13.10.12.102 or dst 13.10.12.102
11:49:32.843752 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3357, length 64
11:49:33.843762 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3358, length 64
11:49:34.843776 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3359, length 64
11:49:35.843878 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3360, length 64


tcpdump -i eth2 -n -e -p src 13.10.12.102 or dst 13.10.12.102
11:49:32.843703 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1f, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3357, length 64
11:49:33.843717 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1f, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3358, length 64
11:49:34.843730 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1f, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3359, length 64
11:49:35.843828 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1f, ethertype IPv4 (0x0800), length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3360, length 64



At 2018-04-29 03:01:53, "Ben Pfaff" <blp at ovn.org> wrote:
>On Fri, Apr 27, 2018 at 11:44:46AM +0800, netsurfed wrote:
>> There is a "balance-alb" bond0 in my linux host, like this:
>> 
>> 
>> Can I add this as a port to ovs bridge? like this:
>> ovs-vsctl add-br ovsbr0
>> ovs-vsctl add-port ovsbr0 bond0
>> 
>> 
>> I know ovs can create bond using "ovs-vsctl add-bond BRIDGE PORT IFACE...". 
>> However, that requires removing the original bond first. I don't want to do that.
>
>Usually it works fine to add a Linux bond to an OVS bridge.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20180621/efe102f3/attachment-0001.html>


More information about the discuss mailing list