[ovs-dev] [ovs-discuss] problem with LACP Failover

Thu Jul 26 21:21:12 UTC 2012

Please keep replies on the list.

Could you please try changing the bond-mode to balance-tcp?

ovs-vsctl set port bond0 bond_mode=balance-tcp.

I'm not sure if that will help you, but it might.  balance-slb bonds
aren't meant to be used with LACP.  In newer versions of Open vSwitch
we've tried to discourage their use more aggressively.

Ethan

On Thu, Jul 26, 2012 at 2:16 PM, Pacôme MASSOL <pacome at pmassol.net> wrote:
> Le 26/07/2012 20:13, Ethan Jackson a écrit :
>
>> Do I understand correctly that you're creating a lacp bond within a
>> hypervisor going from a guest to DOM0?  That's a bit strange . . .
>
>
> It is for teaching purpose in an e-learning context : implement LACP without
> 2 PC with 2 NICS + manageable switch.
>
>
>>> - play with ovs-appctl bond/set-active-slave and change active slave
>>
>> set-active-slave shouldn't affect LACP bonds.  If you want to trigger
>> a failover you will need to take down the link.
>>
>> What does ovs-appctl bond/show and ovs-appctl lacp/show say?  Those
>> commands should help you figure out if things are configured properly.
>
>
> These commands are always ok. Show that a link has been disabled, that slave
> has changed, etc.
>
>
> # ovs-appctl bond/show bond0
> bond_mode: balance-slb
> bond-hash-algorithm: balance-slb
> bond-hash-basis: 0
> updelay: 0 ms
> downdelay: 0 ms
> next rebalance: 5517 ms
> lacp_negotiated: true
>
> slave sw0p1: enabled
>     active slave
>     may_enable: true
>
> slave sw0p0: enabled
>     may_enable: true
>     hash 153: 1 kB load
>
>
>
> # ovs-appctl lacp/show
> ---- bond0 ----
>     status: active negotiated
>     sys_id: 7a:12:64:1a:15:45
>     sys_priority: 65534
>     aggregation key: 5
>     lacp_time: fast
>
> slave: sw0p0: current attached
>     port_id: 6
>     port_priority: 65535
>
>     actor sys_id: 7a:12:64:1a:15:45
>     actor sys_priority: 65534
>     actor port_id: 6
>     actor port_priority: 65535
>     actor key: 5
>     actor state: activity timeout aggregation synchronized collecting
> distributing
>
>     partner sys_id: 08:00:27:9b:24:12
>     partner sys_priority: 65535
>     partner port_id: 2
>     partner port_priority: 255
>     partner key: 17
>     partner state: activity aggregation synchronized collecting distributing
>
> slave: sw0p1: current attached
>     port_id: 5
>     port_priority: 65535
>
>     actor sys_id: 7a:12:64:1a:15:45
>     actor sys_priority: 65534
>     actor port_id: 5
>     actor port_priority: 65535
>     actor key: 5
>     actor state: activity timeout aggregation synchronized collecting
> distributing
>
>     partner sys_id: 08:00:27:9b:24:12
>     partner sys_priority: 65535
>     partner port_id: 1
>     partner port_priority: 255
>     partner key: 17
>     partner state: activity aggregation synchronized collecting distributing
>
>
>
>
>
>>
>> You say connectivity is lost, where is the traffic getting dropped?
>
>
> even with tcpdump + wireshark, it is hard for me to say. Curious : for
> example, the client has to send 4 ARP requests to have one reply from the
> bonded VM. The bonded VM receives only one request...
>
>
>
>>
>> Ethan
>>
>>> - or virtually "disconnect" the wire in VirtualBox
>>> - or put a TAP interface down
>>> Failover is detected by Openvswitch and the VM concerned, but
>>> connectivity
>>> is lost. Tried also with Linux (bonding module), with different bond_mode
>>> :
>>> without success.
>>>
>>>
>>> I surely missed something. Please, can you help me or point me out to
>>> some
>>> link or tutorial or tool to diagnose ???
>>>
>>>
>>> Thanks in advance.
>>>
>>> Best regards.
>>>
>>> PM
>>> _______________________________________________
>>> discuss mailing list
>>> discuss at openvswitch.org
>>> http://openvswitch.org/mailman/listinfo/discuss
>>
>> _______________________________________________
>> discuss mailing list
>> discuss at openvswitch.org
>> http://openvswitch.org/mailman/listinfo/discuss
>
>