[ovs-discuss] Inband Behavior in XenServer Host

Thu Dec 22 17:59:40 UTC 2011

On 12/1/2011 2:38 PM, David Erickson wrote:
> On 12/01/2011 01:11 PM, David Erickson wrote:
>> On 12/1/2011 12:05 PM, Ben Pfaff wrote:
>>> On Thu, Dec 01, 2011 at 12:02:46PM -0800, David Erickson wrote:
>>>> On 12/1/2011 12:01 PM, Ben Pfaff wrote:
>>>>> On Wed, Nov 30, 2011 at 06:18:15PM -0800, David Erickson wrote:
>>>>>> I think I've run into another semi-related issue with inband. I
>>>>>> recently changed the fail mode of my OVS instances from standalone
>>>>>> to secure because I wanted them to continue using the rules that had
>>>>>> been set in the event the controller died or needed restarting.
>>>>>> This had the unfortunate side effect that if the controller is down
>>>>>> long enough then the box can lose its DHCP lease, and be unable to
>>>>>> get a new one because DHCP requests are trying to be sent to the
>>>>>> controller, but the switch isn't able to connect to the controller
>>>>>> without an address, ie:
>>>>>>
>>>>>> Nov 30 21:04:52 localhost dhclient: DHCPDISCOVER on xenbr0 to
>>>>>> 255.255.255.255 port 67 interval 4
>>>>>> Nov 30 21:04:56 localhost dhclient: No DHCPOFFERS received.
>>>>>> Nov 30 21:04:56 localhost dhclient: No working leases in persistent
>>>>>> database - sleeping.
>>>>>> Nov 30 21:04:58 localhost ovs-vswitchd:
>>>>>> 789878|stream_tcp|ERR|tcp:192.168.1.11:6633: connect: Network is
>>>>>> unreachable
>>>>>>
>>>>>> What is the behavior of the inband rules when the switch is in
>>>>>> secure mode and has lost connection to the controller?
>>>>> No different from any other time.
>>>>>
>>>>>> It seems to me like they are being ignored, or at least the DHCP
>>>>>> rule doesn't seem to be working.
>>>>> Please investigate further. I would start by finding out whether the
>>>>> DHCP requests are making it out on the wire, then if they are, 
>>>>> whether
>>>>> DHCP replies are visible coming back across the wire.
>>>> Ya the requests aren't making it onto the wire.
>>> Please capture the kernel flow that matches the request with
>>> "ovs-dpctl dump-flows", then feed that flow back into "ofproto/trace"
>>> to see what OVS is actually doing with it.
>>
>> So in the base case where dump-flows returns no flows at all, the host
>> has lost its dhcp lease it had previously, and the controller had gone
>> unreachable so the switch moved to fail secure mode:
>>
>> ovs-appctl ofproto/trace xenbr0 0 65534
>> ffffffffffff00c09f9ffed8080045100148000000001011a99600000000ffffffff004400430134819801010600c4bea10a000000000000000000000000000000000000000000c09f9ffed80000000000000000000000000000000000000000 
>>
>>
>>
>> Packet: 00:c0:9f:9f:fe:d8 > Broadcast, ethertype IPv4 (0x0800), length
>> 96: truncated-ip - 246 bytes missing! 0.0.0.0.bootpc >
>> 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:c0:9f:9f:fe:d8,
>> length: 300
>> Flow: tunnel0:in_port0000:tci(0) mac00:c0:9f:9f:fe:d8->ff:ff:ff:ff:ff:ff
>> type0800 proto17 tos16 ip0.0.0.0->255.255.255.255 port68->67
>> Rule: table=0 cookie=0
>> priority=180000,udp,in_port=0,dl_src=00:c0:9f:9f:fe:d8,tp_src=68,tp_dst=67 
>>
>> OpenFlow actions=NORMAL
>>
>> Final flow: unchanged
>> Datapath actions: drop
>>
>> Any ideas?
>>
>
> Here are some short instructions on how to reproduce:
> -Connect OVS to a controller with fail mode set to standalone
> -Set DHCP server to hand out a very short term lease to the host 
> XenServer machine (say 180 seconds)
> -Release/renew the lease on the host so it is on the new lease time
> -Set OVS switch fail mode to secure
> -Add iptables rule on your controller machine to block the XS host's ip
> -OVS should lose connection to controller
> -Ensure dump-flows shows no flows, particularly none that would handle 
> the DHCP request/response
> -DHCP lease should expire
> -Badness should ensue

Bumping this to see if you guys have been able to reproduce internally 
or not.

Thanks,
David