[ovs-dev] ovn ping from VM to external gateway IP failed.

Mickey Spiegel mickeys.dev at gmail.com
Sun Jan 1 20:37:45 UTC 2017


On Sun, Jan 1, 2017 at 10:31 AM, Numan Siddique <nusiddiq at redhat.com> wrote:

>
>
> On Sun, Jan 1, 2017 at 6:39 AM, Mickey Spiegel <mickeys.dev at gmail.com>
> wrote:
>
>>
>> On Sat, Dec 31, 2016 at 1:19 AM, Mickey Spiegel <mickeys.dev at gmail.com>
>> wrote:
>>
>>>
>>> On Fri, Dec 30, 2016 at 11:37 AM, Mickey Spiegel <mickeys.dev at gmail.com>
>>> wrote:
>>>
>>>>
>>>> On Fri, Dec 30, 2016 at 7:46 AM, Numan Siddique <nusiddiq at redhat.com>
>>>> wrote:
>>>>
>>>>> On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun <dongj at dtdream.com> wrote:
>>>>>
>>>>
>>>> <snip>
>>>>
>>>>
>>>>>>>>>> Hi Dong Jun, I am also facing the same issue on my setup.
>>>>>>>>>> These are the findings of my investigation so far
>>>>>
>>>>> Looks like this issue is seen after the commit
>>>>> https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312
>>>>> 622fbaeacbc6ce7576e347
>>>>>>>>>> which removes the usage of patch ports and uses the clone action
>>>>> instead.
>>>>>>>>>>
>>>>> I reverted to the commit just before it and SNAT/DNAT is working as
>>>>> expected.
>>>>>
>>>>> In my case, the gateway router is hosted on node 1 and the I am trying
>>>>> to
>>>>> reach a VM (192.168.0.5) hosted on node 2 using the external ip
>>>>> (10.2.7.105) associated ​with it. I could see that the node 1 is
>>>>> sending
>>>>> the packet to node 2 through the geneve tunnel, but it is dropped by
>>>>> node 2
>>>>> flows.
>>>>>
>>>>> Below is the tcpdump of the packet
>>>>>
>>>>> **************************
>>>>> 19:39:44.709907 IP 182.16.0.16.60069 > 182.16.0.15.geneve: Geneve,
>>>>> Flags
>>>>> [none], vni 0x1: IP nusiddiq.blr.redhat.com > 192.168.0.5: ICMP echo
>>>>> request, id 13240, seq 1, length 64
>>>>> ***************************
>>>>>
>>>>> Below is the tcpdump of the packet with the ovn-controller (without the
>>>>> above commit) in the working case
>>>>>
>>>>> **************************
>>>>> 19:41:56.783570 IP 182.16.0.12.29778 > 182.16.0.15.geneve: Geneve,
>>>>> Flags
>>>>> [C], vni 0x1, options [8 bytes]: IP nusiddiq.blr.redhat.com >
>>>>> 192.168.0.5:
>>>>> ICMP echo request, id 13308, seq 1, length 64
>>>>> 19:41:56.784270 IP 182.16.0.15.14539 > 182.16.0.12.geneve: Geneve,
>>>>> Flags
>>>>> [C], vni 0xf, options [8 bytes]: IP 192.168.0.5 >
>>>>> nusiddiq.blr.redhat.com:
>>>>> ICMP echo reply, id 13308, seq 1, length 64
>>>>> **************************
>>>>>
>>>>> The options data has - 00030005
>>>>>
>>>>> From the packet, I could see that the packet from node 1 is missing the
>>>>> geneve option fields which has inport and outport keys.
>>>>>
>>>>
>>>> I am facing the same issue running my distributed NAT patch set.
>>>> Between UNSNAT recirc and output to tunnel, a megaflow is installed that
>>>> is missing the geneve option fields.
>>>>
>>>> I verified that the table=32 openflow rule has the geneve option fields.
>>>> ofproto/trace shows geneve in the "Datapath actions" at the end, so no
>>>> problem with whatever ofproto/trace is using.
>>>>
>>>
>>> Throwing some logs in, I see that flow->metadata.present.map is 0 rather
>>> than 1 coming into tun_metadata_to_geneve_nlattr() in
>>> lib/tun-metadata.c,
>>> when the problem occurs. That is why the geneve option fields are
>>> missing.
>>>
>>> I have not yet figured out why flow->metadata.present.map is 0. It should
>>> be modified when tun_metadata_write() is called due to actions setting
>>> tunnel metadata values. I have not checked that yet.
>>>
>>
>> I just posted a fix. I did not try it with the gateway router or with
>> OpenStack,
>> but with this bug fix all distributed NAT manual test cases are now
>> passing.
>>
>>
> ​Thanks for the fix. I just tested it. Its working when I am trying to
> reach the ​VM using its floating ip. But not when trying to ping
> www.google.com from the VM (SNAT use case)
>

With distributed NAT, most of my debugging and tests were using SNAT. The
bug fix that I posted fixed the problem that was causing ICMP echo replies
to be dropped. The openflow path for distributed SNAT is similar to that
for SNAT on gateway routers, but there are still some differences, notably
one router instead of two routers and no "join" switch. Also I did not try
it with DNS.

Are you able to debug further, to see whether a missing geneve options
field is still the culprit?
It is possible that removal of patch ports within br-int uncovered other
issues.

I primarily used ovs-dpctl dump-flows to see installed megaflows, ovs-appctl
ofproto/trace (with recirc_id), and ovs-ofctl dump-flows for initial
debugging. In particular I could see that the installed megaflows were
lacking the geneve options field in the actions.

Mickey


> Numan
>
>
>> Mickey
>>
>>
>>> Mickey
>>>
>>>
>>>> Mickey
>>>>
>>>>
>>>>
>>>>>
>>>>> Thanks
>>>>> Numan
>>>>>
>>>>>
>>>>> > _______________________________________________
>>>>> > dev mailing list
>>>>> > dev at openvswitch.org
>>>>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>> >
>>>>> _______________________________________________
>>>>> dev mailing list
>>>>> dev at openvswitch.org
>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>
>>>>
>>>>
>>>
>>
>


More information about the dev mailing list