[ovs-dev] ovn ping from VM to external gateway IP failed.

Numan Siddique nusiddiq at redhat.com
Mon Jan 2 11:46:55 UTC 2017


On Mon, Jan 2, 2017 at 2:07 AM, Mickey Spiegel <mickeys.dev at gmail.com>
wrote:

>
> On Sun, Jan 1, 2017 at 10:31 AM, Numan Siddique <nusiddiq at redhat.com>
> wrote:
>
>>
>>
>> On Sun, Jan 1, 2017 at 6:39 AM, Mickey Spiegel <mickeys.dev at gmail.com>
>> wrote:
>>
>>>
>>> On Sat, Dec 31, 2016 at 1:19 AM, Mickey Spiegel <mickeys.dev at gmail.com>
>>> wrote:
>>>
>>>>
>>>> On Fri, Dec 30, 2016 at 11:37 AM, Mickey Spiegel <mickeys.dev at gmail.com
>>>> > wrote:
>>>>
>>>>>
>>>>> On Fri, Dec 30, 2016 at 7:46 AM, Numan Siddique <nusiddiq at redhat.com>
>>>>> wrote:
>>>>>
>>>>>> On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun <dongj at dtdream.com> wrote:
>>>>>>
>>>>>
>>>>> <snip>
>>>>>
>>>>>
>>>>>>>>>>>> Hi Dong Jun, I am also facing the same issue on my setup.
>>>>>>>>>>>> These are the findings of my investigation so far
>>>>>>
>>>>>> Looks like this issue is seen after the commit
>>>>>> https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312
>>>>>> 622fbaeacbc6ce7576e347
>>>>>>>>>>>> which removes the usage of patch ports and uses the clone action
>>>>>> instead.
>>>>>>>>>>>>
>>>>>> I reverted to the commit just before it and SNAT/DNAT is working as
>>>>>> expected.
>>>>>>
>>>>>> In my case, the gateway router is hosted on node 1 and the I am
>>>>>> trying to
>>>>>> reach a VM (192.168.0.5) hosted on node 2 using the external ip
>>>>>> (10.2.7.105) associated ​with it. I could see that the node 1 is
>>>>>> sending
>>>>>> the packet to node 2 through the geneve tunnel, but it is dropped by
>>>>>> node 2
>>>>>> flows.
>>>>>>
>>>>>> Below is the tcpdump of the packet
>>>>>>
>>>>>> **************************
>>>>>> 19:39:44.709907 IP 182.16.0.16.60069 > 182.16.0.15.geneve: Geneve,
>>>>>> Flags
>>>>>> [none], vni 0x1: IP nusiddiq.blr.redhat.com > 192.168.0.5: ICMP echo
>>>>>> request, id 13240, seq 1, length 64
>>>>>> ***************************
>>>>>>
>>>>>> Below is the tcpdump of the packet with the ovn-controller (without
>>>>>> the
>>>>>> above commit) in the working case
>>>>>>
>>>>>> **************************
>>>>>> 19:41:56.783570 IP 182.16.0.12.29778 > 182.16.0.15.geneve: Geneve,
>>>>>> Flags
>>>>>> [C], vni 0x1, options [8 bytes]: IP nusiddiq.blr.redhat.com >
>>>>>> 192.168.0.5:
>>>>>> ICMP echo request, id 13308, seq 1, length 64
>>>>>> 19:41:56.784270 IP 182.16.0.15.14539 > 182.16.0.12.geneve: Geneve,
>>>>>> Flags
>>>>>> [C], vni 0xf, options [8 bytes]: IP 192.168.0.5 >
>>>>>> nusiddiq.blr.redhat.com:
>>>>>> ICMP echo reply, id 13308, seq 1, length 64
>>>>>> **************************
>>>>>>
>>>>>> The options data has - 00030005
>>>>>>
>>>>>> From the packet, I could see that the packet from node 1 is missing
>>>>>> the
>>>>>> geneve option fields which has inport and outport keys.
>>>>>>
>>>>>
>>>>> I am facing the same issue running my distributed NAT patch set.
>>>>> Between UNSNAT recirc and output to tunnel, a megaflow is installed
>>>>> that
>>>>> is missing the geneve option fields.
>>>>>
>>>>> I verified that the table=32 openflow rule has the geneve option
>>>>> fields.
>>>>> ofproto/trace shows geneve in the "Datapath actions" at the end, so no
>>>>> problem with whatever ofproto/trace is using.
>>>>>
>>>>
>>>> Throwing some logs in, I see that flow->metadata.present.map is 0 rather
>>>> than 1 coming into tun_metadata_to_geneve_nlattr() in
>>>> lib/tun-metadata.c,
>>>> when the problem occurs. That is why the geneve option fields are
>>>> missing.
>>>>
>>>> I have not yet figured out why flow->metadata.present.map is 0. It
>>>> should
>>>> be modified when tun_metadata_write() is called due to actions setting
>>>> tunnel metadata values. I have not checked that yet.
>>>>
>>>
>>> I just posted a fix. I did not try it with the gateway router or with
>>> OpenStack,
>>> but with this bug fix all distributed NAT manual test cases are now
>>> passing.
>>>
>>>
>> ​Thanks for the fix. I just tested it. Its working when I am trying to
>> reach the ​VM using its floating ip. But not when trying to ping
>> www.google.com from the VM (SNAT use case)
>>
>
> With distributed NAT, most of my debugging and tests were using SNAT. The
> bug fix that I posted fixed the problem that was causing ICMP echo replies
> to be dropped. The openflow path for distributed SNAT is similar to that
> for SNAT on gateway routers, but there are still some differences, notably
> one router instead of two routers and no "join" switch. Also I did not try
> it with DNS.
>
> Are you able to debug further, to see whether a missing geneve options
> field is still the culprit?
> It is possible that removal of patch ports within br-int uncovered other
> issues.
>


​With some testing I could see that in the node where the gateway is hosted
 - The ​reply packet reaches the gateway router pipeline -> to the otls
switch pipeline (via clone) -> to the router pipeline -> to the peer port
of the switch.
​The packet gets dropped at table 22

 table=22, n_packets=275, n_bytes=26686,
priority=65535,ct_state=+inv+trk,metadata=0x1 actions=drop

Not sure why it is happening. I will try to debug further.

Numan



> I primarily used ovs-dpctl dump-flows to see installed megaflows, ovs-appctl
> ofproto/trace (with recirc_id), and ovs-ofctl dump-flows for initial
> debugging. In particular I could see that the installed megaflows were
> lacking the geneve options field in the actions.
>
> Mickey
>
>
>> Numan
>>
>>
>>> Mickey
>>>
>>>
>>>> Mickey
>>>>
>>>>
>>>>> Mickey
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Numan
>>>>>>
>>>>>>
>>>>>> > _______________________________________________
>>>>>> > dev mailing list
>>>>>> > dev at openvswitch.org
>>>>>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>> >
>>>>>> _______________________________________________
>>>>>> dev mailing list
>>>>>> dev at openvswitch.org
>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


More information about the dev mailing list