[ovs-discuss] [ovs-dev] [ovn] no tunnel from GW to compute

Tony Liu tonyliu0592 at hotmail.com
Wed Oct 28 20:36:28 UTC 2020


Thanks Numan for the hint! I thought it's about OF flow and DP flow.
Didn't realize the problem could be in upstream. I checked NB DB and
found a LRP leftover from previous deletion. Someone was testing a
deployment with network and VMs. They destroy and recreate the
deployment back-to-back. It's likely some issue in Neutron OVN ML2
driver who is responsible for programming NB DB. Somehow, the driver
didn't have a chance to completely finish the deletion while start
the new creation. I manually delete that stale LRP, everything starts
working fine.

Tony
> -----Original Message-----
> From: Numan Siddique <numans at ovn.org>
> Sent: Wednesday, October 28, 2020 1:56 AM
> To: Tony Liu <tonyliu0592 at hotmail.com>
> Cc: ovs-discuss <ovs-discuss at openvswitch.org>; ovs-dev <ovs-
> dev at openvswitch.org>
> Subject: Re: [ovs-dev] [ovn] no tunnel from GW to compute
> 
> On Wed, Oct 28, 2020 at 11:29 AM Tony Liu <tonyliu0592 at hotmail.com>
> wrote:
> >
> > Checked OF flows for working and non-working FIPs, can't find any
> > difference. DF flow is installed by vswitchd based on OF flow when
> > processing the first packet. I enabled debug logging, no log for
> > working FIP, a few logs for non-working FIP. Does that mean something
> > wrong about non-working FIP?
> 
> Could you share your OVN NB DB ? or ovn-nbctl commands to create the
> resources so that I can try it out locally ?
> 
> Thanks
> Numan
> 
> > ==================================================
> > 2020-10-28T05:15:22.058Z|01203|dpif(handler8)|DBG|system at ovs-system:
> miss upcall:
> > recirc_id(0),dp_hash(0),skb_priority(0),in_port(4),skb_mark(0),ct_stat
> > e(0),ct_zone(0),ct_mark(0),ct_label(0),eth(src=e8:1c:ba:9f:b7:c6,dst=f
> > a:16:3e:45:da:61),eth_type(0x0800),ipv4(src=172.16.160.1,dst=10.59.53.
> > 8,proto=1,tos=0,ttl=125,frag=no),icmp(type=8,code=0)
> > icmp,vlan_tci=0x0000,dl_src=e8:1c:ba:9f:b7:c6,dl_dst=fa:16:3e:45:da:61
> > ,nw_src=172.16.160.1,nw_dst=10.59.53.8,nw_tos=0,nw_ecn=0,nw_ttl=125,ic
> mp_type=8,icmp_code=0 icmp_csum:6013 ......
> > 2020-10-28T05:15:22.059Z|00808|vconn|DBG|unix#1: sent (Success):
> > NXT_PACKET_IN2 (OF1.3) (xid=0x0): table_id=24 cookie=0x9173f925
> > total_len=74
> > ct_state=new|trk|dnat,ct_zone=507,ct_nw_src=172.16.160.1,ct_nw_dst=10.
> > 59.53.8,ct_nw_proto=1,ct_tp_src=8,ct_tp_dst=0,ip,reg0=0xa0a000a,reg1=0
> > xa0a0001,reg9=0x8,reg10=0x1,reg11=0x1fb,reg12=0x1fa,reg14=0x1,reg15=0x
> > 7,metadata=0x121,in_port=1 (via action) data_len=74 (unbuffered)
> >
> > userdata=00.00.00.00.00.00.00.00.00.19.00.10.80.00.06.06.ff.ff.ff.ff.f
> > f.ff.00.00.ff.ff.00.18.00.00.23.20.00.06.00.20.00.40.00.00.00.01.de.10
> > .00.00.20.04.ff.ff.00.18.00.00.23.20.00.06.00.20.00.60.00.00.00.01.de.
> > 10.00.00.22.04.00.19.00.10.80.00.2a.02.00.01.00.00.00.00.00.00.ff.ff.0
> > 0.10.00.00.23.20.00.0e.ff.f8.20.00.00.00
> > icmp,vlan_tci=0x0000,dl_src=fa:16:3e:93:f4:1e,dl_dst=00:00:00:00:00:00
> > ,nw_src=172.16.160.1,nw_dst=10.10.0.10,nw_tos=0,nw_ecn=0,nw_ttl=124,ic
> > mp_type=8,icmp_code=0 icmp_csum:6013
> >
> > ......
> > 2020-10-28T05:15:27.060Z|01102|dpif(handler10)|DBG|system at ovs-system:
> action upcall:
> > recirc_id(0x3ad25),dp_hash(0),skb_priority(0),in_port(4),skb_mark(0),c
> > t_state(0xa1),ct_zone(0x1fb),ct_mark(0),ct_label(0),ct_tuple4(src=172.
> > 16.160.1,dst=10.59.53.8,proto=1,tp_src=8,tp_dst=0),eth(src=fa:16:3e:93
> > :f4:1e,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(src=172.16.160.1,d
> > st=10.10.0.10,proto=1,tos=0,ttl=124,frag=no),icmp(type=8,code=0)
> > icmp,vlan_tci=0x0000,dl_src=fa:16:3e:93:f4:1e,dl_dst=00:00:00:00:00:00
> > ,nw_src=172.16.160.1,nw_dst=10.10.0.10,nw_tos=0,nw_ecn=0,nw_ttl=124,ic
> mp_type=8,icmp_code=0 icmp_csum:6012 ......
> > 2020-10-28T05:15:27.061Z|00834|vconn|DBG|unix#1: sent (Success):
> > NXT_PACKET_IN2 (OF1.3) (xid=0x0): table_id=24 cookie=0x9173f925
> > total_len=74
> > ct_state=new|trk|dnat,ct_zone=507,ct_nw_src=172.16.160.1,ct_nw_dst=10.
> > 59.53.8,ct_nw_proto=1,ct_tp_src=8,ct_tp_dst=0,ip,reg0=0xa0a000a,reg1=0
> > xa0a0001,reg9=0x8,reg10=0x1,reg11=0x1fb,reg12=0x1fa,reg14=0x1,reg15=0x
> > 7,metadata=0x121,in_port=1 (via action) data_len=74 (unbuffered)
> >
> > userdata=00.00.00.00.00.00.00.00.00.19.00.10.80.00.06.06.ff.ff.ff.ff.f
> > f.ff.00.00.ff.ff.00.18.00.00.23.20.00.06.00.20.00.40.00.00.00.01.de.10
> > .00.00.20.04.ff.ff.00.18.00.00.23.20.00.06.00.20.00.60.00.00.00.01.de.
> > 10.00.00.22.04.00.19.00.10.80.00.2a.02.00.01.00.00.00.00.00.00.ff.ff.0
> > 0.10.00.00.23.20.00.0e.ff.f8.20.00.00.00
> > icmp,vlan_tci=0x0000,dl_src=fa:16:3e:93:f4:1e,dl_dst=00:00:00:00:00:00
> > ,nw_src=172.16.160.1,nw_dst=10.10.0.10,nw_tos=0,nw_ecn=0,nw_ttl=124,ic
> mp_type=8,icmp_code=0 icmp_csum:6012 ......
> > 2020-10-28T05:15:32.059Z|01359|dpif(handler8)|DBG|system at ovs-system:
> action upcall:
> > recirc_id(0x3ad25),dp_hash(0),skb_priority(0),in_port(4),skb_mark(0),c
> > t_state(0xa1),ct_zone(0x1fb),ct_mark(0),ct_label(0),ct_tuple4(src=172.
> > 16.160.1,dst=10.59.53.8,proto=1,tp_src=8,tp_dst=0),eth(src=fa:16:3e:93
> > :f4:1e,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(src=172.16.160.1,d
> > st=10.10.0.10,proto=1,tos=0,ttl=124,frag=no),icmp(type=8,code=0)
> > icmp,vlan_tci=0x0000,dl_src=fa:16:3e:93:f4:1e,dl_dst=00:00:00:00:00:00
> > ,nw_src=172.16.160.1,nw_dst=10.10.0.10,nw_tos=0,nw_ecn=0,nw_ttl=124,ic
> mp_type=8,icmp_code=0 icmp_csum:6011 ......
> > 2020-10-28T05:15:32.060Z|00856|vconn|DBG|unix#1: sent (Success):
> > NXT_PACKET_IN2 (OF1.3) (xid=0x0): table_id=24 cookie=0x9173f925
> > total_len=74
> > ct_state=new|trk|dnat,ct_zone=507,ct_nw_src=172.16.160.1,ct_nw_dst=10.
> > 59.53.8,ct_nw_proto=1,ct_tp_src=8,ct_tp_dst=0,ip,reg0=0xa0a000a,reg1=0
> > xa0a0001,reg9=0x8,reg10=0x1,reg11=0x1fb,reg12=0x1fa,reg14=0x1,reg15=0x
> > 7,metadata=0x121,in_port=1 (via action) data_len=74 (unbuffered)
> >
> > userdata=00.00.00.00.00.00.00.00.00.19.00.10.80.00.06.06.ff.ff.ff.ff.f
> > f.ff.00.00.ff.ff.00.18.00.00.23.20.00.06.00.20.00.40.00.00.00.01.de.10
> > .00.00.20.04.ff.ff.00.18.00.00.23.20.00.06.00.20.00.60.00.00.00.01.de.
> > 10.00.00.22.04.00.19.00.10.80.00.2a.02.00.01.00.00.00.00.00.00.ff.ff.0
> > 0.10.00.00.23.20.00.0e.ff.f8.20.00.00.00
> > icmp,vlan_tci=0x0000,dl_src=fa:16:3e:93:f4:1e,dl_dst=00:00:00:00:00:00
> > ,nw_src=172.16.160.1,nw_dst=10.10.0.10,nw_tos=0,nw_ecn=0,nw_ttl=124,ic
> > mp_type=8,icmp_code=0 icmp_csum:6011
> > ==================================================
> >
> > Thanks!
> > Tony
> > > -----Original Message-----
> > > From: dev <ovs-dev-bounces at openvswitch.org> On Behalf Of Tony Liu
> > > Sent: Tuesday, October 27, 2020 2:23 PM
> > > To: ovs-discuss <ovs-discuss at openvswitch.org>; ovs-dev <ovs-
> > > dev at openvswitch.org>
> > > Subject: Re: [ovs-dev] [ovn] no tunnel from GW to compute
> > >
> > > Saw the same problem again. Recreate network, attach to router and
> > > launch VM, problem is gone. Probably some glitch happened during the
> > > early deployment. Any hints how to look into it?
> > >
> > > Thanks!
> > > Tony
> > > > -----Original Message-----
> > > > From: discuss <ovs-discuss-bounces at openvswitch.org> On Behalf Of
> > > > Tony Liu
> > > > Sent: Friday, October 16, 2020 6:48 PM
> > > > To: ovs-discuss <ovs-discuss at openvswitch.org>
> > > > Subject: [ovs-discuss] [ovn] no tunnel from GW to compute
> > > >
> > > > Hi,
> > > >
> > > > I am seeing an interesting issue today.
> > > > When ping a FIP from external, request arrives on GW, but no
> > > > tunnel from GW to compute.
> > > > When ping from VM to external, egress works fine, request goes
> > > > through tunnel from compute to GW, then to external.
> > > > Reply arrives at GW, no tunnel from GW back to compute.
> > > >
> > > > I checked DP flows on GW and compared working vs. non-working.
> > > >
> > > > non-working, no tunnel
> > > > ========================
> > > > recirc_id(0),in_port(3),ct_state(-new-est-rel-rpl-inv-
> > > > trk),ct_label(0/0x1),eth(src=e8:1c:ba:9f:b7:c6,dst=fa:16:3e:67:5c:
> > > > d9),
> > > > et
> > > > h_type(0x0800),ipv4(src=128.0.0.0/192.0.0.0,dst=10.59.53.18,proto=
> > > > 1,tt l= 63,frag=no),icmp(type=8/0xf8), packets:8, bytes:784,
> > > > used:0.992s,
> > > > actions:ct_clear,ct(zone=20,nat),recirc(0x6e1)
> > > >
> > > > recirc_id(0x6e1),in_port(3),ct_state(+new-est-rel-rpl-
> > > > inv+trk),ct_label(0/0x1),eth(),eth_type(0x0800),ipv4(dst=10.59.53.
> > > > inv+18,f
> > > > inv+ra
> > > > g=no), packets:29, bytes:2842, used:0.992s,
> > > > actions:ct(commit,zone=21,nat(dst=192.168.1.8)),recirc(0x6e2)
> > > >
> > > > recirc_id(0x6e2),in_port(3),ct_state(+new-est-rel-rpl-
> > > >
> inv+trk),ct_label(0/0x1),eth(src=e8:1c:ba:9f:b7:c6,dst=fa:16:3e:67:5c:
> > > > inv+d9
> > > > ),eth_type(0x0800),ipv4(dst=192.168.1.8,proto=1,ttl=63,frag=no),ic
> > > > mp(t yp e=8/0xf8), packets:8, bytes:784, used:0.992s,
> > > > actions:ct_clear ========================
> > > >
> > > > working, with tunnel
> > > > ========================
> > > > recirc_id(0),in_port(3),ct_state(-new-est-rel-rpl-inv-
> > > > trk),ct_label(0/0x1),eth(src=e8:1c:ba:9f:b7:c6,dst=fa:16:3e:67:5c:
> > > > d9),
> > > > et
> > > > h_type(0x0800),ipv4(src=128.0.0.0/192.0.0.0,dst=10.59.53.14,proto=
> > > > 1,tt l= 63,frag=no),icmp(type=8/0xf8), packets:2, bytes:196,
> > > > used:3.427s,
> > > > actions:ct_clear,ct(zone=20,nat),recirc(0x716)
> > > >
> > > > recirc_id(0x716),in_port(3),ct_state(+new-est-rel-rpl-
> > > > inv+trk),ct_label(0/0x1),eth(),eth_type(0x0800),ipv4(dst=10.59.53.
> > > > inv+14,f
> > > > inv+ra
> > > > g=no), packets:2, bytes:196, used:3.428s,
> > > > actions:ct(commit,zone=21,nat(dst=192.168.1.5)),recirc(0x717)
> > > >
> > > > recirc_id(0x717),in_port(3),ct_state(+new-est-rel-rpl-
> > > >
> inv+trk),ct_label(0/0x1),eth(src=e8:1c:ba:9f:b7:c6,dst=fa:16:3e:67:5c:
> > > > inv+d9
> > > > ),eth_type(0x0800),ipv4(dst=192.168.1.5,proto=1,tos=0/0x3,ttl=63,f
> > > > rag= no ),icmp(type=8/0xf8), packets:0, bytes:0, used:never,
> > > > actions:ct_clear,set(tunnel(tun_id=0x139,dst=10.6.30.63,ttl=64,tp_
> > > > dst=
> > > > 60
> > > > 81,geneve({class=0x102,type=0x80,len=4,0x2000a}),flags(df|csum|key
> > > > ))),
> > > > se
> > > > t(eth(src=fa:16:3e:aa:2a:5d,dst=fa:16:3e:a6:79:6f)),set(ipv4(ttl=6
> > > > 2)),
> > > > 1
> > > > ========================
> > > >
> > > > The difference is on the third flow (0x6e2 and 0x717).
> > > > In non-working case, "set(tunnel..." is missing.
> > > > Note, the working VM and non-working VM are on the same compute.
> > > > I want to trace the root cause. Any hints or comments where and
> > > > how I should look into it?
> > > >
> > > >
> > > > Thanks!
> > > > Tony
> > > >
> > > > _______________________________________________
> > > > discuss mailing list
> > > > discuss at openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> > > _______________________________________________
> > > dev mailing list
> > > dev at openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >


More information about the discuss mailing list