[ovs-dev] OVS will hit an assert if encap(nsh) is done in bucket of group

Jan Scheurich jan.scheurich at ericsson.com
Sun Mar 25 01:08:36 UTC 2018


Hi Yi,



Part of the seemingly strange behavior of the encap(nsh) action in a group is caused by the (often forgotten) fact that group buckets do not contain action *lists* but action *sets*. I have no idea why it was defined like this when groups were first introduced in OpenFlow 1.1. In my view it was a bad decision and causes a lot of limitation for using groups. But that's the way it is.



In action sets there can only be one action of a kind (except for set_field, where there can be one action per target field). If there are multiple actions of the same kind specified, only the last one taken, the earlier  ones ignored.



Furthermore, the order of execution of the actions in the action set is not given by the order in which they are listed but defined by the OpenFlow standard (see chapter 5.6 of OF spec 1.5.1). Of course the generic encap() and decap() actions are not standardized yet, so the OF spec doesn't specify where to put them in the sequence. We had to implement something that follows the spirit of the specification, knowing that whatever we chose may fit some but won't fit many other legitimate use cases.



OVS's order is defined in ofpacts_execute_action_set() in ofp-actions.c:

OFPACT_STRIP_VLAN

OFPACT_POP_MPLS

OFPACT_DECAP

OFPACT_ENCAP

OFPACT_PUSH_MPLS

OFPACT_PUSH_VLAN

OFPACT_DEC_TTL

OFPACT_DEC_MPLS_TTL

OFPACT_DEC_NSH_TTL

All OFP_ACT SET_FIELD and OFP_ACT_MOVE (target)

OFPACT_SET_QUEUE



Now, your specific group bucket use case:



   encap(nsh),set_field:<val>->nsh_xxx,output:vxlan_gpe_port



should be a lucky fit and execute as expected, whereas the analogous use case



   encap(nsh),set_field:<val>->nsh_xxx,encap(ethernet), output:ethernet_port



fails with the error



   Dropping packet as encap(ethernet) is not supported for packet type ethernet.



because the second encap(ethernet) action replaces the encap(nsh) in the action set and is executed first on the original received Ethernet packet. Boom!



So, why does your valid use case cause an assertion failure? It's a consequence of two faults:



  1.  In the conversion of the group bucket's action list to the bucket action set in ofpacts_execute_action_set() the action list is filtered with ofpact_is_set_or_move_action() to select the set_field actions. This function incorrectly flagged OFPACT_ENCAP, OFPACT_DECAP and OFPACT_DEC_NSH_TTL as set_field actions. That's why the encap(nsh) action is wrongly copied twice to the action set.

  2.  The translation of the second encap(nsh) action in the action set doesn't change the packet_type as it is already (1,0x894f). Hence, the commit_packet_type_change() triggered at output to vxlan_gpe port misses to generate a second encap_nsh datapath action. The logic here is obviously not complete to cover the NSH in NSH use case that we intended to support and must be enhanced.


The commit of the changes to the NSH header in commit_set_nsh_action() then triggers assertion failure because the translation of the second encap(nsh) action did overwrite the original nsh_np (0x3 for Ethernet in NSH) in the flow with 0x4 (for NSH in NSH). Since it is not allowed to modify the nsh_np with set_field this is what triggers the assertion.


I believe this assertion to be correct. It did detect the combination of the above two faults.



The solution to 1 is trivial. I'll post a bug fix straight away. That should suffice for your problem.

The solution to 2 requires a bit more thinking. I will send a fix when I have found it.



BR, Jan



> -----Original Message-----

> From: Yang, Yi [mailto:yi.y.yang at intel.com]

> Sent: Friday, 23 March, 2018 08:55

> To: Jan Scheurich <jan.scheurich at ericsson.com>

> Cc: dev at openvswitch.org; Zoltán Balogh <zoltan.balogh at ericsson.com>

> Subject: Re: OVS will hit an assert if encap(nsh) is done in bucket of group

>

> On Fri, Mar 23, 2018 at 07:51:45AM +0000, Jan Scheurich wrote:

> > Hi Yi,

> >

> > Could you please provide the OF pipeline (flows and groups) and an ofproto/trace command that triggers that fault?

> >

> > Thanks, Jan

>

> Hi, Jan

>

> my br-int has the below ports:

>

>  1(dpdk0): addr:08:00:27:c6:9f:ff

>      config:     0

>      state:      LIVE

>      current:    1GB-FD AUTO_NEG

>      speed: 1000 Mbps now, 0 Mbps max

>  2(vxlangpe1): addr:16:04:0c:e5:f1:2c

>      config:     0

>      state:      LIVE

>      speed: 0 Mbps now, 0 Mbps max

>  3(vxlan1): addr:da:1e:fb:2b:c8:63

>      config:     0

>      state:      LIVE

>      speed: 0 Mbps now, 0 Mbps max

>  4(veth-br): addr:92:3d:e0:ab:c2:85

>      config:     0

>      state:      LIVE

>      current:    10GB-FD COPPER

>      speed: 10000 Mbps now, 0 Mbps max

>  LOCAL(br-int): addr:08:00:27:c6:9f:ff

>      config:     0

>      state:      LIVE

>      current:    10MB-FD COPPER

>      speed: 10 Mbps now, 0 Mbps max

>

> Here are group and flow

>

> ${OFCTL} -Oopenflow13 add-group br-int

> group_id=111,type=all,bucket="load:0xc0a83249->NXM_NX_TUN_IPV4_DST[],load:0x9-

> >NXM_NX_TUN_ID[0..31],output:3",bucket="encap(nsh(md_type=1)),set_field:0x80->nsh_flags,set_field:0x3456-

> >nsh_spi,set_field:253->nsh_si,set_field:0x11111111->nsh_c1,set_field:0x22222222->nsh_c2,set_field:0x33333333-

> >nsh_c3,set_field:0x44444444->nsh_c4,load:0xc0a83249->NXM_NX_TUN_IPV4_DST[],load:0x9->NXM_NX_TUN_ID[0..31],output:2"

> ${OFCTL} -Oopenflow13 add-flow br-int in_port=4,icmp,actions=group:111

>

> Then I run ping by this cmd "sudo ip netns exec app ping -c 1 192.168.2.2", it will hit flow "in_port=4,icmp,actions=group:111", then

> result in this assert.

>

> >

> > > -----Original Message-----

> > > From: Yang, Yi [mailto:yi.y.yang at intel.com]

> > > Sent: Friday, 23 March, 2018 04:53

> > > To: dev at openvswitch.org<mailto:dev at openvswitch.org>

> > > Cc: Jan Scheurich <jan.scheurich at ericsson.com<mailto:jan.scheurich at ericsson.com>>; Zoltán Balogh <zoltan.balogh at ericsson.com<mailto:zoltan.balogh at ericsson.com>>

> > > Subject: OVS will hit an assert if encap(nsh) is done in bucket of group

> > >

> > > Hi, guys

> > >

> > > A NSH user found OVS will hit the below assert in function

> > > commit_set_nsh_action in file lib/odp-util.c if encap(nsh) is done in

> > > bucket of group

> > >

> > >     ovs_assert(flow->nsh.mdtype == base_flow->nsh.mdtype &&

> > >                flow->nsh.np == base_flow->nsh.np);

> > >

> > > But it isn't an issue if encap(nsh) in actions=.

> > >

> > > I debugged this issue but can't find the root cause, basically

> > > xlate_generic_encap_action is called twice for a packet in different

> > > code path in group bucket use case. one is upcall process, another one

> > > is normal process in two code paths, gdb call stack dump is as followed.

> > >

> > > The second call is obviously based on the result of the first call which

> > > has been committed by xlate_commit_actions in first call

> > > xlate_generic_encap_action, so flow->nsh.np will be set to NSH_P_NSH,

> > > this is wrong, but it can work normally if I comment out the above

> > > assert. I really don't know why xlate_generic_encap_action is called

> > > twice in group bucket use case, so look forward to your insights,

> > > appriciate your feedback heartfeltly.

> > >

> > > (gdb) bt

> > > #0  xlate_generic_encap_action (encap=0x7ffd9458f460,

> > > ctx=0x7ffd94590f30)

> > >     at ofproto/ofproto-dpif-xlate.c:5913

> > > #1  do_xlate_actions (ofpacts=<optimized out>, ofpacts_len=<optimized

> > > out>,

> > >     ctx=ctx at entry=0x7ffd94590f30,

> > > is_last_action=is_last_action at entry=false)

> > >     at ofproto/ofproto-dpif-xlate.c:6499

> > > #2  0x000000000074add0 in xlate_group_bucket

> > > (ctx=ctx at entry=0x7ffd94590f30,

> > >     is_last_action=<optimized out>, bucket=0x2214f90, bucket=0x2214f90)

> > >     at ofproto/ofproto-dpif-xlate.c:4090

> > > #3  0x0000000000749ee4 in xlate_all_group (is_last_action=false,

> > >     group=0x2214bc0, ctx=0x7ffd94590f30) at

> > > ofproto/ofproto-dpif-xlate.c:4150

> > > #4  xlate_group_action__ (is_last_action=<optimized out>,

> > > group=0x2214bc0,

> > >     ctx=0x7ffd94590f30) at ofproto/ofproto-dpif-xlate.c:4304

> > > #5  xlate_group_action (is_last_action=<optimized out>,

> > >     group_id=<optimized out>, ctx=0x7ffd94590f30)

> > >     at ofproto/ofproto-dpif-xlate.c:4335

> > > #6  do_xlate_actions (ofpacts=ofpacts at entry=0x2212558,

> > >     ofpacts_len=ofpacts_len at entry=8, ctx=ctx at entry=0x7ffd94590f30,

> > >     is_last_action=is_last_action at entry=true)

> > >     at ofproto/ofproto-dpif-xlate.c:6177

> > > #7  0x0000000000750041 in xlate_actions (xin=xin at entry=0x7ffd945919d0,

> > >     xout=xout at entry=0x7ffd94591dd0) at ofproto/ofproto-dpif-xlate.c:7090

> > > #8  0x0000000000741f00 in upcall_xlate (wc=0x7ffd94592ff8,

> > > ---Type <return> to continue, or q <return> to quit---

> > >     odp_actions=0x7ffd945927d0, upcall=0x7ffd94591d70, udpif=0x1805440)

> > >     at ofproto/ofproto-dpif-upcall.c:1162

> > > #9  process_upcall (udpif=udpif at entry=0x1805440,

> > >     upcall=upcall at entry=0x7ffd94591d70,

> > >     odp_actions=odp_actions at entry=0x7ffd945927d0,

> > > wc=wc at entry=0x7ffd94592ff8)

> > >     at ofproto/ofproto-dpif-upcall.c:1361

> > > #10 0x000000000074244b in upcall_cb (packet=<optimized out>,

> > >     flow=0x7ffd94592d60, ufid=<optimized out>, pmd_id=<optimized out>,

> > >     type=<optimized out>, userdata=<optimized out>,

> > > actions=0x7ffd945927d0,

> > >     wc=0x7ffd94592ff8, put_actions=0x7ffd94592810, aux=0x1805440)

> > >     at ofproto/ofproto-dpif-upcall.c:1263

> > > #11 0x000000000076b2d6 in dp_netdev_upcall

> > > (packet_=packet_ at entry=0x2211680,

> > >     flow=flow at entry=0x7ffd94592d60, wc=wc at entry=0x7ffd94592ff8,

> > >     ufid=ufid at entry=0x7ffd945927b0, type=type at entry=DPIF_UC_MISS,

> > >     userdata=userdata at entry=0x0, actions=actions at entry=0x7ffd945927d0,

> > >     put_actions=put_actions at entry=0x7ffd94592810, pmd=<optimized out>,

> > >     pmd=<optimized out>) at lib/dpif-netdev.c:4868

> > > #12 0x00000000007725fd in handle_packet_upcall

> > > (put_actions=0x7ffd94592810,

> > >     actions=0x7ffd945927d0, key=0x7ffd94593c40, packet=0x2211680,

> > >     pmd=0x18a3e90) at lib/dpif-netdev.c:5079

> > > #13 fast_path_processing (pmd=pmd at entry=0x18a3e90,

> > >     packets_=packets_ at entry=0x7ffd94593fa0,

> > > keys=keys at entry=0x7ffd94593c40,

> > > ---Type <return> to continue, or q <return> to quit---

> > >     batches=batches at entry=0x7ffd94593ae0,

> > >     n_batches=n_batches at entry=0x7ffd94593f28, in_port=<optimized out>)

> > >     at lib/dpif-netdev.c:5187

> > > #14 0x0000000000772ea8 in dp_netdev_input__ (pmd=pmd at entry=0x18a3e90,

> > >     packets=packets at entry=0x7ffd94593fa0,

> > >     md_is_valid=md_is_valid at entry=false, port_no=port_no at entry=5)

> > >     at lib/dpif-netdev.c:5259

> > > #15 0x000000000077310d in dp_netdev_input (port_no=5,

> > > packets=0x7ffd94593fa0,

> > >     pmd=0x18a3e90) at lib/dpif-netdev.c:5287

> > > #16 dp_netdev_process_rxq_port (pmd=pmd at entry=0x18a3e90, rxq=0x1d4f2a0,

> > >     port_no=5) at lib/dpif-netdev.c:3286

> > > #17 0x0000000000773b02 in dpif_netdev_run (dpif=<optimized out>)

> > >     at lib/dpif-netdev.c:3940

> > > #18 0x0000000000733a18 in type_run (type=<optimized out>)

> > >     at ofproto/ofproto-dpif.c:342

> > > #19 0x000000000071f9cf in ofproto_type_run (datapath_type=<optimized

> > > out>,

> > >     datapath_type at entry=0x1d50ab0 "netdev") at ofproto/ofproto.c:1707

> > > #20 0x000000000070f955 in bridge_run__ () at vswitchd/bridge.c:2931

> > > #21 0x00000000007153c8 in bridge_run () at vswitchd/bridge.c:2995

> > > #22 0x0000000000415485 in main (argc=5, argv=0x7ffd94594568)

> > >     at vswitchd/ovs-vswitchd.c:120

> > > (gdb) bt

> > > #0  xlate_generic_encap_action (encap=0x7ffd9458f478,

> > > ctx=0x7ffd94590f30)

> > >     at ofproto/ofproto-dpif-xlate.c:5913

> > > #1  do_xlate_actions (ofpacts=<optimized out>, ofpacts_len=<optimized

> > > out>,

> > >     ctx=ctx at entry=0x7ffd94590f30,

> > > is_last_action=is_last_action at entry=false)

> > >     at ofproto/ofproto-dpif-xlate.c:6499

> > > #2  0x000000000074add0 in xlate_group_bucket

> > > (ctx=ctx at entry=0x7ffd94590f30,

> > >     is_last_action=<optimized out>, bucket=0x2214f90, bucket=0x2214f90)

> > >     at ofproto/ofproto-dpif-xlate.c:4090

> > > #3  0x0000000000749ee4 in xlate_all_group (is_last_action=false,

> > >     group=0x2214bc0, ctx=0x7ffd94590f30) at

> > > ofproto/ofproto-dpif-xlate.c:4150

> > > #4  xlate_group_action__ (is_last_action=<optimized out>,

> > > group=0x2214bc0,

> > >     ctx=0x7ffd94590f30) at ofproto/ofproto-dpif-xlate.c:4304

> > > #5  xlate_group_action (is_last_action=<optimized out>,

> > >     group_id=<optimized out>, ctx=0x7ffd94590f30)

> > >     at ofproto/ofproto-dpif-xlate.c:4335

> > > #6  do_xlate_actions (ofpacts=ofpacts at entry=0x2212558,

> > >     ofpacts_len=ofpacts_len at entry=8, ctx=ctx at entry=0x7ffd94590f30,

> > >     is_last_action=is_last_action at entry=true)

> > >     at ofproto/ofproto-dpif-xlate.c:6177

> > > #7  0x0000000000750041 in xlate_actions (xin=xin at entry=0x7ffd945919d0,

> > >     xout=xout at entry=0x7ffd94591dd0) at ofproto/ofproto-dpif-xlate.c:7090

> > > #8  0x0000000000741f00 in upcall_xlate (wc=0x7ffd94592ff8,

> > > ---Type <return> to continue, or q <return> to quit---

> > >     odp_actions=0x7ffd945927d0, upcall=0x7ffd94591d70, udpif=0x1805440)

> > >     at ofproto/ofproto-dpif-upcall.c:1162

> > > #9  process_upcall (udpif=udpif at entry=0x1805440,

> > >     upcall=upcall at entry=0x7ffd94591d70,

> > >     odp_actions=odp_actions at entry=0x7ffd945927d0,

> > > wc=wc at entry=0x7ffd94592ff8)

> > >     at ofproto/ofproto-dpif-upcall.c:1361

> > > #10 0x000000000074244b in upcall_cb (packet=<optimized out>,

> > >     flow=0x7ffd94592d60, ufid=<optimized out>, pmd_id=<optimized out>,

> > >     type=<optimized out>, userdata=<optimized out>,

> > > actions=0x7ffd945927d0,

> > >     wc=0x7ffd94592ff8, put_actions=0x7ffd94592810, aux=0x1805440)

> > >     at ofproto/ofproto-dpif-upcall.c:1263

> > > #11 0x000000000076b2d6 in dp_netdev_upcall

> > > (packet_=packet_ at entry=0x2211680,

> > >     flow=flow at entry=0x7ffd94592d60, wc=wc at entry=0x7ffd94592ff8,

> > >     ufid=ufid at entry=0x7ffd945927b0, type=type at entry=DPIF_UC_MISS,

> > >     userdata=userdata at entry=0x0, actions=actions at entry=0x7ffd945927d0,

> > >     put_actions=put_actions at entry=0x7ffd94592810, pmd=<optimized out>,

> > >     pmd=<optimized out>) at lib/dpif-netdev.c:4868

> > > #12 0x00000000007725fd in handle_packet_upcall

> > > (put_actions=0x7ffd94592810,

> > >     actions=0x7ffd945927d0, key=0x7ffd94593c40, packet=0x2211680,

> > >     pmd=0x18a3e90) at lib/dpif-netdev.c:5079

> > > #13 fast_path_processing (pmd=pmd at entry=0x18a3e90,

> > >     packets_=packets_ at entry=0x7ffd94593fa0,

> > > keys=keys at entry=0x7ffd94593c40,

> > > ---Type <return> to continue, or q <return> to quit---

> > >     batches=batches at entry=0x7ffd94593ae0,

> > >     n_batches=n_batches at entry=0x7ffd94593f28, in_port=<optimized out>)

> > >     at lib/dpif-netdev.c:5187

> > > #14 0x0000000000772ea8 in dp_netdev_input__ (pmd=pmd at entry=0x18a3e90,

> > >     packets=packets at entry=0x7ffd94593fa0,

> > >     md_is_valid=md_is_valid at entry=false, port_no=port_no at entry=5)

> > >     at lib/dpif-netdev.c:5259

> > > #15 0x000000000077310d in dp_netdev_input (port_no=5,

> > > packets=0x7ffd94593fa0,

> > >     pmd=0x18a3e90) at lib/dpif-netdev.c:5287

> > > #16 dp_netdev_process_rxq_port (pmd=pmd at entry=0x18a3e90, rxq=0x1d4f2a0,

> > >     port_no=5) at lib/dpif-netdev.c:3286

> > > #17 0x0000000000773b02 in dpif_netdev_run (dpif=<optimized out>)

> > >     at lib/dpif-netdev.c:3940

> > > #18 0x0000000000733a18 in type_run (type=<optimized out>)

> > >     at ofproto/ofproto-dpif.c:342

> > > #19 0x000000000071f9cf in ofproto_type_run (datapath_type=<optimized

> > > out>,

> > >     datapath_type at entry=0x1d50ab0 "netdev") at ofproto/ofproto.c:1707

> > > #20 0x000000000070f955 in bridge_run__ () at vswitchd/bridge.c:2931

> > > #21 0x00000000007153c8 in bridge_run () at vswitchd/bridge.c:2995

> > > #22 0x0000000000415485 in main (argc=5, argv=0x7ffd94594568)

> > >     at vswitchd/ovs-vswitchd.c:120

> > > (gdb)

> > >

> > > (gdb) bt

> > > #0  xlate_generic_encap_action (encap=0x7f7df3ff81c0,

> > > ctx=0x7f7df3ff9c90)

> > >     at ofproto/ofproto-dpif-xlate.c:5913

> > > #1  do_xlate_actions (ofpacts=<optimized out>, ofpacts_len=<optimized

> > > out>,

> > >     ctx=ctx at entry=0x7f7df3ff9c90,

> > > is_last_action=is_last_action at entry=false)

> > >     at ofproto/ofproto-dpif-xlate.c:6499

> > > #2  0x000000000074add0 in xlate_group_bucket

> > > (ctx=ctx at entry=0x7f7df3ff9c90,

> > >     is_last_action=<optimized out>, bucket=0x2214f90, bucket=0x2214f90)

> > >     at ofproto/ofproto-dpif-xlate.c:4090

> > > #3  0x0000000000749ee4 in xlate_all_group (is_last_action=false,

> > >     group=0x2214bc0, ctx=0x7f7df3ff9c90) at

> > > ofproto/ofproto-dpif-xlate.c:4150

> > > #4  xlate_group_action__ (is_last_action=<optimized out>,

> > > group=0x2214bc0,

> > >     ctx=0x7f7df3ff9c90) at ofproto/ofproto-dpif-xlate.c:4304

> > > #5  xlate_group_action (is_last_action=<optimized out>,

> > >     group_id=<optimized out>, ctx=0x7f7df3ff9c90)

> > >     at ofproto/ofproto-dpif-xlate.c:4335

> > > #6  do_xlate_actions (ofpacts=ofpacts at entry=0x2212558,

> > >     ofpacts_len=ofpacts_len at entry=8, ctx=ctx at entry=0x7f7df3ff9c90,

> > >     is_last_action=is_last_action at entry=true)

> > >     at ofproto/ofproto-dpif-xlate.c:6177

> > > #7  0x0000000000750041 in xlate_actions (xin=xin at entry=0x7f7df3ffa130,

> > >     xout=xout at entry=0x7f7df3ffaa50) at ofproto/ofproto-dpif-xlate.c:7090

> > > #8  0x000000000073fccd in xlate_key (key=<optimized out>,

> > > ---Type <return> to continue, or q <return> to quit---

> > >     len=<optimized out>, push=push at entry=0x7f7df3ffa4d0,

> > >     ctx=ctx at entry=0x7f7df3ffaa30, udpif=<optimized out>)

> > >     at ofproto/ofproto-dpif-upcall.c:2055

> > > #9  0x000000000074031a in xlate_ukey (ukey=0x2214300, ukey=0x2214300,

> > >     ctx=0x7f7df3ffaa30, tcp_flags=<optimized out>, udpif=0x1805440)

> > >     at ofproto/ofproto-dpif-upcall.c:2070

> > > #10 revalidate_ukey__ (udpif=udpif at entry=0x1805440,

> > >     ukey=ukey at entry=0x2214300, tcp_flags=<optimized out>,

> > >     odp_actions=0x7f7df3ffae60, recircs=recircs at entry=0x7f7df3ffae50,

> > >     xcache=<optimized out>) at ofproto/ofproto-dpif-upcall.c:2116

> > > #11 0x00000000007405a6 in revalidate_ukey (udpif=udpif at entry=0x1805440,

> > >     ukey=ukey at entry=0x2214300, stats=stats at entry=0x7f7df3ffbb98,

> > >     odp_actions=odp_actions at entry=0x7f7df3ffae60,

> > >     reval_seq=reval_seq at entry=527957,

> > > recircs=recircs at entry=0x7f7df3ffae50)

> > >     at ofproto/ofproto-dpif-upcall.c:2218

> > > #12 0x0000000000743523 in revalidate (revalidator=0x18078b0)

> > >     at ofproto/ofproto-dpif-upcall.c:2522

> > > #13 0x000000000074362b in udpif_revalidator (arg=0x18078b0)

> > >     at ofproto/ofproto-dpif-upcall.c:910

> > > #14 0x00000000007eb344 in ovsthread_wrapper (aux_=<optimized out>)

> > >     at lib/ovs-thread.c:348

> > > #15 0x00007f7e06697184 in start_thread (arg=0x7f7df3fff700)

> > > ---Type <return> to continue, or q <return> to quit---

> > >     at pthread_create.c:312

> > > #16 0x00007f7e05cab03d in clone ()

> > >     at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

> > > (gdb)


More information about the dev mailing list