[ovs-discuss] MPLS double POP bug?

Pravin Shelar pshelar at nicira.com
Mon Nov 24 23:37:09 UTC 2014


On Mon, Nov 24, 2014 at 9:49 AM, Stefano Salsano
<stefano.salsano at uniroma2.it> wrote:
> Il 21/11/2014 11:04, Stefano Salsano ha scritto:
>>
>> Il 12/11/2014 01:19, Ben Pfaff ha scritto:
>>>
>>> On Tue, Nov 11, 2014 at 11:25:34PM +0100, Stefano Salsano wrote:
>>>>
>>>> we need to POP out two MPLS labels in an egress node and then send
>>>> out the packet as plain IP packet, but apparently this is not
>>>> working in Open vSwitch v2.390 operating in kernel mode
>>
>>
>
> [...]
>>
>>
>> we are now trying to add the correct check in compose_mpls_pop_action:
>>
>> } else if (n >= ctx->xbridge->max_mpls_depth) {
>> COVERAGE_INC(xlate_actions_mpls_overflow);
>> ctx->xout->slow |= SLOW_ACTION;
>> }
>>
>> and we will let you know the results
>
>
> Hi all,
>
> we added the if condition that forwards the packet to user space (slow path)
> in case of POP with multiple MPLS labels (see patch attached)
>
> the result is that the packet is correctly forwarded, but there is a problem
> in updating the stats
>
> unfortunately we do not have a clear picture of the architecture, but we see
> that both the "handler" thread and the "revalidator" thread process the flow
>
> the "revalidator" thread returns this error message: "resubmit actions
> recursed over 64 times"
>
> it seems that the both the handler and revalidator thread receive as input
> this flow description:
>
> mpls,in_port=4,vlan_tci=0x0000,dl_src=5a:93:39:89:21:cf,dl_dst=22:4e:63:a1:65:37,
> mpls_label=524292,mpls_tc=0,mpls_ttl=64,mpls_bos=0,mpls_lse1=0,mpls_lse2=0
>
> which according to us is wrong, as we see two inner mpls labels set set to 0
>
> the correct flow in our opinion should be like this (that we have taken from
> a debug print in the handler thread)
> mpls,in_port=1,vlan_tci=0x0000,dl_src=5a:93:39:89:21:cf,dl_dst=22:4e:63:a1:65:37,
> mpls_label=524292,mpls_tc=0,mpls_ttl=64,mpls_bos=0,mpls_lse1=1073742144
>
> it seems that the odp_flow_key_to_flow__ function
> (https://github.com/openvswitch/ovs/blob/master/lib/odp-util.c#L3590) is not
> able to retrieve the correct flow both in the handler and in the revalidator
> thread (result = ODP_FIT_TOO_LITTLE)
>
> in the handler thread the function flow_extract is called afterwards
> (https://github.com/openvswitch/ovs/blob/master/ofproto/ofproto-dpif-upcall.c#L634),
> setting the correct flow information and allowing the forwarding of the
> packet, while in the validator thread this does not happen and the validator
> is stuck with the wrong flow information
>
> do you have any hint on how we can fix the problem ??
>

Multiple MPLS PUSH and pop should be handled without userspace action.
multiple MPLS PUSH can be a single action, kernel datapath support
this operation. Multiple POP can be handled by recirculating after
every MPLS POP action.
If you are going to investigate it more
ofpact_needs_recirculation_after_mpls() should be looked at. Fix it to
return true for multiple MPLS pop.

> thank you
> Stefano & Pier Luigi
>
>
>
>
>>
>>
>> *****************************************************************
>> Detailed Report of the problem sent on 18/11/2014
>>
>> 1) recirculation does not work properly when there are
>> two or more mpls_pop, let us compare what happens in our system when
>> using the kernel datapath (not working) or the userspace netdev datapath
>> (working)
>>
>> we are using this reference topology
>> h1---peo1----peo2---h2
>> h1 source node, peo1 ingress node, peo2 egress node, h2 dest node
>>
>> in peo1, we report the rules shown with ovs-ofctl (obviously they are
>> the same with kernel and netdev datapath)
>>
>> peo1:
>> -> cookie=0x0, duration=185.672s, table=0, n_packets=178, n_bytes=17444,
>> ip,in_port=2
>>
>> actions=push_mpls:0x8847,set_field:262144->mpls_label,push_mpls:0x8847,set_field:524292->mpls_label,goto_table:1
>>
>> -> cookie=0x0, duration=185.617s, table=1, n_packets=178, n_bytes=17444,
>> mpls,in_port=2,mpls_label=524292 actions=output:1
>>
>> recirculation causes the installation of flows that are not visible with
>> the ovs-ofctl command, so we use ovs-appctl
>> (sudo ovs-appctl dpif/dump-flows peo1)
>>
>> the result is different with kernel and netdev datapath:
>>
>> kernel datapath:
>> -> recirc_id(0),in_port(2),eth_type(0x0800),
>> ipv4(tos=0/0xfc,ttl=64,frag=no), packets:14, bytes:1372, used:0.692s,
>> actions:userspace(pid=4294959673,slow_path(action))
>>
>> netdev datapath:
>> -> recirc_id(0),in_port(2),eth_type(0x0800),
>> ipv4(tos=0/0xfc,ttl=64,frag=no), packets:17, bytes:1666, used:0.664s,
>>
>> actions:push_mpls(label=262144,tc=0,ttl=64,bos=1,eth_type=0x8847),push_mpls(label=524292,tc=0,ttl=64,bos=0,eth_type=0x8847),1
>>
>>
>> we note that with kernel datapath, there is a fallback to userspace
>> processing that is triggered when multiple PUSH operation are requested
>>
>> in peo2, we report the rules shown with ovs-ofctl (obviously they are
>> the same with kernel and netdev datapath)
>>
>> peo2:
>> -> cookie=0x0, duration=621.359s, table=0, n_packets=284, n_bytes=30104,
>> mpls,in_port=1 actions=goto_table:1
>> ->  cookie=0x0, duration=621.299s, table=1, n_packets=284,
>> n_bytes=30104, mpls,in_port=1,mpls_label=524292
>> actions=pop_mpls:0x8847,resubmit(,1)
>> -> cookie=0x0, duration=621.184s, table=1, n_packets=284, n_bytes=30104,
>> mpls,in_port=1,mpls_label=262144,mpls_bos=1
>> actions=pop_mpls:0x0800,output:2
>>
>> recirculation causes the installation of flows that are not visible with
>> the ovs-ofctl command, so we use ovs-appctl
>> (sudo ovs-appctl dpif/dump-flows peo1)
>>
>> the result is different with kernel and netdev datapath:
>>
>> kernel datapath:
>> NO RULES ARE SHOWN !!!!
>>
>> netdev datapath:
>> -> recirc_id(0),in_port(4),eth_type(0x8847),
>> mpls(lse0=0x80004040/0xfffff100, lse1=0x40000140/0xffffffff),
>> packets:5, bytes:530, used:0.720s,
>> actions:pop_mpls(eth_type=0x8847),pop_mpls(eth_type=0x800),5
>>
>> peo2 is the egress node where we execute the double pop, it seems that
>> something is preventing the installation of the "recirculation" flows
>> when using the kernel datapath
>>
>> We have also checked the ovs-vswitchd log and when we activate the
>> kernel datapath, we see these warning messages:
>>
>> -> |WARN|system at ovs-system: failed to put[create] (Invalid argument)
>>
>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(4),skb_mark(0/0),eth(src=e6:da:61:3a:f5:81/00:00:00:00:00:00,dst=0a:a5:58:7d:1c:12/00:00:00:00:00:00),eth_type(0x8847),mpls(label=524292/0xfffff,tc=0/0,ttl=64/0x0,bos=0/1),
>>
>> actions:pop_mpls(eth_type=0x8847),pop_mpls(eth_type=0x800),5
>> -> |WARN|system at ovs-system: execute pop_mpls(eth_type=0x8847),
>> pop_mpls(eth_type=0x800),5 failed
>> (Invalid argument) on packet mpls,in_port=0,vlan_tci=0x0000,
>> dl_src=e6:da:61:3a:f5:81,dl_dst=0a:a5:58:7d:1c:12,mpls_label=524292,
>> mpls_tc=0,mpls_ttl=64,mpls_bos=0,mpls_lse1=1073742144
>>
>> These messages explain also the absence of the "recirculation flows",
>> even if in our mininet emulation scenario there is a single ovs-vswitchd
>> instance, we can refer these messages to peo2 from the "output port"
>>
>> We also noted at the beginning of the vswitchd-log these messages:
>>
>> When using the Kernel datapath:
>> 2014-11-17T19:09:23.698Z|00029|ofproto_dpif|INFO|system at ovs-system:
>> MPLS label stack length probed as 1
>>
>> When using the userspace netdev datapath:
>> 2014-11-17T19:17:37.697Z|00069|ofproto_dpif|INFO|netdev at ovs-netdev:
>> MPLS label stack length probed as 3
>>
>> We have checked in the ovs code, this print is generated by the
>> function static size_t check_max_mpls_depth(struct dpif_backer
>> *backer), this function tries to probe the stack length...
>>
>> we would like to understand what is behind this limitation... why in our
>> system the stack length for the kernel datapath is limited to 1 ?
>> does the stack length depend on some features provided by the
>> system/kernel? is there some kernel version without this limitation?
>>
>> assuming that we have this limitation, our understanding is that this
>> control does not prevent to PUSH three mpls label because the packet is
>> sent to the userspace ovs-switchd as a fallback (and the packet is
>> correctly created and forwarded)... is this correct?
>>
>> if this is the case, it seems that the same fallback is not activated in
>> case of multiple POP operations (>=2 POPs) the system tries to execute
>> the multiple POPs in the kernel but it generates the error messages
>> listed above and no packet is forwarded
>>
>>
>> End Detailed Report of the problem
>> *****************************************************************
>>
>>
>>
>>
>>
>>
>>
>>
>>>>
>>>> We have replicated the problem in a simple mininet setup (script
>>>> attached) with four nodes connected as follows:
>>>>
>>>> h1---peo1----peo2---h2
>>>> h1 source node, peo1 ingress node, peo2 egress node, h2 dest node
>>>>
>>>> - h1 pings h2 (automatic static arp table are used, so there is no ARP)
>>>> - peo1 pushes two mpls labels in the packet
>>>> - peo2 is supposed to pop the two labels and forward the packet as IP
>>>>
>>>> we have a rule to match the outer label, POP it and resubmit as MPLS
>>>> packet:
>>>> actions=pop_mpls:0x8847,resubmit(,1)
>>>>
>>>> then we have a rule to match the inner label,  POP it and output on
>>>> a port as an IP packet:
>>>> actions=pop_mpls:0x0800,output:11
>>>>
>>>> both rules match (and the first pop works), but then the packet
>>>> does not exit from the port 11
>>>>
>>>> we have also tried without success to recirculate the packet
>>>> implicitly after the first pop, replacing the first action with:
>>>> actions=pop_mpls:0x8847
>>>>
>>>> Can anyone help us to identify the problem ? Are we doing something
>>>> wrong or is it a bug?
>>>>
>>>> The mininet script to setup the topology and create the rules in
>>>> order to reproduce the problem is attached.
>>>>
>>>> Some details are reported hereafter for the scenario with explicit
>>>> resubmit.
>>>>
>>>> thank you in advance for your help...
>>>>
>>>> ciao
>>>> Stefano
>>>>
>>>> - peo1 flow table
>>>> OFPST_FLOW reply (OF1.3) (xid=0x2):
>>>> duration=275.553s, table=0, n_packets=58, n_bytes=5684,ip,in_port=2
>>>>
>>>> actions=push_mpls:0x8847,set_field:262144->mpls_label,push_mpls:0x8847,set_field:524292->mpls_label,goto_table:1
>>>>
>>>>
>>>>
>>>> duration=275.493s,table=1,n_packets=58,n_bytes=5684,mpls,in_port=2,mpls_label=524292
>>>>
>>>> actions=output:1
>>>>
>>>> - peo2 flow table
>>>> OFPST_FLOW reply (OF1.3) (xid=0x2):
>>>>
>>>> duration=317.678s,table=0,n_packets=58,n_bytes=6148,mpls,in_port=1
>>>> actions=goto_table:1
>>>>
>>>>
>>>> duration=317.618s,table=1,n_packets=58,n_bytes=6148,mpls,in_port=1,mpls_label=524292
>>>>
>>>> actions=pop_mpls:0x8847,resubmit(,1)
>>>>
>>>>
>>>> duration=317.559s,table=1,n_packets=58,n_bytes=6148,mpls,in_port=1,mpls_label=262144,mpls_bos=1
>>>>
>>>> actions=pop_mpls:0x0800,output:2
>>>>
>>>> NB If the interfaces created by Mininet in the root namespace happen
>>>> to be down, stop the Network Manager and then restart the experiment
>>>> --
>>>> *******************************************************************
>>>> Stefano Salsano
>>>> Professore Associato
>>>> Dipartimento Ingegneria Elettronica
>>>> Universita' di Roma Tor Vergata
>>>> Via del Politecnico, 1 - 00133 Roma - ITALY
>>>>
>>>> http://netgroup.uniroma2.it/Stefano_Salsano/
>>>>
>>>> E-mail  : stefano.salsano at uniroma2.it
>>>> Cell.   : +39 320 4307310
>>>> Office  : (Tel.) +39 06 72597770  (Fax.) +39 06 72597435
>>>> *******************************************************************
>>>>
>>>
>>>> #!/usr/bin/python
>>>>
>>>> import subprocess
>>>>
>>>> from mininet.net import Mininet
>>>> from mininet.node import Host, OVSKernelSwitch, Node
>>>> from mininet.cli import CLI
>>>> from mininet.log import lg, info
>>>>
>>>> def setup():
>>>>
>>>>     lg.setLogLevel('info')
>>>>
>>>>     net = Mininet(switch=OVSKernelSwitch, build=False,
>>>> autoStaticArp=True )
>>>>
>>>>     host1 = net.addHost("h1")
>>>>     host2 = net.addHost("h2")
>>>>
>>>>     peo1 = net.addSwitch("peo1")
>>>>     peo2 = net.addSwitch("peo2")
>>>>
>>>>     net.addLink(peo1, peo2) # 1 | 1
>>>>     net.addLink(host1, peo1)# N/A | 2
>>>>     net.addLink(host2, peo2)# N/A | 2
>>>>
>>>>     net.start()
>>>>
>>>>     #PEO1 Configuration
>>>>     root = Node( 'root', inNamespace=False )
>>>>     root.cmd('ovs-vsctl --no-wait set bridge %s protocols=OpenFlow13'
>>>> %(peo1.name))
>>>>     root.cmd('ovs-ofctl -O OpenFlow13 add-flow %s
>>>>
>>>> "table=0,hard_timeout=0,priority=32768,in_port=2,eth_type=0x800,actions=push_mpls:0x8847,set_field:262144->mpls_label,push_mpls:0x8847,set_field:524292->mpls_label,goto_table:1"'
>>>> %(peo1.name))
>>>>     root.cmd('ovs-ofctl -O OpenFlow13 add-flow %s
>>>>
>>>> "table=1,hard_timeout=0,priority=32768,in_port=2,eth_type=0x8847,mpls_label=524292,action=output:1"'
>>>> %(peo1.name))
>>>>
>>>>     #PEO2 Configuration
>>>>     root.cmd('ovs-vsctl --no-wait set bridge %s protocols=OpenFlow13'
>>>> %(peo2.name))
>>>>     root.cmd('ovs-ofctl -O OpenFlow13 add-flow %s
>>>>
>>>> "table=0,hard_timeout=0,priority=32768,in_port=1,eth_type=0x8847,actions=goto_table=1"'
>>>> %(peo2.name))
>>>>     root.cmd('ovs-ofctl -O OpenFlow13 add-flow %s
>>>>
>>>> "table=1,hard_timeout=0,priority=32768,in_port=1,eth_type=0x8847,mpls_label=524292,action=pop_mpls:0x8847,resubmit(,1)"'
>>>> %(peo2.name))
>>>>     root.cmd('ovs-ofctl -O OpenFlow13 add-flow %s
>>>>
>>>> "table=1,hard_timeout=0,priority=32768,in_port=1,eth_type=0x8847,mpls_label=262144,mpls_bos=1,action=pop_mpls:0x800,output:2"'
>>>> %(peo2.name))
>>>>
>>>>     CLI(net)
>>>>     net.stop()
>>>>     subprocess.call(["sudo", "mn", "-c"], stdout=None, stderr=None)
>>>>
>>>>
>>>> if __name__ == '__main__':
>>>>     setup()
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>> _______________________________________________
>>>> discuss mailing list
>>>> discuss at openvswitch.org
>>>> http://openvswitch.org/mailman/listinfo/discuss
>>>
>>>
>>>
>>
>>
>
>
> --
> *******************************************************************
> Stefano Salsano
> Professore Associato
> Dipartimento Ingegneria Elettronica
> Universita' di Roma Tor Vergata
> Via del Politecnico, 1 - 00133 Roma - ITALY
>
> http://netgroup.uniroma2.it/Stefano_Salsano/
>
> E-mail  : stefano.salsano at uniroma2.it
> Cell.   : +39 320 4307310
> Office  : (Tel.) +39 06 72597770  (Fax.) +39 06 72597435
> *******************************************************************



More information about the discuss mailing list