[ovs-discuss] Issue with connection tracking for packets modified in pipeline

Joe Stringer joe at ovn.org
Thu Jun 22 20:01:18 UTC 2017


On 22 June 2017 at 04:16, Numan Siddique <nusiddiq at redhat.com> wrote:
>
>
> On Thu, Jun 22, 2017 at 5:45 AM, Joe Stringer <joe at ovn.org> wrote:
>>
>> On 21 June 2017 at 04:19, Numan Siddique <nusiddiq at redhat.com> wrote:
>> >
>> >
>> > On Tue, Jun 20, 2017 at 3:11 AM, Joe Stringer <joe at ovn.org> wrote:
>> >>
>> >> On 19 June 2017 at 00:37, Numan Siddique <nusiddiq at redhat.com> wrote:
>> >> >
>> >> >
>> >> > On Fri, Jun 16, 2017 at 11:22 PM, Joe Stringer <joe at ovn.org> wrote:
>> >> >>
>> >> >> On 15 June 2017 at 22:20, Numan Siddique <nusiddiq at redhat.com>
>> >> >> wrote:
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Jun 15, 2017 at 5:06 PM, Aswin S <aswinsuryan at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >>
>> >> >> >> Adding some more info here, Thanks Numan! for pointing to this.
>> >> >> >>
>> >> >> >> The issue I am facing looks similar to the one described in [1]
>> >> >> >> and
>> >> >> >> [2].
>> >> >> >> But it seems the issue is not yet fixed.  Is there a plan to fix
>> >> >> >> this
>> >> >> >> soon?
>> >> >> >> In Opendaylight security groups is implemented using
>> >> >> >> ovs-conntrack.
>> >> >> >> So
>> >> >> >> the
>> >> >> >> flow based router  ping  responder and floating IP translations
>> >> >> >> hits
>> >> >> >> this
>> >> >> >> issue.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> [1]https://mail.openvswitch.org/pipermail/ovs-dev/2017-March/329542.html
>> >> >> >> [2]https://patchwork.ozlabs.org/patch/739796/
>> >> >> >>
>> >> >> >
>> >> >> > The same issuse is also seen in OVN as pointed by Aswin.
>> >> >> >
>> >> >> > Joe - If you remember, we had a chat about this same issue during
>> >> >> > the
>> >> >> > Openstack Boston summit.
>> >> >>
>> >> >> Hi Numan, yeah I recall we had this discussion. I didn't have much
>> >> >> clarity on where we're at with this.  Looking at patchwork, I
>> >> >> provided
>> >> >> some feedback on the RFC. The most straightforward approach seems to
>> >> >> be adding a nf_ct_set(skb, NULL, 0); call for each of the 5tuple
>> >> >> "set"
>> >> >> actions in the datapath.
>> >> >
>> >> >
>> >> > Thanks. I will try it out and let you know how it went.
>> >> > I remember, I was suppose to provide more clarity after our
>> >> > discussion.
>> >> > My
>> >> > apologies. It slipped out of my head.
>> >>
>> >> No worries, let me know how you go.
>> >
>> >
>> > I tried this and it didn't work. In fact the function set_ipv4 (in
>> > datapath/actions.c) is not even called.
>> >
>> > Below is the flow which responds to ICMP request packet
>> >
>> > cookie=0x64913aa, duration=566.801s, table=17, n_packets=3, n_bytes=294,
>> > idle_age=144,
>> > priority=90,icmp,metadata=0x3,nw_dst=192.168.0.1,icmp_type=8,icmp_code=0
>> >
>> > actions=push:NXM_OF_IP_SRC[],push:NXM_OF_IP_DST[],pop:NXM_OF_IP_SRC[],pop:NXM_OF_IP_DST[],load:0xff->NXM_NX_IP_TTL[],load:0->NXM_OF_ICMP_TYPE[],load:0x1->NXM_NX_REG10[0],resubmit(,18)
>> >
>> > Thanks
>> > Numan
>>
>> Hi Numan,
>>
>> How are you going about making these changes and testing them? Could
>> you double-check that the correct module was loaded when you ran the
>> test? Given that the IP src and dst are being modified from the flow
>> you described above, I think that the set_ipv4 function should be
>> called for such flows.
>>
>> Some sanity checks:
>> # modinfo openvswitch
>> # find /lib/modules -name openvswitch.ko* | xargs ls -l
>>
>> Might want to double-check that your depmod.d settings are set
>> correctly so it loads the new module instead of the one that comes
>> with your kernel.
>> # man depmod.d
>>
>> Of course, the above doesn't necessarily apply if you're making
>> changes directly in your kernel tree and loading the module from there
>> (for example, using insmod, or make modules_install into the original
>> module path).
>>
>
> Hi Joe,
>
> I verified that the loaded openvswitch module loaded is indeed modified by
> me.  I also put some printks in functions like "ovs_packet_cmd_execute" to
> verify.
>
> I created my testing scenario as per the commands here [1]. There are 2
> logical ports with IPs 192.168.0.2 and 192.168.0.3 associated to 2
> namespaces ns1 and ns2. The logical switch is also connected to a logical
> router.
>
> I pinged from 192.168.0.2 to 192.168.0.3 continuously and monitored the
> kernel flows with the command -
>
> $watch -n1 -d "sudo ovs-dpctl dump-flows system at ovs-system"
> recirc_id(0),in_port(3),eth(src=00:00:00:00:00:00/01:00:00:00:00:00,dst=50:54:00:00:00:01),eth_type(0x0800),ipv4(dst=192.168.0.2/255.255.255.254,frag=no),
> packets:28, bytes:2744, used:0.323s, actions:2
> recirc_id(0),in_port(2),eth(src=00:00:00:00:00:00/01:00:00:00:00:00,dst=50:54:00:00:00:02),eth_type(0x0800),ipv4(dst=192.168.0.2/255.255.255.254,frag=no),
> packets:28, bytes:2744, used:0.323s, actions:3
>
>
> I pinged from 192.168.0.2 to 192.168.0.1 (without any ACLs, so the ping
> would be successful), I observed that the action is always userspace and I
> could see that the function "odp_execute_masked_set_action" in
> lib/odp-execute.c is called in vswitchd.
>
> $watch -n1 -d "sudo ovs-dpctl dump-flows system at ovs-system"
> recirc_id(0),in_port(2),eth(src=50:54:00:00:00:01,dst=00:00:00:00:ff:01),eth_type(0x0806),arp(sip=192.168.0.2,tip=192.168.0.1,op=1/0xff,sha=50:54:00:00:00:01,tha=00:00:00:00:00:00),
> packets:0, bytes:0, used:never,
> actions:userspace(pid=4294958020,slow_path(action))
> recirc_id(0),in_port(2),eth(src=50:54:00:00:00:01,dst=00:00:00:00:ff:01),eth_type(0x0800),ipv4(src=192.168.0.2,dst=192.168.0.1,proto=1,ttl=64,frag=no),icmp(type=8,code=0),
> packets:9, bytes:882, used:0.937s,
> actions:userspace(pid=4294958021,slow_path(action))
>
> In this case, the ICMP reply is framed by the OVS flow  and there is "clone"
> action involved for the packet to go to and from the logical switch to
> logical router pipeline.
>
> To avoid clone action, I added some code in ovn-northd to respond the ICMP
> reply if the ip4.dst = 192.168.0.1 which translated to the below OF flow
>
> table=19, n_packets=619, n_bytes=60662, idle_age=1,
> priority=90,icmp,metadata=0x1,nw_dst=192.168.0.1,icmp_type=8,icmp_code=0
> actions=move:NXM_OF_IP_SRC[]->NXM_OF_IP_DST[],mod_nw_src:192.168.0.1,push:NXM_OF_ETH_SRC[],push:NXM_OF_ETH_DST[],pop:NXM_OF_ETH_SRC[],pop:NXM_OF_ETH_DST[],load:0xff->NXM_NX_IP_TTL[],load:0->NXM_OF_ICMP_TYPE[],load:0x1->NXM_NX_REG10[0],resubmit(,20)
>
> And in both the cases I see that there is an upcall for each packet and
> odp_execute_masked_set_action is called.

OK, I think that my suggestion for that patch (patchwork 739796) was
actually addressing a subtly different issue.

With regards to this issue, as far as I understand back to the
original report, connection with tuple A is committed to the
connection tracker. A is then statelessly modified to tuple B, then a
lookup with B is performed. Typically if you have tuple A or tuple A'
(ie, the reversed tuple) in the packet headers then looking up with
either of these headers will find the same connection. If you then
perform a lookup with tuple B, then it can only look up using B or B';
no state was kept about the translation from A->B, so there's no way
for the connection tracker to associate tuple B back to tuple A.
Lookup using B and B' cannot find a connection because it was never
committed like that. Therefore it would be new. However, since B is a
SYN-ACK packet, the Linux connection tracker considers that it is
invalid rather than new. For it to work, the tuple B', ie the original
SYN, should be committed first.


More information about the discuss mailing list