[ovs-discuss] Issue with connection tracking for packets modified in pipeline

Numan Siddique nusiddiq at redhat.com
Wed Jun 28 01:36:43 UTC 2017


On Jun 23, 2017 2:25 PM, "Joe Stringer" <joe at ovn.org> wrote:

On 22 June 2017 at 16:08, Numan Siddique <nusiddiq at redhat.com> wrote:
>
>
> On Jun 23, 2017 1:31 AM, "Joe Stringer" <joe at ovn.org> wrote:
>
> On 22 June 2017 at 04:16, Numan Siddique <nusiddiq at redhat.com> wrote:
>>
>>
>> On Thu, Jun 22, 2017 at 5:45 AM, Joe Stringer <joe at ovn.org> wrote:
>>>
>>> On 21 June 2017 at 04:19, Numan Siddique <nusiddiq at redhat.com> wrote:
>>> >
>>> >
>>> > On Tue, Jun 20, 2017 at 3:11 AM, Joe Stringer <joe at ovn.org> wrote:
>>> >>
>>> >> On 19 June 2017 at 00:37, Numan Siddique <nusiddiq at redhat.com> wrote:
>>> >> >
>>> >> >
>>> >> > On Fri, Jun 16, 2017 at 11:22 PM, Joe Stringer <joe at ovn.org> wrote:
>>> >> >>
>>> >> >> On 15 June 2017 at 22:20, Numan Siddique <nusiddiq at redhat.com>
>>> >> >> wrote:
>>> >> >> >
>>> >> >> >
>>> >> >> > On Thu, Jun 15, 2017 at 5:06 PM, Aswin S <aswinsuryan at gmail.com>
>>> >> >> > wrote:
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> Adding some more info here, Thanks Numan! for pointing to this.
>>> >> >> >>
>>> >> >> >> The issue I am facing looks similar to the one described in [1]
>>> >> >> >> and
>>> >> >> >> [2].
>>> >> >> >> But it seems the issue is not yet fixed.  Is there a plan to
fix
>>> >> >> >> this
>>> >> >> >> soon?
>>> >> >> >> In Opendaylight security groups is implemented using
>>> >> >> >> ovs-conntrack.
>>> >> >> >> So
>>> >> >> >> the
>>> >> >> >> flow based router  ping  responder and floating IP translations
>>> >> >> >> hits
>>> >> >> >> this
>>> >> >> >> issue.
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> [1]https://mail.openvswitch.org/pipermail/ovs-dev/2017-
March/329542.html
>>> >> >> >> [2]https://patchwork.ozlabs.org/patch/739796/
>>> >> >> >>
>>> >> >> >
>>> >> >> > The same issuse is also seen in OVN as pointed by Aswin.
>>> >> >> >
>>> >> >> > Joe - If you remember, we had a chat about this same issue
during
>>> >> >> > the
>>> >> >> > Openstack Boston summit.
>>> >> >>
>>> >> >> Hi Numan, yeah I recall we had this discussion. I didn't have much
>>> >> >> clarity on where we're at with this.  Looking at patchwork, I
>>> >> >> provided
>>> >> >> some feedback on the RFC. The most straightforward approach seems
>>> >> >> to
>>> >> >> be adding a nf_ct_set(skb, NULL, 0); call for each of the 5tuple
>>> >> >> "set"
>>> >> >> actions in the datapath.
>>> >> >
>>> >> >
>>> >> > Thanks. I will try it out and let you know how it went.
>>> >> > I remember, I was suppose to provide more clarity after our
>>> >> > discussion.
>>> >> > My
>>> >> > apologies. It slipped out of my head.
>>> >>
>>> >> No worries, let me know how you go.
>>> >
>>> >
>>> > I tried this and it didn't work. In fact the function set_ipv4 (in
>>> > datapath/actions.c) is not even called.
>>> >
>>> > Below is the flow which responds to ICMP request packet
>>> >
>>> > cookie=0x64913aa, duration=566.801s, table=17, n_packets=3,
>>> > n_bytes=294,
>>> > idle_age=144,
>>> >
>>> > priority=90,icmp,metadata=0x3,nw_dst=192.168.0.1,icmp_type=
8,icmp_code=0
>>> >
>>> >
>>> > actions=push:NXM_OF_IP_SRC[],push:NXM_OF_IP_DST[],pop:NXM_
OF_IP_SRC[],pop:NXM_OF_IP_DST[],load:0xff->NXM_NX_IP_TTL[],
load:0->NXM_OF_ICMP_TYPE[],load:0x1->NXM_NX_REG10[0],resubmit(,18)
>>> >
>>> > Thanks
>>> > Numan
>>>
>>> Hi Numan,
>>>
>>> How are you going about making these changes and testing them? Could
>>> you double-check that the correct module was loaded when you ran the
>>> test? Given that the IP src and dst are being modified from the flow
>>> you described above, I think that the set_ipv4 function should be
>>> called for such flows.
>>>
>>> Some sanity checks:
>>> # modinfo openvswitch
>>> # find /lib/modules -name openvswitch.ko* | xargs ls -l
>>>
>>> Might want to double-check that your depmod.d settings are set
>>> correctly so it loads the new module instead of the one that comes
>>> with your kernel.
>>> # man depmod.d
>>>
>>> Of course, the above doesn't necessarily apply if you're making
>>> changes directly in your kernel tree and loading the module from there
>>> (for example, using insmod, or make modules_install into the original
>>> module path).
>>>
>>
>> Hi Joe,
>>
>> I verified that the loaded openvswitch module loaded is indeed modified
by
>> me.  I also put some printks in functions like "ovs_packet_cmd_execute"
to
>> verify.
>>
>> I created my testing scenario as per the commands here [1]. There are 2
>> logical ports with IPs 192.168.0.2 and 192.168.0.3 associated to 2
>> namespaces ns1 and ns2. The logical switch is also connected to a logical
>> router.
>>
>> I pinged from 192.168.0.2 to 192.168.0.3 continuously and monitored the
>> kernel flows with the command -
>>
>> $watch -n1 -d "sudo ovs-dpctl dump-flows system at ovs-system"
>>
>> recirc_id(0),in_port(3),eth(src=00:00:00:00:00:00/01:00:
00:00:00:00,dst=50:54:00:00:00:01),eth_type(0x0800),ipv4(dst=
192.168.0.2/255.255.255.254,frag=no),
>> packets:28, bytes:2744, used:0.323s, actions:2
>>
>> recirc_id(0),in_port(2),eth(src=00:00:00:00:00:00/01:00:
00:00:00:00,dst=50:54:00:00:00:02),eth_type(0x0800),ipv4(dst=
192.168.0.2/255.255.255.254,frag=no),
>> packets:28, bytes:2744, used:0.323s, actions:3
>>
>>
>> I pinged from 192.168.0.2 to 192.168.0.1 (without any ACLs, so the ping
>> would be successful), I observed that the action is always userspace and
I
>> could see that the function "odp_execute_masked_set_action" in
>> lib/odp-execute.c is called in vswitchd.
>>
>> $watch -n1 -d "sudo ovs-dpctl dump-flows system at ovs-system"
>>
>> recirc_id(0),in_port(2),eth(src=50:54:00:00:00:01,dst=00:
00:00:00:ff:01),eth_type(0x0806),arp(sip=192.168.0.2,
tip=192.168.0.1,op=1/0xff,sha=50:54:00:00:00:01,tha=00:00:00:00:00:00),
>> packets:0, bytes:0, used:never,
>> actions:userspace(pid=4294958020,slow_path(action))
>>
>> recirc_id(0),in_port(2),eth(src=50:54:00:00:00:01,dst=00:
00:00:00:ff:01),eth_type(0x0800),ipv4(src=192.168.0.2,
dst=192.168.0.1,proto=1,ttl=64,frag=no),icmp(type=8,code=0),
>> packets:9, bytes:882, used:0.937s,
>> actions:userspace(pid=4294958021,slow_path(action))
>>
>> In this case, the ICMP reply is framed by the OVS flow  and there is
>> "clone"
>> action involved for the packet to go to and from the logical switch to
>> logical router pipeline.
>>
>> To avoid clone action, I added some code in ovn-northd to respond the
ICMP
>> reply if the ip4.dst = 192.168.0.1 which translated to the below OF flow
>>
>> table=19, n_packets=619, n_bytes=60662, idle_age=1,
>> priority=90,icmp,metadata=0x1,nw_dst=192.168.0.1,icmp_type=8,icmp_code=0
>>
>> actions=move:NXM_OF_IP_SRC[]->NXM_OF_IP_DST[],mod_nw_src:
192.168.0.1,push:NXM_OF_ETH_SRC[],push:NXM_OF_ETH_DST[],
pop:NXM_OF_ETH_SRC[],pop:NXM_OF_ETH_DST[],load:0xff->NXM_
NX_IP_TTL[],load:0->NXM_OF_ICMP_TYPE[],load:0x1->NXM_NX_
REG10[0],resubmit(,20)
>>
>> And in both the cases I see that there is an upcall for each packet and
>> odp_execute_masked_set_action is called.
>
> OK, I think that my suggestion for that patch (patchwork 739796) was
> actually addressing a subtly different issue.
>
> With regards to this issue, as far as I understand back to the
> original report, connection with tuple A is committed to the
> connection tracker. A is then statelessly modified to tuple B, then a
> lookup with B is performed. Typically if you have tuple A or tuple A'
> (ie, the reversed tuple) in the packet headers then looking up with
> either of these headers will find the same connection. If you then
> perform a lookup with tuple B, then it can only look up using B or B';
> no state was kept about the translation from A->B, so there's no way
> for the connection tracker to associate tuple B back to tuple A.
> Lookup using B and B' cannot find a connection because it was never
> committed like that. Therefore it would be new. However, since B is a
> SYN-ACK packet, the Linux connection tracker considers that it is
> invalid rather than new. For it to work, the tuple B', ie the original
> SYN, should be committed first.
>
>
> Thanks for the explanation. The issue we are seeing is for ICMP packets
and
> looking into the connection tracking entries I see the packet is in
> UNREPLIED state. When the ICMP reply is framed by the ovs flows, the tuple
> would still remain the same right ? Only ip4.src is swapped with ip4.dst
and
> ICMP code is changed.

Right, so for ICMP I think the problem is different. Yes, by the looks
only src/dst are swapped and code changed, which should produce a
tuple that can look up and find the original connection. Given that
the execution is happening in userspace, that would be one path to
follow: exactly what is executed upon the packet in terms of datapath
actions after the kernel runs userspace(...,slow_path(action))? Where
is the conntrack call from that path, and how does it try to get the
ct_state from the kernel?

I wonder if the ICMP issue is related to the patch here:
http://patchwork.ozlabs.org/patch/775756/


Thanks. I will test some more and get back on this.

Nirman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20170628/2c969fae/attachment-0001.html>


More information about the discuss mailing list