[ovs-dev] [PATCH v2] ipf: fix only nat the first fragment in the reass process

Aaron Conole aconole at redhat.com
Mon Aug 23 15:11:16 UTC 2021


Ilya Maximets <i.maximets at ovn.org> writes:

> On 8/12/21 6:17 PM, Aaron Conole wrote:
>> wenxu at ucloud.cn writes:
>> 
>>> From: wenxu <wenxu at ucloud.cn>
>>>
>>> The ipf collect original fragment packets and reass a new pkt
>>> to do the conntrack logic. After finsh the conntrack things
>>> copy the ct meta info to each orignal packet and modify the
>>> l4 header in the first fragment. It should modify the ip src/
>>> dst info for all the fragments.
>>>
>>> Signed-off-by: wenxu <wenxu at ucloud.cn>
>>> Co-authored-by: luke.li <luke.li at ucloud.cn>
>>> Signed-off-by: luke.li <luke.li at ucloud.cn>
>>> ---
>> 
>> Acked-by: Aaron Conole <aconole at redhat.com>
>> 
>> Thanks for the fix.  I see it can work for any l3 protocol.
>> 
>> Based on the comments you supplied, I wrote the following test case.  It
>> can either be folded in by you (or Ilya on apply), or I can submit as a
>> separate patch (in case you are worried about having my sign-off /
>> coauthor on this patch).
>> 
>> When testing 'make check-system-userspace' before this patch, I see a
>> failure and get the following tcpdump logged:
>> 
>>   12:15:31 aconole at RHTPC1VM0NT {master} ~/git/ovs$ sudo tcpdump -r
>> tests/system-userspace-testsuite.dir/078/p1.pcap
>>   reading from file
>> tests/system-userspace-testsuite.dir/078/p1.pcap, link-type EN10MB
>> (Ethernet), snapshot length 262144
>>   dropped privs to tcpdump
>>   12:07:21.364925 ARP, Request who-has 10.2.1.2 tell 10.2.1.1, length 28
>>   12:07:21.364928 ARP, Reply 10.2.1.2 is-at e6:45:4a:80:7c:61 (oui Unknown), length 28
>>   12:07:21.365095 IP 10.1.1.1 > 10.1.1.2: ICMP echo request, id 40165, seq 1, length 1480
>>   12:07:21.365099 IP 10.2.1.1 > 10.1.1.2: icmp
>>   12:07:21.365101 IP 10.2.1.1 > 10.1.1.2: icmp
>>   12:07:21.365102 IP 10.2.1.1 > 10.1.1.2: icmp
>> 
>> We see the first frag correct, but subsequent frags are broken.
>> 
>> This test worked both for userspace and kernel datapath on my local
>> system.
>
> Hmm.  This test fails for me for both kernel and userspace:

Okay, I'll try it again on my system.  For reference, I was on F34,
kernel 5.12.12-300.fc34.x86_64

> tcpdump -r tests/system-userspace-testsuite.dir/078/p0.pcap
> 15:17:12.832383 ARP, Request who-has 10.2.1.2 tell 10.2.1.1, length 28
> 15:17:12.834317 ARP, Reply 10.2.1.2 is-at 46:0c:83:aa:6e:b0 (oui Unknown), length 28
> 15:17:12.834327 IP 10.2.1.1 > 10.1.1.2: ICMP echo request, id 27759, seq 1, length 1480
> 15:17:12.834329 IP 10.2.1.1 > 10.1.1.2: icmp
> 15:17:12.834330 IP 10.2.1.1 > 10.1.1.2: icmp
> 15:17:12.834332 IP 10.2.1.1 > 10.1.1.2: icmp
>
> tcpdump -r tests/system-userspace-testsuite.dir/078/p1.pcap
> 15:17:12.833542 ARP, Request who-has 10.2.1.2 tell 10.2.1.1, length 28
> 15:17:12.834994 IP 10.1.1.1 > 10.1.1.2: ICMP echo request, id 27759, seq 1, length 1480
> 15:17:12.834999 IP 10.1.1.1 > 10.1.1.2: icmp
> 15:17:12.835002 IP 10.1.1.1 > 10.1.1.2: icmp
> 15:17:12.835004 IP 10.1.1.1 > 10.1.1.2: icmp
>
> ping -c 1 10.1.1.2 -M dont -s 4500 | grep "transmitted" | sed 's/time.*ms$/time 0ms/'
> NS_EXEC_HEREDOC
> --- -   2021-08-16 15:17:22.844535052 -0400
> +++ /root/ovs/tests/system-userspace-testsuite.dir/at-groups/78/stdout
> @@ -1,2 +1,2 @@
> -1 packets transmitted, 1 received, 0% packet loss, time 0ms
> +1 packets transmitted, 0 received, 100% packet loss, time 0ms
>
> # uname -a
> Linux rhel8 4.18.0-305.3.1.el8_4.x86_64
>
> I'm not sure what is going on.  Could you, please, re-check?

I'll boot an rhel8.4 instance and try it out.

> I will not apply this patch for now until we figure out how to test it.

Okay.

Thanks, Ilya!

> Best regards, Ilya Maximets.
>
>> 
>> ---
>>  tests/system-traffic.at | 46 +++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 46 insertions(+)
>> 
>> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>> index f400cfabc9..feb335c783 100644
>> --- a/tests/system-traffic.at
>> +++ b/tests/system-traffic.at
>> @@ -3305,6 +3305,52 @@ NS_CHECK_EXEC([at_ns0], [ping6 -s 3200 -q -c
>> 3 -i 0.3 -w 2 fc00::2 | FORMAT_PING
>>  OVS_TRAFFIC_VSWITCHD_STOP
>>  AT_CLEANUP
>>  
>> +AT_SETUP([conntrack - IPv4 Fragmentation + nat])
>> +AT_SKIP_IF([test $HAVE_TCPDUMP = no])
>> +CHECK_CONNTRACK()
>> +
>> +OVS_TRAFFIC_VSWITCHD_START(
>> +   [set-fail-mode br0 secure -- ])
>> +
>> +ADD_NAMESPACES(at_ns0, at_ns1)
>> +
>> +ADD_VETH(p0, at_ns0, br0, "10.2.1.1/24")
>> +NS_CHECK_EXEC([at_ns0], [ip link set dev p0 address 80:88:88:88:88:88])
>> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
>> +
>> +# create a dummy route for NAT
>> +ADD_VETH(p2, at_ns1, br0, "10.2.1.2/24")
>> +NS_CHECK_EXEC([at_ns0], [ip route add 10.1.1.0/24 via 10.2.1.2])
>> +NS_CHECK_EXEC([at_ns1], [ip neigh add 10.1.1.1 nud permanent lladdr 80:88:88:88:88:88 dev p1])
>> +
>> +# disable iptables from getting in the way
>> +NS_EXEC([at_ns1], [iptables --flush])
>> +
>> +# solely for debugging when things go wrong
>> +NS_EXEC([at_ns0], [tcpdump -i p0 -w p0.pcap -xx >tcpdump.out &])
>> +NS_EXEC([at_ns1], [tcpdump -i p1 -w p1.pcap -xx >tcpdump.out &])
>> +NS_EXEC([at_ns1], [tcpdump -i p2 -w p2.pcap -xx >tcpdump2.out &])
>> +
>> +AT_DATA([flows.txt], [dnl
>> +table=0,arp,actions=normal
>> +table=0,ct_state=-trk,ip,in_port=ovs-p0, actions=ct(table=1, nat)
>> +table=0,ct_state=-trk,ip,in_port=ovs-p1, actions=ct(table=1, nat)
>> +table=1,ct_state=+trk+new,ip,in_port=ovs-p0, actions=ct(commit, nat(src=10.1.1.1)),ovs-p1
>> +table=1,ct_state=+trk+est,ip,in_port=ovs-p0, actions=ovs-p1
>> +table=1,ct_state=+trk+est,ip,in_port=ovs-p1, actions=ovs-p0
>> +])
>> +
>> +AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
>> +
>> +#check connectivity
>> +NS_CHECK_EXEC([at_ns0], [ping -c 1 10.1.1.2 -M dont -s 4500 | FORMAT_PING], [0], [dnl
>> +1 packets transmitted, 1 received, 0% packet loss, time 0ms
>> +])
>> +
>> +OVS_TRAFFIC_VSWITCHD_STOP
>> +AT_CLEANUP
>> +
>> +
>>  AT_SETUP([conntrack - resubmit to ct multiple times])
>>  CHECK_CONNTRACK()
>>  
>> 



More information about the dev mailing list