[ovs-dev] [PATCH v6 2/2] conntrack: handle SNAT with all-zero IP address

Paolo Valerio pvalerio at redhat.com
Wed May 26 14:34:35 UTC 2021


Hi Dumitru,

Dumitru Ceara <dceara at redhat.com> writes:

> On 4/28/21 11:56 PM, Paolo Valerio wrote:
>> this patch introduces for the userspace datapath the handling
>> of rules like the following:
>> 
>> ct(commit,nat(src=0.0.0.0),...)
>> 
>> Kernel datapath already handle this case that is particularly
>> handy in scenarios like the following:
>> 
>> Given A: 10.1.1.1, B: 192.168.2.100, C: 10.1.1.2
>> 
>> A opens a connection toward B on port 80 selecting as source port 10000.
>> B's IP gets dnat'ed to C's IP (10.1.1.1:10000 -> 192.168.2.100:80).
>> 
>> This will result in:
>> 
>> tcp,orig=(src=10.1.1.1,dst=192.168.2.100,sport=10000,dport=80),reply=(src=10.1.1.2,dst=10.1.1.1,sport=80,dport=10000),protoinfo=(state=ESTABLISHED)
>> 
>> A now tries to establish another connection with C using source port
>> 10000, this time using C's IP address (10.1.1.1:10000 -> 10.1.1.2:80).
>> 
>> This second connection, if processed by conntrack with no SNAT/DNAT
>> involved, collides with the reverse tuple of the first connection,
>> so the entry for this valid connection doesn't get created.
>> 
>> With this commit, and adding a SNAT rule with 0.0.0.0 for
>> 10.1.1.1:10000 -> 10.1.1.2:80 will allow to create the conn entry:
>> 
>> tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=10000,dport=80),reply=(src=10.1.1.2,dst=10.1.1.1,sport=80,dport=10001),protoinfo=(state=ESTABLISHED)
>> tcp,orig=(src=10.1.1.1,dst=192.168.2.100,sport=10000,dport=80),reply=(src=10.1.1.2,dst=10.1.1.1,sport=80,dport=10000),protoinfo=(state=ESTABLISHED)
>> 
>> The issue exists even in the opposite case (with A trying to connect
>> to C using B's IP after establishing a direct connection from A to C).
>> 
>> This commit refactors the relevant function in a way that both of the
>> previously mentioned cases are handled as well.
>> 
>> Suggested-by: Eelco Chaudron <echaudro at redhat.com>
>> Signed-off-by: Paolo Valerio <pvalerio at redhat.com>
>> ---
>
> Hi Paolo,
>
> There seems to be a functionality difference between kernel and
> userspace datapath, wrt. all-zero IP SNAT.
>
> With the sample script below, when using the kernel datapath, p1 is able
> to connect to the VIP (DNAT) with backend p2.  However, when using the
> userspace datapath, it doesn't work because userspace conntrack will
> always perform an additional SNAT if two commit operations happen for
> the same packet in the same zone.
>
> Arguably this pipeline could be optimized, but it's actually what OVN
> generates today, with the addition that on the second commit we also
> perform SNAT(0.0.0.0).  OVN first executes the load balancing stage
> (DNAT + commit) and later executes the firewalling stage (ACL) and if
> the packet should be allowed the connection gets committed again.  This
> also covers the case when packets are not load balanced, and only
> committed in the ACL stage.
>

The behavior is present in the userspace dp without this series.
As of today, multiple commits appear to be managed in kernel dp but not
in userspace dp.

E.g. with the current release (so no null snat involved), using your
script, setting an IP address other than all-zero (even the same source
IP of the sender in the orig direction) in the snat rule:

ovs-ofctl add-flow br-test 'table=1,tcp actions=ct(commit,nat(src=42.42.42.3),table=2)'

there's basically the same behavior for both datapaths:
- one working entry for kernel dp
- two for userspace with the subsequent failure

Kernel dp tries to prevent the second one from being executed
caching and doing other checks, while the userspace dp doesn't take
care of it.

Of course, in this case, the example is made up and just like you said,
the problem can be avoided changing the rules.

> While I think we can try and work around this in OVN by using additional
> openflow register bits to differentiate between traffic that was load
> balanced or not, it would make it way easier if the sequence:
>
> table=T: ct(commit, nat(dst=x.y.z.t), zone=Z, table=T+1)
>
> followed by (without changing conntrack zone)
>
> table=T+1: ct(commit, nat(src=0.0.0.0), zone=Z)
>
> would translate the second commit to a no-op similar to what the kernel
> does.
>

It would be good to prevent failures (as long as it is possible) and
have a consistent behavior between datapaths.

Nice catch though!

Paolo

> Thanks,
> Dumitru
>
> ovs-vsctl add-br br-test
> ovs-vsctl set bridge br-test datapath_type=netdev
> ovs-vsctl add-port br-test p1 -- set interface p1 type=internal
> ovs-vsctl add-port br-test p2 -- set interface p2 type=internal
> ip netns add p1
> ip link set p1 netns p1
> ip netns exec p1 ip link set up dev p1
> ip netns exec p1 ip link set p1 address 00:00:00:00:00:01
> ip netns exec p1 ip addr add dev p1 42.42.42.1/24
> ip netns exec p1 ip neigh add 42.42.42.42 lladdr 00:00:00:00:00:02 dev p1
>
> ip netns add p2
> ip link set p2 netns p2
> ip netns exec p2 ip link set up dev p2
> ip netns exec p2 ip link set p2 address 00:00:00:00:00:02
> ip netns exec p2 ip addr add dev p2 42.42.42.2/24
>
> ovs-ofctl del-flows br-test
> ovs-ofctl add-flow br-test 'table=0,priority=2,in_port=1,tcp,nw_dst=42.42.42.42,tp_dst=8080 actions=ct(commit,nat(dst=42.42.42.2:80),table=1)'
> ovs-ofctl add-flow br-test 'table=0,priority=2,in_port=2,tcp actions=ct(nat,table=1)'
> ovs-ofctl add-flow br-test 'table=0,priority=1 actions=resubmit(,2)'
> ovs-ofctl add-flow br-test 'table=1,tcp actions=ct(commit,nat(src=0.0.0.0),table=2)'
> ovs-ofctl add-flow br-test 'table=2,in_port=1 actions=2'
> ovs-ofctl add-flow br-test 'table=2,in_port=2 actions=1'
>
> # Start a server.
> ip netns exec p2 nc -l -v -k 42.42.42.2 80
>
> # Start client1 (should succeed):
> ip netns exec p1 nc -v -z -s 42.42.42.1 -p 4000 42.42.42.42 8080



More information about the dev mailing list