[ovs-dev] [RFC] conntrack: cross zone nat

Joe Stringer joe at ovn.org
Fri Jan 8 22:03:38 UTC 2016


On 22 December 2015 at 22:05, Zang MingJie <zealot0630 at gmail.com> wrote:
>
>
> On Wed, Dec 23, 2015 at 3:10 AM, Joe Stringer <joe at ovn.org> wrote:
>>
>> On 21 December 2015 at 23:52, Zang MingJie <zealot0630 at gmail.com> wrote:
>> > Hi:
>> >
>> > Problem
>> > =======
>> >
>> > I'm glad to see that ovs add conntrack support, the conntrack support is
>> > great, but I want to push it more forward.
>> >
>> > Consider this scenario:
>> > multiple tenant sharing a single global ip by using nat. ip address in
>> > different tenant can be overlapped. let's say tenant A ip x and tenant B
>> > ip
>> > x want to access internet via nat.
>> >
>> > Currently we accomplish this by using double-nat
>> > tenant A:  x:port ---natA--> internal-ip-a:port ---nat--> global-ip:port
>> > tenant B:  x:port ---natB--> internal-ip-b:port ---nat--> global-ip:port
>> >
>> > natA and natB is done in their own per tenant namespace, so there is no
>> > problem even they have same ip. and second net translate their assigned
>> > internal ips to public ip, there internal ips doesn't conflict.
>> >
>> > Idea
>> > ====
>> >
>> > Now I want to simplify the process by using a single nat using ovs, I
>> > want
>> > to translate a ip:port pair from tenant zone to public zone directly:
>> > ZoneA:x:port ---nat--> ZonePublic:global-ip:port
>> > ZoneB:x:port ---nat--> ZonePublic:global-ip:port
>> >
>> >
>> > Implementation consideration
>> > ============================
>> > Currently kernel cf table is not zone/tenant aware, it can only handle
>> > ip:port pair. It may extended to handle zone-id.
>> >
>> > so cf table can be similar to this one
>> >
>> > zone:s-ip:s-port:d-ip:d-port <------> zone:s-ip:s-port:d-ip:d-port
>> >
>> >
>> > for new connection, src/dst zone is specified by flow:
>> >
>> > Match:   in_port(1),tcp,conn_state=-tracked
>> > Action:  nat(src_zone=10,dst_zone=20,masq=x.x.x.x)
>> >
>> > then a new cf entry can be generated like this one:
>> >
>> > 10:192.168.0.10:4562:8.8.8.8:53 <----> 20:masq-ip:random-port:8.8.8.8:53
>> >
>> > The returning packets can be handled by another flow:
>> >
>> > Match:   in_port(2),tcp,conn_state=+established
>> > Action:  nat(reverse,zone=20)
>> >
>> > by lookup cf table using 4-tuple plus zone, the cf entry can be easily
>> > find,
>> > also zone id '10' can be read from cf entry, so ovs know it is
>> > translated
>> > to zone 10 now.
>> > _______________________________________________
>> > dev mailing list
>> > dev at openvswitch.org
>> > http://openvswitch.org/mailman/listinfo/dev
>>
>>
>> Would something like this work?
>>
>> From tenant to outside:
>> in_port=1,tcp,
>> actions=ct(commit,zone=1,nat(src=GLOBAL),ct(commit,zone=0,exec(set_field:1->ct_mark)),output:3
>> in_port=2,tcp,
>> actions=ct(commit,zone=2,nat(src=GLOBAL),ct(commit,zone=0,exec(set_field:2->ct_mark)),output:3
>
>
> Probably not. nat also need to choose an unused port. but in zone 1, it
> doesn't know which port is unused in zone 0.

OK, I see. Does the port allocation occur differently based on the
zone? (If so, I agree this is a problem; if not, this approach seems
plausible)

>>
>> From outside to tenant:
>> table=0,in_port=3,tcp, actions=ct(zone=0,table=1)
>> table=1,in_port=3,tcp,ct_state=+est,ct_zone=0,ct_mark=1,
>> actions=ct(zone=1,nat,table=2)
>> table=1,in_port=3,tcp,ct_state=+est,ct_zone=0,ct_mark=2,
>> actions=ct(zone=2,nat,table=2)
>> table=2,in_port=3,tcp,ct_state=+est,ct_zone=1 actions=output:1
>> table=2,in_port=3,tcp,ct_state=+est,ct_zone=2 actions=output:2
>
> And there will be lots of upcalls per connection.

On second thought, I think that tables 1 and 2 could be squashed
together - the extra recirculation/upcall should be unnecessary unless
you've got other logic that wants to do something based on that zone.
This is dependent on my previous question though.

<snip>

I discussed this with Jarno offline. It seems that what you're really
looking for OVS support for this new feature that's in Linux 4.3:
https://github.com/torvalds/linux/commit/deedb59039f111c41aa5a54ee384c8e7c08bc78a

If you were to work on adding support for this feature to OVS, I don't
see any particular reason that someone would block the change. The
kernel changes would need to be made against upstream net-next kernel
first (with OVS userspace code changes to test with). The API changes
would need to make sure they don't break the existing functionality.
In terms of the kernel module in the OVS tree, it looks like it would
be a large amount of work to support this on kernels older than v4.3,
so I wouldn't count on being able to run older kernels with this
feature.



More information about the dev mailing list