[ovs-dev] [PATCH net 2/2] act_ct: support asymmetric conntrack

Aaron Conole aconole at redhat.com
Mon Nov 18 21:24:18 UTC 2019


Paul Blakey <paulb at mellanox.com> writes:

> On 11/14/2019 4:22 PM, Roi Dayan wrote:
>>
>> On 2019-11-08 11:07 PM, Aaron Conole wrote:
>>> The act_ct TC module shares a common conntrack and NAT infrastructure
>>> exposed via netfilter.  It's possible that a packet needs both SNAT and
>>> DNAT manipulation, due to e.g. tuple collision.  Netfilter can support
>>> this because it runs through the NAT table twice - once on ingress and
>>> again after egress.  The act_ct action doesn't have such capability.
>>>
>>> Like netfilter hook infrastructure, we should run through NAT twice to
>>> keep the symmetry.
>>>
>>> Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
>>>
>>> Signed-off-by: Aaron Conole <aconole at redhat.com>
>>> ---
>>>   net/sched/act_ct.c | 13 ++++++++++++-
>>>   1 file changed, 12 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
>>> index fcc46025e790..f3232a00970f 100644
>>> --- a/net/sched/act_ct.c
>>> +++ b/net/sched/act_ct.c
>>> @@ -329,6 +329,7 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
>>>   			  bool commit)
>>>   {
>>>   #if IS_ENABLED(CONFIG_NF_NAT)
>>> +	int err;
>>>   	enum nf_nat_manip_type maniptype;
>>>   
>>>   	if (!(ct_action & TCA_CT_ACT_NAT))
>>> @@ -359,7 +360,17 @@ static int tcf_ct_act_nat(struct sk_buff *skb,
>>>   		return NF_ACCEPT;
>>>   	}
>>>   
>>> -	return ct_nat_execute(skb, ct, ctinfo, range, maniptype);
>>> +	err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
>>> +	if (err == NF_ACCEPT &&
>>> +	    ct->status & IPS_SRC_NAT && ct->status & IPS_DST_NAT) {
>>> +		if (maniptype == NF_NAT_MANIP_SRC)
>>> +			maniptype = NF_NAT_MANIP_DST;
>>> +		else
>>> +			maniptype = NF_NAT_MANIP_SRC;
>>> +
>>> +		err = ct_nat_execute(skb, ct, ctinfo, range, maniptype);
>>> +	}
>>> +	return err;
>>>   #else
>>>   	return NF_ACCEPT;
>>>   #endif
>>>
>> +paul
>
> Hi Aaron,
>
> I think I understand the issue and this looks good,
>
> Can you describe the scenario to reproduce this?

It reproduces with OpenShift 3.10, which makes forward direction packets
between namespaces pump through a tun device that applies NAT rules to
rewrite the dest.  Limit the namespace number of ephemeral sockets using
by editing net.ipv4.ip_local_port_range in the client namespace, and
connect to the server namespace.  That's the mechanism for OvS.  But for
TC I guess there wouldn't be anything convenient avaiable.

I'll try to script up something that doesn't use openshift.

>
> Thanks,
>
> Paul.



More information about the dev mailing list