[ovs-dev] [PATCH] conntrack: Reset ct_state when entering a new zone.

Dumitru Ceara dceara at redhat.com
Thu Mar 5 11:32:56 UTC 2020


On 3/4/20 8:44 PM, Dumitru Ceara wrote:
> On 3/4/20 7:45 PM, Ilya Maximets wrote:
>> On 3/4/20 2:01 PM, Dumitru Ceara wrote:
>>> On 1/30/20 3:16 PM, Dumitru Ceara wrote:
>>>> When a new conntrack zone is entered, the ct_state field is zeroed in
>>>> order to avoid using state information from different zones.
>>>>
>>>> One such scenario is when a packet is double NATed. Assuming two zones
>>>> and 3 flows performing the following actions in order on the packet:
>>>> 1. ct(zone=5,nat), recirc
>>>> 2. ct(zone=1), recirc
>>>> 3. ct(zone=1,nat)
>>>>
>>>> If at step #1 the packet matches an existing NAT entry, it will get
>>>> translated and pkt->md.ct_state is set to CS_DST_NAT or CS_SRC_NAT.
>>>> At step #2 the new tuple might match an existing connection and
>>>> pkt->md.ct_zone is set to 1.
>>>> If at step #3 the packet matches an existing NAT entry in zone 1,
>>>> handle_nat() will be called to perform the translation but it will
>>>> return early because the packet's zone matches the conntrack zone and
>>>> the ct_state field still contains CS_DST_NAT or CS_SRC_NAT from the
>>>> translations in zone 5.
>>>>
>>>> In order to reliably detect when a packet enters a new conntrack zone
>>>> we also need to zero out the pkt->md.ct_zone field when initializing
>>>> metadata in pkt_metadata_init().
>>>>
>>>> CC: Darrell Ball <dlu998 at gmail.com>
>>>> Signed-off-by: Dumitru Ceara <dceara at redhat.com>
>>>> ---
>>>>  lib/conntrack.c | 5 +++++
>>>>  lib/packets.h   | 5 +++++
>>>>  2 files changed, 10 insertions(+)
>>>>
>>>> diff --git a/lib/conntrack.c b/lib/conntrack.c
>>>> index ff5a894..e4d934a 100644
>>>> --- a/lib/conntrack.c
>>>> +++ b/lib/conntrack.c
>>>> @@ -1277,6 +1277,11 @@ process_one(struct conntrack *ct, struct dp_packet *pkt,
>>>>              const struct nat_action_info_t *nat_action_info,
>>>>              ovs_be16 tp_src, ovs_be16 tp_dst, const char *helper)
>>>>  {
>>>> +    /* Reset ct_state whenever entering a new zone. */
>>>> +    if (pkt->md.ct_zone != zone) {
>>>> +        pkt->md.ct_state = 0;
>>>> +    }
>>>> +
>>>>      bool create_new_conn = false;
>>>>      conn_key_lookup(ct, &ctx->key, ctx->hash, now, &ctx->conn, &ctx->reply);
>>>>      struct conn *conn = ctx->conn;
>>>> diff --git a/lib/packets.h b/lib/packets.h
>>>> index 5d7f82c..fae64bb 100644
>>>> --- a/lib/packets.h
>>>> +++ b/lib/packets.h
>>>> @@ -161,6 +161,11 @@ pkt_metadata_init(struct pkt_metadata *md, odp_port_t port)
>>>>       */
>>>>      memset(md, 0, offsetof(struct pkt_metadata, ct_orig_tuple_ipv6));
>>>>  
>>>> +    /* Explicitly zero out ct_zone in order to be able to properly determine
>>>> +     * when a packet enters a new conntrack zone.
>>>> +     */
>>>> +    md->ct_zone = 0;
>>
> 
> Hi Ilya,
> 
> Thanks for reviewing this!
> 
>> I'm not an expert in conntrack, but I'm bothered about this change in
>> init function.  This function works for every single packet and any
>> modification might significantly affect performance.
> 
> I agree, I'm not too happy with this either but it seemed the safest
> from a correctness perspective.
> 
>>
>> There is an assumption that conntrack fields of packet metadata are
>> never used if ct_state is zero.  So, every user of these fields must
>> be sure that ct_state is correctly initialized.
> 
> Yes, but as far as I understand, this doesn't imply that if ct_state is
> non-zero the pkt->md.ct_zone field is initialized and safe to use.
> 
>> Will it work if we'll change your 'if' statement in 'process_one'
>> function to something like:
>>
>>     if (pkt->md.ct_state && pkt->md.ct_zone != zone) {
>>         pkt->md.ct_state = 0;
>>     }
>>
>> and will not change metadata initialization code?
>>
> 
> I think one example of reading unitialized pkt->md.ct_zone is if
> process_one() was called for a packet and returned early in case of
> CT_CONN_TYPE_UN_NAT:
> 
> https://github.com/openvswitch/ovs/blob/master/lib/conntrack.c#L1303
> 
> Here we set pkt->md.ct_state to "CS_TRACKED | CS_INVALID" but we don't
> touch pkt->md.ct_zone. I think there are cases when this packet could
> reach process_one() again further down the pipeline and we'd be reading
> the uninitialized pkt->md.ct_zone.
> 
> Right now I don't see any other places where we set pkt->md.ct_state
> while leaving pkt->md.ct_zone uninitialized. If that's true I can make
> sure ct_zone is properly initialized here too and then the check you
> suggested would be ok in all cases and we could get rid of the change in
> pkt_metadata_init().
> 
> I'll give it a try and test it out to see if all scenarios are covered.
> 
> Thanks,
> Dumitru
> 

Hi Ilya, Darrell,

I sent out a v2 with the changes mentioned above:
https://patchwork.ozlabs.org/patch/1249513/

Thanks,
Dumitru

>>>> +
>>>>      /* It can be expensive to zero out all of the tunnel metadata. However,
>>>>       * we can just zero out ip_dst and the rest of the data will never be
>>>>       * looked at. */
>>>>
>>>
>>> Hi,
>>>
>>> Just a reminder about this patch.
>>>
>>> OVN LB harpinning system tests on OVN master and OVN-20.03 with OVS
>>> userspace datapath fail because of the issue this patch addresses.
>>>
>>> Thanks,
>>> Dumitru
>>>
>>> _______________________________________________
>>> dev mailing list
>>> dev at openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>
>>
> 



More information about the dev mailing list