[ovs-dev] [PATCH][v2] conntrack: Fix conntrack tw expiration

Aaron Conole aconole at redhat.com
Thu Jan 14 15:22:37 UTC 2021


"Li,Rongqing" <lirongqing at baidu.com> writes:

>  
>
>  
>
>> -----Original Message-----
>
>> From: Aaron Conole [mailto:aconole at redhat.com]
>
>> Sent: Friday, January 08, 2021 11:15 PM
>
>> To: Li,Rongqing <lirongqing at baidu.com>
>
>> Cc: ovs-dev at openvswitch.org; William Tu <u9012063 at gmail.com>
>
>> Subject: Re: [ovs-dev] [PATCH][v2] conntrack: Fix conntrack tw expiration
>
>> 
>
>> "Li,Rongqing" <lirongqing at baidu.com> writes:
>
>> 
>
>> > struct conntrack *ct, struct conn
>
>> >> > *conn_,
>
>> >> >
>
>> >> >          if (tcp_flags & TCP_RST) {
>
>> >> >              src->state = dst->state = CT_DPIF_TCPS_TIME_WAIT;
>
>> >> > +            conn_update_expiration(ct, &conn->up,
>
>> >> CT_TM_TCP_CLOSED,
>
>> >> > + now);
>
>> >> >          }
>
>> >> >      } else {
>
>> >> >          COVERAGE_INC(conntrack_tcp_seq_chk_failed);
>
>> >>
>
>> >> Does it make more sense to move the entire conn_update_expiration
>
>> >> block in the previous condition to a common place after the 3 cases
>
>> >> here (seq_chk, shotgun-syn, and in-window blocks)?
>
>> >>
>
>> >> That should cover this case, and the fin assignment as well.  Then
>
>> >> again, there are only two assignments in this block.
>
>> >>
>
>> >
>
>> >
>
>> > Do you mean we should change like below; I think your suggestion make
>
>> > sense, if no one has objection, I will send a new version
>
>> >
>
>> > diff --git a/lib/conntrack-tcp.c b/lib/conntrack-tcp.c index
>
>> > 18a2aa7c7..b827eab62 100644
>
>> > --- a/lib/conntrack-tcp.c
>
>> > +++ b/lib/conntrack-tcp.c
>
>> > @@ -338,21 +338,6 @@ tcp_conn_update(struct conntrack *ct, struct conn
>
>> *conn_,
>
>> >              src->state = dst->state = CT_DPIF_TCPS_TIME_WAIT;
>
>> >          }
>
>> >
>
>> > -        if (src->state >= CT_DPIF_TCPS_FIN_WAIT_2
>
>> > -            && dst->state >= CT_DPIF_TCPS_FIN_WAIT_2) {
>
>> > -            conn_update_expiration(ct, &conn->up, CT_TM_TCP_CLOSED,
>
>> now);
>
>> > -        } else if (src->state >= CT_DPIF_TCPS_CLOSING
>
>> > -                   && dst->state >= CT_DPIF_TCPS_CLOSING) {
>
>> > -            conn_update_expiration(ct, &conn->up,
>
>> CT_TM_TCP_FIN_WAIT, now);
>
>> > -        } else if (src->state < CT_DPIF_TCPS_ESTABLISHED
>
>> > -                   || dst->state < CT_DPIF_TCPS_ESTABLISHED) {
>
>> > -            conn_update_expiration(ct, &conn->up,
>
>> CT_TM_TCP_OPENING, now);
>
>> > -        } else if (src->state >= CT_DPIF_TCPS_CLOSING
>
>> > -                   || dst->state >= CT_DPIF_TCPS_CLOSING) {
>
>> > -            conn_update_expiration(ct, &conn->up,
>
>> CT_TM_TCP_CLOSING, now);
>
>> > -        } else {
>
>> > -            conn_update_expiration(ct, &conn->up,
>
>> CT_TM_TCP_ESTABLISHED, now);
>
>> > -        }
>
>> >      } else if ((dst->state < CT_DPIF_TCPS_SYN_SENT
>
>> >                  || dst->state >= CT_DPIF_TCPS_FIN_WAIT_2
>
>> >                  || src->state >= CT_DPIF_TCPS_FIN_WAIT_2) @@
>
>> -412,6
>
>> > +397,22 @@ tcp_conn_update(struct conntrack *ct, struct conn *conn_,
>
>> >          return CT_UPDATE_INVALID;
>
>> >      }
>
>> >
>
>> > +    if (src->state >= CT_DPIF_TCPS_FIN_WAIT_2
>
>> > +        && dst->state >= CT_DPIF_TCPS_FIN_WAIT_2) {
>
>> > +        conn_update_expiration(ct, &conn->up, CT_TM_TCP_CLOSED,
>
>> now);
>
>> > +    } else if (src->state >= CT_DPIF_TCPS_CLOSING
>
>> > +               && dst->state >= CT_DPIF_TCPS_CLOSING) {
>
>> > +        conn_update_expiration(ct, &conn->up, CT_TM_TCP_FIN_WAIT,
>
>> now);
>
>> > +    } else if (src->state < CT_DPIF_TCPS_ESTABLISHED
>
>> > +               || dst->state < CT_DPIF_TCPS_ESTABLISHED) {
>
>> > +        conn_update_expiration(ct, &conn->up, CT_TM_TCP_OPENING,
>
>> now);
>
>> > +    } else if (src->state >= CT_DPIF_TCPS_CLOSING
>
>> > +               || dst->state >= CT_DPIF_TCPS_CLOSING) {
>
>> > +        conn_update_expiration(ct, &conn->up, CT_TM_TCP_CLOSING,
>
>> now);
>
>> > +    } else {
>
>> > +        conn_update_expiration(ct, &conn->up,
>
>> CT_TM_TCP_ESTABLISHED, now);
>
>> > +    }
>
>> > +
>
>> >      return CT_UPDATE_VALID;
>
>> >  }
>
>> 
>
>> I'm much less sure now, re-reading the comments, especially:
>
>> 
>
>>          * This must be a little more careful than the above code
>
>>          * since packet floods will also be caught here. We don't
>
>>          * update the TTL here to mitigate the damage of a packet
>
>>          * flood and so the same code can handle awkward establishment
>
>>          * and a loosened connection close.
>
>>          * In the establishment case, a correct peer response will
>
>>          * validate the connection, go through the normal state code
>
>>          * and keep updating the state TTL.
>
>  
>
>  
>
> Sorry, I am not clear how these codes related to TLL?

As I understand the code, the state TTL is the expiration time for the
state.

>> 
>
>> I think given that, it would be wrong to make any change to the connection
>
>> expiration time in this branch.
>
>> 
>
>> I guess you're hitting the 'loosened connection close' case, which will not
>
>  
>
>  
>
> the connection that I see is blow:
>
>  
>
> *
>
>  
>
>  
>
>> change the TTL and keep the connection in the table longer.  Looking at other
>
>> PF based firewalls, none of them ever update the xon timeout in this
>
>> fall-through case (I looked at openbsd, dragonfly bsd, and freebsd).
>
>> 
>
> Is it possible this is a common issue for BSD? I did not find the similar logic in linux
>

It is possible that such issues don't impact linux, but it could be for
other reasons.  If you have a traffic capture that you can post we can
look at it.  Best would be to create a test case using sendpkt.py to
send the packets - that would demonstrate the issue between the two (and
help us to keep it cohesive in the future).

I think we do want to keep the datapaths as similar as possible, so I
would welcome this test to be added.

>
>> Have you looked at using the CT timeout policy framework to setup a timeout
>
>  
>
>  
>
> What is CT timeout policy? Could you give some links, thanks

I guess there isn't a whole lot of documentation.  We should fix that.

>From ovs-vsctl(8):
       [--may-exist] add-zone-tp datapath zone=zone_id policies
              Creates  a  conntrack  zone timeout policy with zone_id in data‐
              path.  The policies consist of  key=value  pairs,  separated  by
              spaces.   For  example,  icmp_first=30 icmp_reply=60 specifies a
              30-second timeout policy for the first ICMP packet and a 60-sec‐
              ond  policy  for  ICMP reply packets.  See the CT_Timeout_Policy
              table in ovs-vswitchd.conf.db(5) for the supported keys.

              Without --may-exist, attempting to add a  zone_id  that  already
              exists is an error.  With --may-exist, this command does nothing
              if zone_id already exists.

>From the CT_Timeout_Policy section of ovs-vswitchd.conf.db(5):

CT_Timeout_Policy TABLE
       Connection tracking timeout policy configuration

   Summary:
       Timeouts:
         timeouts                    map of string-integer pairs, key  one  of
                                     icmp_first,     icmp_reply,    tcp_close,
                                     tcp_close_wait,          tcp_established,
                                     tcp_fin_wait,  tcp_last_ack, tcp_retrans‐
                                     mit,     tcp_syn_recv,     tcp_syn_sent2,
                                     tcp_syn_sent,  tcp_time_wait,  tcp_unack,
                                     udp_first, udp_multiple,  or  udp_single,
                                     value in range 0 to 4,294,967,295
         TCP Timeouts:
            timeouts : tcp_syn_sent  optional   integer,   in   range   0   to
                                     4,294,967,295
            timeouts : tcp_syn_recv  optional   integer,   in   range   0   to
                                     4,294,967,295
            timeouts : tcp_established
                                     optional   integer,   in   range   0   to
                                     4,294,967,295
            timeouts : tcp_fin_wait  optional   integer,   in   range   0   to
                                     4,294,967,295
            timeouts : tcp_close_wait
                                     optional   integer,   in   range   0   to
                                     4,294,967,295
            timeouts : tcp_last_ack  optional   integer,   in   range   0   to
                                     4,294,967,295
            timeouts : tcp_time_wait optional   integer,   in   range   0   to
                                     4,294,967,295
            timeouts : tcp_close     optional   integer,   in   range   0   to
                                     4,294,967,295
            timeouts : tcp_syn_sent2 optional   integer,   in   range   0   to
                                     4,294,967,295
            timeouts : tcp_retransmit
                                     optional   integer,   in   range   0   to
                                     4,294,967,295
            timeouts : tcp_unack     optional   integer,   in   range   0   to
                                     4,294,967,295

  ....

As an example (just illustration - I didn't test any of this):

  ovs-vsctl --may-exist add-zone-tp netdev zone=0 tcp_time_wait=30 \
  tcp_close=30 tcp_last_ack=30 tcp_fin_wait=30

>  
>
> Thanks
>
>  
>
> -Li



More information about the dev mailing list