[ovs-dev] [PATCH][v2] conntrack: Fix conntrack tw expiration
Aaron Conole
aconole at redhat.com
Thu Jan 14 15:22:37 UTC 2021
"Li,Rongqing" <lirongqing at baidu.com> writes:
>
>
>
>
>> -----Original Message-----
>
>> From: Aaron Conole [mailto:aconole at redhat.com]
>
>> Sent: Friday, January 08, 2021 11:15 PM
>
>> To: Li,Rongqing <lirongqing at baidu.com>
>
>> Cc: ovs-dev at openvswitch.org; William Tu <u9012063 at gmail.com>
>
>> Subject: Re: [ovs-dev] [PATCH][v2] conntrack: Fix conntrack tw expiration
>
>>
>
>> "Li,Rongqing" <lirongqing at baidu.com> writes:
>
>>
>
>> > struct conntrack *ct, struct conn
>
>> >> > *conn_,
>
>> >> >
>
>> >> > if (tcp_flags & TCP_RST) {
>
>> >> > src->state = dst->state = CT_DPIF_TCPS_TIME_WAIT;
>
>> >> > + conn_update_expiration(ct, &conn->up,
>
>> >> CT_TM_TCP_CLOSED,
>
>> >> > + now);
>
>> >> > }
>
>> >> > } else {
>
>> >> > COVERAGE_INC(conntrack_tcp_seq_chk_failed);
>
>> >>
>
>> >> Does it make more sense to move the entire conn_update_expiration
>
>> >> block in the previous condition to a common place after the 3 cases
>
>> >> here (seq_chk, shotgun-syn, and in-window blocks)?
>
>> >>
>
>> >> That should cover this case, and the fin assignment as well. Then
>
>> >> again, there are only two assignments in this block.
>
>> >>
>
>> >
>
>> >
>
>> > Do you mean we should change like below; I think your suggestion make
>
>> > sense, if no one has objection, I will send a new version
>
>> >
>
>> > diff --git a/lib/conntrack-tcp.c b/lib/conntrack-tcp.c index
>
>> > 18a2aa7c7..b827eab62 100644
>
>> > --- a/lib/conntrack-tcp.c
>
>> > +++ b/lib/conntrack-tcp.c
>
>> > @@ -338,21 +338,6 @@ tcp_conn_update(struct conntrack *ct, struct conn
>
>> *conn_,
>
>> > src->state = dst->state = CT_DPIF_TCPS_TIME_WAIT;
>
>> > }
>
>> >
>
>> > - if (src->state >= CT_DPIF_TCPS_FIN_WAIT_2
>
>> > - && dst->state >= CT_DPIF_TCPS_FIN_WAIT_2) {
>
>> > - conn_update_expiration(ct, &conn->up, CT_TM_TCP_CLOSED,
>
>> now);
>
>> > - } else if (src->state >= CT_DPIF_TCPS_CLOSING
>
>> > - && dst->state >= CT_DPIF_TCPS_CLOSING) {
>
>> > - conn_update_expiration(ct, &conn->up,
>
>> CT_TM_TCP_FIN_WAIT, now);
>
>> > - } else if (src->state < CT_DPIF_TCPS_ESTABLISHED
>
>> > - || dst->state < CT_DPIF_TCPS_ESTABLISHED) {
>
>> > - conn_update_expiration(ct, &conn->up,
>
>> CT_TM_TCP_OPENING, now);
>
>> > - } else if (src->state >= CT_DPIF_TCPS_CLOSING
>
>> > - || dst->state >= CT_DPIF_TCPS_CLOSING) {
>
>> > - conn_update_expiration(ct, &conn->up,
>
>> CT_TM_TCP_CLOSING, now);
>
>> > - } else {
>
>> > - conn_update_expiration(ct, &conn->up,
>
>> CT_TM_TCP_ESTABLISHED, now);
>
>> > - }
>
>> > } else if ((dst->state < CT_DPIF_TCPS_SYN_SENT
>
>> > || dst->state >= CT_DPIF_TCPS_FIN_WAIT_2
>
>> > || src->state >= CT_DPIF_TCPS_FIN_WAIT_2) @@
>
>> -412,6
>
>> > +397,22 @@ tcp_conn_update(struct conntrack *ct, struct conn *conn_,
>
>> > return CT_UPDATE_INVALID;
>
>> > }
>
>> >
>
>> > + if (src->state >= CT_DPIF_TCPS_FIN_WAIT_2
>
>> > + && dst->state >= CT_DPIF_TCPS_FIN_WAIT_2) {
>
>> > + conn_update_expiration(ct, &conn->up, CT_TM_TCP_CLOSED,
>
>> now);
>
>> > + } else if (src->state >= CT_DPIF_TCPS_CLOSING
>
>> > + && dst->state >= CT_DPIF_TCPS_CLOSING) {
>
>> > + conn_update_expiration(ct, &conn->up, CT_TM_TCP_FIN_WAIT,
>
>> now);
>
>> > + } else if (src->state < CT_DPIF_TCPS_ESTABLISHED
>
>> > + || dst->state < CT_DPIF_TCPS_ESTABLISHED) {
>
>> > + conn_update_expiration(ct, &conn->up, CT_TM_TCP_OPENING,
>
>> now);
>
>> > + } else if (src->state >= CT_DPIF_TCPS_CLOSING
>
>> > + || dst->state >= CT_DPIF_TCPS_CLOSING) {
>
>> > + conn_update_expiration(ct, &conn->up, CT_TM_TCP_CLOSING,
>
>> now);
>
>> > + } else {
>
>> > + conn_update_expiration(ct, &conn->up,
>
>> CT_TM_TCP_ESTABLISHED, now);
>
>> > + }
>
>> > +
>
>> > return CT_UPDATE_VALID;
>
>> > }
>
>>
>
>> I'm much less sure now, re-reading the comments, especially:
>
>>
>
>> * This must be a little more careful than the above code
>
>> * since packet floods will also be caught here. We don't
>
>> * update the TTL here to mitigate the damage of a packet
>
>> * flood and so the same code can handle awkward establishment
>
>> * and a loosened connection close.
>
>> * In the establishment case, a correct peer response will
>
>> * validate the connection, go through the normal state code
>
>> * and keep updating the state TTL.
>
>
>
>
>
> Sorry, I am not clear how these codes related to TLL?
As I understand the code, the state TTL is the expiration time for the
state.
>>
>
>> I think given that, it would be wrong to make any change to the connection
>
>> expiration time in this branch.
>
>>
>
>> I guess you're hitting the 'loosened connection close' case, which will not
>
>
>
>
>
> the connection that I see is blow:
>
>
>
> *
>
>
>
>
>
>> change the TTL and keep the connection in the table longer. Looking at other
>
>> PF based firewalls, none of them ever update the xon timeout in this
>
>> fall-through case (I looked at openbsd, dragonfly bsd, and freebsd).
>
>>
>
> Is it possible this is a common issue for BSD? I did not find the similar logic in linux
>
It is possible that such issues don't impact linux, but it could be for
other reasons. If you have a traffic capture that you can post we can
look at it. Best would be to create a test case using sendpkt.py to
send the packets - that would demonstrate the issue between the two (and
help us to keep it cohesive in the future).
I think we do want to keep the datapaths as similar as possible, so I
would welcome this test to be added.
>
>> Have you looked at using the CT timeout policy framework to setup a timeout
>
>
>
>
>
> What is CT timeout policy? Could you give some links, thanks
I guess there isn't a whole lot of documentation. We should fix that.
>From ovs-vsctl(8):
[--may-exist] add-zone-tp datapath zone=zone_id policies
Creates a conntrack zone timeout policy with zone_id in data‐
path. The policies consist of key=value pairs, separated by
spaces. For example, icmp_first=30 icmp_reply=60 specifies a
30-second timeout policy for the first ICMP packet and a 60-sec‐
ond policy for ICMP reply packets. See the CT_Timeout_Policy
table in ovs-vswitchd.conf.db(5) for the supported keys.
Without --may-exist, attempting to add a zone_id that already
exists is an error. With --may-exist, this command does nothing
if zone_id already exists.
>From the CT_Timeout_Policy section of ovs-vswitchd.conf.db(5):
CT_Timeout_Policy TABLE
Connection tracking timeout policy configuration
Summary:
Timeouts:
timeouts map of string-integer pairs, key one of
icmp_first, icmp_reply, tcp_close,
tcp_close_wait, tcp_established,
tcp_fin_wait, tcp_last_ack, tcp_retrans‐
mit, tcp_syn_recv, tcp_syn_sent2,
tcp_syn_sent, tcp_time_wait, tcp_unack,
udp_first, udp_multiple, or udp_single,
value in range 0 to 4,294,967,295
TCP Timeouts:
timeouts : tcp_syn_sent optional integer, in range 0 to
4,294,967,295
timeouts : tcp_syn_recv optional integer, in range 0 to
4,294,967,295
timeouts : tcp_established
optional integer, in range 0 to
4,294,967,295
timeouts : tcp_fin_wait optional integer, in range 0 to
4,294,967,295
timeouts : tcp_close_wait
optional integer, in range 0 to
4,294,967,295
timeouts : tcp_last_ack optional integer, in range 0 to
4,294,967,295
timeouts : tcp_time_wait optional integer, in range 0 to
4,294,967,295
timeouts : tcp_close optional integer, in range 0 to
4,294,967,295
timeouts : tcp_syn_sent2 optional integer, in range 0 to
4,294,967,295
timeouts : tcp_retransmit
optional integer, in range 0 to
4,294,967,295
timeouts : tcp_unack optional integer, in range 0 to
4,294,967,295
....
As an example (just illustration - I didn't test any of this):
ovs-vsctl --may-exist add-zone-tp netdev zone=0 tcp_time_wait=30 \
tcp_close=30 tcp_last_ack=30 tcp_fin_wait=30
>
>
> Thanks
>
>
>
> -Li
More information about the dev
mailing list