[ovs-dev] [PATCH v5 05/16] conntrack: Periodically delete expired connections.

Thu Jul 28 06:13:47 UTC 2016

On 27/07/2016 21:13, "Joe Stringer" <joe at ovn.org> wrote:

>On 27 July 2016 at 19:01, Daniele Di Proietto <diproiettod at vmware.com> wrote:
>>
>>
>>
>>
>>
>> On 27/07/2016 17:14, "Joe Stringer" <joe at ovn.org> wrote:
>>
>>>On 26 July 2016 at 17:58, Daniele Di Proietto <diproiettod at vmware.com> wrote:
>>>> This commit adds a thread that periodically removes expired connections.
>>>>
>>>> The expiration time of a connection can be expressed by:
>>>>
>>>> expiration = now + timeout
>>>>
>>>> For each possible 'timeout' value (there aren't many) we keep a list.
>>>> When the expiration is updated, we move the connection to the back of the
>>>> corresponding 'timeout' list. This ways, the list is always ordered by
>>>> 'expiration'.
>>>>
>>>> When the cleanup thread iterates through the lists for expired
>>>> connections, it can stop at the first non expired connection.
>>>>
>>>> Suggested-by: Joe Stringer <joe at ovn.org>
>>>> Signed-off-by: Daniele Di Proietto <diproiettod at vmware.com>
>>>
>>>Acked-by: Joe Stringer <joe at ovn.org>
>>
>> Thanks for the review!
>>
>>>
>>>Minor comments on comments below. Thanks!
>>>
>>><snip>
>>>
>>>> +/* Cleanup:
>>>> + *
>>>
>>>Extra line.
>>
>> Fixed
>>
>>>
>>>> + * We must call conntrack_clean() periodically.  conntrack_clean() return
>>>> + * value gives an hint on when the next cleanup must be done (either because
>>>> + * there is an actual connection that expires, or because a new connection
>>>> + * might be created with the minimum timeout).
>>>> + *
>>>> + * The logic below has two goals:
>>>> + *
>>>> + * - Avoid calling conntrack_clean() too often.  If we call conntrack_clean()
>>>> + *   each time a connection expires, the thread will consume 100% CPU, so we
>>>> + *   try to call the function _at most_ once every CT_CLEAN_INTERVAL, to batch
>>>> + *   removal.
>>>
>>>Isn't it CT_CLEAN_MIN_INTERVAL that prevents calls happening too
>>>often? I would imagine that under high cps conditions where you're
>>>likely to peg 100% on cleanup, cleanup is behind and CT_CLEAN_INTERVAL
>>>logic won't come into the picture.
>>>
>>>> + * - On the other hand, it's not a good idea to keep the buckets locked for
>>>> + *   too long, as we might prevent traffic from flowing.  If conntrack_clean()
>>>> + *   returns a value which is in the past, it means that the internal limit
>>>> + *   has been reached and more cleanup is required.  In this case, just wait
>>>> + *   CT_CLEAN_MIN_INTERVAL before the next call.
>>>> + */
>>>
>>>Keeping the buckets locked too long also happens if you constantly
>>>call conntrack_clean(), so I think these two paragraphs are arguing
>>>slightly different angles for the same parameter.
>>>
>>>CT_CLEAN_MIN_INTERVAL ensures that if cleanup is behind, there is
>>>atleast some 200ms blocks of time when buckets will be left alone so
>>>the datapath can operate unhindered.
>>>
>>>CT_CLEAN_INTERVAL ensures that if we are coping with the current
>>>cleanup tasks, then we wait at least 5 seconds to do further cleanup.
>>>This seems like it's more targeted towards reducing wakeups when there
>>>is a wide distribution of timeouts but relatively small number of
>>>connections, that could be handled by less frequent cleanups.
>>>
>>>I like the "the logic has two goals" presentation of this, but maybe
>>>there is a better way we can frame the comment above?
>>
>> I couldn't have said it better, I almost stole your wording entirely:
>>
>> + * The logic below has two goals:
>> + *
>> + * - Avoid calling conntrack_clean() too often.  If we call conntrack_clean()
>> + *   each time a connection expires, the thread will consume 100% CPU, so we
>> + *   try to call the function _at most_ once every CT_CLEAN_INTERVAL, to batch
>> + *   removal.
>> + *
>> + * - On the other hand, it's not a good idea to keep the buckets locked for
>> + *   too long, as we might prevent traffic from flowing.  If conntrack_clean()
>> + *   returns a value which is in the past, it means that the internal limit
>> + *   has been reached and more cleanup is required.  In this case, just wait
>> + *   CT_CLEAN_MIN_INTERVAL before the next call.
>
>Is the new wording missing? This looks the same as the old wording.

Oops, you're right, copy paste mistake, this is what I changed it to:

* The logic below has two goals:
 *
 * - We want to reduce the number of wakeups and batch connection cleanup
 *   when the load is not very high.  CT_CLEAN_INTERVAL ensures that if we
 *   are coping with the current cleanup tasks, then we wait at least
 *   5 seconds to do further cleanup.
 *
 * - We don't want to keep the buckets locked too long, as we might prevent
 *   traffic from flowing.  CT_CLEAN_MIN_INTERVAL ensures that if cleanup is
 *   behind, there is at least some 200ms blocks of time when buckets will be
 *   left alone, so the datapath can operate unhindered.