[ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first
Ilya Maximets
i.maximets at ovn.org
Wed Jun 3 14:33:28 UTC 2020
On 6/3/20 1:08 PM, Linhaifeng wrote:
>
>
>> -----Original Message-----
>> From: Ilya Maximets [mailto:i.maximets at ovn.org]
>> Sent: Wednesday, June 3, 2020 6:50 PM
>> To: Linhaifeng <haifeng.lin at huawei.com>; Ben Pfaff <blp at ovn.org>
>> Cc: i.maximets at ovn.org; dev at openvswitch.org; Lilijun (Jerry)
>> <jerry.lilijun at huawei.com>; Lichunhe <lichunhe at huawei.com>; nd
>> <nd at arm.com>; chenchanghu <chenchanghu at huawei.com>
>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first
>>
>> On 6/3/20 9:04 AM, Linhaifeng wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Ben Pfaff [mailto:blp at ovn.org]
>>>> Sent: Wednesday, June 3, 2020 1:26 PM
>>>> To: Linhaifeng <haifeng.lin at huawei.com>
>>>> Cc: Yanqin Wei <Yanqin.Wei at arm.com>; dev at openvswitch.org; nd
>>>> <nd at arm.com>; Lilijun (Jerry) <jerry.lilijun at huawei.com>; chenchanghu
>>>> <chenchanghu at huawei.com>; Lichunhe <lichunhe at huawei.com>
>>>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first
>>>>
>>>> On Wed, Jun 03, 2020 at 01:22:52AM +0000, Linhaifeng wrote:
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Ben Pfaff [mailto:blp at ovn.org]
>>>>> Sent: Wednesday, June 3, 2020 1:28 AM
>>>>> To: Linhaifeng <haifeng.lin at huawei.com>
>>>>> Cc: Yanqin Wei <Yanqin.Wei at arm.com>; dev at openvswitch.org; nd
>>>>> <nd at arm.com>; Lilijun (Jerry) <jerry.lilijun at huawei.com>;
>>>>> chenchanghu <chenchanghu at huawei.com>; Lichunhe
>> <lichunhe at huawei.com>
>>>>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first
>>>>>
>>>>> On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote:
>>>>>> We should update rcu pointer first then use ovsrcu_postpone to free
>>>>>> otherwise maybe cause use-after-free.
>>>>>> e.g.,reader indicates momentary quiescent and access old pointer
>>>>>> after writer postpone free old pointer and before setting new pointer.
>>>>>>
>>>>>> Signed-off-by: Linhaifeng <haifeng.lin at huawei.com>
>>>>>
>>>>> I don't see how that's possible, since the writer hasn't quiesced.
>>>>>
>>>>> I think the logic is as follow, Could you help me find out where is incorrect?
>>>>>
>>>>> 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4
>>>>> ->
>>>>> 3.3 -> 2.2(use after free)
>>>>>
>>>>> wirter:
>>>>> 1.1 use postone to free old pointer
>>>>> 1.2 flush cbsets to flushed_cbsets
>>>>> 1.3 update new pointer
>>>>> 1.4 quiesced
>>>>>
>>>>> Read:
>>>>> 2.1. read pointer
>>>>> 2.2. use pointer
>>>>> 2.3. quiesced
>>>>>
>>>>> Rcu:
>>>>> 3.1 pop flushed_cbsets
>>>>> 3.2 ovsrcu_synchronize
>>>>> 3.3 call all cb to free
>>>>
>>>> So you're saying this:
>>>>
>>>> 1.1 use postone to free old pointer (A)
>>>> 1.2 flush cbsets to flushed_cbsets
>>>>
>>>> 3.1 pop flushed_cbsets
>>>> 3.2 ovsrcu_synchronize
>>>>
>>>> 2.1. read pointer (A)
>>>> 2.2. use pointer (A)
>>>> 2.3. quiesced
>>>>
>>>> 2.1. read pointer (A)
>>>>
>>>> 1.3 update new pointer (B)
>>>> 1.4 quiesced
>>>>
>>>> 3.3 call all cb to free (A)
>>>>
>>>> 2.2. use pointer (A)
>>>>
>>>> Wow, you are absolutely right. This had never occurred to me. Thank
>> you!
>>>> I'll review your patch.
>>>
>>> Yes, it's really hard to happen. If it happened it's also hard to find the reason
>> so I suggest it can be a rule for using rcu.
>>
>> I agree that there is an issue here, but I think that we should not force users to
>> call ovsrcu_set() before ovsrcu_postpone(). Current users doesn't do
>> anything illegal since pointer must not be freed before the next grace period
>> from their point of view.
>>
>> For me it looks like the main issue is existence of point 1.2, i.e. flushing cbsets
>> while writer is not quiesced yet. And we need to fix this inside rcu library itself.
>> For example, we could avoid flushing inside
>> ovsrcu_postpone() by making cbs[16] a dynamically allocated array and using
>> x2nrealloc instead of flushing.
>>
>> Thoughts?
>>
> Hi, Ilya Maximets
>
> May be this is a good idea therefor the users not need to think about call ovsrcu_set() first or ovsrcu_postpone().
> How about you think, ben? May be you can send a patch to modify the ovsrcu_postpone() not to flush cbsets to
> replace of my patches.
The change could look like this:
diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c
index ebc8120f0..cde1e925b 100644
--- a/lib/ovs-rcu.c
+++ b/lib/ovs-rcu.c
@@ -30,6 +30,8 @@
VLOG_DEFINE_THIS_MODULE(ovs_rcu);
+#define MIN_CBS 16
+
struct ovsrcu_cb {
void (*function)(void *aux);
void *aux;
@@ -37,7 +39,8 @@ struct ovsrcu_cb {
struct ovsrcu_cbset {
struct ovs_list list_node;
- struct ovsrcu_cb cbs[16];
+ struct ovsrcu_cb *cbs;
+ size_t n_allocated;
int n_cbs;
};
@@ -310,16 +313,19 @@ ovsrcu_postpone__(void (*function)(void *aux), void *aux)
cbset = perthread->cbset;
if (!cbset) {
cbset = perthread->cbset = xmalloc(sizeof *perthread->cbset);
+ cbset->cbs = xmalloc(MIN_CBS * sizeof *cbset->cbs);
+ cbset->n_allocated = MIN_CBS;
cbset->n_cbs = 0;
}
+ if (cbset->n_cbs == cbset->n_allocated) {
+ cbset->cbs = x2nrealloc(cbset->cbs, &cbset->n_allocated,
+ sizeof *cbset->cbs);
+ }
+
cb = &cbset->cbs[cbset->n_cbs++];
cb->function = function;
cb->aux = aux;
-
- if (cbset->n_cbs >= ARRAY_SIZE(cbset->cbs)) {
- ovsrcu_flush_cbset(perthread);
- }
}
static bool
@@ -341,6 +347,7 @@ ovsrcu_call_postponed(void)
for (cb = cbset->cbs; cb < &cbset->cbs[cbset->n_cbs]; cb++) {
cb->function(cb->aux);
}
+ free(cbset->cbs);
free(cbset);
}
---
I could submit a formal patch, if agreed.
Ben, what do you think?
More information about the dev
mailing list