[ovs-discuss] problem about cls pvector may cause ovs-vswitchd crash
Yinpeijun
yinpeijun at huawei.com
Tue Sep 15 11:22:28 UTC 2020
Hi All,
Recently we found a problem, as follow:
1. Problem description:
PVECTOR_FOR_EACH use ovsrcu_get to get pvector’s current impl pointer, and the memory_order_consume
in ovsrcu_get will ensure *impl is read after this instruction, so we can get the the correct ptr value in
impl->vector[0], but it seems that we cannot make sure that the *ptr value is also correct.
2. Verification through testing:
Copy the test code into file lib/dpif-netdev.c, and the modify fuction pmd_thread_main, insert the following line
in the for (;;) loop:
do_atomic_test(pmd);
if the do_atomic_consumer is implement without mutex lock, we can easily get the following log in ovs-vswitchd.log:
----my_itrator:143, my_value:144 (we may get other values, and my_itrator = my_value - 1 stay true.
The consumer thread can get dirty memory with memory_order_consume, if we change ovsrcu_get to get values
with memory_order_acquire, we can still get the error message (my_itrator = my_value - 1).
We can fix this problem if consumer thread access data with g_pvector_lock’ protection, but pvector is designed
to use without locks. Where did it go wrong? Can anyone here give some comments?
--------------------------------------------- A error case truly happened whining using pvector --------------------------
Suppose a scenario as follows:
1) handler thread insert a dpcls_subtable, whose address is p,
to datapath classifier’s (struct dpcls) pvector;
2) pmd thread call fast_path_processing to find the subtable and access address p;
3) the subtable above is destroyed by someone;
4) handler thread insert another subtable, whose address is also p,
and insert to dpcls’s pvector;
5) pmd thread call fast_path_processing -> dp_netdev_pmd_lookup_flow
-> dpcls_lookup to access the subtable from pvector.
We use PVECTOR_FOR_EACH to iterate the pvector in step 5, no locks is used, dirty memory access may
cause to ovs-vswitchd process coredump.
----------------------------------------------Testing codes are as follows ---------------------------------------------
static struct pvector g_atomic_test;
static unsigned char g_my_data[65536];
static unsigned char g_value = 0;
static int g_index = 0;
static struct ovs_mutex g_pvector_lock = OVS_MUTEX_INITIALIZER;
static void do_atomic_producer(void)
{
ovs_mutex_lock(&g_pvector_lock);
if (g_index < 65536) {
g_my_data[g_index] = g_value;
pvector_insert(&g_atomic_test, (void *)&g_my_data[g_index], 0);
g_index++;
} else {
for (int i = 0; i < g_index; i++) {
pvector_remove(&g_atomic_test, (void *)&g_my_data[i]);
}
g_index = 0;
g_value++; //value can loops to zero;
}
pvector_publish(&g_atomic_test);
ovs_mutex_unlock(&g_pvector_lock);
}
static void do_atomic_consumer(void)
{
unsigned char *my_itr;
unsigned char my_value;
bool first = true;
// ovs_mutex_lock(&g_pvector_lock);
PVECTOR_FOR_EACH (my_itr, &g_atomic_test) {
if (first) {
my_value = *my_itr;
first = false;
} else if (*my_itr != my_value) {
VLOG_ERR("----my_itrator:%u, my_value:%u\n", *my_itr, my_value);
break;
}
}
// ovs_mutex_unlock(&g_pvector_lock);
}
static void do_atomic_test(struct dp_netdev_pmd_thread *pmd)
{
if (pmd->core_id == 1) {
do_atomic_producer();
} else {
do_atomic_consumer();
}
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200915/7175c3f2/attachment.html>
More information about the discuss
mailing list