[ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first

Linhaifeng haifeng.lin at huawei.com
Wed Jun 3 02:15:31 UTC 2020



> -----Original Message-----
> From: Ben Pfaff [mailto:blp at ovn.org]
> Sent: Wednesday, June 3, 2020 8:35 AM
> To: Yanqin Wei <Yanqin.Wei at arm.com>
> Cc: Linhaifeng <haifeng.lin at huawei.com>; dev at openvswitch.org; nd
> <nd at arm.com>; Lilijun (Jerry) <jerry.lilijun at huawei.com>; chenchanghu
> <chenchanghu at huawei.com>; Lichunhe <lichunhe at huawei.com>
> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first
> 
> This is not how RCU works in OVS.  Every thread is by default considered
> active.  They rarely quiesce except implicitly inside poll_block().
> Please read the large comment at the top of ovs-rcu.h.
> 
> Is your patch based on actual bugs that you have found, or is it just some kind
> of precaution?  If it is the latter, then it is not needed.
> 
Is an actual bug for old version bug it's also suitable for the other codes in ovs.

Here is the debug info:
linux-mNuKFc:/Images/linhf/830/Euler_compile_env # gdb -p `pidof ovs-vswitchd`
GNU gdb (GDB) Red Hat Enterprise Linux 8.2-3.h2
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-Huawei-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 102706
[New LWP 109133]
[New LWP 109134]
[New LWP 109297]
[New LWP 109298]
[New LWP 109299]
[New LWP 109300]
[New LWP 109303]
[New LWP 109304]
[New LWP 109308]
[New LWP 109309]
[New LWP 109310]
[New LWP 109311]
[New LWP 109522]
[New LWP 109523]
[New LWP 109603]
[New LWP 109615]
[New LWP 109619]
[New LWP 109655]
[New LWP 109673]
[New LWP 109794]
[New LWP 109795]
[New LWP 113953]
[New LWP 114362]
[New LWP 114364]
[New LWP 114368]
[New LWP 114370]
[New LWP 114373]
[New LWP 114377]
[New LWP 115594]
[New LWP 115595]
[New LWP 115596]
[New LWP 115597]
[New LWP 115598]
[New LWP 115600]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x0000ffff879981ac in poll () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glib2-2.54.2-2.h1.aarch64 glibc-2.28-9.h17.aarch64 keyutils-libs-1.5.8-3.aarch64 krb5-libs-1.15.1-34.h2.aarch64 libcgroup-0.41-15.h3.aarch64 libcom_err-1.44.3-1.h4.aarch64 libgcc-7.3.0-20190804.h18.aarch64 libselinux-2.5-12.aarch64 numactl-libs-2.0.9-7.h1.aarch64 openssl-libs-1.0.2k-16.h6.aarch64 pcre-8.32-17.h9.aarch64 uvpkmc-1.0.1-807.aarch64 zlib-1.2.7-17.aarch64
(gdb) b dpcls_destroy_subtable
Breakpoint 1 at 0x508bcc: file lib/dpif-netdev.c, line 6919.
(gdb) b ovsrcu_call_postponed
Breakpoint 2 at 0x5b7d34: file lib/ovs-rcu.c, line 336.
(gdb) c
Continuing.
[Switching to Thread 0xffff83b97860 (LWP 109304)]

Thread 9 "urcu2" hit Breakpoint 2, ovsrcu_call_postponed () at lib/ovs-rcu.c:336
warning: Source file is more recent than executable.
336	{
(gdb) n
339	    int wait_del = 0;
(gdb) 
340	    while(wait_del);
(gdb) set wait_del = 1
(gdb) c
Continuing.
[Switching to Thread 0xffff51748860 (LWP 115598)]

Thread 34 "revalidator19" hit Breakpoint 1, dpcls_destroy_subtable (cls=0xffff1c00a420, subtable=0xffff3c009250) at lib/dpif-netdev.c:6919
6919	    int wait_get = 0;
(gdb) n
6920	    VLOG_DBG("Destroying subtable %p for in_port %d", subtable, cls->in_port);
(gdb) 
6921	    pvector_remove(&cls->subtables, subtable);
(gdb) 
6922	    cmap_remove(&cls->subtables_map, &subtable->cmap_node,
(gdb) set wait_get = 1
(gdb) n
6924	    cmap_destroy(&subtable->rules);
(gdb) p subtable->rules
$1 = {impl = {p = 0xffff30008940}}
(gdb) s
cmap_destroy (cmap=0xffff3c009258) at lib/cmap.c:288
288	    if (cmap) {
(gdb) n
289	        struct cmap_impl *impl = cmap_get_impl(cmap);
(gdb) 
290	        if (impl != &empty_cmap) {
(gdb) 
291	            ovsrcu_postpone(free_cacheline, impl);
(gdb) s
ovsrcu_postpone__ (function=0x6029b0 <free_cacheline>, aux=0xffff30008940) at lib/ovs-rcu.c:315
315	    struct ovsrcu_perthread *perthread = ovsrcu_perthread_get();
(gdb) n
318	    int size = ARRAY_SIZE(cbset->cbs);
(gdb) 
319	    cbset = perthread->cbset;
(gdb) 
320	    if (!cbset) {
(gdb) 
325	    cb = &cbset->cbs[cbset->n_cbs++];
(gdb) 
326	    cb->function = function;
(gdb) 
327	    cb->aux = aux;
(gdb) 
329	    if (cbset->n_cbs >= size) {
(gdb) set size = cbset->n_cbs
(gdb) n
330	        ovsrcu_flush_cbset(perthread);
(gdb) s
ovsrcu_flush_cbset (perthread=0xffff30001210) at lib/ovs-rcu.c:397
397	    ovsrcu_flush_cbset__(perthread, false);
(gdb) s
ovsrcu_flush_cbset__ (perthread=0xffff30001210, protected=false) at lib/ovs-rcu.c:380
380	    struct ovsrcu_cbset *cbset = perthread->cbset;
(gdb) n
382	    if (cbset) {
(gdb) 
383	        guarded_list_push_back(&flushed_cbsets, &cbset->list_node, SIZE_MAX);
(gdb) 
384	        perthread->cbset = NULL;
(gdb) 
386	        if (protected) {
(gdb) 
389	            seq_change(flushed_cbsets_seq);
(gdb) 
392	}
(gdb) 
ovsrcu_flush_cbset (perthread=0xffff30001210) at lib/ovs-rcu.c:398
398	}
(gdb) 
ovsrcu_postpone__ (function=0x6029b0 <free_cacheline>, aux=0xffff30008940) at lib/ovs-rcu.c:332
332	}
(gdb) 
cmap_destroy (cmap=0xffff3c009258) at lib/cmap.c:294
294	}
(gdb) 
dpcls_destroy_subtable (cls=0xffff1c00a420, subtable=0xffff3c009250) at lib/dpif-netdev.c:6925
6925	    while(wait_get);
(gdb) p wait_get
$2 = 1
(gdb) thir 9
Undefined command: "thir".  Try "help".
(gdb) thr 9
[Switching to thread 9 (Thread 0xffff83b97860 (LWP 109304))]
#0  ovsrcu_call_postponed () at lib/ovs-rcu.c:340
340	    while(wait_del);
(gdb) set wait_del = 0
(gdb) n
342	    guarded_list_pop_all(&flushed_cbsets, &cbsets);
(gdb) 
343	    if (ovs_list_is_empty(&cbsets)) {
(gdb) pcbsets cbsets
$173 = {list_node = {prev = 0xfffefc041660, next = 0xffff83b96f98}, cbs = {{function = 0x6029b0 <free_cacheline>, aux = 0xffff3c0094c0}, {function = 0x6029b0 <free_cacheline>, aux = 0xffff30008940}, {function = 0x508b9c <subtable_free>, aux = 0xffff3c009c10}, {function = 0xffff87947448 <free>, aux = 0xffff30001390}, {function = 0x4f9b74 <dp_netdev_flow_free>, aux = 0xffff34009b80}, {function = 0x0, aux = 0x0}, {function = 0x0, aux = 0x0}, {function = 0x90, aux = 0x74}, {function = 0x0, aux = 0xabc7c8 <ovsrcu_threads>}, {function = 0x0, aux = 0x0}, {function = 0xffffffff, aux = 0x0}, {function = 0x0, aux = 0x0}, {function = 0x0, aux = 0x2263e3}, {function = 0x0, aux = 0x3275637275}, {function = 0x0, aux = 0x21}, {function = 0xffff3000ea30, aux = 0xffff300013f0}}, n_cbs = 2}
(gdb) prcus ovsrcu_threads
$174 = {list_node = {prev = 0xabc7c8 <ovsrcu_threads>, next = 0xffff30001390}, mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, __align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3301106, cbset = 0x0, name = "pmd14\000\000\000\000\000\000\000\000\000\000"}
$175 = {list_node = {prev = 0xffff38003970, next = 0xffff30001210}, mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, __align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3297445, cbset = 0x0, name = "urcu2\000\000\000\000\000\000\000\000\000\000"}
$176 = {list_node = {prev = 0xffff30001390, next = 0xabc7c8 <ovsrcu_threads>}, mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, __align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3300573, cbset = 0x0, name = "revalidator19\000\000"}
(gdb) n
347	    ovsrcu_synchronize();
(gdb) s
ovsrcu_synchronize () at lib/ovs-rcu.c:225
225	{
(gdb) n
226	    unsigned int warning_threshold = INIT_WARNING_THRESHOLD_MS;
(gdb) 
227	    unsigned int block_report_frequent = BLOCK_REPORT_FREQUENT_MS;
(gdb) 
230	    int wait_round = 0;
(gdb) 
232	    if (single_threaded()) {
(gdb) 
236	    target_seqno = seq_read(global_seqno);
(gdb) 
237	    ovsrcu_quiesce_start();
(gdb) p target_seqno
$177 = 3301107
(gdb) thr 23    //切换到pmd一直next,直到执行ovsrcu_try_quiesce
(gdb) 
4975	                if (!ovsrcu_try_quiesce()) {
(gdb) 
4976	                    emc_cache_slow_sweep(&pmd->flow_cache);
(gdb) 
4980	                if (OVS_UNLIKELY(pmd->pmd_xinfo.flow_table_on_dp != flow_table_on_dp)) {
(gdb) prcus ovsrcu_threads
$194 = {list_node = {prev = 0xabc7c8 <ovsrcu_threads>, next = 0xffff30001210}, mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, __align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3301109, cbset = 0x0, name = "pmd14\000\000\000\000\000\000\000\000\000\000"}
$195 = {list_node = {prev = 0xffff38003970, next = 0xabc7c8 <ovsrcu_threads>}, mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, __align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3300573, cbset = 0x0, name = "revalidator19\000\000"}
(gdb) b util.c:261 if (p == 0xffff30008940)
Breakpoint 3 at 0x6029bc: file lib/util.c, line 261.
(gdb) b dpif-netdev.c:7129 if (subtable->rules->impl->p == 0xffff30008940)
Breakpoint 4 at 0x509444: file lib/dpif-netdev.c, line 7129.
(gdb) thr 34
[Switching to thread 34 (Thread 0xffff51748860 (LWP 115598))]
(gdb) set wait_get = 0
(gdb) c
Continuing.
[Switching to Thread 0xffff43ffe860 (LWP 113953)]

Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7129
7129	        while(wait_free);
(gdb) p wait_free
$200 = 0
(gdb) set wait_free = 1
(gdb) c
Continuing.
[Switching to Thread 0xffff43ffe860 (LWP 113953)]

Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7129
7129	        while(wait_free);
(gdb) c
Continuing.

Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7129
7129	        while(wait_free);
(gdb) c
Continuing.

Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7129
7129	        while(wait_free);
(gdb) c
Continuing.
[Switching to Thread 0xffff83b97860 (LWP 109304)]

Thread 9 "urcu2" hit Breakpoint 3, free_cacheline (p=0xffff30008940) at lib/util.c:261
261	    free(p);
(gdb) c
(gdb) c
Continuing.
[Switching to Thread 0xffff43ffe860 (LWP 113953)]

Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7129
7129	        while(wait_free);
(gdb) set wait_free = 0
(gdb) c
Continuing.

Thread 23 "pmd14" received signal SIGSEGV, Segmentation fault.
0x00000000004e1e50 in read_counter (bucket_=0xffff3003bfc0) at lib/cmap.c:333
333	    atomic_read_explicit(&bucket->counter, &counter, memory_order_acquire);
(gdb) bt
#0  0x00000000004e1e50 in read_counter (bucket_=0xffff3003bfc0) at lib/cmap.c:333
#1  0x00000000004e1e9c in read_even_counter (bucket=0xffff3003bfc0) at lib/cmap.c:344
#2  0x00000000004e2270 in cmap_find_batch (cmap=0xffff3c009258, map=1, hashes=0xffff43ffcb18, nodes=0xffff43ffcb98) at lib/cmap.c:459
#3  0x0000000000509468 in dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7131
#4  0x00000000004fabdc in dp_netdev_lookup_flow (classifiers=0xffff5237b0c8, key=0xffff43ffd3c0, lookup_num_p=0xffff43ffcd84) at lib/dpif-netdev.c:2111
#5  0x000000000050748c in fast_path_processing (pmd=0xffff51f4b000, packets_=0xffff43ffddb8, keys=0xffff43ffd3c0, batches=0xffff43ffd270, n_batches=0xffff43ffd6c0, in_port=11, now=862863019) at lib/dpif-netdev.c:6027
#6  0x0000000000507bb0 in dp_netdev_input__ (pmd=0xffff51f4b000, packets=0xffff43ffddb8, md_is_valid=false, port_no=11) at lib/dpif-netdev.c:6138
#7  0x0000000000507fa0 in dp_netdev_input (pmd=0xffff51f4b000, packets=0xffff43ffddb8, port_no=11) at lib/dpif-netdev.c:6222
#8  0x000000000067d130 in dp_uevs_forward (evs_ctx=0xffff43ffdd60, port_no=11, md_is_valid=false) at lib/evs/evs-dpdk.c:2717
#9  0x00000000005035e4 in evs_netdev_process_rxq_port (enable_blc=0, port_args=0xffff43ffdd40, pmd_args=0xffff43ffdd88) at lib/evs/evs-dpdk-inline.h:162
#10 evs_forward_loop (lacp_only_mode=false, ts=0xffff43ffdd30, port_args=0xffff43ffdd40, pmd_args=0xffff43ffdd88, poll_list=0xffff38009030, poll_cnt=24) at lib/dpif-netdev.c:4788
#11 pmd_thread_main (f_=0xffff51f4b000) at lib/dpif-netdev.c:4905
#12 0x00000000005bb0fc in ovsthread_wrapper (aux_=0x282d2280) at lib/ovs-thread.c:715
#13 0x0000ffff87f5c8bc in start_thread () from /lib64/libpthread.so.0
#14 0x0000ffff879a1e7c in thread_start () from /lib64/libc.so.6
> On Tue, Jun 02, 2020 at 11:22:57PM +0000, Yanqin Wei wrote:
> > Hi Ben,
> >
> > If my understanding is correct, the writer could not be a rcu thread because it
> does not need report holding or not holding pointers.
> > So old memory will be freed after all rcu thread report quiesce.
> >
> > Best Regards,
> > Wei Yanqin
> >
> > > -----Original Message-----
> > > From: Ben Pfaff <blp at ovn.org>
> > > Sent: Wednesday, June 3, 2020 1:28 AM
> > > To: Linhaifeng <haifeng.lin at huawei.com>
> > > Cc: Yanqin Wei <Yanqin.Wei at arm.com>; dev at openvswitch.org; nd
> > > <nd at arm.com>; Lilijun (Jerry) <jerry.lilijun at huawei.com>;
> > > chenchanghu <chenchanghu at huawei.com>; Lichunhe
> <lichunhe at huawei.com>
> > > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first
> > >
> > > On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote:
> > > > We should update rcu pointer first then use ovsrcu_postpone to
> > > > free otherwise maybe cause use-after-free.
> > > > e.g.,reader indicates momentary quiescent and access old pointer
> > > > after writer postpone free old pointer and before setting new pointer.
> > > >
> > > > Signed-off-by: Linhaifeng <haifeng.lin at huawei.com>
> > >
> > > I don't see how that's possible, since the writer hasn't quiesced.


More information about the dev mailing list