[ovs-dev] [PATCH] rculist: Remove postponed poisoning.

Jarno Rajahalme jrajahalme at nicira.com
Wed Jun 10 22:32:24 UTC 2015


Postponed 'next' member poisoning was based on the faulty assumption
that postponed functions would be called in the order they were
postponed.  This assumption holds only for the functions postponed by
any single thread.  When functions are postponed by different
threads, there are no guarantees of the order in which the functions
may be called, or timing between those calls after the next grace
period has passed.

Given this, the postponed poisoning could have executed after
postponed destruction of the object containing the rculist element.

This bug was revealed after the memory leaks on rule deletion were
recently fixed.

This patch removes the postponed 'next' member poisoning and adds
documentation describing the ordering limitations in OVS RCU.

Alex Wang dug out the root cause of the resulting crashes, thanks!

Signed-off-by: Jarno Rajahalme <jrajahalme at nicira.com>
---
 lib/automake.mk  |    1 -
 lib/classifier.c |    7 +++----
 lib/ovs-rcu.c    |   13 +++++++++++++
 lib/ovs-rcu.h    |   15 ++++++++++++++-
 lib/rculist.c    |   27 ---------------------------
 lib/rculist.h    |   36 +++++++++++++++++-------------------
 6 files changed, 47 insertions(+), 52 deletions(-)
 delete mode 100644 lib/rculist.c

diff --git a/lib/automake.mk b/lib/automake.mk
index 7a34c1a..f082115 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -197,7 +197,6 @@ lib_libopenvswitch_la_SOURCES = \
 	lib/random.h \
 	lib/rconn.c \
 	lib/rconn.h \
-	lib/rculist.c \
 	lib/rculist.h \
 	lib/reconnect.c \
 	lib/reconnect.h \
diff --git a/lib/classifier.c b/lib/classifier.c
index 6075cf7..ee26221 100644
--- a/lib/classifier.c
+++ b/lib/classifier.c
@@ -269,11 +269,10 @@ cls_rule_destroy(struct cls_rule *rule)
 {
     ovs_assert(!rule->cls_match);   /* Must not be in a classifier. */
 
-    /* Check that the rule has been properly removed from the classifier and
-     * that the destruction only happens after the RCU grace period, or that
-     * the rule was never inserted to the classifier in the first place. */
-    ovs_assert(rculist_next_protected(&rule->node) == RCULIST_POISON
+    /* Check that the rule has been properly removed from the classifier. */
+    ovs_assert(rule->node.prev == RCULIST_POISON
                || rculist_is_empty(&rule->node));
+    rculist_poison__(&rule->node);   /* Poisons also the next pointer. */
 
     minimatch_destroy(&rule->match);
 }
diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c
index e0634cf..b8f8bc4 100644
--- a/lib/ovs-rcu.c
+++ b/lib/ovs-rcu.c
@@ -212,6 +212,19 @@ ovsrcu_synchronize(void)
 /* Registers 'function' to be called, passing 'aux' as argument, after the
  * next grace period.
  *
+ * The call is guaranteed to happen after the next time all participating
+ * threads have quiesced at least once, but there is no quarantee that all
+ * registered functions are called as early as possible, or that the functions
+ * registered by different threads would be called in the order the
+ * registrations took place.  In particular, even if two threads provably
+ * register a function each in a specific order, the functions may still be
+ * called in the opposite order, depending on the timing of when the threads
+ * call ovsrcu_quiesce(), how many functions they postpone, and when the
+ * ovs-rcu thread happens to grab the functions to be called.
+ *
+ * All functions registered by a single thread are guaranteed to execute in the
+ * registering order, however.
+ *
  * This function is more conveniently called through the ovsrcu_postpone()
  * macro, which provides a type-safe way to allow 'function''s parameter to be
  * any pointer type. */
diff --git a/lib/ovs-rcu.h b/lib/ovs-rcu.h
index 1d79976..c1e3d60 100644
--- a/lib/ovs-rcu.h
+++ b/lib/ovs-rcu.h
@@ -60,12 +60,25 @@
  *
  * When a quiescient state has occurred in every thread, we say that a "grace
  * period" has occurred.  Following a grace period, all of the callbacks
- * postponed before the start of the grace period may be invoked.  OVS takes
+ * postponed before the start of the grace period MAY be invoked.  OVS takes
  * care of this automatically through the RCU mechanism: while a process still
  * has only a single thread, it invokes the postponed callbacks directly from
  * ovsrcu_quiesce() and ovsrcu_quiesce_start(); after additional threads have
  * been created, it creates an extra helper thread to invoke callbacks.
  *
+ * Please note that while a postponed function call is guaranteed to happen
+ * after the next time all participating threads have quiesced at least once,
+ * there is no quarantee that all postponed functions are called as early as
+ * possible, or that the functions postponed by different threads would be
+ * called in the order the registrations took place.  In particular, even if
+ * two threads provably postpone a function each in a specific order, the
+ * postponed functions may still be called in the opposite order, depending on
+ * the timing of when the threads call ovsrcu_quiesce(), how many functions
+ * they postpone, and when the ovs-rcu thread happens to grab the functions to
+ * be called.
+ *
+ * All functions postponed by a single thread are guaranteed to execute in the
+ * order they were postponed, however.
  *
  * Use
  * ---
diff --git a/lib/rculist.c b/lib/rculist.c
deleted file mode 100644
index 61a03d0..0000000
--- a/lib/rculist.c
+++ /dev/null
@@ -1,27 +0,0 @@
-/*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at:
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-#include <config.h>
-#include "rculist.h"
-
-/* Initializes 'list' with pointers that will (probably) cause segfaults if
- * dereferenced and, better yet, show up clearly in a debugger. */
-void
-rculist_poison__(struct rculist *list)
-    OVS_NO_THREAD_SAFETY_ANALYSIS
-{
-    list->prev = RCULIST_POISON;
-    ovsrcu_set_hidden(&list->next, RCULIST_POISON);
-}
diff --git a/lib/rculist.h b/lib/rculist.h
index f3c1475..7ba20e5 100644
--- a/lib/rculist.h
+++ b/lib/rculist.h
@@ -38,10 +38,7 @@
  * - rculist_front() returns a const pointer to accommodate for an RCU reader.
  * - rculist_splice_hidden(): Spliced elements may not have been visible to
  *   RCU readers before the operation.
- * - rculist_poison(): Immediately poisons the 'prev' pointer, and schedules
- *   ovsrcu_postpone() to poison the 'next' pointer.  This issues a memory
- *   write operation to the list element, hopefully crashing the program if
- *   the list node was freed or re-used too early.
+ * - rculist_poison(): Only poisons the 'prev' pointer.
  *
  * The following functions are variations of the struct ovs_list functions with
  * similar names, but are now restricted to the writer use:
@@ -134,8 +131,6 @@ rculist_init(struct rculist *list)
 
 #define RCULIST_POISON (struct rculist *)(UINTPTR_MAX / 0xf * 0xc)
 
-void rculist_poison__(struct rculist *list);
-
 /* Initializes 'list' with pointers that will (probably) cause segfaults if
  * dereferenced and, better yet, show up clearly in a debugger. */
 static inline void
@@ -143,7 +138,19 @@ rculist_poison(struct rculist *list)
     OVS_NO_THREAD_SAFETY_ANALYSIS
 {
     list->prev = RCULIST_POISON;
-    ovsrcu_postpone(rculist_poison__, list);
+}
+
+/* Initializes 'list' with pointers that will (probably) cause segfaults if
+ * dereferenced and, better yet, show up clearly in a debugger.
+ *
+ * This variant poisons also the next pointer, so this may not be called if
+ * this list element is still visible to RCU readers. */
+static inline void
+rculist_poison__(struct rculist *list)
+    OVS_NO_THREAD_SAFETY_ANALYSIS
+{
+    rculist_poison(list);
+    ovsrcu_set_hidden(&list->next, RCULIST_POISON);
 }
 
 /* rculist insertion. */
@@ -217,10 +224,7 @@ rculist_replace(struct rculist *element, struct rculist *position)
     position_next->prev = element;
     element->prev = position->prev;
     ovsrcu_set(&element->prev->next, element);
-
-#ifndef NDEBUG
-    rculist_poison(position); /* XXX: Some overhead due to ovsrcu_postpone() */
-#endif
+    rculist_poison(position);
 }
 
 /* Initializes 'dst' with the contents of 'src', compensating for moving it
@@ -244,10 +248,7 @@ rculist_move(struct rculist *dst, struct rculist *src)
     } else {
         rculist_init(dst);
     }
-
-#ifndef NDEBUG
-    rculist_poison(src); /* XXX: Some overhead due to ovsrcu_postpone() */
-#endif
+    rculist_poison(src);
 }
 
 /* Removes 'elem' from its list and returns the element that followed it.
@@ -268,10 +269,7 @@ rculist_remove(struct rculist *elem)
 
     elem_next->prev = elem->prev;
     ovsrcu_set(&elem->prev->next, elem_next);
-
-#ifndef NDEBUG
-    rculist_poison(elem); /* XXX: Some overhead due to ovsrcu_postpone() */
-#endif
+    rculist_poison(elem);
     return elem_next;
 }
 
-- 
1.7.10.4




More information about the dev mailing list