[ovs-dev] [PATCH] ovsdb-server: Write manager status information to Manager table.
Ben Pfaff
blp at nicira.com
Tue Feb 1 17:55:24 UTC 2011
On Mon, Jan 31, 2011 at 05:05:58PM -0800, Andrew Evans wrote:
> This commit makes the status of manager connections visible via the Manager
> table in the database. Two new columns have been created for this purpose:
> 'is_connected' and 'status'. The former is a boolean flag, and the latter is a
> string-string map which may contain the keys "last_error", "state", and
> "time_in_state".
>
> Feature #3692.
> Requested-by: Keith Amidon <keith at nicira.com>
Thanks for writing this up!
I think that parse_db_column() will segfault if its 'name_' argument
does not contain a colon. Indeed:
blp at blp:~/nicira/openflow/_build(139)$ valgrind ovsdb/ovsdb-server --remote=x conf.db --unixctl=$PWD/socket
==32657== Syscall param timer_create(evp) points to uninitialised byte(s)
==32657== at 0x4854008: timer_create (timer_create.c:99)
==32657== by 0x8077022: set_up_timer (timeval.c:139)
==32657== by 0x807723C: refresh_monotonic_if_ticked (timeval.c:175)
==32657== by 0x80772D0: time_msec (timeval.c:227)
==32657== by 0x807AEA8: vlog_init (vlog.c:455)
==32657== by 0x80628A8: daemonize_start (daemon.c:474)
==32657== by 0x8055E54: main (ovsdb-server.c:105)
==32657== Address 0xfe98d3a4 is on thread 1's stack
==32657==
Feb 01 09:44:37|00001|lockfile|WARN|.conf.db.~lock~: waited 8 ms for lock file
Feb 01 09:44:38|00002|reconnect|INFO|x: connecting...
Feb 01 09:44:38|00003|reconnect|WARN|x: connection attempt failed (Address family not supported by protocol)
Feb 01 09:44:38|00004|reconnect|INFO|x: waiting 1 seconds before reconnect
==32657== Invalid read of size 1
==32657== at 0x47DC0B8: strlen (mc_replace_strmem.c:282)
==32657== by 0x80794B1: xstrdup (util.c:99)
==32657== by 0x8055070: parse_db_column (ovsdb-server.c:196)
==32657== by 0x805625E: main (ovsdb-server.c:492)
==32657== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==32657==
==32657==
==32657== Process terminating with default action of signal 11 (SIGSEGV)
==32657== Access not within mapped region at address 0x0
==32657== at 0x47DC0B8: strlen (mc_replace_strmem.c:282)
==32657== by 0x80794B1: xstrdup (util.c:99)
==32657== by 0x8055070: parse_db_column (ovsdb-server.c:196)
==32657== by 0x805625E: main (ovsdb-server.c:492)
==32657== If you believe this happened as a result of a stack
==32657== overflow in your program's main thread (unlikely but
==32657== possible), you can try to increase the size of the
==32657== main thread stack using the --main-stacksize= flag.
==32657== The main thread stack size used in this run was 8388608.
Segmentation fault
I'm not sure why you removed the warning in add_manager_options(). Did
it just get logged too much? You could use VLOG_INFO_ONCE instead.
What's the reason for doing the update in two phases, first collecting
rows in get_remote_rows() and then updating them in the caller? Does
this somehow work better than just iterating over the rows and modifying
them in a single pass? It looks like this way actually involves more
code and more work, so I'm curious about the design motivation here.
I think that write_string_string_column() could use
ovsdb_datum_sort_assert(), since its caller never adds a duplicate key.
In update_remote_status(), I'm not sure that it's guaranteed that
'status' will be nonnull. Are you sure about that?
More information about the dev
mailing list