[ovs-dev] [PATCH v4] ovsdb-tool: Add a db consistency check to the ovsdb-tool check-cluster command
Federico Paolinelli
fpaoline at redhat.com
Thu Jul 30 10:41:47 UTC 2020
There are some occurrences where the database ends up in an inconsistent
state. This happened in ovn-k8s and is described in [0].
Here we are adding a supported way to check that a given db is consistent,
which is less error prone than checking the logs.
Tested against both a valid db and a corrupted db attached to the
above bug [1]. Also, tested with a fresh db that did not do a snapshot.
[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1837953#c23
[1]: https://bugzilla.redhat.com/attachment.cgi?id=1697595
Signed-off-by: Federico Paolinelli <fpaoline at redhat.com>
Suggested-by: Dumitru Ceara <dceara at redhat.com>
---
ovsdb/ovsdb-tool.c | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/ovsdb/ovsdb-tool.c b/ovsdb/ovsdb-tool.c
index 91662cab8..30d0472b2 100644
--- a/ovsdb/ovsdb-tool.c
+++ b/ovsdb/ovsdb-tool.c
@@ -1497,6 +1497,44 @@ do_check_cluster(struct ovs_cmdl_context *ctx)
}
}
+ /* Check for db consistency:
+ * The serverid must be in the servers list.
+ */
+
+ for (struct server *s = c.servers; s < &c.servers[c.n_servers]; s++) {
+ struct shash *servers_obj = json_object(s->snap->servers);
+ char *server_id = xasprintf(SID_FMT, SID_ARGS(&s->header.sid));
+ bool found = false;
+ const struct shash_node *node;
+
+ SHASH_FOR_EACH (node, servers_obj) {
+ if (!strncmp(server_id, node->name, SID_LEN)) {
+ found = true;
+ }
+ }
+
+ if (!found) {
+ for (struct raft_entry *e = s->entries;
+ e < &s->entries[s->log_end - s->log_start]; e++) {
+ if (e->servers == NULL) {
+ continue;
+ }
+ struct shash *log_servers_obj = json_object(e->servers);
+ SHASH_FOR_EACH (node, log_servers_obj) {
+ if (!strncmp(server_id, node->name, SID_LEN)) {
+ found = true;
+ }
+ }
+ }
+ }
+
+ if (!found) {
+ ovs_fatal(0, "%s: server %s not found in server list",
+ s->filename, server_id);
+ }
+ free(server_id);
+ }
+
/* Clean up. */
for (size_t i = 0; i < c.n_servers; i++) {
--
2.26.2
More information about the dev
mailing list