[ovs-dev] [PATCH v3] [RFC] ovn: Start work on design documentation.

Gray, Mark D mark.d.gray at intel.com
Thu Feb 26 17:11:54 UTC 2015


> From: dev [mailto:dev-bounces at openvswitch.org] On Behalf Of Ben Pfaff
> Sent: Friday, February 20, 2015 7:20 AM
> To: dev at openvswitch.org
> Cc: Ben Pfaff
> Subject: [ovs-dev] [PATCH v3] [RFC] ovn: Start work on design
> documentation.
> 

I have some questions below to help my understanding as it's kind of difficult
to follow without diagrams, but I think I have the general idea. I look forward
to seeing a presentation on this eventually, maybe at the next ovs conf!

> This commit adds preliminary design documentation for Open Virtual
> Network,
> or OVN, a new OVS-based project to add support for virtual networking to
> OVS, initially with OpenStack integration.
> 
> This initial design has been influenced by many people, including (in
> alphabetical order) Aaron Rosen, Chris Wright, Gurucharan Shetty, Jeremy
> Stribling, Justin Pettit, Ken Duda, Kevin Benton, Madhu Venugopal, Martin
> Casado, Natasha Gude, Pankaj Thakkar, Russell Bryant, Teemu Koponen, and
> Thomas Graf.  All blunders, however, are due to my own hubris.
> 
> Signed-off-by: Ben Pfaff <blp at nicira.com>
> ---
> v1->v2: Rebase.
> v2->v3:
>   - Multiple CMSes are possible.
>   - Whitespace and typo fixes.
>   - ovn.ovsschema: Gateway table is not a root table, other tables are.
>   - ovn.xml: Talk about deleting rows on HV shutdown.
>   - ovn-nb.xml: Clarify 'switch' column in ACL table.
>   - ovn-nb.ovssechma: A Logical_Router_Port is no longer a Logical_Port.
>   - ovn.xml: Add action for generating ARP.
>   - ovn-nb.xml: Add allow-related action for security group support.
> ---
>  Makefile.am                |   1 +
>  configure.ac               |   3 +-
>  ovn/automake.mk            |  75 +++++++
>  ovn/ovn-architecture.7.xml | 338 ++++++++++++++++++++++++++++++
>  ovn/ovn-controller.8.in    |  41 ++++
>  ovn/ovn-nb.ovsschema       |  62 ++++++
>  ovn/ovn-nb.xml             | 245 ++++++++++++++++++++++
>  ovn/ovn.ovsschema          |  50 +++++
>  ovn/ovn.xml                | 497
> +++++++++++++++++++++++++++++++++++++++++++++
>  9 files changed, 1311 insertions(+), 1 deletion(-)
>  create mode 100644 ovn/automake.mk
>  create mode 100644 ovn/ovn-architecture.7.xml
>  create mode 100644 ovn/ovn-controller.8.in
>  create mode 100644 ovn/ovn-nb.ovsschema
>  create mode 100644 ovn/ovn-nb.xml
>  create mode 100644 ovn/ovn.ovsschema
>  create mode 100644 ovn/ovn.xml
> 
> diff --git a/Makefile.am b/Makefile.am
> index 0480d20..699a580 100644
> --- a/Makefile.am
> +++ b/Makefile.am
> @@ -370,3 +370,4 @@ include tutorial/automake.mk
>  include vtep/automake.mk
>  include datapath-windows/automake.mk
>  include datapath-windows/include/automake.mk
> +include ovn/automake.mk
> diff --git a/configure.ac b/configure.ac
> index d2d02ca..795f876 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1,4 +1,4 @@
> -# Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
> +# Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015 Nicira, Inc.
>  #
>  # Licensed under the Apache License, Version 2.0 (the "License");
>  # you may not use this file except in compliance with the License.
> @@ -182,6 +182,7 @@ dnl This makes sure that include/openflow gets
> created in the build directory.
>  AC_CONFIG_COMMANDS([include/openflow/openflow.h.stamp])
> 
>  AC_CONFIG_COMMANDS([utilities/bugtool/dummy], [:])
> +AC_CONFIG_COMMANDS([ovn/dummy], [:])
> 
>  m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES])
> 
> diff --git a/ovn/automake.mk b/ovn/automake.mk
> new file mode 100644
> index 0000000..3889d56
> --- /dev/null
> +++ b/ovn/automake.mk
> @@ -0,0 +1,75 @@
> +# OVN schema and IDL
> +EXTRA_DIST += ovn/ovn.ovsschema
> +pkgdata_DATA += ovn/ovn.ovsschema
> +
> +# OVN E-R diagram
> +#
> +# If "python" or "dot" is not available, then we do not add graphical diagram
> +# to the documentation.
> +if HAVE_PYTHON
> +if HAVE_DOT
> +ovn/ovn.gv: ovsdb/ovsdb-dot.in ovn/ovn.ovsschema
> +	$(AM_V_GEN)$(OVSDB_DOT) --no-arrows
> $(srcdir)/ovn/ovn.ovsschema > $@
> +ovn/ovn.pic: ovn/ovn.gv ovsdb/dot2pic
> +	$(AM_V_GEN)(dot -T plain < ovn/ovn.gv | $(PERL)
> $(srcdir)/ovsdb/dot2pic -f 3) > $@.tmp && \
> +	mv $@.tmp $@
> +OVN_PIC = ovn/ovn.pic
> +OVN_DOT_DIAGRAM_ARG = --er-diagram=$(OVN_PIC)
> +DISTCLEANFILES += ovn/ovn.gv ovn/ovn.pic
> +endif
> +endif
> +
> +# OVN schema documentation
> +EXTRA_DIST += ovn/ovn.xml
> +DISTCLEANFILES += ovn/ovn.5
> +man_MANS += ovn/ovn.5
> +ovn/ovn.5: \
> +	ovsdb/ovsdb-doc ovn/ovn.xml ovn/ovn.ovsschema $(OVN_PIC)
> +	$(AM_V_GEN)$(OVSDB_DOC) \
> +		$(OVN_DOT_DIAGRAM_ARG) \
> +		--version=$(VERSION) \
> +		$(srcdir)/ovn/ovn.ovsschema \
> +		$(srcdir)/ovn/ovn.xml > $@.tmp && \
> +	mv $@.tmp $@
> +
> +# OVN northbound schema and IDL
> +EXTRA_DIST += ovn/ovn-nb.ovsschema
> +pkgdata_DATA += ovn/ovn-nb.ovsschema
> +
> +# OVN northbound E-R diagram
> +#
> +# If "python" or "dot" is not available, then we do not add graphical diagram
> +# to the documentation.
> +if HAVE_PYTHON
> +if HAVE_DOT
> +ovn/ovn-nb.gv: ovsdb/ovsdb-dot.in ovn/ovn-nb.ovsschema
> +	$(AM_V_GEN)$(OVSDB_DOT) --no-arrows $(srcdir)/ovn/ovn-
> nb.ovsschema > $@
> +ovn/ovn-nb.pic: ovn/ovn-nb.gv ovsdb/dot2pic
> +	$(AM_V_GEN)(dot -T plain < ovn/ovn-nb.gv | $(PERL)
> $(srcdir)/ovsdb/dot2pic -f 3) > $@.tmp && \
> +	mv $@.tmp $@
> +OVN_NB_PIC = ovn/ovn-nb.pic
> +OVN_NB_DOT_DIAGRAM_ARG = --er-diagram=$(OVN_NB_PIC)
> +DISTCLEANFILES += ovn/ovn-nb.gv ovn/ovn-nb.pic
> +endif
> +endif
> +
> +# OVN northbound schema documentation
> +EXTRA_DIST += ovn/ovn-nb.xml
> +DISTCLEANFILES += ovn/ovn-nb.5
> +man_MANS += ovn/ovn-nb.5
> +ovn/ovn-nb.5: \
> +	ovsdb/ovsdb-doc ovn/ovn-nb.xml ovn/ovn-nb.ovsschema
> $(OVN_NB_PIC)
> +	$(AM_V_GEN)$(OVSDB_DOC) \
> +		$(OVN_NB_DOT_DIAGRAM_ARG) \
> +		--version=$(VERSION) \
> +		$(srcdir)/ovn/ovn-nb.ovsschema \
> +		$(srcdir)/ovn/ovn-nb.xml > $@.tmp && \
> +	mv $@.tmp $@
> +
> +man_MANS += ovn/ovn-controller.8 ovn/ovn-architecture.7
> +EXTRA_DIST += ovn/ovn-controller.8.in ovn/ovn-architecture.7.xml
> +
> +SUFFIXES += .xml
> +%: %.xml
> +	$(AM_V_GEN)$(run_python) $(srcdir)/build-aux/xml2nroff \
> +		--version=$(VERSION) $< > $@.tmp && mv $@.tmp $@
> diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
> new file mode 100644
> index 0000000..d51a175
> --- /dev/null
> +++ b/ovn/ovn-architecture.7.xml
> @@ -0,0 +1,338 @@
> +<?xml version="1.0" encoding="utf-8"?>
> +<manpage program="ovn-architecture" section="7" title="OVN
> Architecture">
> +  <h1>Name</h1>
> +  <p>ovn-architecture -- Open Virtual Network architecture</p>
> +
> +  <h1>Description</h1>
> +
> +  <p>
> +    OVN, the Open Virtual Network, is a system to support virtual network
> +    abstraction.  OVN complements the existing capabilities of OVS to add
> +    native support for virtual network abstractions, such as virtual L2 and L3
> +    overlays and security groups.  Just like OVS, OVN's design goal is to have
> +    a production-quality implementation that can operate at significant scale.
> +  </p>

Will security groups work with the userspace datapath or will something 
need to be added in userspace?

> +
> +  <p>
> +    An OVN deployment consists of several components:
> +  </p>
> +
> +  <ul>
> +    <li>
> +      <p>
> +        A <dfn>Cloud Management System</dfn> (<dfn>CMS</dfn>), which is
> +        OVN's ultimate client (via its users and administrators).  OVN
> +        integration requires installing a CMS-specific plugin and
> +        related software (see below).  OVN initially targets OpenStack
> +        as CMS.
> +      </p>

Are you engaging with the OpenStack community on this? Do you have a 
blueprint?

> +
> +      <p>
> +        We generally speak of ``the'' CMS, but one can imagine scenarios in
> +        which multiple CMSes manage different parts of an OVN deployment.
> +      </p>
> +    </li>
> +
> +    <li>
> +      An OVN Database physical or virtual node (or, eventually, cluster)
> +      installed in a central location.
> +    </li>
> +
> +    <li>
> +      One or more (usually many) <dfn>hypervisors</dfn>.  Hypervisors must
> run
> +      Open vSwitch and implement the interface described in
> +      <code>IntegrationGuide.md</code> in the OVS source tree.  Any
> hypervisor
> +      platform supported by Open vSwitch is acceptable.
> +    </li>
> +
> +    <li>
> +      <p>
> +	Zero or more <dfn>gateways</dfn>.  A gateway extends a tunnel-
> based
> +	logical network into a physical network by bidirectionally forwarding
> +	packets between tunnels and a physical Ethernet port.  This allows
> +	non-virtualized machines to participate in logical networks.  A
> gateway
> +	may be a physical host, a virtual machine, or an ASIC-based hardware
> +	switch that supports the <code>vtep</code>(5) schema.  (Support
> for the
> +	latter will come later in OVN implementation.)
> +      </p>
> +
> +      <p>
> +	Hypervisors and gateways are together called <dfn>transport
> node</dfn>
> +	or <dfn>chassis</dfn>.
> +      </p>
> +    </li>
> +  </ul>
> +
> +  <p>
> +    The diagram below shows how the major components of OVN and
> related
> +    software interact.  Starting at the top of the diagram, we have:
> +  </p>
> +
> +  <ul>
> +    <li>
> +      The Cloud Management System, as defined above.
> +    </li>
> +
> +    <li>
> +      <p>
> +	The <dfn>OVN/CMS Plugin</dfn> is the component of the CMS that
> +	interfaces to OVN.  In OpenStack, this is a Neutron plugin.
> +	The plugin's main purpose is to translate the CMS's notion of logical
> +	network configuration, stored in the CMS's configuration database in
> a
> +	CMS-specific format, into an intermediate representation
> understood by
> +	OVN.
> +      </p>
> +
> +      <p>
> +	This component is necessarily CMS-specific, so a new plugin needs to
> be
> +	developed for each CMS that is integrated with OVN.  All of the
> +	components below this one in the diagram are CMS-independent.
> +      </p>
> +    </li>
> +
> +    <li>
> +      <p>
> +	The <dfn>OVN Northbound Database</dfn> receives the
> intermediate
> +	representation of logical network configuration passed down by the
> +	OVN/CMS Plugin.  The database schema is meant to be ``impedance
> +	matched'' with the concepts used in a CMS, so that it directly
> supports
> +	notions of logical switches, routers, ACLs, and so on.  See
> +	<code>ovs-nb</code>(5) for details.
> +      </p>
> +
> +      <p>
> +	The OVN Northbound Database has only two clients: the OVN/CMS
> Plugin
> +	above it and <code>ovn-nbd</code> below it.
> +      </p>
> +    </li>
> +
> +    <li>
> +      <code>ovn-nbd</code>(8) connects to the OVN Northbound Database
> above it
> +      and the OVN Database below it.  It translates the logical network
> +      configuration in terms of conventional network concepts, taken from the
> +      OVN Northbound Database, into logical datapath flows in the OVN
> Database
> +      below it.
> +    </li>

As both of these databases are common to all instances of ovn-controller, what
influenced the design decision to have this double database model? Could
you have defined some kind of message-based api northbound that was then
translated and inserted into the ovn database or directly into ovsdb? I'm just
curious. I presume it was due to ease of implementation?

> +
> +    <li>
> +      <p>
> +	The <dfn>OVN Database</dfn> is the center of the system.  Its
> clients
> +	are <code>ovn-nbd</code>(8) above it and <code>ovn-
> controller</code>(8)
> +	on every transport node below it.
> +      </p>
> +
> +      <p>
> +	The OVN Database contains three kinds of data: <dfn>Physical
> +	Network</dfn> (PN) tables that specify how to reach hypervisor and
> +	other nodes, <dfn>Logical Network</dfn> (LN) tables that describe
> the
> +	logical network in terms of ``logical datapath flows,'' and
> +	<dfn>Binding</dfn> tables that link logical network components'
> +	locations to the physical network.  The hypervisors populate the PN
> and
> +	Binding tables, whereas <code>ovn-nbd</code>(8) populates the LN
> +	tables.
> +      </p>
> +
> +      <p>
> +	OVN Database performance must scale with the number of transport
> nodes.
> +	This will likely require some work on <code>ovsdb-server</code>(1)
> as
> +	we encounter bottlenecks.  Clustering for availability may be needed.
> +      </p>
> +    </li>
> +  </ul>
> +
> +  <p>
> +    The remaining components are replicated onto each hypervisor:
> +  </p>
> +
> +  <ul>
> +    <li>
> +      <code>ovn-controller</code>(8) is OVN's agent on each hypervisor and
> +      software gateway.  Northbound, it connects to the OVN Database to
> learn
> +      about OVN configuration and status and to populate the PN and
> <code>Bindings</code>
> +      tables with the hypervisor's status.  Southbound, it connects to
> +      <code>ovs-vswitchd</code>(8) as an OpenFlow controller, for control
> over
> +      network traffic, and to the local <code>ovsdb-server</code>(1) to allow
> +      it to monitor and control Open vSwitch configuration.
> +    </li>
> +
> +    <li>
> +      <code>ovs-vswitchd</code>(8) and <code>ovsdb-server</code>(1) are
> +      conventional components of Open vSwitch.
> +    </li>
> +  </ul>
> +
> +  <pre fixed="yes">
> +                                  CMS
> +                                   |
> +                                   |
> +                       +-----------|-----------+
> +                       |           |           |
> +                       |     OVN/CMS Plugin    |
> +                       |           |           |
> +                       |           |           |
> +                       |   OVN Northbound DB   |
> +                       |           |           |
> +                       |           |           |
> +                       |        ovn-nbd        |
> +                       |           |           |
> +                       +-----------|-----------+
> +                                   |
> +                                   |
> +                                +------+
> +                                |OVN DB|
> +                                +------+
> +                                   |
> +                                   |
> +                +------------------+------------------+
> +                |                  |                  |
> + HV 1           |                  |    HV n          |
> ++---------------|---------------+  .  +---------------|---------------+
> +|               |               |  .  |               |               |
> +|        ovn-controller         |  .  |        ovn-controller         |
> +|         |          |          |  .  |         |          |          |
> +|         |          |          |     |         |          |          |
> +|  ovs-vswitchd   ovsdb-server  |     |  ovs-vswitchd   ovsdb-server  |
> +|                               |     |                               |
> ++-------------------------------+     +-------------------------------+
> +  </pre>
> +
> +  <h3>Life Cycle of a VIF</h3>
> +
> +  <p>
> +    Tables and their schemas presented in isolation are difficult to
> +    understand.  Here's an example.
> +  </p>
> +
> +  <p>
> +    The steps in this example refer often to details of the OVN and OVN
> +    Northbound database schemas.  Please see <code>ovn</code>(5) and
> +    <code>ovn-nb</code>(5), respectively, for the full story on these
> +    databases.
> +  </p>
> +
> +  <ol>
> +    <li>
> +      A VIF's life cycle begins when a CMS administrator creates a new VIF
> +      using the CMS user interface or API and adds it to a switch (one
> +      implemented by OVN as a logical switch).  The CMS updates its own
> +      configuration.  This includes associating unique, persistent identifier
> +      <var>vif-id</var> and Ethernet address <var>mac</var> with the VIF.
> +    </li>
> +
> +    <li>
> +      The CMS plugin updates the OVN Northbound database to include the
> new
> +      VIF, by adding a row to the <code>Logical_Port</code> table.  In the
> new
> +      row, <code>name</code> is <var>vif-id</var>, <code>mac</code> is
> +      <var>mac</var>, <code>switch</code> points to the OVN logical
> switch's
> +      Logical_Switch record, and other columns are initialized appropriately.
> +    </li>
> +
> +    <li>
> +      <code>ovs-nbd</code> receives the OVN Northbound database
> update.  In
> +      turn, it makes the corresponding updates to the OVN database, by
> adding
> +      rows to the OVN database <code>Pipeline</code> table to reflect the
> new
> +      port, e.g. add a flow to recognize that packets destined to the new
> +      port's MAC address should be delivered to it, and update the flow that

Openstack's neutron uses normal action in both the integration and physical
bridge. What kind of flows would get added?

Also, does this mean that the ovn-controller will also be an OpenFlow
client to OVS? The diagram doesn’t really indicate that.

> +      delivers broadcast and multicast packets to include the new port.
> +    </li>
> +
> +    <li>
> +      On every hypervisor, <code>ovn-controller</code> receives the
> +      <code>Pipeline</code> table updates that <code>ovs-nbd</code>
> made in the
> +      previous step.  As long as the VM that owns the VIF is powered off,
> +      <code>ovn-controller</code> cannot do much; it cannot, for example,
> +      arrange to send packets to or receive packets from the VIF, because the
> +      VIF does not actually exist anywhere.
> +    </li>
> +
> +    <li>
> +      Eventually, a user powers on the VM that owns the VIF.  On the
> hypervisor
> +      where the VM is powered on, the integration between the hypervisor
> and
> +      Open vSwitch (described in <code>IntegrationGuide.md</code>) adds
> the VIF
> +      to the OVN integration bridge and stores <var>vif-id</var> in
> +      <code>external-ids</code>:<code>iface-id</code> to indicate that the
> +      interface is an instantiation of the new VIF.  (None of this code is new
> +      in OVN; this is pre-existing integration work that has already been done
> +      on hypervisors that support OVS.)
> +    </li>
> +
> +    <li>
> +      On the hypervisor where the VM is powered on, <code>ovn-
> controller</code>
> +      notices <code>external-ids</code>:<code>iface-id</code> in the new
> +      Interface.  In response, it updates the local hypervisor's OpenFlow
> +      tables so that packets to and from the VIF are properly handled.
> +      Afterward, it updates the <code>Bindings</code> table in the OVN DB,
> +      adding a row that links the logical port from
> +      <code>external-ids</code>:<code>iface-id</code> to the hypervisor.
> +    </li>
> +
> +    <li>
> +      Some CMS systems, including OpenStack, fully start a VM only when its
> +      networking is ready.  To support this, <code>ovn-nbd</code> notices
> the
> +      new row in the <code>Bindings</code> table, and pushes this upward
> by
> +      updating the <ref column="up" table="Logical_Port" db="OVN_NB"/>
> column
> +      in the OVN Northbound database's <ref table="Logical_Port"
> db="OVN_NB"/>
> +      table to indicate that the VIF is now up.  The CMS, if it uses this
> +      feature, can then react by allowing the VM's execution to proceed.
> +    </li>
> +
> +    <li>
> +      On every hypervisor but the one where the VIF resides,
> +      <code>ovn-controller</code> notices the new row in the
> +      <code>Bindings</code> table.  This provides <code>ovn-
> controller</code>
> +      the physical location of the logical port, so each instance updates the
> +      OpenFlow tables of its switch (based on logical datapath flows in the
> OVN
> +      DB <code>Pipeline</code> table) so that packets to and from the VIF
> can
> +      be properly handled via tunnels.
> +    </li>
> +
> +    <li>
> +      Eventually, a user powers off the VM that owns the VIF.  On the
> +      hypervisor where the VM was powered on, the VIF is deleted from the
> OVN
> +      integration bridge.
> +    </li>
> +
> +    <li>
> +      On the hypervisor where the VM was powered on,
> +      <code>ovn-controller</code> notices that the VIF was deleted.  In
> +      response, it removes the logical port's row from the
> +      <code>Bindings</code> table.
> +    </li>
> +
> +    <li>
> +      On every hypervisor, <code>ovn-controller</code> notices the row
> removed
> +      from the <code>Bindings</code> table.  This means that
> +      <code>ovn-controller</code> no longer knows the physical location of
> the
> +      logical port, so each instance updates its OpenFlow table to reflect
> +      that.
> +    </li>
> +
> +    <li>
> +      Eventually, when the VIF (or its entire VM) is no longer needed by
> +      anyone, an administrator deletes the VIF using the CMS user interface or
> +      API.  The CMS updates its own configuration.
> +    </li>
> +
> +    <li>
> +      The CMS plugin removes the VIF from the OVN Northbound database,
> +      by deleting its row in the <code>Logical_Port</code> table.
> +    </li>
> +
> +    <li>
> +      <code>ovs-nbd</code> receives the OVN Northbound update and in
> turn
> +      updates the OVN database accordingly, by removing or updating the
> +      rows from the OVN database <code>Pipeline</code> table that were
> related
> +      to the now-destroyed VIF.
> +    </li>
> +
> +    <li>
> +      On every hypervisor, <code>ovn-controller</code> receives the
> +      <code>Pipeline</code> table updates that <code>ovs-nbd</code>
> made in the
> +      previous step.  <code>ovn-controller</code> updates OpenFlow tables
> to
> +      reflect the update, although there may not be much to do, since the VIF
> +      had already become unreachable when it was removed from the
> +      <code>Bindings</code> table in a previous step.
> +    </li>
> +  </ol>
> +
> +</manpage>
> diff --git a/ovn/ovn-controller.8.in b/ovn/ovn-controller.8.in
> new file mode 100644
> index 0000000..59fcb59
> --- /dev/null
> +++ b/ovn/ovn-controller.8.in
> @@ -0,0 +1,41 @@
> +.\" -*- nroff -*-
> +.de IQ
> +.  br
> +.  ns
> +.  IP "\\$1"
> +..
> +.TH ovn\-controller 8 "@VERSION@" "Open vSwitch" "Open vSwitch
> Manual"
> +.ds PN ovn\-controller
> +.
> +.SH NAME
> +ovn\-controller \- OVN local controller
> +.
> +.SH SYNOPSIS
> +\fBovn\-controller\fR [\fIoptions\fR]
> +.
> +.SH DESCRIPTION
> +\fBovn\-controller\fR is the local controller daemon for OVN, the Open
> +Virtual Network.  It connects northbound to the OVN database (see
> +\fBovn\fR(5)) over the OVSDB protocol, and southbound to the Open
> +vSwitch database (see \fBovs-vswitchd.conf.db\fR(5)) over the OVSDB
> +protocol and to \fBovs\-vswitchd\fR(8) via OpenFlow.  Each hypervisor
> +and software gateway in an OVN deployment runs its own independent
> +copy of \fBovn\-controller\fR; thus, \fBovn\-controller\fR's
> +southbound connections are machine-local and do not run over a
> +physical network.
> +.PP
> +XXX this is completely skeletal.
> +.
> +.SH OPTIONS
> +.SS "Public Key Infrastructure Options"
> +.so lib/ssl.man
> +.so lib/ssl-peer-ca-cert.man
> +.ds DD
> +.so lib/daemon.man
> +.so lib/vlog.man
> +.so lib/unixctl.man
> +.so lib/common.man
> +.
> +.SH "SEE ALSO"
> +.
> +\fBovn\-architecture\fR(7)
> diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
> new file mode 100644
> index 0000000..ad675ac
> --- /dev/null
> +++ b/ovn/ovn-nb.ovsschema
> @@ -0,0 +1,62 @@
> +{
> +    "name": "OVN_Northbound",
> +    "tables": {
> +        "Logical_Switch": {
> +            "columns": {
> +                "router_port": {"type": {"key": {"type": "uuid",
> +                                                 "refTable": "Logical_Router_Port",
> +                                                 "refType": "strong"},
> +                                         "min": 0, "max": 1}},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}}},

Is this analogous to a neutron "network"?

> +        "Logical_Port": {

Is this analogous to a neutron "port"?

> +            "columns": {
> +                "switch": {"type": {"key": {"type": "uuid",
> +                                            "refTable": "Logical_Switch",
> +                                            "refType": "strong"}}},
> +                "name": {"type": "string"},
> +                "macs": {"type": {"key": "string",
> +                                  "min": 0,
> +                                  "max": "unlimited"}},
> +                "port_security": {"type": {"key": "string",
> +                                           "min": 0,
> +                                           "max": "unlimited"}},
> +                "up": {"type": {"key": "boolean", "min": 0, "max": 1}},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}},
> +            "indexes": [["name"]]},
> +        "ACL": {
> +            "columns": {
> +                "switch": {"type": {"key": {"type": "uuid",
> +                                            "refTable": "Logical_Switch",
> +                                            "refType": "strong"}}},
> +                "priority": {"type": {"key": {"type": "integer",
> +                                              "minInteger": 0,
> +                                              "maxInteger": 65535}}},
> +                "match": {"type": "string"},
> +                "action": {"type": {"key": {"type": "string",
> +                                            "enum": ["set", ["allow", "allow-related", "drop",
> "reject"]]}}},
> +                "log": {"type": "boolean"},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}}},
> +        "Logical_Router": {
> +            "columns": {
> +                "ip": {"type": "string"},
> +                "default_gw": {"type": {"key": "string", "min": 0, "max": 1}},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}}},
> +        "Logical_Router_Port": {
> +            "columns": {
> +                "router": {"type": {"key": {"type": "uuid",
> +                                            "refTable": "Logical_Router",
> +                                            "refType": "strong"}}},
> +                "network": {"type": "string"},

Is this "network" the same as a neutron "subnet"?

> +                "mac": {"type": "string"},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}}}},
> +    "version": "1.0.0"}
> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
> new file mode 100644
> index 0000000..80190ca
> --- /dev/null
> +++ b/ovn/ovn-nb.xml
> @@ -0,0 +1,245 @@
> +<?xml version="1.0" encoding="utf-8"?>
> +<database name="ovn-nb" title="OVN Northbound Database">
> +  <p>
> +    This database is the interface between OVN and the cloud management
> system
> +    (CMS), such as OpenStack, running above it.  The CMS produces almost all
> of
> +    the contents of the database.  The <code>ovs-nbd</code> program
> monitors
> +    the database contents, transforms it, and stores it into the <ref
> +    db="OVN"/> database.
> +  </p>
> +
> +  <p>
> +    We generally speak of ``the'' CMS, but one can imagine scenarios in
> +    which multiple CMSes manage different parts of an OVN deployment.
> +  </p>
> +
> +  <h2>External IDs</h2>
> +
> +  <p>
> +    Each of the tables in this database contains a special column, named
> +    <code>external_ids</code>.  This column has the same form and
> purpose each
> +    place it appears.
> +  </p>
> +
> +  <dl>
> +    <dt><code>external_ids</code>: map of string-string pairs</dt>
> +    <dd>
> +      Key-value pairs for use by the CMS.  The CMS might use certain pairs, for
> +      example, to identify entities in its own configuration that correspond to
> +      those in this database.
> +    </dd>
> +  </dl>
> +
> +  <table name="Logical_Switch" title="L2 logical switch">
> +    <p>
> +      Each row represents one L2 logical switch.  A given switch's ports are
> +      the <ref table="Logical_Port"/> rows whose <ref table="Logical_Port"
> +      column="switch"/> column points to its row.
> +    </p>
> +
> +    <column name="router_port">
> +      <p>
> +        The router port to which this logical switch is connected, or empty if
> +        this logical switch is not connected to any router.  A switch may be
> +        connected to at most one logical router, but this is not a significant
> +        restriction because logical routers may be connected into arbitrary
> +        topologies.
> +      </p>
> +    </column>
> +
> +    <group title="Common Columns">
> +      <column name="external_ids">
> +        See <em>External IDs</em> at the beginning of this document.
> +      </column>
> +    </group>
> +  </table>
> +
> +  <table name="Logical_Port" title="L2 logical switch port">
> +    <p>
> +      A port within an L2 logical switch.
> +    </p>
> +
> +    <column name="switch">
> +      The logical switch to which the logical port is connected.
> +    </column>
> +
> +    <column name="name">
> +      The logical port name.  The name used here must match those used in
> the
> +      <ref key="iface-id" table="Interface" column="external_ids"
> +      db="Open_vSwitch"/> in the <ref db="Open_vSwitch"/> database's
> <ref
> +      table="Interface" db="Open_vSwitch"/> table, because hypervisors use
> <ref
> +      key="iface-id" table="Interface" column="external_ids"
> +      db="Open_vSwitch"/> as a lookup key for logical ports.
> +    </column>
> +
> +    <column name="up">
> +      This column is populated by <code>ovn-nbd</code>, rather than by the
> CMS
> +      plugin as is most of this database.  When a logical port is bound to a
> +      physical location in the OVN database <ref db="OVN" table="Bindings"/>
> +      table, <code>ovn-nbd</code> sets this column to <code>true</code>;
> +      otherwise, or if the port becomes unbound later, it sets it to
> +      <code>false</code>.  This allows the CMS to wait for a VM's networking
> to
> +      become active before it allows the VM to start.
> +    </column>
> +
> +    <column name="macs">
> +      The logical port's own Ethernet address or addresses, each in the form
> +
> <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:
> <var>xx</var>.
> +      Like a physical Ethernet NIC, a logical port ordinarily has a single
> +      fixed Ethernet address.  The string <code>unknown</code> is also
> allowed
> +      to indicate that the logical port has an unknown set of (additional)
> +      source addresses.
> +    </column>
> +
> +    <column name="port_security">

Is this analogous to "security groups"?

> +      <p>
> +        A set of L2 (Ethernet) or L3 (IPv4 or IPv6) addresses or L2+L3 pairs
> +        from which the logical port is allowed to send packets and to which it
> +        is allowed to receive packets.  If this column is empty, all addresses
> +        are permitted.
> +      </p>
> +
> +      <p>
> +        Exact syntax is TBD.  One could simply use comma- or space-separated
> L2
> +        and L3 addresses in each set member, or replace this by a subset of the
> +        general-purpose expression language used for the <ref
> column="match"
> +        table="Pipeline" db="OVN"/> column in the OVN database's <ref
> +        table="Pipeline" db="OVN"/> table.
> +      </p>
> +    </column>
> +
> +    <group title="Common Columns">
> +      <column name="external_ids">
> +        See <em>External IDs</em> at the beginning of this document.
> +      </column>
> +    </group>
> +  </table>
> +
> +  <table name="ACL" title="Access Control List (ACL) rule">
> +    <p>
> +      Each row in this table represents one ACL rule for the logical switch in
> +      its <ref column="switch"/> column.  The <ref column="action"/> column
> for
> +      the highest-<ref column="priority"/> matching row in this table
> +      determines a packet's treatment.  If no row matches, packets are
> allowed
> +      by default.  (Default-deny treatment is possible: add a rule with <ref
> +      column="priority"/> 0, <code>true</code> as <ref column="match"/>,
> and
> +      <code>deny</code> as <ref column="action"/>.)
> +    </p>
> +
> +    <column name="switch">
> +      The switch to which the ACL rule applies.  The expression in the
> +      <ref column="match"/> column may match against logical ports
> +      within this switch.
> +    </column>
> +
> +    <column name="priority">
> +      The ACL rule's priority.  Rules with numerically higher priority take
> +      precedence over those with lower.  If two ACL rules with the same
> +      priority both match, then the one actually applied to a packet is
> +      undefined.
> +    </column>
> +
> +    <column name="match">
> +      The packets that the ACL should match, in the same expression language
> +      used for the <ref column="match" table="Pipeline" db="OVN"/> column
> in
> +      the OVN database's <ref table="Pipeline" db="OVN"/> table.  Match
> +      <code>inport</code> and <code>outport</code> against names of
> logical
> +      ports within <ref column="switch"/> to implement ingress and egress
> ACLs,
> +      respectively.  In logical switches connected to logical routers, the
> +      special port name <code>ROUTER</code> refers to the logical router
> port.
> +    </column>
> +
> +    <column name="action">
> +      <p>The action to take when the ACL rule matches:</p>
> +
> +      <ul>
> +	<li>
> +	  <code>allow</code>: Forward the packet.
> +	</li>
> +
> +	<li>
> +	  <code>allow-related</code>: Forward the packet and related
> traffic
> +	  (e.g. inbound replies to an outbound connection).
> +	</li>
> +
> +	<li>
> +	  <code>drop</code>: Silently drop the packet.
> +	</li>
> +
> +	<li>
> +	  <code>reject</code>: Drop the packet, replying with a RST for TCP
> or
> +	  ICMP unreachable message for other IP-based protocols.
> +	</li>
> +      </ul>
> +    </column>
> +
> +    <column name="log">
> +      If set to <code>true</code>, packets that match the ACL will trigger a
> +      log message on the transport node or nodes that perform ACL
> processing.
> +      Logging may be combined with any <ref column="action"/>.
> +    </column>
> +
> +    <group title="Common Columns">
> +      <column name="external_ids">
> +        See <em>External IDs</em> at the beginning of this document.
> +      </column>
> +    </group>
> +  </table>
> +
> +  <table name="Logical_Router" title="L3 logical router">
> +    <p>
> +      Each row represents one L3 logical router.  A given router's ports are
> +      the <ref table="Logical_Router_Port"/> rows whose <ref
> +      table="Logical_Router_Port" column="router"/> column points to its
> row.
> +    </p>
> +
> +    <column name="ip">
> +      The logical router's own IP address.  The logical router uses this
> +      address for ICMP replies (e.g. network unreachable messages) and
> other
> +      traffic that it originates and responds to traffic destined to this
> +      address (e.g. ICMP echo requests).
> +    </column>
> +
> +    <column name="default_gw">
> +      IP address to use as default gateway, if any.
> +    </column>
> +
> +    <group title="Common Columns">
> +      <column name="external_ids">
> +        See <em>External IDs</em> at the beginning of this document.
> +      </column>
> +    </group>
> +  </table>
> +
> +  <table name="Logical_Router_Port" title="L3 logical router port">
> +    <p>
> +      A port within an L3 logical router.
> +    </p>
> +
> +    <p>
> +      A router port is always attached to a switch port.  The connection can be
> +      identified by following the <ref column="router_port"
> +      table="Logical_Port"/> column from an appropriate <ref
> +      table="Logical_Port"/> row.
> +    </p>
> +
> +    <column name="router">
> +      The router to which the port belongs.
> +    </column>
> +
> +    <column name="network">
> +      The IP network and netmask of the network on the router port.  Used
> for
> +      routing.
> +    </column>
> +
> +    <column name="mac">
> +      The Ethernet address that belongs to this router port.
> +    </column>
> +
> +    <group title="Common Columns">
> +      <column name="external_ids">
> +        See <em>External IDs</em> at the beginning of this document.
> +      </column>
> +    </group>
> +  </table>
> +</database>
> diff --git a/ovn/ovn.ovsschema b/ovn/ovn.ovsschema
> new file mode 100644
> index 0000000..5597df4
> --- /dev/null
> +++ b/ovn/ovn.ovsschema
> @@ -0,0 +1,50 @@
> +{
> +    "name": "OVN",
> +    "tables": {
> +        "Chassis": {
> +            "columns": {
> +                "name": {"type": "string"},
> +                "encap": {"type": {"key": {"type": "string",
> +                                           "enum": ["set", ["stt", "vxlan", "gre"]]}}},
> +                "encap_options": {"type": {"key": "string",
> +                                           "value": "string",
> +                                           "min": 0,
> +                                           "max": "unlimited"}},
> +                "ip": {"type": "string"},
> +                "gateway_ports": {"type": {"key": "string",
> +                                           "value": {"type": "uuid",
> +                                                     "refTable": "Gateway",
> +                                                     "refType": "strong"},
> +                                           "min": 0,
> +                                           "max": "unlimited"}}},
> +            "isRoot": true,
> +            "indexes": [["name"]]},
> +        "Gateway": {
> +            "columns": {"attached_port": {"type": "string"},
> +                        "vlan_map": {"type": {"key": {"type": "integer",
> +                                                      "minInteger": 0,
> +                                                      "maxInteger": 4095},
> +                                              "value": {"type": "string"},
> +                                              "min": 0,
> +                                              "max": "unlimited"}}}},
> +        "Pipeline": {
> +            "columns": {
> +                "table_id": {"type": {"key": {"type": "integer",
> +                                              "minInteger": 0,
> +                                              "maxInteger": 127}}},
> +                "priority": {"type": {"key": {"type": "integer",
> +                                              "minInteger": 0,
> +                                              "maxInteger": 65535}}},
> +                "match": {"type": "string"},
> +                "actions": {"type": "string"}},
> +            "isRoot": true},
> +        "Bindings": {
> +            "columns": {
> +                "logical_port": {"type": "string"},
> +                "chassis": {"type": "string"},
> +                "mac": {"type": {"key": "string",
> +                                 "min": 0,
> +                                 "max": "unlimited"}}},
> +            "indexes": [["logical_port"]],
> +            "isRoot": true}},
> +    "version": "1.0.0"}
> diff --git a/ovn/ovn.xml b/ovn/ovn.xml
> new file mode 100644
> index 0000000..a233112
> --- /dev/null
> +++ b/ovn/ovn.xml
> @@ -0,0 +1,497 @@
> +<?xml version="1.0" encoding="utf-8"?>
> +<database name="ovn" title="OVN Database">
> +  <p>
> +    This database holds logical and physical configuration and state for the
> +    Open Virtual Network (OVN) system to support virtual network
> abstraction.
> +    For an introduction to OVN, please see <code>ovn-
> architecture</code>(7).
> +  </p>
> +
> +  <p>
> +    The OVN database sits at the center of the OVN architecture.  It is the one
> +    component that speaks both southbound directly to all the hypervisors
> and
> +    gateways, via <code>ovn-controller</code>, and northbound to the
> Cloud
> +    Management System, via <code>ovn-nbd</code>:
> +  </p>
> +
> +  <h2>Database Structure</h2>
> +
> +  <p>
> +    The OVN database contains three classes of data with different
> properties,
> +    as described in the sections below.
> +  </p>
> +
> +  <h3>Physical Network (PN) data</h3>
> +
> +  <p>
> +    PN tables contain information about the chassis nodes in the system.  This
> +    contains all the information necessary to wire the overlay, such as IP
> +    addresses, supported tunnel types, and security keys.
> +  </p>
> +
> +  <p>
> +    The amount of PN data is small (O(n) in the number of chassis) and it
> +    changes infrequently, so it can be replicated to every chassis.

This confuses me, I thought that above it seemed to suggest that there
was a single ovn database but you talk about replication here. Is the
northbound database central but the ovn database local on each node?
Or is the replication due to the clustering that you mention somewhere
above?

> +  </p>
> +
> +  <p>
> +    The <ref table="Chassis"/> and <ref table="Gateway"/> tables comprise
> the
> +    PN tables.
> +  </p>
> +
> +  <h3>Logical Network (LN) data</h3>
> +
> +  <p>
> +    LN tables contain the topology of logical switches and routers, ACLs,
> +    firewall rules, and everything needed to describe how packets traverse a
> +    logical network, represented as logical datapath flows (see Logical
> +    Datapath Flows, below).
> +  </p>
> +
> +  <p>
> +    LN data may be large (O(n) in the number of logical ports, ACL rules,
> +    etc.).  Thus, to improve scaling, each chassis should receive only data
> +    related to logical networks in which that chassis participates.  Past
> +    experience shows that in the presence of large logical networks, even
> +    finer-grained partitioning of data, e.g. designing logical flows so that
> +    only the chassis hosting a logical port needs related flows, pays off
> +    scale-wise.  (This is not necessary initially but it is worth bearing in
> +    mind in the design.)
> +  </p>
> +
> +  <p>
> +    The LN is a slave of the cloud management system running northbound of
> OVN.
> +    That CMS determines the entire OVN logical configuration and therefore
> the
> +    LN's content at any given time is a deterministic function of the CMS's
> +    configuration, although that happens indirectly via the OVN Northbound
> DB
> +    and <code>ovn-nvd</code>.

ovn-nbd

> +  </p>
> +
> +  <p>
> +    LN data is likely to change more quickly than PN data.  This is especially
> +    true in a container environment where VMs are created and destroyed
> (and
> +    therefore added to and deleted from logical switches) quickly.
> +  </p>
> +
> +  <p>
> +    The <ref table="Pipeline"/> table is currently the only LN table.
> +  </p>
> +
> +  <h3>Bindings data</h3>
> +
> +  <p>
> +    The Bindings tables contain the current placement of logical components
> +    (such as VMs and VIFs) onto chassis and the bindings between logical
> ports
> +    and MACs.
> +  </p>
> +
> +  <p>
> +    Bindings change frequently, at least every time a VM powers up or down
> +    or migrates, and especially quickly in a container environment.  The
> +    amount of data per VM (or VIF) is small.
> +  </p>
> +
> +  <p>
> +    Each chassis is authoritative about the VMs and VIFs that it hosts at any
> +    given time and can efficiently flood that state to a central location, so
> +    the consistency needs are minimal.
> +  </p>
> +
> +  <p>
> +    The <ref table="Bindings"/> table is currently the only Bindings table.
> +  </p>
> +
> +  <table name="Chassis" title="Physical Network Hypervisor and Gateway
> Information">
> +    <p>
> +      Each row in this table represents a hypervisor or gateway (a chassis) in
> +      the physical network (PN).  Each chassis, via
> +      <code>ovn-controller</code>, adds and updates its own row, and keeps
> a
> +      copy of the remaining rows to determine how to reach other
> hypervisors.
> +    </p>
> +
> +    <p>
> +      When a chassis shuts down gracefully, it should remove its own row.
> +      (This is not critical because resources hosted on the chassis are equally
> +      unreachable regardless of whether the row is present.)  If a chassis
> +      shuts down permanently without removing its row, some kind of manual
> or
> +      automatic cleanup is eventually needed; we can devise a process for that
> +      as necessary.
> +    </p>
> +
> +    <column name="name">
> +      A chassis name, taken from <ref key="system-id" table="Open_vSwitch"
> +      column="external_ids" db="Open_vSwitch"/> in the Open_vSwitch
> +      database's <ref table="Open_vSwitch" db="Open_vSwitch"/> table.
> OVN does
> +      not prescribe a particular format for chassis names.
> +    </column>
> +
> +    <group title="Encapsulation">
> +      <p>
> +        These columns together identify how OVN may transmit logical
> dataplane
> +        packets to this chassis.
> +      </p>
> +
> +      <column name="encap">
> +        The encapsulation to use to transmit packets to this chassis.
> +      </column>
> +
> +      <column name="encap_options">
> +        Options for configuring the encapsulation, e.g. IPsec parameters when
> +        IPsec support is introduced.  No options are currently defined.
> +      </column>
> +
> +      <column name="ip">
> +        The IPv4 address of the encapsulation tunnel endpoint.
> +      </column>
> +    </group>
> +
> +    <group title="Gateway Configuration">
> +      <p>
> +        A <dfn>gateway</dfn> is a chassis that forwards traffic between a
> +        logical network and a physical VLAN.  Gateways are typically dedicated
> +        nodes that do not host VMs.
> +      </p>
> +
> +      <column name="gateway_ports">
> +        Maps from the name of a gateway port, which is typically a physical
> +        port (e.g. <code>eth1</code>) or an Open vSwitch patch port, to a <ref
> +        table="Gateway"/> record that describes the details of the gatewaying
> +        function.
> +      </column>
> +    </group>
> +  </table>
> +
> +  <table name="Gateway" title="Physical Network Gateway Ports">
> +    <p>
> +      The <ref column="gateway_ports" table="Chassis"/> column in the <ref
> +      table="Chassis"/> table refers to rows in this table to connect a chassis
> +      port to a gateway function.  Each row in this table describes the logical
> +      networks to which a gateway port is attached.  Each chassis, via
> +      <code>ovn-controller</code>(8), adds and updates its own rows, if any
> +      (since most chassis are not gateways), and keeps a copy of the remaining
> +      rows to determine how to reach other chassis.
> +    </p>
> +
> +    <column name="vlan_map">
> +      Maps from a VLAN ID to a logical port name.  Thus, each named logical
> +      port corresponds to one VLAN on the gateway port.
> +    </column>
> +
> +    <column name="attached_port">
> +      The name of the gateway port in the chassis's Open vSwitch integration
> +      bridge.
> +    </column>
> +  </table>
> +
> +  <table name="Pipeline" title="Logical Network Pipeline">
> +    <p>
> +      Each row in this table represents one logical flow.  The cloud
> management
> +      system, via its OVN integration, populates this table with logical flows
> +      that implement the L2 and L3 topology specified in the CMS
> configuration.
> +      Each hypervisor, via <code>ovn-controller</code>, translates the logical
> +      flows into OpenFlow flows specific to its hypervisor and installs them
> +      into Open vSwitch.
> +    </p>
> +
> +    <p>
> +      Logical flows are expressed in an OVN-specific format, described here.  A
> +      logical datapath flow is much like an OpenFlow flow, except that the
> +      flows are written in terms of logical ports and logical datapaths instead
> +      of physical ports and physical datapaths.  Translation between logical
> +      and physical flows helps to ensure isolation between logical datapaths.
> +      (The logical flow abstraction also allows the CMS to do less work, since
> +      it does not have to separately compute and push out physical physical
> +      flows to each chassis.)
> +    </p>
> +
> +    <p>
> +      The default action when no flow matches is to drop packets.
> +    </p>
> +
> +    <column name="table_id">
> +      The stage in the logical pipeline, analogous to an OpenFlow table
> number.
> +    </column>
> +
> +    <column name="priority">
> +      The flow's priority.  Flows with numerically higher priority take
> +      precedence over those with lower.  If two logical datapath flows with the
> +      same priority both match, then the one actually applied to the packet is
> +      undefined.
> +    </column>
> +
> +    <column name="match">
> +      <p>
> +        A matching expression.  OVN provides a superset of OpenFlow
> matching
> +        capabilities, using a syntax similar to Boolean expressions in a
> +        programming language.
> +      </p>
> +
> +      <p>
> +        Matching expressions have two important kinds of primary expression:
> +        <dfn>fields</dfn> and <dfn>constants</dfn>.  A field names a piece of
> +        data or metadata.  The supported fields are:
> +      </p>
> +
> +      <ul>
> +        <li>
> +          <code>metadata</code> <code>reg0</code> ... <code>reg7</code>
> +          <code>xreg0</code> ... <code>xreg3</code>
> +        </li>
> +        <li><code>inport</code> <code>outport</code>
> <code>queue</code></li>
> +        <li><code>eth.src</code> <code>eth.dst</code>
> <code>eth.type</code></li>
> +        <li><code>vlan.tci</code> <code>vlan.vid</code>
> <code>vlan.pcp</code> <code>vlan.present</code></li>
> +        <li><code>ip.proto</code> <code>ip.dscp</code>
> <code>ip.ecn</code> <code>ip.ttl</code> <code>ip.frag</code></li>
> +        <li><code>ip4.src</code> <code>ip4.dst</code></li>
> +        <li><code>ip6.src</code> <code>ip6.dst</code>
> <code>ip6.label</code></li>
> +        <li><code>arp.op</code> <code>arp.spa</code>
> <code>arp.tpa</code> <code>arp.sha</code> <code>arp.tha</code></li>
> +        <li><code>tcp.src</code> <code>tcp.dst</code>
> <code>tcp.flags</code></li>
> +        <li><code>udp.src</code> <code>udp.dst</code></li>
> +        <li><code>sctp.src</code> <code>sctp.dst</code></li>
> +        <li><code>icmp4.type</code> <code>icmp4.code</code></li>
> +        <li><code>icmp6.type</code> <code>icmp6.code</code></li>
> +        <li><code>nd.target</code> <code>nd.sll</code>
> <code>nd.tll</code></li>
> +      </ul>
> +
> +      <p>
> +        Subfields may be addressed using a <code>[]</code> suffix,
> +        e.g. <code>tcp.src[0..7]</code> refers to the low 8 bits of the TCP
> +        source port.  A subfield may be used in any context a field is allowed.
> +      </p>
> +
> +      <p>
> +        Some fields have prerequisites.  OVN implicitly adds clauses to satisfy
> +        these.  For example, <code>arp.op == 1</code> is equivalent to
> +        <code>eth.type == 0x0806 &amp;&amp; arp.op == 1</code>, and
> +        <code>tcp.src == 80</code> is equivalent to <code>(eth.type ==
> 0x0800
> +        || eth.type == 0x86dd) &amp;&amp; ip.proto == 6 &amp;&amp; tcp.src
> ==
> +        80</code>.
> +      </p>
> +
> +      <p>
> +        Most fields have integer values.  Integer constants may be expressed in
> +        several forms: decimal integers, hexadecimal integers prefixed by
> +        <code>0x</code>, dotted-quad IPv4 addresses, IPv6 addresses in their
> +        standard forms, and as Ethernet addresses as colon-separated hex
> +        digits.  A constant in any of these forms may be followed by a slash
> +        and a second constant (the mask) in the same form, to form a masked
> +        constant.  IPv4 and IPv6 masks may be given as integers, to express
> +        CIDR prefixes.
> +      </p>
> +
> +      <p>
> +        The <code>inport</code> and <code>outport</code> fields have
> string
> +        values.  The useful values are <ref column="logical_port"/> names from
> +        the <ref column="Bindings"/> and <ref column="Gateway"/> table.
> +      </p>
> +
> +      <p>
> +        The available operators, from highest to lowest precedence, are:
> +      </p>
> +
> +      <ul>
> +        <li><code>()</code></li>
> +        <li><code>==   !=   &lt;   &lt;=   &gt;   &gt;=   in   not in</code></li>
> +        <li><code>!</code></li>
> +        <li><code>&amp;&amp;</code></li>
> +        <li><code>||</code></li>
> +      </ul>
> +
> +      <p>
> +        The <code>()</code> operator is used for grouping.
> +      </p>
> +
> +      <p>
> +        The equality operator <code>==</code> is the most important
> operator.
> +        Its operands must be a field and an optionally masked constant, in
> +        either order.  The <code>==</code> operator yields true when the
> +        field's value equals the constant's value for all the bits included in
> +        the mask.  The <code>==</code> operator translates simply and
> naturally
> +        to OpenFlow.
> +      </p>
> +
> +      <p>
> +        The inequality operator <code>!=</code> yields the inverse of
> +        <code>==</code> but its syntax and use are the same.
> Implementation of
> +        the inequality operator is expensive.
> +      </p>
> +
> +      <p>
> +        The relational operators are &lt;, &lt;=, &gt;, and &gt;=.  Their
> +        operands must be a field and a constant, in either order; the constant
> +        must not be masked.  These operators are most commonly useful for L4
> +        ports, e.g. <code>tcp.src &lt; 1024</code>.  Implementation of the
> +        relational operators is expensive.
> +      </p>
> +
> +      <p>
> +        The set membership operator <code>in</code>, with syntax
> +        ``<code><var>field</var> in { <var>constant1</var>,
> +        <var>constant2</var>,</code> ... <code>}</code>'', is syntactic sugar
> +        for ``<code>(<var>field</var> == <var>constant1</var> ||
> +        <var>field</var> == <var>constant2</var> ||
> </code>...<code>)</code>.
> +        Conversely, ``<code><var>field</var> not in { <var>constant1</var>,
> +        <var>constant2</var>, </code>...<code> }</code>'' is syntactic sugar
> +        for ``<code>(<var>field</var> != <var>constant1</var> &amp;&amp;
> +        <var>field</var> != <var>constant2</var> &amp;&amp;
> +        </code>...<code>)</code>''.
> +      </p>
> +
> +      <p>
> +        The unary prefix operator <code>!</code> yields its operand's inverse.
> +      </p>
> +
> +      <p>
> +        The logical AND operator <code>&amp;&amp;</code> yields true only
> if
> +        both of its operands are true.
> +      </p>
> +
> +      <p>
> +        The logical OR operator <code>||</code> yields true if at least one of
> +        its operands is true.
> +      </p>
> +
> +      <p>
> +        Finally, the keywords <code>true</code> and <code>false</code>
> may also
> +        be used in matching expressions.  <code>true</code> is useful by itself
> +        as a catch-all expression that matches every packet.
> +      </p>
> +
> +      <p>
> +        (The above is pretty ambitious.  It probably makes sense to initially
> +        implement only a subset of this specification.  The full specification
> +        is written out mainly to get an idea of what a fully general matching
> +        expression language could include.)
> +      </p>
> +    </column>
> +
> +    <column name="actions">
> +      <p>
> +        Below, a <var>value</var> is either a <var>constant</var> or a
> +        <var>field</var>.  The following actions seem most likely to be useful:
> +      </p>
> +
> +      <dl>
> +        <dt><code>drop;</code></dt>
> +        <dd>syntactic sugar for no actions</dd>
> +
> +        <dt><code>output(<var>value</var>);</code></dt>
> +        <dd>output to port</dd>
> +
> +        <dt><code>broadcast;</code></dt>
> +        <dd>output to every logical port except ingress port</dd>
> +
> +        <dt><code>resubmit;</code></dt>
> +        <dd>execute next logical datapath table as subroutine</dd>
> +
> +        <dt><code>set(<var>field</var>=<var>value</var>);</code></dt>
> +        <dd>set data or metadata field, or copy between fields</dd>
> +      </dl>
> +
> +      <p>
> +        Following are not well thought out:
> +      </p>
> +
> +      <dl>
> +          <dt><code>learn</code></dt>
> +
> +          <dt><code>conntrack</code></dt>
> +
> +          <dt><code>with(<var>field</var>=<var>value</var>) {
> <var>action</var>, </code>...<code> }</code></dt>
> +          <dd>execute <var>actions</var> with temporary changes to
> <var>fields</var></dd>
> +
> +          <dt><code>dec_ttl { <var>action</var>, </code>...<code> } {
> <var>action</var>; </code>...<code>}</code></dt>
> +          <dd>
> +            decrement TTL; execute first set of actions if
> +            successful, second set if TTL decrement fails
> +          </dd>
> +
> +          <dt><code>icmp_reply { <var>action</var>, </code>...<code>
> }</code></dt>
> +          <dd>generate ICMP reply from packet, execute
> <var>action</var>s</dd>
> +
> +	  <dt><code>arp { <var>action</var>, </code>...<code>
> }</code></dt>
> +	  <dd>generate ARP from packet, execute <var>action</var>s</dd>
> +      </dl>
> +
> +      <p>
> +        Other actions can be added as needed
> +        (e.g. <code>push_vlan</code>, <code>pop_vlan</code>,
> +        <code>push_mpls</code>, <code>pop_mpls</code>).
> +      </p>
> +
> +      <p>
> +        Some of the OVN actions do not map directly to OpenFlow actions, e.g.:
> +      </p>
> +
> +      <ul>
> +        <li>
> +          <code>with</code>: Implemented as <code>stack_push;
> +          set(</code>...<code>); <var>actions</var>; stack_pop</code>.
> +        </li>
> +
> +        <li>
> +          <code>dec_ttl</code>: Implemented as <code>dec_ttl</code>
> followed
> +          by the successful actions.  The failure case has to be implemented by
> +          ovn-controller interpreting packet-ins.  It might be difficult to
> +          identify the particular place in the processing pipeline in
> +          <code>ovn-controller</code>; maybe some restrictions will be
> +          necessary.
> +        </li>
> +
> +        <li>
> +          <code>icmp_reply</code>: Implemented by sending the packet to
> +          <code>ovn-controller</code>, which generates the ICMP reply and
> sends
> +          the packet back to <code>ovs-vswitchd</code>.
> +        </li>
> +      </ul>
> +    </column>
> +  </table>
> +
> +  <table name="Bindings" title="Physical-Logical Bindings">
> +    <p>
> +      Each row in this table identifies the physical location of a logical
> +      port.  Each hypervisor, via <code>ovn-controller</code>, populates this
> +      table with rows for the logical ports that are located on its hypervisor,
> +      which <code>ovn-controller</code> in turn finds out by monitoring the
> +      local hypervisor's Open_vSwitch database, which identifies logical ports
> +      via the conventions described in <code>IntegrationGuide.md</code>.
> +    </p>
> +
> +    <p>
> +      When a chassis shuts down gracefully, it should remove its bindings.
> +      (This is not critical because resources hosted on the chassis are equally
> +      unreachable regardless of whether their rows are present.)  To handle
> the
> +      case where a VM is shut down abruptly on one chassis, then brought up
> +      again on a different one, <code>ovn-controller</code> must delete any
> +      existing <ref table="Binding"/> record for a logical port when it adds a
> +      new one.
> +    </p>
> +
> +    <column name="logical_port">
> +      A logical port, taken from <ref key="iface-id" table="Interface"
> +      column="external_ids" db="Open_vSwitch"/> in the Open_vSwitch
> database's
> +      <ref table="Interface" db="Open_vSwitch"/> table.  OVN does not
> prescribe
> +      a particular format for the logical port ID.
> +    </column>
> +
> +    <column name="chassis">
> +      The physical location of the logical port.  To successfully identify a
> +      chassis, this column must match the <ref table="Chassis"
> column="name"/>
> +      column in some row in the <ref table="Chassis"/> table.
> +    </column>
> +
> +    <column name="mac">
> +      <p>
> +        The Ethernet address or addresses used as a source address on the
> +        logical port, each in the form
> +
> <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:
> <var>xx</var>.
> +        The string <code>unknown</code> is also allowed to indicate that the
> +        logical port has an unknown set of (additional) source addresses.
> +      </p>
> +
> +      <p>
> +        A VM interface would ordinarily have a single Ethernet address.  A
> +        gateway port might initially only have <code>unknown</code>, and
> then
> +        add MAC addresses to the set as it learns new source addresses.
> +      </p>
> +    </column>
> +  </table>
> +</database>
> --
> 2.1.3
> 

I hear it's pronounce "oven"? Between that and ovs-*kettle, have you guys got
some kind of kitchen theme going on!?


More information about the dev mailing list