[ovs-dev] [RFC ovn] ovn: Design and Schema changes for Container integration.

Gurucharan Shetty shettyg at nicira.com
Thu Mar 5 20:28:43 UTC 2015


The design was come up after inputs and discussions with multiple
people, including (in alphabetical order) Aaron Rosen, Ben Pfaff,
Ganesan Chandrashekhar, Justin Pettit and Somik Behera. There
are still some chinks around the OVN schema that needs to be
sorted out. So this is a early version.

Signed-off-by: Gurucharan Shetty <gshetty at nicira.com>
---
 ovn/CONTAINERS.md          |  101 ++++++++++++++++++++++++++++++++++++++++++++
 ovn/automake.mk            |    4 +-
 ovn/ovn-architecture.7.xml |   95 +++++++++++++++++++++++++++++++++++++++++
 ovn/ovn-nb.ovsschema       |    6 +++
 ovn/ovn-nb.xml             |   49 ++++++++++++++++++---
 ovn/ovn.ovsschema          |    6 +++
 ovn/ovn.xml                |   45 +++++++++++++++++---
 7 files changed, 292 insertions(+), 14 deletions(-)
 create mode 100644 ovn/CONTAINERS.md

diff --git a/ovn/CONTAINERS.md b/ovn/CONTAINERS.md
new file mode 100644
index 0000000..0bc7eee
--- /dev/null
+++ b/ovn/CONTAINERS.md
@@ -0,0 +1,101 @@
+Integration of Containers with OVN and Openstack
+------------------------------------------------
+
+In a multi-tenant environment, creating containers directly on hypervisors
+has many risks.  A container application can break out and make changes to
+the Open vSwitch flows and thus impact other tenants.  This document
+describes creation of containers inside VMs and how they can be made part
+of the logical networks securely.  The created logical network can include VMs,
+containers and physical machines as endpoints.  To better understand the
+proposed integration of Containers with OVN and Openstack, this document
+describes the end to end workflow with an example.
+
+* A OpenStack tenant creates a VM (say VM-A) with a single network interface
+that belongs to a management logical network.  The VM is meant to host
+containers.  OpenStack Nova chooses the hypervisor on which VM-A is created.
+
+* A logical port is created in Neutron with a port id that is same as the
+vif-id associated with the virtual network interface (VIF) of VM-A.
+
+* When VM-A is created on a hypervisor, its VIF gets added to the
+Open vSwitch integration bridge.  This creates a row in the Interface table
+of the Open_vSwitch database.  As explained in the [IntegrationGuide.md],
+the vif-id associated with the VM network interface gets added in the
+external_ids:iface-id column of the newly created row in the Interface table.
+
+* Since VM-A belongs to a logical network, it gets an IP address.  This IP
+address is used to spawn containers (either manually or through container
+orchestration systems) inside that VM and to monitor their health.
+
+* The vif-id associated with the VM's network interface can be obtained by
+making a call to Neutron using tenant credentials.
+
+* All the calls to Neutron will need tenant credentials.  These calls can
+either be made from inside the tenant VM as part of a container network plugin
+or from outside the tenant VM (if the tenant is not comfortable using temporary
+Keystone tokens from inside the tenant VMs).  For simplicity, this document
+explains the work flow using the former method.
+
+* The container hosting VM will need Open vSwitch installed in it.  The only
+work for Open vSwitch inside the VM is to tag network traffic coming from
+containers.
+
+* When a container needs to be created inside the VM with a container network
+interface that is expected to be attached to a particular logical switch, the
+network plugin in that VM chooses any unused VLAN (This VLAN tag only needs to
+be unique inside that VM.  This limits the number of Container interfaces to
+4096 inside a single VM).  This VLAN tag is stripped out in the hypervisor
+by OVN and is only useful as a context (or metadata) for OVN.
+
+* The container network plugin then makes a call to Neutron to create a
+logical port.  In addition to all the inputs that a call to create a port in
+Neutron is currently needed, it sends the vif-id and the VLAN tag as inputs.
+
+* Neutron in turn will verify that the vif-id belongs to the tenant in question
+and then uses the OVN specific plugin to create a new row in the Logical_Port
+table of the OVN Northbound Database.  Neutron responds back with an
+IP address and MAC address for that network interface.  So Neutron becomes
+the IPAM system and provides unique IP and MAC addresses across VMs and
+Containers in the same logical network.
+
+* When a container is eventually deleted, the network plugin in that VM
+will make a call to Neutron to delete that port.  Neutron in turn will
+delete the entry in the Logical_Port table of the OVN Northbound Database.
+
+As an example, consider Docker containers.  Since Docker currently does not
+have a network plugin feature, this example uses a hypothetical wrapper
+around Docker to make calls to Neutron.
+
+* Create a Logical switch, e.g.:
+
+```
+% ovn-docker --cred=cca86bd13a564ac2a63ddf14bf45d37f create network LS1
+```
+
+The above command will make a call to Neutron with the credentials to create
+a logical switch.  The above is optional if the logical switch has already
+been created from outside the VM.
+
+* List networks available to the tenant.
+
+```
+% ovn-docker --cred=cca86bd13a564ac2a63ddf14bf45d37f list networks
+```
+
+* Create a container and attach a interface to the previously created switch
+as a logical port.
+
+```
+% ovn-docker --cred=cca86bd13a564ac2a63ddf14bf45d37f --vif-id=$VIF_ID \
+--network=LS1 run -d --net=none ubuntu:14.04 /bin/sh -c \
+"while true; do echo hello world; sleep 1; done"
+```
+
+The above command will make a call to Neutron with all the inputs it currently
+needs to create a logical port.  In addition, it passes the $VIF_ID and a
+unused VLAN.  Neutron will add that information in OVN and return back
+a MAC address and IP address for that interface.  ovn-docker will then create
+a veth pair, insert one end inside the container as 'eth0' and the other end
+as a port of a local OVS bridge as an access port of the chosen VLAN.
+
+[IntegrationGuide.md]:IntegrationGuide.md
diff --git a/ovn/automake.mk b/ovn/automake.mk
index a4951dc..d8cc311 100644
--- a/ovn/automake.mk
+++ b/ovn/automake.mk
@@ -74,4 +74,6 @@ SUFFIXES += .xml
 	$(AM_V_GEN)$(run_python) $(srcdir)/build-aux/xml2nroff \
 		--version=$(VERSION) $< > $@.tmp && mv $@.tmp $@
 
-EXTRA_DIST += ovn/TODO
+EXTRA_DIST += \
+	ovn/TODO \
+	ovn/CONTAINERS.md
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index 6971d69..878b189 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -259,6 +259,12 @@
   </p>
 
   <p>
+    A VIF on a hypervisor is a virtual network interface attached either
+    to a VM or a Container running directly on that hypervisor (This is
+    different from the interface of a Container running inside a VM).
+  </p>
+
+  <p>
     The steps in this example refer often to details of the OVN and OVN
     Northbound database schemas.  Please see <code>ovn</code>(5) and
     <code>ovn-nb</code>(5), respectively, for the full story on these
@@ -390,4 +396,93 @@
     </li>
   </ol>
 
+  <h2>Life Cycle of a Container interface inside a VM</h2>
+
+  <p>
+    Containers can be spawned inside a hypervisor as well as inside a VM.
+    The previous section talked about the life cycle of a virtual network
+    interface as seen inside a hypervisor.  This section talks about the
+    life cycle of a container interface (CIF) when created inside a VM.
+    There can be network traffic associated with many CIFs coming through
+    a single VIF and as such should be distinguished with a tag.  OVN uses
+    VLAN as the tagging mechanism.
+  </p>
+
+  <ol>
+    <li>
+      A CIF's life cycle begins when a Container is spawned inside a VM by
+      the either the same CMS that created the VM or a tenant that owns that VM
+      or even a Container Orchestration System that is different than the CMS
+      that initially created the VM.  Whoever the entity is, it will need to
+      know the <var>vif-id</var> that is associated with the network interface
+      of the VM through which the container interface's network traffic is
+      expected to go through.  The entity that creates the container interface
+      will also need to choose an unused VLAN inside that VM.
+    </li>
+
+    <li>
+      The container spawning entity (either directly or through the CMS that
+      manages the underlying infrastructure) updates the OVN Northbound
+      database to include the new CIF, by adding a row to the
+      <code>Logical_Port</code> table.  In the new row, <code>name</code> is
+      any unique identifier, <code>parent_name</code> is the <var>vif-id</var>
+      of the VM through which the CIF's network traffic is expected to go
+      through and the <code>tag</code> is the VLAN tag that identifies the
+      network traffic of that CIF.
+    </li>
+
+    <li>
+      <code>ovn-nbd</code> receives the OVN Northbound database update.  In
+      turn, it makes the corresponding updates to the OVN database, by adding
+      rows to the OVN database's <code>Pipeline</code> table to reflect the new
+      port and also by creating a new row in the <code>Bindings</code> table
+      and populating all its columns except the column that identifies the
+      <code>chassis</code>.
+    </li>
+
+    <li>
+      On every hypervisor, <code>ovn-controller</code> subscribes to the
+      changes in the <code>Bindings</code> table.  When a new row is created
+      by <code>ovn-nbd</code> that includes a value in <code>parent_port</code>
+      column of <code>Bindings</code> table, the <code>ovn-controller</code>
+      in the hypervisor whose OVN integration bridge has that same value in
+      <var>vif-id</var> in <code>external-ids</code>:<code>iface-id</code>
+      updates the local hypervisor's OpenFlow tables so that packets to and
+      from the VIF with the particular VLAN <code>tag</code> are properly
+      handled.  Afterward it updates the <code>chassis</code> column of
+      the <code>Bindings</code> to reflect the physical location.
+    </li>
+
+    <li>
+      One can only start the application inside the container after the
+      underlying network is ready.  To support this, <code>ovn-nbd</code>
+      notices the updated <code>chassis</code> column in <code>Bindings</code>
+      table and updates the <ref column="up" table="Logical_Port"
+      db="OVN_NB"/> column in the OVN Northbound database's
+      <ref table="Logical_Port" db="OVN_NB"/> table to indicate that the
+      CIF is now up.  The entity responsible to start the container application
+      queries this value and starts the application.
+    </li>
+
+    <li>
+      Eventually the entity that created and started the container, stops it.
+      The entity, through the CMS (or directly) deletes its row in the
+      <code>Logical_Port</code> table.
+    </li>
+
+    <li>
+      <code>ovn-nbd</code> receives the OVN Northbound update and in turn
+      updates the OVN database accordingly, by removing or updating the
+      rows from the OVN database <code>Pipeline</code> table that were related
+      to the now-destroyed CIF.
+    </li>
+
+    <li>
+      On every hypervisor, <code>ovn-controller</code> receives the
+      <code>Pipeline</code> table updates that <code>ovn-nbd</code> made in the
+      previous step.  <code>ovn-controller</code> updates OpenFlow tables to
+      reflect the update and then removes the logical port's row from the
+      <code>Bindings</code> table.
+    </li>
+  </ol>
 </manpage>
diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
index ad675ac..f7070dc 100644
--- a/ovn/ovn-nb.ovsschema
+++ b/ovn/ovn-nb.ovsschema
@@ -16,6 +16,12 @@
                                             "refTable": "Logical_Switch",
                                             "refType": "strong"}}},
                 "name": {"type": "string"},
+                "parent_name": {"type": {"key": "string", "min": 0, "max": 1}},
+                "tag": {
+                     "type": {"key": {"type": "integer",
+                                      "minInteger": 0,
+                                      "maxInteger": 4095},
+                              "min": 0, "max": 1}},
                 "macs": {"type": {"key": "string",
                                   "min": 0,
                                   "max": "unlimited"}},
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 80190ca..b64b4dc 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -64,12 +64,46 @@
     </column>
 
     <column name="name">
-      The logical port name.  The name used here must match those used in the
+      <p>
+      The logical port name.
+      </p>
+
+      <p>
+      For entities (VMs or Containers) that are spawned in the hypervisor,
+      the name used here must match those used in the <ref key="iface-id"
+      table="Interface" column="external_ids" db="Open_vSwitch"/> in the
+      <ref db="Open_vSwitch"/> database's <ref table="Interface"
+      db="Open_vSwitch"/> table, because hypervisors use <ref key="iface-id"
+      table="Interface" column="external_ids" db="Open_vSwitch"/> as a lookup
+      key to identify the network interface of that entity.
+      </p>
+
+      <p>
+      For Containers that are spawned inside a VM, the name can be
+      any unique identifier.  In such a case, <ref column="parent_name"/>
+      must be populated.
+      </p>
+    </column>
+
+    <column name="parent_name">
+      When <ref column="name"/> identifies the interface of a Container
+      spawned inside a tenant VM, this column represents the VM interface
+      through which the container interface sends its network traffic.
+      The name used here must match those used in the <ref key="iface-id"
+      table="Interface" column="external_ids" db="Open_vSwitch"/> in the
+      <ref db="Open_vSwitch"/> table, because hypervisors in this case use
       <ref key="iface-id" table="Interface" column="external_ids"
-      db="Open_vSwitch"/> in the <ref db="Open_vSwitch"/> database's <ref
-      table="Interface" db="Open_vSwitch"/> table, because hypervisors use <ref
-      key="iface-id" table="Interface" column="external_ids"
-      db="Open_vSwitch"/> as a lookup key for logical ports.
+      db="Open_vSwitch"/> as a lookup key to identify the network interface
+      of the tenant VM.
+    </column>
+
+    <column name="tag">
+      When <ref column="name"/> identifies the interface of a Container
+      spawned inside a tenant VM, this column identifies the VLAN tag in
+      the network traffic associated with that Container's network interface.
+      When there are multiple Container interfaces inside a VM, all of
+      them send their network traffic through a single VM network interface and
+      this value helps OVN identify the correct Container interface.
     </column>
 
     <column name="up">
@@ -78,8 +112,9 @@
       physical location in the OVN database <ref db="OVN" table="Bindings"/>
       table, <code>ovn-nbd</code> sets this column to <code>true</code>;
       otherwise, or if the port becomes unbound later, it sets it to
-      <code>false</code>.  This allows the CMS to wait for a VM's networking to
-      become active before it allows the VM to start.
+      <code>false</code>.  This allows the CMS to wait for a VM's
+      (or Container's) networking to become active before it allows the
+      VM (or Container) to start.
     </column>
 
     <column name="macs">
diff --git a/ovn/ovn.ovsschema b/ovn/ovn.ovsschema
index 5597df4..d5b3205 100644
--- a/ovn/ovn.ovsschema
+++ b/ovn/ovn.ovsschema
@@ -41,6 +41,12 @@
         "Bindings": {
             "columns": {
                 "logical_port": {"type": "string"},
+                "parent_port": {"type": {"key": "string", "min": 0, "max": 1}},
+                "tag": {
+                     "type": {"key": {"type": "integer",
+                                      "minInteger": 0,
+                                      "maxInteger": 4095},
+                              "min": 0, "max": 1}},
                 "chassis": {"type": "string"},
                 "mac": {"type": {"key": "string",
                                  "min": 0,
diff --git a/ovn/ovn.xml b/ovn/ovn.xml
index ccc2001..992ba57 100644
--- a/ovn/ovn.xml
+++ b/ovn/ovn.xml
@@ -448,7 +448,11 @@
   <table name="Bindings" title="Physical-Logical Bindings">
     <p>
       Each row in this table identifies the physical location of a logical
-      port.  Each hypervisor, via <code>ovn-controller</code>, populates this
+      port.
+    </p>
+
+    <p>
+      Each hypervisor, via <code>ovn-controller</code>, populates this
       table with rows for the logical ports that are located on its hypervisor,
       which <code>ovn-controller</code> in turn finds out by monitoring the
       local hypervisor's Open_vSwitch database, which identifies logical ports
@@ -456,6 +460,13 @@
     </p>
 
     <p>
+      When the logical port is backed by a container interface created inside
+      a VM, <code>ovn-nb</code> creates a row in this table and expects the
+      <code>ovn-controller</code> to fill in the physical location of the
+      logical port.
+    </p>
+
+    <p>
       When a chassis shuts down gracefully, it should remove its bindings.
       (This is not critical because resources hosted on the chassis are equally
       unreachable regardless of whether their rows are present.)  To handle the
@@ -466,16 +477,38 @@
     </p>
 
     <column name="logical_port">
-      A logical port, taken from <ref key="iface-id" table="Interface"
-      column="external_ids" db="Open_vSwitch"/> in the Open_vSwitch database's
-      <ref table="Interface" db="Open_vSwitch"/> table.  OVN does not prescribe
-      a particular format for the logical port ID.
+      The logical port name.  For VMs and Containers created in a hypervisor,
+      this is populated by <code>ovn-controller</code> and is taken from
+      <ref key="iface-id" table="Interface" column="external_ids"
+      db="Open_vSwitch"/> in the Open_vSwitch database's
+      <ref table="Interface" db="Open_vSwitch"/> table.  For Containers
+      created inside a VM, this is taken from <ref table="Logical_Port"
+      column="name" db="OVN_Northbound"/> and populated by <code>ovn-nb</code>.
+      OVN does not prescribe a particular format for the logical port ID.
+    </column>
+
+    <column name="parent_port">
+      For containers created inside a VM, this is taken from
+      <ref table="Logical_Port" column="parent_name" db="OVN_Northbound"/> and
+      populated by <code>ovn-nb</code>.  It is left empty if
+      <ref column="logical_port"/> belongs to a VM or a Container created
+      in the hypervisor.
+    </column>
+
+    <column name="tag">
+      When <ref column="logical_port"/> identifies the interface of a Container
+      spawned inside a VM, this column identifies the VLAN tag in
+      the network traffic associated with that Container's network interface.
+      This is populated by <code>ovn-nb</code>.  It is left empty if
+      <ref column="logical_port"/> belongs to a VM or a Container created
+      in the hypervisor.
     </column>
 
     <column name="chassis">
       The physical location of the logical port.  To successfully identify a
       chassis, this column must match the <ref table="Chassis" column="name"/>
-      column in some row in the <ref table="Chassis"/> table.
+      column in some row in the <ref table="Chassis"/> table.  This is
+      populated by <code>ovn-controller</code>.
     </column>
 
     <column name="mac">
-- 
1.7.9.5




More information about the dev mailing list