[ovs-dev] [PATCH v7 3/3] ovn: Add "localnet" logical port type.

Russell Bryant rbryant at redhat.com
Wed Aug 26 15:07:54 UTC 2015


Introduce a new logical port type called "localnet".  A logical port
with this type also has an option called "network_name".  A "localnet"
logical port represents a connection to a network that is locally
accessible from each chassis running ovn-controller.  ovn-controller
will use the ovn-bridge-mappings configuration to figure out which
patch port on br-int should be used for this port.

OpenStack Neutron has an API extension called "provider networks" which
allows an administrator to specify that it would like ports directly
attached to some pre-existing network in their environment.  There was a
previous thread where we got into the details of this here:

  http://openvswitch.org/pipermail/dev/2015-June/056765.html

The case where this would be used is an environment that isn't actually
interested in virtual networks and just wants all of their compute
resources connected up to externally managed networks.  Even in this
environment, OVN still has a lot of value to add.  OVN implements port
security and ACLs for all ports connected to these networks.  OVN also
provides the configuration interface and control plane to manage this
across many hypervisors.

As a specific example, consider an environment with two hypvervisors
(A and B) with two VMs on each hypervisor (A1, A2, B1, B2).  Now imagine
that the desired setup from an OpenStack perspective is to have all of
these VMs attached to the same provider network, which is a physical
network we'll refer to as "physnet1".

The first step here is to configure each hypervisor with bridge mappings
that tell ovn-controller that a local bridge called "br-eth1" is used to
reach the network called "physnet1".  We can simulate the inital setup
of this environment in ovs-sandbox with the following commands:

  # Setup the local hypervisor (A)
  ovs-vsctl add-br br-eth1
  ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1

  # Create a fake remote hypervisor (B)
  ovn-sbctl chassis-add fakechassis geneve 127.0.0.1

To get the behavior we want, we model every Neutron port connected to a
Neutron provider network as an OVN logical switch with 2 ports.  The
first port is a normal logical port to be used by the VM.  The second
logical port is a special port with its type set to "localnet".

You could imagine an alternative configuration where there are many OVN
logical ports with a single OVN "localnet" logical port on the same OVN
logical switch.  This setup provides something different, where the
logical ports would communicate with eath other in logical space via
tunnnels between hypervisors.  For Neutron's use case, we want all ports
communicating via an existing network without the use of an overlay.

To simulate the creation of the OVN logical switches and OVN logical
ports for A1, A2, B1, and B2, you can run the following commands:

  # Create 4 OVN logical switches.  Each logical switch has 2 ports,
  # port1 for a VM and physnet1 for the existing network we are
  # connecting to.
  for n in 1 2 3 4; do
      ovn-nbctl lswitch-add provnet1-$n

      ovn-nbctl lport-add provnet1-$n provnet1-$n-port1
      ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n
      ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n

      ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1
      ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown
      ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet
      ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1
  done

  # Bind lport1 (A1) and lport2 (A2) to the local hypervisor.
  ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1
  ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1

  # Bind the other 2 ports to the fake remote hypervisor.
  ovn-sbctl lport-bind provnet1-3-port1 fakechassis
  ovn-sbctl lport-bind provnet1-4-port1 fakechassis

After running these commands, we have the following logical
configuration:

  $ ovn-nbctl show
    lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4)
        lport provnet1-4-physnet1
            macs: unknown
        lport provnet1-4-port1
            macs: 00:00:00:00:00:04
    lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2)
        lport provnet1-2-physnet1
            macs: unknown
        lport provnet1-2-port1
            macs: 00:00:00:00:00:02
    lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3)
        lport provnet1-3-physnet1
            macs: unknown
        lport provnet1-3-port1
            macs: 00:00:00:00:00:03
    lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1)
        lport provnet1-1-physnet1
            macs: unknown
        lport provnet1-1-port1
            macs: 00:00:00:00:00:01

We can also look at OVN_Southbound to see that 2 logical ports are bound
to each hypervisor:

  $ ovn-sbctl show
  Chassis "56b18105-5706-46ef-80c4-ff20979ab068"
      Encap geneve
          ip: "127.0.0.1"
      Port_Binding "provnet1-1-port1"
      Port_Binding "provnet1-2-port1"
  Chassis fakechassis
      Encap geneve
          ip: "127.0.0.1"
      Port_Binding "provnet1-3-port1"
      Port_Binding "provnet1-4-port1"

Now we can generate several packets to test how a packet would be
processed on hypervisor A.  The OpenFlow port numbers in this demo are:

  1 - patch port to br-eth1 (physnet1)
  2 - tunnel to fakechassis
  3 - lport1 (A1)
  4 - lport2 (A2)

Packet test #1: A1 to A2 - This will be output to ofport 1.  Despite
both VMs being local to this hypervisor, all packets betwen the VMs go
through physnet1.  In practice, this will get optimized at br-eth1.

  ovs-appctl ofproto/trace br-int \
    in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate

Packet test #2: physnet1 to A2 - Consider this a continuation of test
is attached to will be considered.  The end result should be that the
only output is to ofport 4 (A2).

  ovs-appctl ofproto/trace br-int \
    in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate

Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1
is to be used to reach any other port.  When it arrives at hypervisor B,
processing would look just like test #2.

  ovs-appctl ofproto/trace br-int \
    in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate

Packet test #4: A1 broadcast. - Again, the packet will only be sent to
physnet1.

  ovs-appctl ofproto/trace br-int \
    in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate

Packet test #5: B1 broadcast arriving at hypervisor A.  This is somewhat
a continuation of test #4.  When a broadcast packet arrives from
physnet1 on hypervisor A, we should see it output to both A1 and A2
(ofports 3 and 4).

  ovs-appctl ofproto/trace br-int \
    in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate

Signed-off-by: Russell Bryant <rbryant at redhat.com>
---
 ovn/controller/lflow.h     |  10 +++-
 ovn/controller/physical.c  | 124 +++++++++++++++++++++++++++++++++++++++------
 ovn/ovn-architecture.7.xml |  27 ++++++++++
 ovn/ovn-nb.xml             |  16 ++++--
 ovn/ovn-sb.xml             |  31 ++++++++++--
 5 files changed, 184 insertions(+), 24 deletions(-)

diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
index 5cac76c..02e36ff 100644
--- a/ovn/controller/lflow.h
+++ b/ovn/controller/lflow.h
@@ -58,6 +58,7 @@ struct uuid;
  * These values are documented in ovn-architecture(7), please update the
  * documentation if you change any of them. */
 #define MFF_LOG_DATAPATH MFF_METADATA /* Logical datapath (64 bits). */
+#define MFF_OVN_FLAGS    MFF_REG5     /* Bit flags used internally by OVN */
 #define MFF_LOG_INPORT   MFF_REG6     /* Logical input port (32 bits). */
 #define MFF_LOG_OUTPORT  MFF_REG7     /* Logical output port (32 bits). */
 
@@ -69,8 +70,13 @@ struct uuid;
     MFF_LOG_REG(MFF_REG1) \
     MFF_LOG_REG(MFF_REG2) \
     MFF_LOG_REG(MFF_REG3) \
-    MFF_LOG_REG(MFF_REG4) \
-    MFF_LOG_REG(MFF_REG5)
+    MFF_LOG_REG(MFF_REG4)
+
+/* Bits used in MFF_OVN_FLAGS. */
+enum {
+    /* Indicates that the packet came in on a localnet port */
+    OVN_FLAG_LOCALNET = (1 << 0),
+};
 
 void lflow_init(void);
 void lflow_run(struct controller_ctx *, struct hmap *flow_table);
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index 2ec0ba9..e43b989 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -23,7 +23,9 @@
 #include "ovn-controller.h"
 #include "ovn/lib/ovn-sb-idl.h"
 #include "openvswitch/vlog.h"
+#include "shash.h"
 #include "simap.h"
+#include "smap.h"
 #include "sset.h"
 #include "vswitch-idl.h"
 
@@ -138,6 +140,8 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 {
     struct simap lport_to_ofport = SIMAP_INITIALIZER(&lport_to_ofport);
     struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
+    struct simap localnet_to_ofport = SIMAP_INITIALIZER(&localnet_to_ofport);
+
     for (int i = 0; i < br_int->n_ports; i++) {
         const struct ovsrec_port *port_rec = br_int->ports[i];
         if (!strcmp(port_rec->name, br_int->name)) {
@@ -150,6 +154,9 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
             continue;
         }
 
+        const char *localnet = smap_get(&port_rec->external_ids,
+                                        "ovn-patch-port");
+
         for (int j = 0; j < port_rec->n_interfaces; j++) {
             const struct ovsrec_interface *iface_rec = port_rec->interfaces[j];
 
@@ -162,8 +169,11 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                 continue;
             }
 
-            /* Record as chassis or local logical port. */
-            if (chassis_id) {
+            /* Record as patch to local net, chassis, or local logical port. */
+            if (!strcmp(iface_rec->type, "patch") && localnet) {
+                simap_put(&localnet_to_ofport, localnet, ofport);
+                break;
+            } else if (chassis_id) {
                 enum chassis_tunnel_type tunnel_type;
                 if (!strcmp(iface_rec->type, "geneve")) {
                     tunnel_type = GENEVE;
@@ -196,6 +206,13 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
     struct ofpbuf ofpacts;
     ofpbuf_init(&ofpacts, 0);
 
+    struct localnet_flow {
+        struct shash_node node;
+        struct match match;
+        struct ofpbuf ofpacts;
+    };
+    struct shash localnet_inputs = SHASH_INITIALIZER(&localnet_inputs);
+
     /* Set up flows in table 0 for physical-to-logical translation and in table
      * 64 for logical-to-physical translation. */
     const struct sbrec_port_binding *binding;
@@ -210,7 +227,13 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
         int tag = 0;
         ofp_port_t ofport;
-        if (binding->parent_port) {
+        if (!strcmp(binding->type, "localnet")) {
+            const char *network = smap_get(&binding->options, "network_name");
+            if (!network) {
+                continue;
+            }
+            ofport = u16_to_ofp(simap_get(&localnet_to_ofport, network));
+        } else if (binding->parent_port) {
             ofport = u16_to_ofp(simap_get(&lport_to_ofport,
                                           binding->parent_port));
             if (ofport && binding->tag) {
@@ -235,6 +258,9 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
         struct match match;
         if (!tun) {
+            struct ofpbuf *local_ofpacts = &ofpacts;
+            bool add_input_flow = true;
+
             /* Packets that arrive from a vif can belong to a VM or
              * to a container located inside that VM. Packets that
              * arrive from containers have a tag (vlan) associated with them.
@@ -245,33 +271,65 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
              *
              * Priority 150 is for traffic belonging to containers. For such
              * traffic, match on the tags and then strip the tag.
-             * Priority 100 is for traffic belonging to VMs.
+             * Priority 100 is for traffic belonging to VMs or locally connected
+             * networks.
              *
              * For both types of traffic: set MFF_LOG_INPORT to the logical
              * input port, MFF_LOG_DATAPATH to the logical datapath, and
              * resubmit into the logical ingress pipeline starting at table
              * 16. */
-            match_init_catchall(&match);
-            ofpbuf_clear(&ofpacts);
-            match_set_in_port(&match, ofport);
-            if (tag) {
-                match_set_dl_vlan(&match, htons(tag));
+            if (!strcmp(binding->type, "localnet")) {
+                /* The same OpenFlow port may correspond to localnet ports
+                 * attached to more than one logical datapath, so keep track of
+                 * all actions to be taken and add it as a single flow at the
+                 * end. */
+
+                const char *network = smap_get(&binding->options, "network_name");
+                struct shash_node *node;
+                struct localnet_flow *ln_flow;
+
+                node = shash_find(&localnet_inputs, network);
+                if (!node) {
+                    ln_flow = xmalloc(sizeof *ln_flow);
+                    match_init_catchall(&ln_flow->match);
+                    match_set_in_port(&ln_flow->match, ofport);
+                    ofpbuf_init(&ln_flow->ofpacts, 0);
+                    /* Set OVN_FLAG_LOCALNET to indicate that the packet came in from a
+                     * localnet port. */
+                    put_load(OVN_FLAG_LOCALNET, MFF_OVN_FLAGS, 0, 32,
+                             &ln_flow->ofpacts);
+
+                    node = shash_add(&localnet_inputs, network, ln_flow);
+                }
+                ln_flow = node->data;
+                local_ofpacts = &ln_flow->ofpacts;
+                add_input_flow = false;
+            } else {
+                ofpbuf_clear(local_ofpacts);
+                match_init_catchall(&match);
+                match_set_in_port(&match, ofport);
+                if (tag) {
+                    match_set_dl_vlan(&match, htons(tag));
+                }
             }
 
             /* Set MFF_LOG_DATAPATH and MFF_LOG_INPORT. */
             put_load(binding->datapath->tunnel_key, MFF_LOG_DATAPATH, 0, 64,
-                     &ofpacts);
-            put_load(binding->tunnel_key, MFF_LOG_INPORT, 0, 32, &ofpacts);
+                     local_ofpacts);
+            put_load(binding->tunnel_key, MFF_LOG_INPORT, 0, 32,
+                     local_ofpacts);
 
             /* Strip vlans. */
             if (tag) {
-                ofpact_put_STRIP_VLAN(&ofpacts);
+                ofpact_put_STRIP_VLAN(local_ofpacts);
             }
 
             /* Resubmit to first logical ingress pipeline table. */
-            put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, &ofpacts);
-            ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, tag ? 150 : 100,
-                            &match, &ofpacts);
+            put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, local_ofpacts);
+            if (add_input_flow) {
+                ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG,
+                                tag ? 150 : 100, &match, &ofpacts);
+            }
 
             /* Table 33, priority 100.
              * =======================
@@ -341,10 +399,13 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
             match_init_catchall(&match);
             ofpbuf_clear(&ofpacts);
 
-            /* Match MFF_LOG_DATAPATH, MFF_LOG_OUTPORT. */
+            /* Match MFF_LOG_DATAPATH, MFF_LOG_OUTPORT. and not
+             * OVN_FLAG_LOCALNET */
             match_set_metadata(&match, htonll(binding->datapath->tunnel_key));
             match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0,
                           binding->tunnel_key);
+            match_set_reg_masked(&match, MFF_OVN_FLAGS - MFF_REG0,
+                                 0, OVN_FLAG_LOCALNET);
 
             put_encapsulation(mff_ovn_geneve, tun, binding->datapath,
                               binding->tunnel_key, &ofpacts);
@@ -401,6 +462,16 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                 put_resubmit(OFTABLE_DROP_LOOPBACK, &ofpacts);
             } else if (port->chassis) {
                 sset_add(&remote_chassis, port->chassis->name);
+            } else if (!strcmp(port->type, "localnet")) {
+                const char *network = smap_get(&port->options, "network_name");
+                if (!network) {
+                    continue;
+                }
+                if (!simap_contains(&localnet_to_ofport, network)) {
+                    continue;
+                }
+                put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32, &ofpacts);
+                put_resubmit(OFTABLE_DROP_LOOPBACK, &ofpacts);
             }
         }
 
@@ -423,6 +494,9 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
         if (!sset_is_empty(&remote_chassis)) {
             ofpbuf_clear(&ofpacts);
 
+            match_set_reg_masked(&match, MFF_OVN_FLAGS - MFF_REG0,
+                                 0, OVN_FLAG_LOCALNET);
+
             const char *chassis;
             const struct chassis_tunnel *prev = NULL;
             SSET_FOR_EACH (chassis, &remote_chassis) {
@@ -516,4 +590,22 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
         free(tun);
     }
     hmap_destroy(&tunnels);
+
+    /* Table 0, Priority 100
+     * =====================
+     *
+     * We have now determined the full set of actions needed on input
+     * from a locally accessible network, so we can write the flows for them.
+     */
+    struct shash_node *ln_flow_node, *ln_flow_node_next;
+    struct localnet_flow *ln_flow;
+    SHASH_FOR_EACH_SAFE (ln_flow_node, ln_flow_node_next, &localnet_inputs) {
+        ln_flow = ln_flow_node->data;
+        shash_delete(&localnet_inputs, ln_flow_node);
+        ofctrl_add_flow(flow_table, 0, 100, &ln_flow->match, &ln_flow->ofpacts);
+        ofpbuf_uninit(&ln_flow->ofpacts);
+        free(ln_flow);
+    }
+    shash_destroy(&localnet_inputs);
+    simap_destroy(&localnet_to_ofport);
 }
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index f10869d..e93e1c1 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -652,6 +652,16 @@
       tunnels as part of the tunnel key.)
     </dd>
 
+    <dt>OVN flags</dt>
+    <dd>
+      <!-- Keep the following in sync with MFF_OVN_FLAGS in
+           ovn/controller/lflow.h. -->
+      Flows may set bits in Nicira extension register number 5 to aid in
+      processing.  Currently, the only flag is a bit that indicates that a
+      packet arrived via a logical port with a type of <code>localnet</code>.
+      This field is not passed across tunnels.
+    </dd>
+
     <dt>VLAN ID</dt>
     <dd>
       The VLAN ID is used as an interface between OVN and containers nested
@@ -677,6 +687,15 @@
       </p>
 
       <p>
+        It's possible that a single ingress physical port maps to multiple
+        logical ports with a type of <code>localnet</code>.  In that case, an
+        OVN flag is set to indicate that this packet arrived on a <code>localnet</code>
+        port.  This flag is used later to help determine what type of output is
+        appropriate.  The logical datapath and logical input port fields will be
+        reset and the packet will be resubmitted to table 16 multiple times.
+      </p>
+
+      <p>
         Packets that originate from a container nested within a VM are treated
         in a slightly different way.  The originating container can be
         distinguished based on the VIF-specific VLAN ID, so the
@@ -763,6 +782,14 @@
       </p>
 
       <p>
+        Note that there is special handling in place to ensure that a packet
+        that arrived on a <code>localnet</code> logical port is never sent over
+        a tunnel to a remote hypervisor.  This is to prevent loops and
+        duplicating packets.  The only output will be logical ports on the local
+        hypervisor.
+      </p>
+
+      <p>
         Flows in table 33 resemble those in table 32 but for logical ports that
         reside locally rather than remotely.  For unicast logical output ports
         on the local hypervisor, the actions just resubmit to table 34.  For
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index ade8164..6e20593 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -116,13 +116,23 @@
       </p>
 
       <p>
-      There are no other logical port types implemented yet.
+      When this column is set to <em>localnet</em>, this logical port represents a
+      connection to a locally accessible network from each ovn-controller instance.
+      A logical switch can only have a single <em>localnet</em> port attached.
       </p>
     </column>
 
     <column name="options">
-        This column provides key/value settings specific to the logical port
-        <ref column="type"/>.
+      <p>
+      This column provides key/value settings specific to the logical port
+      <ref column="type"/>.
+      </p>
+
+      <p>
+      When <ref column="type"/> is set to <em>localnet</em>, you must set the option
+      <em>network_name</em>.  ovn-controller uses local configuration to determine
+      exactly how to connect to this locally accessible network.
+      </p>
     </column>
 
     <column name="parent_name">
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index 57e9689..a536833 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -899,13 +899,38 @@
       </p>
 
       <p>
-      There are no other logical port types implemented yet.
+      When this column is set to <em>localnet</em>, this logical port represents a
+      connection to a locally accessible network from each ovn-controller instance.
+      A logical switch can only have a single <em>localnet</em> port attached.
       </p>
     </column>
 
     <column name="options">
-        This column provides key/value settings specific to the logical port
-        <ref column="type"/>.
+      <p>
+      This column provides key/value settings specific to the logical port
+      <ref column="type"/>.
+      </p>
+
+      <p>
+      When <ref column="type"/> is set to <em>localnet</em>, you must set the option
+      <em>network_name</em>.  ovn-controller uses the configuration entry
+      <em>ovn-bridge-mappings</em> to determine how to connect to this network.
+      <em>ovn-bridge-mappings</em> is a list of network names mapped to a local
+      OVS bridge that provides access to that network.  An example of configuring
+      <em>ovn-bridge-mappings</em> would be:
+      </p>
+
+      <p>
+      <em>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</em>
+      </p>
+
+      <p>
+      Also note that when a logical switch has a <em>localnet</em> port attached,
+      every chassis that may have a local vif attached to that logical switch
+      must have a bridge mapping configured to reach that <em>localnet</em>.
+      Traffic that arrives on a <em>localnet</em> port is never forwarded over a tunnel
+      to another chassis.
+      </p>
     </column>
 
     <column name="tunnel_key">
-- 
2.4.3




More information about the dev mailing list