[ovs-dev] [patch_v8] vtep: add source node replication support.

Anupam Chanda achanda at vmware.com
Wed May 4 02:09:12 UTC 2016


Hi Darrell,

Inlined below with [AC].

Thanks,
Anupam


-------- Original message --------
From: Darrell Ball <dlu998 at gmail.com>
Date: 5/3/2016 5:35 PM (GMT-08:00)
To: Anupam Chanda <achanda at vmware.com>
Cc: Bruce Davie <bdavie at vmware.com>, dev at openvswitch.com
Subject: Re: [patch_v8] vtep: add source node replication support.

Thanks Anupam

Darrell

On Tue, May 3, 2016 at 4:28 PM, Anupam Chanda <achanda at vmware.com<mailto:achanda at vmware.com>> wrote:

Hi Darrell,


Looks good, thanks for working on this. Some comments below tagged with [AC].


Thanks,

Anupam

From: Darrell Ball <dlu998 at gmail.com<mailto:dlu998 at gmail.com>>
Date: Tue, May 3, 2016 at 3:32 PM
Subject: [patch_v8] vtep: add source node replication support.
To: dlu998 at gmail.com<mailto:dlu998 at gmail.com>, bdavie at vmware.com<mailto:bdavie at vmware.com>, achanda at vmware.com<mailto:achanda at vmware.com>, dev at openvswitch.com<mailto:dev at openvswitch.com>


This patch series updates the vtep schema, vtep-ctl commands and vtep
simulator to support source node replication in addition to service node
replication per logical switch.  The default replication mode is service node
as that was the only mode previously supported.  Source node replication
mode is optionally configurable and resetting the replication mode implicitly
sets the replication mode back to a default of service node.

Signed-off-by: Darrell Ball <dlu998 at gmail.com<mailto:dlu998 at gmail.com>>
---
 tests/vtep-ctl.at<https://urldefense.proofpoint.com/v2/url?u=http-3A__vtep-2Dctl.at&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=cqgNSY2RGg_YQcdyKlzS1orMEjgRrjG_SxtnByEef18&e=>       | 32 +++++++++++++++++++++++
 vtep/README.ovs-vtep.md<https://urldefense.proofpoint.com/v2/url?u=http-3A__README.ovs-2Dvtep.md&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=u67AnpfKI7_ta8BmE8NGlLCqf4qXFDTn9a5PuYwp3RU&e=> | 30 ++++++++++++++++++++--
 vtep/ovs-vtep           | 36 +++++++++++++++++++++++---
 vtep/vtep-ctl.8.in<https://urldefense.proofpoint.com/v2/url?u=http-3A__vtep-2Dctl.8.in&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=giAJRMO8tjE8s2qz8px4VbvMpr6mm3gd-vHCY1a24cU&e=>      | 15 +++++++++++
 vtep/vtep-ctl.c         | 61 ++++++++++++++++++++++++++++++++++++++++++++
 vtep/vtep.ovsschema     |  9 +++++--
 vtep/vtep.xml           | 67 +++++++++++++++++++++++++++++++++++++++++++++----
 7 files changed, 237 insertions(+), 13 deletions(-)

diff --git a/tests/vtep-ctl.at<https://urldefense.proofpoint.com/v2/url?u=http-3A__vtep-2Dctl.at&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=cqgNSY2RGg_YQcdyKlzS1orMEjgRrjG_SxtnByEef18&e=> b/tests/vtep-ctl.at<https://urldefense.proofpoint.com/v2/url?u=http-3A__vtep-2Dctl.at&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=cqgNSY2RGg_YQcdyKlzS1orMEjgRrjG_SxtnByEef18&e=>
index 99e97e8..6a7e059 100644
--- a/tests/vtep-ctl.at<https://urldefense.proofpoint.com/v2/url?u=http-3A__vtep-2Dctl.at&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=cqgNSY2RGg_YQcdyKlzS1orMEjgRrjG_SxtnByEef18&e=>
+++ b/tests/vtep-ctl.at<https://urldefense.proofpoint.com/v2/url?u=http-3A__vtep-2Dctl.at&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=cqgNSY2RGg_YQcdyKlzS1orMEjgRrjG_SxtnByEef18&e=>
@@ -318,6 +318,38 @@ CHECK_LSWITCHES([a])
 VTEP_CTL_CLEANUP
 AT_CLEANUP

+AT_SETUP([add-ls a, set-replication-mode a source_node])
+AT_KEYWORDS([vtep-ctl])
+VTEP_CTL_SETUP
+AT_CHECK([RUN_VTEP_CTL(
+  [add-ls a],[set-replication-mode a source_node],
+  [get-replication-mode a])],
+  [0], [source_node
+], [], [VTEP_CTL_CLEANUP])
+VTEP_CTL_CLEANUP
+AT_CLEANUP
+
+AT_SETUP([add-ls a, set-replication-mode a service_node])
+AT_KEYWORDS([vtep-ctl])
+VTEP_CTL_SETUP
+AT_CHECK([RUN_VTEP_CTL(
+  [add-ls a],[set-replication-mode a service_node],
+  [get-replication-mode a])],
+  [0], [service_node
+], [], [VTEP_CTL_CLEANUP])
+VTEP_CTL_CLEANUP
+AT_CLEANUP
+
+AT_SETUP([add-ls a, reset-replication-mode a])
+AT_KEYWORDS([vtep-ctl])
+VTEP_CTL_SETUP
+AT_CHECK([RUN_VTEP_CTL(
+  [add-ls a],[reset-replication-mode a],
+  [get-replication-mode a])],
+  [0], [[(null)]
+], [], [VTEP_CTL_CLEANUP])
+VTEP_CTL_CLEANUP
+AT_CLEANUP

 dnl ----------------------------------------------------------------------
 AT_BANNER([vtep-ctl unit tests -- logical binding tests])
diff --git a/vtep/README.ovs-vtep.md<https://urldefense.proofpoint.com/v2/url?u=http-3A__README.ovs-2Dvtep.md&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=u67AnpfKI7_ta8BmE8NGlLCqf4qXFDTn9a5PuYwp3RU&e=> b/vtep/README.ovs-vtep.md<https://urldefense.proofpoint.com/v2/url?u=http-3A__README.ovs-2Dvtep.md&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=u67AnpfKI7_ta8BmE8NGlLCqf4qXFDTn9a5PuYwp3RU&e=>
index 6734dab..6d37e37 100644
--- a/vtep/README.ovs-vtep.md<https://urldefense.proofpoint.com/v2/url?u=http-3A__README.ovs-2Dvtep.md&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=u67AnpfKI7_ta8BmE8NGlLCqf4qXFDTn9a5PuYwp3RU&e=>
+++ b/vtep/README.ovs-vtep.md<https://urldefense.proofpoint.com/v2/url?u=http-3A__README.ovs-2Dvtep.md&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=u67AnpfKI7_ta8BmE8NGlLCqf4qXFDTn9a5PuYwp3RU&e=>
@@ -166,13 +166,39 @@ vtep-ctl bind-ls br0 p0 0 ls0
 vtep-ctl set Logical_Switch ls0 tunnel_key=33
       ```

-3. Direct unknown destinations out a tunnel:
+3. Optionally, change the replication mode from a default of service_node to
+   source_node, which can be done at the logical switch level:
+
+      ```
+vtep-ctl set-replication-mode ls0 source_node
+      ```
+
+The replication mode can also be reset back to the default of service node replication, if needed, at the logical switch level:
+
+      ```
+vtep-ctl reset-replication-mode ls0
+      ```
+
+Setting the replication mode back to the default of service_node replication
+mode can also be done via the set command, if needed:

[AC] How about: The replication mode may be explicitly set to the default value of service_node replication via the set command, if needed:

Better; thanks



+
+      ```
+vtep-ctl set-replication-mode ls0 service_node
+      ```
+
+The replication mode can also be queried using the command:
+
+      ```
+vtep-ctl get-replication-mode ls0
+      ```

[AC] What's the output of this command if replication-mode has not been previously set?

I mentioned it in the vtep-ctl man page; I said NULL as a simplified version of [(NULL)].
I thought mentioning these details here may be a distraction, unless you feel adding here
as well is helpful.

[AC] Agreed, don't need to add those details here.



+
+4. Direct unknown destinations out a tunnel:

       ```
 vtep-ctl add-mcast-remote ls0 unknown-dst 10.2.2.2
       ```

-4. Direct unicast destinations out a different tunnel:
+5. Direct unicast destinations out a different tunnel:

       ```
 vtep-ctl add-ucast-remote ls0 00:11:22:33:44:55 10.2.2.3
diff --git a/vtep/ovs-vtep b/vtep/ovs-vtep
index 31ff159..7b8cdb1 100755
--- a/vtep/ovs-vtep
+++ b/vtep/ovs-vtep
@@ -94,6 +94,7 @@ class Logical_Switch(object):
         self.unknown_dsts = set()
         self.tunnel_key = 0
         self.setup_ls()
+        self.replication_mode = "service_node"

     def __del__(self):
         vlog.info<https://urldefense.proofpoint.com/v2/url?u=http-3A__vlog.info&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=Qu7WpCm9rMljkyQ_V0qCtGOpg0S-xRfCz7BxxaQU71A&e=>("destroying lswitch %s" % self.name<https://urldefense.proofpoint.com/v2/url?u=http-3A__self.name&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=c9RNWdAiUHVAqA-hNtRQwXwSD3YoyV25LXbyrKp9iFw&e=>)
@@ -141,13 +142,17 @@ class Logical_Switch(object):
             ovs_ofctl("add-flow %s table=1,priority=1,in_port=%s,action=%s"
                         % (self.short_name, port_no, ",".join(flood_ports)))

-        # Traffic coming from a VTEP physical port should only be flooded to
-        # one 'unknown-dst' and to all other physical ports that belong to that
-        # VTEP device and this logical switch.
+        # Traffic coming from a VTEP physical port should always be flooded to
+        # all the other physical ports that belong to that VTEP device and
+        # this logical switch.  If the replication mode is service node then
+        # send to one unknown_dst node (the first one here); else we assume the
+        # replication mode is source node and we send the packet to all
+        # unknown_dst nodes.
         for tunnel in self.unknown_dsts:
             port_no = self.tunnels[tunnel][0]
             flood_ports.append(port_no)
-            break
+            if self.replication_mode == "service_node":
+                break

         ovs_ofctl("add-flow %s table=1,priority=0,action=%s"
                   % (self.short_name, ",".join(flood_ports)))
@@ -293,8 +298,31 @@ class Logical_Switch(object):

         self.remote_macs = remote_macs

+        replication_mode = vtep_ctl("get logical_switch %s replication_mode"
+                                    % self.name<https://urldefense.proofpoint.com/v2/url?u=http-3A__self.name&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=c9RNWdAiUHVAqA-hNtRQwXwSD3YoyV25LXbyrKp9iFw&e=>)
+
+        # Replication mode is an optional column and if it is not set,
+        # replication mode defaults to service_node.
+        if replication_mode == "[]":
+            replication_mode = "service_node"
+
+        # If the logical switch level replication mode has changed then
+        # update to that value.
+        replic_mode_change = False

[AC] s/replic_mode_change/replication_mode_change

sure


+        if replication_mode != self.replication_mode:
+            self.replication_mode = replication_mode
+            vlog.info<https://urldefense.proofpoint.com/v2/url?u=http-3A__vlog.info&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=Qu7WpCm9rMljkyQ_V0qCtGOpg0S-xRfCz7BxxaQU71A&e=>("%s replication mode changed to %s" %
+                       (self.name<https://urldefense.proofpoint.com/v2/url?u=http-3A__self.name&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=c9RNWdAiUHVAqA-hNtRQwXwSD3YoyV25LXbyrKp9iFw&e=>, self.replication_mode))
+            replic_mode_change = True
+
+        unk_dsts_change = False

[AC] s/unk_dsts_change/unknown_dsts_change


sure


         if (self.unknown_dsts != unknown_dsts):
             self.unknown_dsts = unknown_dsts
+            unk_dsts_change = True
+
+        # If either the replication mode has changed or the unknown
+        # destinations set has changed, update the flooding decision.
+        if replic_mode_change is True or unk_dsts_change is True:
             self.update_flood()

     def update_stats(self):
diff --git a/vtep/vtep-ctl.8.in<https://urldefense.proofpoint.com/v2/url?u=http-3A__vtep-2Dctl.8.in&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=giAJRMO8tjE8s2qz8px4VbvMpr6mm3gd-vHCY1a24cU&e=> b/vtep/vtep-ctl.8.in<https://urldefense.proofpoint.com/v2/url?u=http-3A__vtep-2Dctl.8.in&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=giAJRMO8tjE8s2qz8px4VbvMpr6mm3gd-vHCY1a24cU&e=>
index 129c7ed..f0da108 100644
--- a/vtep/vtep-ctl.8.in<https://urldefense.proofpoint.com/v2/url?u=http-3A__vtep-2Dctl.8.in&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=giAJRMO8tjE8s2qz8px4VbvMpr6mm3gd-vHCY1a24cU&e=>
+++ b/vtep/vtep-ctl.8.in<https://urldefense.proofpoint.com/v2/url?u=http-3A__vtep-2Dctl.8.in&d=BQMFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=qV-i3RRHfITiNOOHCqe5JTEo_Co9bBvDWO_i0mHUWqU&m=TRP9_xzf3amc74NNDwbxinFEBUmjNHAkuOwQ-_Yh0zA&s=giAJRMO8tjE8s2qz8px4VbvMpr6mm3gd-vHCY1a24cU&e=>
@@ -195,6 +195,21 @@ combination on the physical switch \fIpswitch\fR.
 List the logical switch bindings for \fIport\fR on the physical switch
 \fIpswitch\fR.
 .
+.IP "\fBset\-replication\-mode \fIlswitch replication\-mode\fR"
+Set logical switch \fIlswitch\fR replication mode to
+\fIreplication\-mode\fR; the only valid values presently for replication
+mode are "service_node" and "source_node".
+.
+.IP "\fBget\-replication\-mode \fIlswitch\fR"
+Get logical switch \fIlswitch\fR replication mode.  The only valid values
+presently for replication mode are "service_node" and "source_node".
+A return value of NULL indicates the replication mode column is not set
+and therefore a default of "service_node" is implied.
+.
+.IP "\fBreset\-replication\-mode \fIlswitch\fR"
+Reset a logical switch \fIlswitch\fR replication mode to the default of
+"service_node".
+.
 .SS "Logical Router Commands"
 These commands examine and manipulate logical routers.
 .
diff --git a/vtep/vtep-ctl.c b/vtep/vtep-ctl.c
index 29d9a17..2254447 100644
--- a/vtep/vtep-ctl.c
+++ b/vtep/vtep-ctl.c
@@ -335,6 +335,9 @@ Logical Switch commands:\n\
   bind-ls PS PORT VLAN LS     bind LS to VLAN on PORT\n\
   unbind-ls PS PORT VLAN      unbind logical switch on VLAN from PORT\n\
   list-bindings PS PORT       list bindings for PORT on PS\n\
+  set-ls-replication-mode LS MODE  set replication mode on LS\n\
+  get-ls-replication-mode LS       get replication mode on LS\n\
+  reset-ls-replication-mode LS     reset replication mode on LS\n\
 \n\
 Logical Router commands:\n\
   add-lr LR                   create a new logical router named LR\n\
@@ -851,6 +854,8 @@ pre_get_info(struct ctl_context *ctx)
     ovsdb_idl_add_column(ctx->idl, &vteprec_physical_port_col_vlan_bindings);

     ovsdb_idl_add_column(ctx->idl, &vteprec_logical_switch_col_name);
+    ovsdb_idl_add_column(ctx->idl,
+                         &vteprec_logical_switch_col_replication_mode);

     ovsdb_idl_add_column(ctx->idl, &vteprec_logical_router_col_name);

@@ -1523,6 +1528,56 @@ cmd_unbind_ls(struct ctl_context *ctx)
     vtep_ctl_context_invalidate_cache(ctx);
 }

+static void
+cmd_set_ls_replication_mode(struct ctl_context *ctx)
+{
+    struct vtep_ctl_context *vtepctl_ctx = vtep_ctl_context_cast(ctx);
+    struct vtep_ctl_lswitch *ls;
+    const char *ls_name = ctx->argv[1];
+
+    vtep_ctl_context_populate_cache(ctx);
+
+    if (strcmp(ctx->argv[2], "service_node") &&
+        strcmp(ctx->argv[2], "source_node")) {
+        ctl_fatal("Replication mode must be 'service_node' or 'source_node'");
+    }
+
+    ls = find_lswitch(vtepctl_ctx, ls_name, true);
+    vteprec_logical_switch_set_replication_mode(ls->ls_cfg, ctx->argv[2]);
+
+    vtep_ctl_context_invalidate_cache(ctx);

[AC] invalidate cache only if the set command actually changes the replication mode?


I prefer to leave it as is and keep the logic as simple as possible; adding if/else to skip invalidate cache for a redundant setting of mode seems an unnecessary complication.

[AC] Agreed. Makes sense to keep this logic simple.

A mode change will be rare and an unnecessary/redundant mode
change will be very rare unless the operator is typing with his beer can.



+}
+
+static void
+cmd_get_ls_replication_mode(struct ctl_context *ctx)
+{
+    struct vtep_ctl_context *vtepctl_ctx = vtep_ctl_context_cast(ctx);
+    struct vtep_ctl_lswitch *ls;
+    const char *ls_name = ctx->argv[1];
+
+    vtep_ctl_context_populate_cache(ctx);
+
+    ls = find_lswitch(vtepctl_ctx, ls_name, true);
+    ds_put_format(&ctx->output, "%s\n", ls->ls_cfg->replication_mode);
+
+    vtep_ctl_context_invalidate_cache(ctx);

[AC] I think the above invalidate cache operation is not needed.

Right, its not needed for "get"



+}
+
+static void
+cmd_reset_ls_replication_mode(struct ctl_context *ctx)
+{
+    struct vtep_ctl_context *vtepctl_ctx = vtep_ctl_context_cast(ctx);
+    struct vtep_ctl_lswitch *ls;
+    const char *ls_name = ctx->argv[1];
+
+    vtep_ctl_context_populate_cache(ctx);
+
+    ls = find_lswitch(vtepctl_ctx, ls_name, true);
+    vteprec_logical_switch_set_replication_mode(ls->ls_cfg, NULL);
+
+    vtep_ctl_context_invalidate_cache(ctx);
+}
+
 static struct vtep_ctl_lrouter *
 find_lrouter(struct vtep_ctl_context *vtepctl_ctx,
              const char *name, bool must_exist)
@@ -2459,6 +2514,12 @@ static const struct ctl_command_syntax vtep_commands[] = {
     {"list-bindings", 2, 2, NULL, pre_get_info, cmd_list_bindings, NULL, "", RO},
     {"bind-ls", 4, 4, NULL, pre_get_info, cmd_bind_ls, NULL, "", RO},
     {"unbind-ls", 3, 3, NULL, pre_get_info, cmd_unbind_ls, NULL, "", RO},
+    {"set-replication-mode", 2, 2, "LS MODE", pre_get_info,
+        cmd_set_ls_replication_mode, NULL, "", RW},
+    {"get-replication-mode", 1, 1, "LS", pre_get_info,
+        cmd_get_ls_replication_mode, NULL, "", RO},
+    {"reset-replication-mode", 1, 1, "LS", pre_get_info,
+        cmd_reset_ls_replication_mode, NULL, "", RW},

     /* Logical Router commands. */
     {"add-lr", 1, 1, NULL, pre_get_info, cmd_add_lr, NULL, "--may-exist", RW},
diff --git a/vtep/vtep.ovsschema b/vtep/vtep.ovsschema
index 533fd2e..e409d8d 100644
--- a/vtep/vtep.ovsschema
+++ b/vtep/vtep.ovsschema
@@ -1,6 +1,6 @@
 {
   "name": "hardware_vtep",
-  "cksum": "770244945 11113",
+  "cksum": "4127261095<tel:4127261095> 11302",
   "tables": {
     "Global": {
       "columns": {
@@ -96,6 +96,11 @@
         "name": {"type": "string"},
         "description": {"type": "string"},
         "tunnel_key": {"type": {"key": "integer", "min": 0, "max": 1}},
+        "replication_mode": {
+          "type": {
+            "key": {
+              "enum": ["set", ["service_node", "source_node"]],
+              "type": "string"},"min": 0, "max": 1}},
         "other_config": {
           "type": {"key": "string", "value": "string",
                    "min": 0, "max": "unlimited"}}},
@@ -296,4 +301,4 @@
           "ephemeral": true}},
       "indexes": [["target"]],
       "isRoot": false}},
-  "version": "1.5.1"}
+  "version": "1.6.0"}
diff --git a/vtep/vtep.xml b/vtep/vtep.xml
index a3a6988..8a486bb 100644
--- a/vtep/vtep.xml
+++ b/vtep/vtep.xml
@@ -357,6 +357,28 @@
         Indicates that an error has occurred in the switch but that no
         more specific information is available.
       </column>
+
+      <column name="switch_fault_status"
+        key="unsupported_source_node_replication">
+        Indicates that the requested source node replication mode cannot be
+        supported by the physical switch;  this specifically means in this
+        context that the physical switch lacks the capability to support
+        source node replication mode.  This error occurs when a controller
+        attempts to set source node replication mode for one of the logical
+        switches that the physical switch is keeping context for.  An NVC
+        that observes this error should take appropriate action (for example
+        reverting the logical switch to service node replication mode).
+        It is recommended that an NVC be proactive and test for support of
+        source node replication by using a test logical switch on vtep
+        physical switch nodes and then trying to change the replication mode
+        to source node on this logical switch, checking for error.  The NVC
+        could remember this capability per vtep physical switch.  Using
+        mixed replication modes on a given logical switch is not recommended.
+        Service node replication mode is considered a basic requirement
+        since it only requires sending a packet to a single transport node,
+        hence its not expected that a switch should report that service node
+        mode cannot be supported.
+      </column>
     </group>

     <group title="Common Column">
@@ -754,6 +776,38 @@
       </column>
     </group>

+    <group title="Replication Mode">
+      <p>
+        For handling broadcast, multicast (in default manner) and unknown

[AC] What is "default manner" here?

BUM multicast default is broadcast on the logical switch.
I can clarify as - "in a default, broadcast manner". What do you think ?

[AC] I would suggest just removing that phrase "in default manner" to keep it simple.



+        unicast traffic, packets can be sent to all members of a logical
+        switch referenced by a physical switch.  There are different modes
+        to replicate the packets.  The default mode of replication is to
+        send the traffic to a service node, which can be a hypervisor,
+        server or appliance, and let the service node handle replication to
+        other transport nodes (hypervisors or other VTEP physical
+        switches).  This mode is called service node replication.  An
+        alternate mode of replication, called source node replication
+        involves the source node sending to all other transport nodes.
+        Hypervisors are always responsible for doing their own
+        replication for locally attached VMs in both modes.  Service node
+        mode is the default and was the only option for prior versions of
+        the schema.  Service node replication mode is considered a basic
+        requirement because it only requires sending the the packet to a
+        single transport node.
+      </p>
+
+      <column name="replication_mode">
+        <p>
+          This optional column defines the replication mode per
+          <ref table="Logical_Switch"/>.  There are 2 valid values
+          presently, <code>service_node</code> and
+          <code>source_node</code>.  If the column is not set, the
+          replication mode defaults to service_node.
+        </p>
+
+      </column>
+    </group>
+
     <group title="Identification">
       <column name="name">
         Symbolic name for the logical switch.
@@ -887,8 +941,8 @@
       Multicast packet replication may be handled by a service node,
       in which case the physical locators will be IP addresses of
       service nodes. If the VTEP supports replication onto multiple
-      tunnels, then this may be used to replicate directly onto
-      VTEP-hypervisor tunnels.
+      tunnels, using source node replication, then this may be used to
+      replicate directly onto VTEP-hypervisor or VTEP-VTEP tunnels.
     </p>

     <column name="MAC">
@@ -911,9 +965,12 @@

     <column name="locator_set">
       The physical locator set to be used to reach this MAC address. In
-      this table, the physical locator set will be either a service node IP
-      address or a set of tunnel IP addresses of hypervisors (and
-      potentially other VTEPs).
+      this table, the physical locator set will be either a set of service
+      node when service node replication is used or the set of transport

[AC] s/"node when"/"nodes when"

Missed this one earlier when Justin mentioned it :-)




+      nodes (defined as hypervisors or VTEPs) participating in the associated
+      logical switch. When service node replication is used, the VTEP should

[AC] in the associated logical switch when source node replication is used.

thanks



+      send packets to one member of the locator set that is known to be
+      healthy and reachable, which could be determined by BFD.

[AC] When source node replication is used, the VTEP should send packets to all members of the locator set.

I might as well mention that here as well; thanks.




     </column>

     <column name="ipaddr">
--
1.9.1





More information about the dev mailing list