[ovs-dev] [PATCH] datapath-windows: check source port during tunnel Tx

Alin Serdean aserdean at cloudbasesolutions.com
Wed Aug 13 20:07:24 UTC 2014


Hi Nithin,

At first glance everything seems good.

I even tested it out between a HyperV and KVM hypervisor setup.

On HyperV:
PS C:\package\binaries> .\ovsset.ps1
system at ovs-system:
        lookups: hit:34 missed:273 lost:0
        flows: 0
        masks: hit:0 total:0 hit/pkt:0.00
        port 16777216: internal
        port 16777225: external.1
        port 16777288: vmNICSyn.1000048
d97cd8bd-60a4-4de3-94a5-686c355c63fd
    Bridge br-int
        Port br-int
            Interface br-int
                type: internal
    Bridge br-pif
        Port "vmNICSyn.1000048"
            Interface "vmNICSyn.1000048"
        Port br-pif
            Interface br-pif
                type: internal
        Port "vxlan-1"
            Interface "vxlan-1"
                type: vxlan
                options: {in_key=flow, local_ip="10.13.8.102", out_key=flow, remote_ip="10.13.8.2"}
        Port internal
            Interface internal
        Port "external.1"
            Interface "external.1"

No crash until now, will leave it overnight.

Although one thing worries me:
PS C:\package\binaries> ping -t 10.13.8.2

Pinging 10.13.8.2 with 32 bytes of data:
Reply from 10.13.8.2: bytes=32 time=400ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=406ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=404ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=405ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=406ms TTL=64
Reply from 10.13.8.2: bytes=32 time<1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=452ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=420ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=405ms TTL=64
Reply from 10.13.8.2: bytes=32 time<1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=405ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=405ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=221ms TTL=64
Reply from 10.13.8.2: bytes=32 time=2ms TTL=64
Reply from 10.13.8.2: bytes=32 time=402ms TTL=64
Reply from 10.13.8.2: bytes=32 time<1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=405ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=405ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=405ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=405ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64
Reply from 10.13.8.2: bytes=32 time=406ms TTL=64
Reply from 10.13.8.2: bytes=32 time=17ms TTL=64
Reply from 10.13.8.2: bytes=32 time=405ms TTL=64
Reply from 10.13.8.2: bytes=32 time=5ms TTL=64
Reply from 10.13.8.2: bytes=32 time=1ms TTL=64

Very sporadic times.

Will continue my testing and review tomorrow.

Just wanted to let you know :).

Thanks,
Alin.

-----Mesaj original-----
De la: dev [mailto:dev-bounces at openvswitch.org] În numele Nithin Raju
Trimis: Wednesday, August 13, 2014 8:59 AM
Către: dev at openvswitch.org; ssaurabh at vmware.com
Subiect: [ovs-dev] [PATCH] datapath-windows: check source port during tunnel Tx

In the Windows datapath, Tx tunneling functionality is implemented by checking if the outport in the action is a tunnel port. If so, the packet is encapsulated and sent out on the PIF bridge for as second flow lookup.
Basically, we don't use the hypervisor's IP stack to send out a packet, and short circuit the path ourselves. On the PIF bridge, the source port of the encapsulated packet is the VTEP port ie. the internal port.

If a Tunneling port is added to the PIF bridge (a possible misconfiguration), where the VTEP(internal) port and the external port (physical NIC) also reside, a flooding action can cause a loop, by re-injecting the packet on the same PIF bridge which again floods to the tunnel port.

In this change, we break the loop by encapsulating packets only if they are sent out by a VIF or if they originate from userspace ie. userspace generated.
We make use of the input port attribute in the packet execute ioctl.

This change is based off of the legacy datapath interface published in OvsPub.h. This interface has a input port field in the packet execute ioctl.
I looked in dpif-linux.c that uses the netlink based datapath interface and even in that case, we do add the the source port in:
    dpif_linux_encode_execute() -> odp_key_from_pkt_metadata().
So, this fix is applicable when we adopt the netlink based datapath interface as well.

The Rx side of OvsDetectTunnelPkt() has only documentation updates. The fix is on the Tx side.

Validation (using dpif-windows.c):
- Was able to perform VTEP <-> VTEP ping with the configuration posted in the issue.
- Was able to perform VIF <-> VIF ping when the setup was configured correctly.

Signed-off-by: Nithin Raju <nithin at vmware.com>
Reported-by: Alin Serdean <aserdean at cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/20
---
 datapath-windows/ovsext/OvsActions.c | 58 +++++++++++++++++++++++++-----------
 datapath-windows/ovsext/OvsUser.c    | 19 ++++++++----
 datapath-windows/ovsext/OvsVport.h   |  7 +++++
 3 files changed, 60 insertions(+), 24 deletions(-)

diff --git a/datapath-windows/ovsext/OvsActions.c b/datapath-windows/ovsext/OvsActions.c
index 4a2c117..635becc 100644
--- a/datapath-windows/ovsext/OvsActions.c
+++ b/datapath-windows/ovsext/OvsActions.c
@@ -224,14 +224,18 @@ OvsDetectTunnelRxPkt(OvsForwardingContext *ovsFwdCtx,
 /*
  * --------------------------------------------------------------------------
  * OvsDetectTunnelPkt --
- *     Utility function to detect the tunnel type of a TX/RX packet.
+ *     Utility function to detect if a packet is to be subjected to
+ *     tunneling (Tx) or de-tunneling (Rx). Various factors such as source
+ *     port, destination port, packet contents, and previously setup tunnel
+ *     context are used.
  *
  * Result:
- *  True  - if the tunnel type was detected.
- *  False - if not a tunnel packet or tunnel type not supported.
- *
- *  if result==True, the forwarding context gets initialized with the
- *  right tunnel vport.
+ *  True  - If the packet is to be subjected to tunneling.
+ *          In case of invalid tunnel context, the tunneling functionality is
+ *          a no-op and is completed within this function itself by consuming
+ *          all of the tunneling context.
+ *  False - If not a tunnel packet or tunnel type not supported. Caller should
+ *          process the packet as a non-tunnel packet.
  * --------------------------------------------------------------------------
  */
 static __inline BOOLEAN
@@ -239,16 +243,17 @@ OvsDetectTunnelPkt(OvsForwardingContext *ovsFwdCtx,
                    const POVS_VPORT_ENTRY dstVport,
                    const OvsFlowKey *flowKey)  {
-    /*
-     * The source of NBL during tunneling Rx could be the external port or if
-     * it being executed from userspace, the source port is default port.
-     */
-
     if (OvsIsInternalVportType(dstVport->ovsType)) {
+        /*
+         * Rx:
+         * The source of NBL during tunneling Rx could be the external
+         * port or if it is being executed from userspace, the source port is
+         * default port.
+         */
         BOOLEAN validSrcPort = (ovsFwdCtx->fwdDetail->SourcePortId ==
-                            ovsFwdCtx->switchContext->externalPortId)
-                || (ovsFwdCtx->fwdDetail->SourcePortId ==
-                            NDIS_SWITCH_DEFAULT_PORT_ID);
+                                ovsFwdCtx->switchContext->externalPortId) ||
+                               (ovsFwdCtx->fwdDetail->SourcePortId ==
+                                NDIS_SWITCH_DEFAULT_PORT_ID);
 
         if (validSrcPort && OvsDetectTunnelRxPkt(ovsFwdCtx, flowKey)) {
             ASSERT(ovsFwdCtx->tunnelTxNic == NULL); @@ -258,9 +263,27 @@ OvsDetectTunnelPkt(OvsForwardingContext *ovsFwdCtx,
     } else if (OvsIsTunnelVportType(dstVport->ovsType)) {
         ASSERT(ovsFwdCtx->tunnelTxNic == NULL);
         ASSERT(ovsFwdCtx->tunnelRxNic == NULL);
-        ASSERT(ovsFwdCtx->tunKey.dst != 0);
-        ovsActionStats.txVxlan++;
-        ovsFwdCtx->tunnelTxNic = dstVport;
+
+        /*
+         * Tx:
+         * The destination port is a tunnel port. Encapsulation must be
+         * performed only on packets that originate from a VIF port or from
+         * userspace (default port)
+         *
+         * If the packet will not be encapsulated, consume the tunnel context
+         * by clearing it.
+         */
+        if (ovsFwdCtx->srcVportNo != 0 &&
+            !OvsIsVifVportNo(ovsFwdCtx->srcVportNo)) {
+            ovsFwdCtx->tunKey.dst = 0;
+        }
+
+        /* Tunnel the packet only if tunnel context is set. */
+        if (ovsFwdCtx->tunKey.dst != 0) {
+            ovsActionStats.txVxlan++;
+            ovsFwdCtx->tunnelTxNic = dstVport;
+        }
+
         return TRUE;
     }
 
@@ -318,7 +341,6 @@ OvsAddPorts(OvsForwardingContext *ovsFwdCtx,
         NET_BUFFER_DATA_LENGTH(NET_BUFFER_LIST_FIRST_NB(ovsFwdCtx->curNbl));
 
     if (OvsDetectTunnelPkt(ovsFwdCtx, vport, flowKey)) {
-        ASSERT(ovsFwdCtx->tunnelTxNic || ovsFwdCtx->tunnelRxNic);
         return NDIS_STATUS_SUCCESS;
     }
 
diff --git a/datapath-windows/ovsext/OvsUser.c b/datapath-windows/ovsext/OvsUser.c
index 8271d52..5093f20 100644
--- a/datapath-windows/ovsext/OvsUser.c
+++ b/datapath-windows/ovsext/OvsUser.c
@@ -314,6 +314,7 @@ OvsExecuteDpIoctl(PVOID inputBuffer,
     PNDIS_SWITCH_FORWARDING_DETAIL_NET_BUFFER_LIST_INFO fwdDetail;
     OvsFlowKey key;
     OVS_PACKET_HDR_INFO layers;
+    POVS_VPORT_ENTRY vport;
 
     if (inputLength < sizeof(*execute) || outputLength != 0) {
         return STATUS_INFO_LENGTH_MISMATCH; @@ -351,8 +352,14 @@ OvsExecuteDpIoctl(PVOID inputBuffer,
     }
 
     fwdDetail = NET_BUFFER_LIST_SWITCH_FORWARDING_DETAIL(pNbl);
-    fwdDetail->SourcePortId = NDIS_SWITCH_DEFAULT_PORT_ID;
-    fwdDetail->SourceNicIndex = 0;
+    vport = OvsFindVportByPortNo(gOvsSwitchContext, execute->inPort);
+    if (vport) {
+        fwdDetail->SourcePortId = vport->portId;
+        fwdDetail->SourceNicIndex = vport->nicIndex;
+    } else {
+        fwdDetail->SourcePortId = NDIS_SWITCH_DEFAULT_PORT_ID;
+        fwdDetail->SourceNicIndex = 0;
+    }
     // XXX: Figure out if any of the other members of fwdDetail need to be set.
 
     ndisStatus = OvsExtractFlow(pNbl, fwdDetail->SourcePortId, &key, &layers, @@ -362,10 +369,10 @@ OvsExecuteDpIoctl(PVOID inputBuffer,
         NdisAcquireRWLockRead(gOvsSwitchContext->dispatchLock, &lockState,
                               NDIS_RWL_AT_DISPATCH_LEVEL);
         ndisStatus = OvsActionsExecute(gOvsSwitchContext, NULL, pNbl,
-                                     0, // XXX: we are passing 0 for srcVportNo
-                                     NDIS_SEND_FLAGS_SWITCH_DESTINATION_GROUP,
-                                     &key, NULL, &layers, actions,
-                                     execute->actionsLen);
+                                       vport ? vport->portNo : 0,
+                                       NDIS_SEND_FLAGS_SWITCH_DESTINATION_GROUP,
+                                       &key, NULL, &layers, actions,
+                                       execute->actionsLen);
         pNbl = NULL;
         NdisReleaseRWLock(gOvsSwitchContext->dispatchLock, &lockState);
     }
diff --git a/datapath-windows/ovsext/OvsVport.h b/datapath-windows/ovsext/OvsVport.h
index 8fe23f1..4ab0019 100644
--- a/datapath-windows/ovsext/OvsVport.h
+++ b/datapath-windows/ovsext/OvsVport.h
@@ -161,6 +161,13 @@ OvsIsTunnelVportNo(UINT32 portNo)
     return (idx >= OVS_TUNNEL_INDEX_START && idx <= OVS_TUNNEL_INDEX_END);  }
 
+static __inline BOOLEAN
+OvsIsVifVportNo(UINT32 portNo)
+{
+    UINT32 idx = OVS_VPORT_INDEX(portNo);
+    return (idx >= OVS_VM_VPORT_START && idx <= OVS_VM_VPORT_MAX); }
+
 static __inline POVS_VPORT_ENTRY
 OvsGetTunnelVport(OVS_VPORT_TYPE type)
 {
--
1.9.1

_______________________________________________
dev mailing list
dev at openvswitch.org
http://openvswitch.org/mailman/listinfo/dev



More information about the dev mailing list