[ovs-dev] [PATCH V3] datapath-windows: Improved offloading on STT tunnel

Sairam Venugopal vsairam at vmware.com
Tue May 3 16:48:25 UTC 2016


Hi Paul,

I have added my reply below.

Thanks,
Sairam

On 5/3/16, 5:17 AM, "Paul Boca" <pboca at cloudbasesolutions.com> wrote:

>Hi Sai,
>
>Please see the comments inline.
>
>Thanks,
>Paul
>
>> -----Original Message-----
>> From: Sairam Venugopal [mailto:vsairam at vmware.com]
>> Sent: Saturday, April 30, 2016 11:38 AM
>> To: Paul Boca; dev at openvswitch.org
>> Subject: Re: [ovs-dev] [PATCH V3] datapath-windows: Improved offloading
>>on
>> STT tunnel
>> 
>> Hey Paul,
>> 
>> Thanks for clarifying what OvsExtractLayers was doing. You shouldn¹t
>>have
>> to extract the layers info for the inner packet since they can be
>>derived
>> from the STT Header Flags - especially when it¹s CSUM_Partial. That¹s
>>even
>> mentioned in the STT IETF draft under the STT Flags section -
>> 
>>https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_
>>draft-2Ddavie-2Dstt-2D08&d=BQIFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNt
>>Xt-uEs&r=Dcruz40PROJ40ROzSpxyQSLw6fcrOWpJgEcEmNR3JEQ&m=0e7PJ1EsBgAr5B9GL8
>>JU926nGRjTVJxfDZPkLFe_0C0&s=UPsD0tpTT7fevR7JaFvFI6cUT6pq88J9f_V16bysiaU&e
>>= . This definitely impacts
>> the efficiency of STT.
>
>PB: That's true, in case of CSUM_Partial I extract the info from STT
>header with few exceptions
>explained below. 
>For CSUM_Verified I'll move extract layers after because it isn't needed.
>
>> 
>> Are you running into issues when the inner checksum isn¹t computed and
>>the
>> STT flags mention otherwise? If yes, then that¹s a bug in the Encap side
>> of things. I would test a similar setup on KVM and verify if it works.
>>The
>> function - OvsDecapSetOffloads mimics what KVM does -
>> 
>>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openvswit
>>ch_ovs_blob_d271907f817db25be8da8d425ac256&d=BQIFAw&c=Sqcl0Ez6M0X8aeM67LK
>>IiDJAXVeAw-YihVMNtXt-uEs&r=Dcruz40PROJ40ROzSpxyQSLw6fcrOWpJgEcEmNR3JEQ&m=
>>0e7PJ1EsBgAr5B9GL8JU926nGRjTVJxfDZPkLFe_0C0&s=Gojy-_UB_GvnZJffFSyUhosf1Zr
>>sKB2KjdmUKvXPFW0&e=
>> d7ed8
>> c96a9/datapath/linux/compat/stt.c#L1222 and handles the case when the
>> inner checksum is partially computed. When inner checksum is already
>> computed, why is it necessary to update the IP version of the csumInfo?
>> Since we don¹t request any further offloads.
>
>PB: I've tested Win-to-Win and Lin-to-Win connections and encountered
>some issues regarding
>checksums and how flags are set.
>In __push_stt_header function
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openvswitc
>h_ovs_blob_&d=BQIFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Dcruz
>40PROJ40ROzSpxyQSLw6fcrOWpJgEcEmNR3JEQ&m=0e7PJ1EsBgAr5B9GL8JU926nGRjTVJxfD
>ZPkLFe_0C0&s=stK-oN_CrsVsGRRyzsY1RgiTBCuET1w9zCK1wKGzeZ4&e=
>d271907f817db25be8da8d425ac256d7ed8c96a9/datapath/linux/compat/stt.c#L566
>only 
>CHECKSUM_PARTIAL and CHECKSUM_UNNECESSARY is verified, ignoring
>CHECKSUM_COMPLETE 
>and CHECKSUM_NONE in skb->ip_summed. This leads to packets with STT
>header flags set to 0
>even if the checksum is not computed/verified.
>In case of LSO I added code to recompute pseudo-checksum because I get
>invalid pseudo-check
>from Linux on CSUM_Partial case. If I understood the explanation
>correctly from 
>https://urldefense.proofpoint.com/v2/url?u=http-3A__lxr.free-2Delectrons.c
>om_source_include_linux_skbuff.h-23L204&d=BQIFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJ
>AXVeAw-YihVMNtXt-uEs&r=Dcruz40PROJ40ROzSpxyQSLw6fcrOWpJgEcEmNR3JEQ&m=0e7PJ
>1EsBgAr5B9GL8JU926nGRjTVJxfDZPkLFe_0C0&s=T3QNozDyL0cuTo_ZgJfbJdHIvzOyXV2vJ
>Tua_KSXHl0&e=  skb->ip_summed field is always
>set to CHECKSUM_PARTIAL in case of LSO.
>I will rename back OvsDecapApplyOffloads to OvsDecapSetOffloads.

I still don’t think we should be computing checksum for the inner packet
on the receiving side. Can you check what happens in KVM when the inner
checksum isn’t computed and flags are wrong? We can discuss about this in
meeting.

If Linux sends down CSUM_Partial, then it definitely computed the
PseudoChecksum once and we shouldn’t have to recompute it. We should
probably file a bug for this in OVS-Issues and fix KVM since it’s doesn’t
send down the right STT header for Pseudo Checksum. Adding more info into
reproducing the issue will help.


>
>> 
>> Moreover, using OvsGetExternalMTU for the inner VM¹s MSS is also
>> incorrect. Ideally we should be able to retrieve the MTU of the inner VM
>> and use that or require that a certain default MTU should be set for the
>> inner VMs. Were you able to test this by changing the MTU of external
>> adapter without changing that of the inner VM?
>
>PB: When I tested, I changed  MTU only in VM and now I tried to change
>MTU for external adapter

Sai: Try disabling and re-enabling the OVS switch extension in Hyper-V. We
have a bug where the adapter properties don’t get refreshed on
modification.

>but it doesn't seem to have any effect on sent packets, OvsGetExternalMtu
>returns 1500 always.

>
>> 
>> I don¹t think the STT header flags can be 0. We should probably get it
>> confirmed from some other person on the OVS team. If you are
>>encountering
>> any issues with the current STT setup, please file a bug. There is a
>> higher likelihood of having a bug in OvsSttEncap(). Although, I see an
>> advantage in differentiating OvsExtractLayers from OvsExtractFlows, I am
>> still unsure if this patch improves STT offloading. Do correct me if my
>> understanding is off.
>
>PB: I couldn't find in STT specs if this can be 0 but it happens in some
>cases to be 0
>In Linux-to-Windows connections.

Sai: Yes, from what I gathered, it was by design. Unlike Linux, we don’t
have CHECKSUM_COMPLETE and CHECKSUM_NONE support and hence the
discrepancies. There have been few bug fixes in the KVM-STT implementation
after Hyper-V went live. We should look at those and see if those changes
address the issues that you see.

>
>> 
>> Thanks,
>> Sairam
>> 
>> 
>> On 4/27/16, 12:25 AM, "Paul Boca" <pboca at cloudbasesolutions.com> wrote:
>> 
>> >Added OvsExtractLayers - populates only the layers field without
>> >unnecessary
>> >memory operations for flow part
>> >If in STT header the flags are 0 then force packets checksums
>>calculation
>> >Ensure correct pseudo checksum is set for LSO both on send and receive
>> >
>> >Signed-off-by: Paul-Daniel Boca <pboca at cloudbasesolutions.com>
>> >---
>> >v2: Fixed a NULL pointer dereference.
>> >    Removed some unused local variables and multiple initializations.
>> >v3: Use LSO V2 in OvsDoEncapStt
>> >    Fixed alignment and code style
>> >    Use IpHdr TTL for fragment expiration on receive instead 30s
>> >---
>> > datapath-windows/ovsext/Flow.c         | 243
>> >++++++++++++++++++++++++++++-----
>> > datapath-windows/ovsext/Flow.h         |   2 +
>> > datapath-windows/ovsext/PacketParser.c |  97 +++++++------
>> > datapath-windows/ovsext/PacketParser.h |   8 +-
>> > datapath-windows/ovsext/Stt.c          | 157 +++++++++++++++++----
>> > datapath-windows/ovsext/Stt.h          |   1 -
>> > datapath-windows/ovsext/User.c         |  20 +--
>> > 7 files changed, 413 insertions(+), 115 deletions(-)
>> >
>> >diff --git a/datapath-windows/ovsext/Flow.c
>> >b/datapath-windows/ovsext/Flow.c
>> >index 1f23625..a49a60c 100644
>> >--- a/datapath-windows/ovsext/Flow.c
>> >+++ b/datapath-windows/ovsext/Flow.c
>> >@@ -1566,7 +1566,8 @@ _MapKeyAttrToFlowPut(PNL_ATTR *keyAttrs,
>> >
>> >                     ndKey = NlAttrGet(keyAttrs[OVS_KEY_ATTR_ND]);
>> >                     RtlCopyMemory(&icmp6FlowPutKey->ndTarget,
>> >-                                  ndKey->nd_target, sizeof
>> >(icmp6FlowPutKey->ndTarget));
>> >+                                  ndKey->nd_target,
>> >+                                  sizeof (icmp6FlowPutKey->ndTarget));
>> >                     RtlCopyMemory(icmp6FlowPutKey->arpSha,
>> >                                   ndKey->nd_sll, ETH_ADDR_LEN);
>> >                     RtlCopyMemory(icmp6FlowPutKey->arpTha,
>> >@@ -1596,8 +1597,10 @@ _MapKeyAttrToFlowPut(PNL_ATTR *keyAttrs,
>> >             arpFlowPutKey->nwSrc = arpKey->arp_sip;
>> >             arpFlowPutKey->nwDst = arpKey->arp_tip;
>> >
>> >-            RtlCopyMemory(arpFlowPutKey->arpSha, arpKey->arp_sha,
>> >ETH_ADDR_LEN);
>> >-            RtlCopyMemory(arpFlowPutKey->arpTha, arpKey->arp_tha,
>> >ETH_ADDR_LEN);
>> >+            RtlCopyMemory(arpFlowPutKey->arpSha, arpKey->arp_sha,
>> >+                          ETH_ADDR_LEN);
>> >+            RtlCopyMemory(arpFlowPutKey->arpTha, arpKey->arp_tha,
>> >+                          ETH_ADDR_LEN);
>> >             /* Kernel datapath assumes 'arpFlowPutKey->nwProto' to be
>>in
>> >host
>> >              * order. */
>> >             arpFlowPutKey->nwProto = (UINT8)ntohs((arpKey->arp_op));
>> >@@ -1846,29 +1849,195 @@ OvsGetFlowMetadata(OvsFlowKey *key,
>> >     return status;
>> > }
>> >
>> >+
>> > /*
>> >-
>> 
>>>*-----------------------------------------------------------------------
>>>--
>> >---
>> >- * Initializes 'flow' members from 'packet', 'skb_priority', 'tun_id',
>> >and
>> >- * 'ofp_in_port'.
>> >- *
>> >- * Initializes 'packet' header pointers as follows:
>> >- *
>> >- *    - packet->l2 to the start of the Ethernet header.
>> >- *
>> >- *    - packet->l3 to just past the Ethernet header, or just past the
>> >- *      vlan_header if one is present, to the first byte of the
>>payload
>> >of the
>> >- *      Ethernet frame.
>> >- *
>> >- *    - packet->l4 to just past the IPv4 header, if one is present and
>> >has a
>> >- *      correct length, and otherwise NULL.
>> >- *
>> >- *    - packet->l7 to just past the TCP, UDP, SCTP or ICMP header, if
>> >one is
>> >- *      present and has a correct length, and otherwise NULL.
>> >- *
>> >- * Returns NDIS_STATUS_SUCCESS normally.  Fails only if packet data
>> >cannot be accessed
>> >- * (e.g. if Pkt_CopyBytesOut() returns an error).
>> >-
>> 
>>>*-----------------------------------------------------------------------
>>>--
>> >---
>> >- */
>> 
>>>+*----------------------------------------------------------------------
>>>--
>> >----
>> >+* Initializes 'layers' members from 'packet'
>> >+*
>> >+* Initializes 'layers' header pointers as follows:
>> >+*
>> >+*    - layers->l2 to the start of the Ethernet header.
>> >+*
>> >+*    - layers->l3 to just past the Ethernet header, or just past the
>> >+*      vlan_header if one is present, to the first byte of the payload
>> >of the
>> >+*      Ethernet frame.
>> >+*
>> >+*    - layers->l4 to just past the IPv4 header, if one is present and
>> >has a
>> >+*      correct length, and otherwise NULL.
>> >+*
>> >+*    - layers->l7 to just past the TCP, UDP, SCTP or ICMP header, if
>>one
>> >is
>> >+*      present and has a correct length, and otherwise NULL.
>> >+*
>> >+*    - layers->isIPv4/isIPv6/isTcp/isUdp/isSctp based on the packet
>>type
>> >+*
>> >+* Returns NDIS_STATUS_SUCCESS normally.
>> >+* Fails only if packet data cannot be accessed.
>> >+* (e.g. if OvsParseIPv6() returns an error).
>> 
>>>+*----------------------------------------------------------------------
>>>--
>> >----
>> >+*/
>> >+NDIS_STATUS
>> >+OvsExtractLayers(const NET_BUFFER_LIST *packet,
>> >+                 POVS_PACKET_HDR_INFO layers)
>> >+{
>> >+    struct Eth_Header *eth;
>> >+    UINT8 offset = 0;
>> >+    PVOID vlanTagValue;
>> >+    ovs_be16 dlType;
>> >+
>> >+    layers->value = 0;
>> >+
>> >+    /* Link layer. */
>> >+    eth = (Eth_Header *)GetStartAddrNBL((NET_BUFFER_LIST *)packet);
>> >+
>> >+    /*
>> >+    * vlan_tci.
>> >+    */
>> >+    vlanTagValue = NET_BUFFER_LIST_INFO(packet,
>> >Ieee8021QNetBufferListInfo);
>> >+    if (!vlanTagValue) {
>> >+        if (eth->dix.typeNBO == ETH_TYPE_802_1PQ_NBO) {
>> >+            offset = sizeof(Eth_802_1pq_Tag);
>> >+        }
>> >+
>> >+        /*
>> >+        * XXX Please note after this point, src mac and dst mac should
>> >+        * not be accessed through eth
>> >+        */
>> >+        eth = (Eth_Header *)((UINT8 *)eth + offset);
>> >+    }
>> >+
>> >+    /*
>> >+    * dl_type.
>> >+    *
>> >+    * XXX assume that at least the first
>> >+    * 12 bytes of received packets are mapped.  This code has the
>> >stronger
>> >+    * assumption that at least the first 22 bytes of 'packet' is
>>mapped
>> >(if my
>> >+    * arithmetic is right).
>> >+    */
>> >+    if (ETH_TYPENOT8023(eth->dix.typeNBO)) {
>> >+        dlType = eth->dix.typeNBO;
>> >+        layers->l3Offset = ETH_HEADER_LEN_DIX + offset;
>> >+    } else if (OvsPacketLenNBL(packet) >= ETH_HEADER_LEN_802_3 &&
>> >+               eth->e802_3.llc.dsap == 0xaa &&
>> >+               eth->e802_3.llc.ssap == 0xaa &&
>> >+               eth->e802_3.llc.control == ETH_LLC_CONTROL_UFRAME &&
>> >+               eth->e802_3.snap.snapOrg[0] == 0x00 &&
>> >+               eth->e802_3.snap.snapOrg[1] == 0x00 &&
>> >+               eth->e802_3.snap.snapOrg[2] == 0x00) {
>> >+        dlType = eth->e802_3.snap.snapType.typeNBO;
>> >+        layers->l3Offset = ETH_HEADER_LEN_802_3 + offset;
>> >+    } else {
>> >+        dlType = htons(OVSWIN_DL_TYPE_NONE);
>> >+        layers->l3Offset = ETH_HEADER_LEN_DIX + offset;
>> >+    }
>> >+
>> >+    /* Network layer. */
>> >+    if (dlType == htons(ETH_TYPE_IPV4)) {
>> >+        struct IPHdr ip_storage;
>> >+        const struct IPHdr *nh;
>> >+
>> >+        layers->isIPv4 = 1;
>> >+        nh = OvsGetIp(packet, layers->l3Offset, &ip_storage);
>> >+        if (nh) {
>> >+            layers->l4Offset = layers->l3Offset + nh->ihl * 4;
>> >+
>> >+            if (!(nh->frag_off & htons(IP_OFFSET))) {
>> >+                if (nh->protocol == SOCKET_IPPROTO_TCP) {
>> >+                    OvsParseTcp(packet, NULL, layers);
>> >+                } else if (nh->protocol == SOCKET_IPPROTO_UDP) {
>> >+                    OvsParseUdp(packet, NULL, layers);
>> >+                } else if (nh->protocol == SOCKET_IPPROTO_SCTP) {
>> >+                    OvsParseSctp(packet, NULL, layers);
>> >+                } else if (nh->protocol == SOCKET_IPPROTO_ICMP) {
>> >+                    ICMPHdr icmpStorage;
>> >+                    const ICMPHdr *icmp;
>> >+
>> >+                    icmp = OvsGetIcmp(packet, layers->l4Offset,
>> >&icmpStorage);
>> >+                    if (icmp) {
>> >+                        layers->l7Offset = layers->l4Offset + sizeof
>> >*icmp;
>> >+                    }
>> >+                }
>> >+            }
>> >+        }
>> >+    } else if (dlType == htons(ETH_TYPE_IPV6)) {
>> >+        NDIS_STATUS status;
>> >+        Ipv6Key ipv6Key;
>> >+
>> >+        status = OvsParseIPv6(packet, &ipv6Key, layers);
>> >+        if (status != NDIS_STATUS_SUCCESS) {
>> >+            return status;
>> >+        }
>> >+        layers->isIPv6 = 1;
>> >+
>> >+        if (ipv6Key.nwProto == SOCKET_IPPROTO_TCP) {
>> >+            OvsParseTcp(packet, &(ipv6Key.l4), layers);
>> >+        } else if (ipv6Key.nwProto == SOCKET_IPPROTO_UDP) {
>> >+            OvsParseUdp(packet, &(ipv6Key.l4), layers);
>> >+        } else if (ipv6Key.nwProto == SOCKET_IPPROTO_SCTP) {
>> >+            OvsParseSctp(packet, &ipv6Key.l4, layers);
>> >+        } else if (ipv6Key.nwProto == SOCKET_IPPROTO_ICMPV6) {
>> >+            Icmp6Key icmp6Key;
>> >+            OvsParseIcmpV6(packet, NULL, &icmp6Key, layers);
>> >+        }
>> >+    } else if (OvsEthertypeIsMpls(dlType)) {
>> >+        MPLSHdr mplsStorage;
>> >+        const MPLSHdr *mpls;
>> >+
>> >+        /*
>> >+        * In the presence of an MPLS label stack the end of the L2
>> >+        * header and the beginning of the L3 header differ.
>> >+        *
>> >+        * A network packet may contain multiple MPLS labels, but we
>> >+        * are only interested in the topmost label stack entry.
>> >+        *
>> >+        * Advance network header to the beginning of the L3 header.
>> >+        * layers->l3Offset corresponds to the end of the L2 header.
>> >+        */
>> >+        for (UINT32 i = 0; i < FLOW_MAX_MPLS_LABELS; i++) {
>> >+            mpls = OvsGetMpls(packet, layers->l3Offset, &mplsStorage);
>> >+            if (!mpls) {
>> >+                break;
>> >+            }
>> >+
>> >+            layers->l3Offset += MPLS_HLEN;
>> >+            layers->l4Offset += MPLS_HLEN;
>> >+
>> >+            if (mpls->lse & htonl(MPLS_BOS_MASK)) {
>> >+                /*
>> >+                * Bottom of Stack bit is set, which means there are no
>> >+                * remaining MPLS labels in the packet.
>> >+                */
>> >+                break;
>> >+            }
>> >+        }
>> >+    }
>> >+
>> >+    return NDIS_STATUS_SUCCESS;
>> >+}
>> >+
>> >+/*
>> 
>>>+*----------------------------------------------------------------------
>>>--
>> >----
>> >+* Initializes 'flow' members from 'packet', 'skb_priority', 'tun_id',
>>and
>> >+* 'ofp_in_port'.
>> >+*
>> >+* Initializes 'packet' header pointers as follows:
>> >+*
>> >+*    - packet->l2 to the start of the Ethernet header.
>> >+*
>> >+*    - packet->l3 to just past the Ethernet header, or just past the
>> >+*      vlan_header if one is present, to the first byte of the payload
>> >of the
>> >+*      Ethernet frame.
>> >+*
>> >+*    - packet->l4 to just past the IPv4 header, if one is present and
>> >has a
>> >+*      correct length, and otherwise NULL.
>> >+*
>> >+*    - packet->l7 to just past the TCP, UDP, SCTP or ICMP header, if
>>one
>> >is
>> >+*      present and has a correct length, and otherwise NULL.
>> >+*
>> >+* Returns NDIS_STATUS_SUCCESS normally.
>> >+* Fails only if packet data cannot be accessed.
>> >+* (e.g. if Pkt_CopyBytesOut() returns an error).
>> 
>>>+*----------------------------------------------------------------------
>>>--
>> >----
>> >+*/
>> > NDIS_STATUS
>> > OvsExtractFlow(const NET_BUFFER_LIST *packet,
>> >                UINT32 inPort,
>> >@@ -1900,8 +2069,8 @@ OvsExtractFlow(const NET_BUFFER_LIST *packet,
>> >
>> >     /* Link layer. */
>> >     eth = (Eth_Header *)GetStartAddrNBL((NET_BUFFER_LIST *)packet);
>> >-    memcpy(flow->l2.dlSrc, eth->src, ETH_ADDR_LENGTH);
>> >-    memcpy(flow->l2.dlDst, eth->dst, ETH_ADDR_LENGTH);
>> >+    RtlCopyMemory(flow->l2.dlSrc, eth->src, ETH_ADDR_LENGTH);
>> >+    RtlCopyMemory(flow->l2.dlDst, eth->dst, ETH_ADDR_LENGTH);
>> >
>> >     /*
>> >      * vlan_tci.
>> >@@ -1923,8 +2092,7 @@ OvsExtractFlow(const NET_BUFFER_LIST *packet,
>> >             flow->l2.vlanTci = 0;
>> >         }
>> >         /*
>> >-         * XXX
>> >-         * Please note after this point, src mac and dst mac should
>> >+         * XXX Please note after this point, src mac and dst mac
>>should
>> >          * not be accessed through eth
>> >          */
>> >         eth = (Eth_Header *)((UINT8 *)eth + offset);
>> >@@ -1955,7 +2123,8 @@ OvsExtractFlow(const NET_BUFFER_LIST *packet,
>> >         layers->l3Offset = ETH_HEADER_LEN_DIX + offset;
>> >     }
>> >
>> >-    flow->l2.keyLen = OVS_WIN_TUNNEL_KEY_SIZE + OVS_L2_KEY_SIZE -
>> >flow->l2.offset;
>> >+    flow->l2.keyLen = OVS_WIN_TUNNEL_KEY_SIZE + OVS_L2_KEY_SIZE
>> >+                      - flow->l2.offset;
>> >     /* Network layer. */
>> >     if (flow->l2.dlType == htons(ETH_TYPE_IPV4)) {
>> >         struct IPHdr ip_storage;
>> >@@ -2012,9 +2181,9 @@ OvsExtractFlow(const NET_BUFFER_LIST *packet,
>> >     } else if (flow->l2.dlType == htons(ETH_TYPE_IPV6)) {
>> >         NDIS_STATUS status;
>> >         flow->l2.keyLen += OVS_IPV6_KEY_SIZE;
>> >-        status = OvsParseIPv6(packet, flow, layers);
>> >+        status = OvsParseIPv6(packet, &flow->ipv6Key, layers);
>> >         if (status != NDIS_STATUS_SUCCESS) {
>> >-            memset(&flow->ipv6Key, 0, sizeof (Ipv6Key));
>> >+            RtlZeroMemory(&flow->ipv6Key, sizeof (Ipv6Key));
>> >             return status;
>> >         }
>> >         layers->isIPv6 = 1;
>> >@@ -2029,7 +2198,7 @@ OvsExtractFlow(const NET_BUFFER_LIST *packet,
>> >         } else if (flow->ipv6Key.nwProto == SOCKET_IPPROTO_SCTP) {
>> >             OvsParseSctp(packet, &flow->ipv6Key.l4, layers);
>> >         } else if (flow->ipv6Key.nwProto == SOCKET_IPPROTO_ICMPV6) {
>> >-            OvsParseIcmpV6(packet, flow, layers);
>> >+            OvsParseIcmpV6(packet, &flow->ipv6Key, &flow->icmp6Key,
>> >layers);
>> >             flow->l2.keyLen += (OVS_ICMPV6_KEY_SIZE -
>>OVS_IPV6_KEY_SIZE);
>> >         }
>> >     } else if (flow->l2.dlType == htons(ETH_TYPE_ARP)) {
>> >@@ -2051,10 +2220,10 @@ OvsExtractFlow(const NET_BUFFER_LIST *packet,
>> >             }
>> >             if (arpKey->nwProto == ARPOP_REQUEST
>> >                 || arpKey->nwProto == ARPOP_REPLY) {
>> >-                memcpy(&arpKey->nwSrc, arp->arp_spa, 4);
>> >-                memcpy(&arpKey->nwDst, arp->arp_tpa, 4);
>> >-                memcpy(arpKey->arpSha, arp->arp_sha, ETH_ADDR_LENGTH);
>> >-                memcpy(arpKey->arpTha, arp->arp_tha, ETH_ADDR_LENGTH);
>> >+                RtlCopyMemory(&arpKey->nwSrc, arp->arp_spa, 4);
>> >+                RtlCopyMemory(&arpKey->nwDst, arp->arp_tpa, 4);
>> >+                RtlCopyMemory(arpKey->arpSha, arp->arp_sha,
>> >ETH_ADDR_LENGTH);
>> >+                RtlCopyMemory(arpKey->arpTha, arp->arp_tha,
>> >ETH_ADDR_LENGTH);
>> >             }
>> >         }
>> >     } else if (OvsEthertypeIsMpls(flow->l2.dlType)) {
>> >diff --git a/datapath-windows/ovsext/Flow.h
>> >b/datapath-windows/ovsext/Flow.h
>> >index 310c472..88240b5 100644
>> >--- a/datapath-windows/ovsext/Flow.h
>> >+++ b/datapath-windows/ovsext/Flow.h
>> >@@ -53,6 +53,8 @@ NDIS_STATUS OvsAllocateFlowTable(OVS_DATAPATH
>> *datapath,
>> >
>> > NDIS_STATUS OvsGetFlowMetadata(OvsFlowKey *key,
>> >                                PNL_ATTR *keyAttrs);
>> >+NDIS_STATUS OvsExtractLayers(const NET_BUFFER_LIST *packet,
>> >+                             POVS_PACKET_HDR_INFO layers);
>> > NDIS_STATUS OvsExtractFlow(const NET_BUFFER_LIST *pkt, UINT32 inPort,
>> >                            OvsFlowKey *flow, POVS_PACKET_HDR_INFO
>>layers,
>> >                            OvsIPv4TunnelKey *tunKey);
>> >diff --git a/datapath-windows/ovsext/PacketParser.c
>> >b/datapath-windows/ovsext/PacketParser.c
>> >index 93df342..c4a04d0 100644
>> >--- a/datapath-windows/ovsext/PacketParser.c
>> >+++ b/datapath-windows/ovsext/PacketParser.c
>> >@@ -84,14 +84,13 @@ OvsGetPacketBytes(const NET_BUFFER_LIST *nbl,
>> >
>> > NDIS_STATUS
>> > OvsParseIPv6(const NET_BUFFER_LIST *packet,
>> >-             OvsFlowKey *key,
>> >+             Ipv6Key *ipv6Key,
>> >              POVS_PACKET_HDR_INFO layers)
>> > {
>> >     UINT16 ofs = layers->l3Offset;
>> >     IPv6Hdr ipv6HdrStorage;
>> >     const IPv6Hdr *nh;
>> >     UINT32 nextHdr;
>> >-    Ipv6Key *flow= &key->ipv6Key;
>> >
>> >     nh = OvsGetPacketBytes(packet, sizeof *nh, ofs, &ipv6HdrStorage);
>> >     if (!nh) {
>> >@@ -99,15 +98,15 @@ OvsParseIPv6(const NET_BUFFER_LIST *packet,
>> >     }
>> >
>> >     nextHdr = nh->nexthdr;
>> >-    memcpy(&flow->ipv6Src, nh->saddr.s6_addr, 16);
>> >-    memcpy(&flow->ipv6Dst, nh->daddr.s6_addr, 16);
>> >+    RtlCopyMemory(&ipv6Key->ipv6Src, nh->saddr.s6_addr, 16);
>> >+    RtlCopyMemory(&ipv6Key->ipv6Dst, nh->daddr.s6_addr, 16);
>> >
>> >-    flow->nwTos = ((nh->flow_lbl[0] & 0xF0) >> 4) | (nh->priority <<
>>4);
>> >-    flow->ipv6Label =
>> >+    ipv6Key->nwTos = ((nh->flow_lbl[0] & 0xF0) >> 4) | (nh->priority
>><<
>> >4);
>> >+    ipv6Key->ipv6Label =
>> >         ((nh->flow_lbl[0] & 0x0F) << 16) | (nh->flow_lbl[1] << 8) |
>> >nh->flow_lbl[2];
>> >-    flow->nwTtl = nh->hop_limit;
>> >-    flow->nwProto = SOCKET_IPPROTO_NONE;
>> >-    flow->nwFrag = OVS_FRAG_TYPE_NONE;
>> >+    ipv6Key->nwTtl = nh->hop_limit;
>> >+    ipv6Key->nwProto = SOCKET_IPPROTO_NONE;
>> >+    ipv6Key->nwFrag = OVS_FRAG_TYPE_NONE;
>> >
>> >     // Parse extended headers and compute L4 offset
>> >     ofs += sizeof(IPv6Hdr);
>> >@@ -160,9 +159,9 @@ OvsParseIPv6(const NET_BUFFER_LIST *packet,
>> >             /* We only process the first fragment. */
>> >             if (fragHdr->offlg != htons(0)) {
>> >                 if ((fragHdr->offlg & IP6F_OFF_HOST_ORDER_MASK) ==
>> >htons(0)) {
>> >-                    flow->nwFrag = OVS_FRAG_TYPE_FIRST;
>> >+                    ipv6Key->nwFrag = OVS_FRAG_TYPE_FIRST;
>> >                 } else {
>> >-                    flow->nwFrag = OVS_FRAG_TYPE_LATER;
>> >+                    ipv6Key->nwFrag = OVS_FRAG_TYPE_LATER;
>> >                     nextHdr = SOCKET_IPPROTO_FRAGMENT;
>> >                     break;
>> >                 }
>> >@@ -170,7 +169,7 @@ OvsParseIPv6(const NET_BUFFER_LIST *packet,
>> >         }
>> >     }
>> >
>> >-    flow->nwProto = (UINT8)nextHdr;
>> >+    ipv6Key->nwProto = (UINT8)nextHdr;
>> >     layers->l4Offset = ofs;
>> >     return NDIS_STATUS_SUCCESS;
>> > }
>> >@@ -183,10 +182,14 @@ OvsParseTcp(const NET_BUFFER_LIST *packet,
>> >     TCPHdr tcpStorage;
>> >     const TCPHdr *tcp = OvsGetTcp(packet, layers->l4Offset,
>>&tcpStorage);
>> >     if (tcp) {
>> >-        flow->tpSrc = tcp->source;
>> >-        flow->tpDst = tcp->dest;
>> >-        layers->isTcp = 1;
>> >-        layers->l7Offset = layers->l4Offset + 4 * tcp->doff;
>> >+        if (flow) {
>> >+            flow->tpSrc = tcp->source;
>> >+            flow->tpDst = tcp->dest;
>> >+        }
>> >+        if (layers) {
>> >+            layers->isTcp = 1;
>> >+            layers->l7Offset = layers->l4Offset + 4 * tcp->doff;
>> >+        }
>> >     }
>> > }
>> >
>> >@@ -198,10 +201,14 @@ OvsParseSctp(const NET_BUFFER_LIST *packet,
>> >     SCTPHdr sctpStorage;
>> >     const SCTPHdr *sctp = OvsGetSctp(packet, layers->l4Offset,
>> >&sctpStorage);
>> >     if (sctp) {
>> >-        flow->tpSrc = sctp->source;
>> >-        flow->tpDst = sctp->dest;
>> >-        layers->isSctp = 1;
>> >-        layers->l7Offset = layers->l4Offset + sizeof *sctp;
>> >+        if (flow) {
>> >+            flow->tpSrc = sctp->source;
>> >+            flow->tpDst = sctp->dest;
>> >+        }
>> >+        if (layers) {
>> >+            layers->isSctp = 1;
>> >+            layers->l7Offset = layers->l4Offset + sizeof *sctp;
>> >+        }
>> >     }
>> > }
>> >
>> >@@ -213,29 +220,33 @@ OvsParseUdp(const NET_BUFFER_LIST *packet,
>> >     UDPHdr udpStorage;
>> >     const UDPHdr *udp = OvsGetUdp(packet, layers->l4Offset,
>>&udpStorage);
>> >     if (udp) {
>> >-        flow->tpSrc = udp->source;
>> >-        flow->tpDst = udp->dest;
>> >-        layers->isUdp = 1;
>> >-        if (udp->check == 0) {
>> >-            layers->udpCsumZero = 1;
>> >+        if (flow) {
>> >+            flow->tpSrc = udp->source;
>> >+            flow->tpDst = udp->dest;
>> >+        }
>> >+        if (layers) {
>> >+            layers->isUdp = 1;
>> >+            if (udp->check == 0) {
>> >+                layers->udpCsumZero = 1;
>> >+            }
>> >+            layers->l7Offset = layers->l4Offset + sizeof *udp;
>> >         }
>> >-        layers->l7Offset = layers->l4Offset + sizeof *udp;
>> >     }
>> > }
>> >
>> > NDIS_STATUS
>> > OvsParseIcmpV6(const NET_BUFFER_LIST *packet,
>> >-            OvsFlowKey *key,
>> >-            POVS_PACKET_HDR_INFO layers)
>> >+               Ipv6Key *ipv6Key,
>> >+               Icmp6Key *icmp6Key,
>> >+               POVS_PACKET_HDR_INFO layers)
>> > {
>> >     UINT16 ofs = layers->l4Offset;
>> >     ICMPHdr icmpStorage;
>> >     const ICMPHdr *icmp;
>> >-    Icmp6Key *flow = &key->icmp6Key;
>> >
>> >-    memset(&flow->ndTarget, 0, sizeof(flow->ndTarget));
>> >-    memset(flow->arpSha, 0, sizeof(flow->arpSha));
>> >-    memset(flow->arpTha, 0, sizeof(flow->arpTha));
>> >+    memset(&icmp6Key->ndTarget, 0, sizeof(icmp6Key->ndTarget));
>> >+    memset(icmp6Key->arpSha, 0, sizeof(icmp6Key->arpSha));
>> >+    memset(icmp6Key->arpTha, 0, sizeof(icmp6Key->arpTha));
>> >
>> >     icmp = OvsGetIcmp(packet, ofs, &icmpStorage);
>> >     if (!icmp) {
>> >@@ -247,8 +258,10 @@ OvsParseIcmpV6(const NET_BUFFER_LIST *packet,
>> >      * The ICMPv6 type and code fields use the 16-bit transport port
>> >      * fields, so we need to store them in 16-bit network byte order.
>> >      */
>> >-    key->ipv6Key.l4.tpSrc = htons(icmp->type);
>> >-    key->ipv6Key.l4.tpDst = htons(icmp->code);
>> >+    if (ipv6Key) {
>> >+        ipv6Key->l4.tpSrc = htons(icmp->type);
>> >+        ipv6Key->l4.tpDst = htons(icmp->code);
>> >+    }
>> >
>> >     if (icmp->code == 0 &&
>> >         (icmp->type == ND_NEIGHBOR_SOLICIT ||
>> >@@ -261,7 +274,7 @@ OvsParseIcmpV6(const NET_BUFFER_LIST *packet,
>> >         if (!ndTarget) {
>> >             return NDIS_STATUS_FAILURE;
>> >         }
>> >-        flow->ndTarget = *ndTarget;
>> >+        icmp6Key->ndTarget = *ndTarget;
>> >
>> >         while ((UINT32)(ofs + 8) <= OvsPacketLenNBL(packet)) {
>> >             /*
>> >@@ -288,14 +301,14 @@ OvsParseIcmpV6(const NET_BUFFER_LIST *packet,
>> >              * layer option is specified twice.
>> >              */
>> >             if (ndOpt->type == ND_OPT_SOURCE_LINKADDR && optLen == 8)
>>{
>> >-                if (Eth_IsNullAddr(flow->arpSha)) {
>> >-                    memcpy(flow->arpSha, ndOpt + 1, ETH_ADDR_LENGTH);
>> >+                if (Eth_IsNullAddr(icmp6Key->arpSha)) {
>> >+                    memcpy(icmp6Key->arpSha, ndOpt + 1,
>>ETH_ADDR_LENGTH);
>> >                 } else {
>> >                     goto invalid;
>> >                 }
>> >             } else if (ndOpt->type == ND_OPT_TARGET_LINKADDR && optLen
>> >== 8) {
>> >-                if (Eth_IsNullAddr(flow->arpTha)) {
>> >-                    memcpy(flow->arpTha, ndOpt + 1, ETH_ADDR_LENGTH);
>> >+                if (Eth_IsNullAddr(icmp6Key->arpTha)) {
>> >+                    memcpy(icmp6Key->arpTha, ndOpt + 1,
>>ETH_ADDR_LENGTH);
>> >                 } else {
>> >                     goto invalid;
>> >                 }
>> >@@ -309,9 +322,9 @@ OvsParseIcmpV6(const NET_BUFFER_LIST *packet,
>> >     return NDIS_STATUS_SUCCESS;
>> >
>> > invalid:
>> >-    memset(&flow->ndTarget, 0, sizeof(flow->ndTarget));
>> >-    memset(flow->arpSha, 0, sizeof(flow->arpSha));
>> >-    memset(flow->arpTha, 0, sizeof(flow->arpTha));
>> >+    RtlZeroMemory(&icmp6Key->ndTarget, sizeof(icmp6Key->ndTarget));
>> >+    RtlZeroMemory(icmp6Key->arpSha, sizeof(icmp6Key->arpSha));
>> >+    RtlZeroMemory(icmp6Key->arpTha, sizeof(icmp6Key->arpTha));
>> >
>> >     return NDIS_STATUS_FAILURE;
>> > }
>> >diff --git a/datapath-windows/ovsext/PacketParser.h
>> >b/datapath-windows/ovsext/PacketParser.h
>> >index 47d227f..f1d7f28 100644
>> >--- a/datapath-windows/ovsext/PacketParser.h
>> >+++ b/datapath-windows/ovsext/PacketParser.h
>> >@@ -22,7 +22,7 @@
>> >
>> > const VOID* OvsGetPacketBytes(const NET_BUFFER_LIST *_pNB, UINT32 len,
>> >                               UINT32 SrcOffset, VOID *storage);
>> >-NDIS_STATUS OvsParseIPv6(const NET_BUFFER_LIST *packet, OvsFlowKey
>> *key,
>> >+NDIS_STATUS OvsParseIPv6(const NET_BUFFER_LIST *packet, Ipv6Key *key,
>> >                         POVS_PACKET_HDR_INFO layers);
>> > VOID OvsParseTcp(const NET_BUFFER_LIST *packet, L4Key *flow,
>> >                  POVS_PACKET_HDR_INFO layers);
>> >@@ -30,8 +30,10 @@ VOID OvsParseUdp(const NET_BUFFER_LIST *packet,
>> L4Key
>> >*flow,
>> >                  POVS_PACKET_HDR_INFO layers);
>> > VOID OvsParseSctp(const NET_BUFFER_LIST *packet, L4Key *flow,
>> >                   POVS_PACKET_HDR_INFO layers);
>> >-NDIS_STATUS OvsParseIcmpV6(const NET_BUFFER_LIST *packet, OvsFlowKey
>> >*key,
>> >-                            POVS_PACKET_HDR_INFO layers);
>> >+NDIS_STATUS OvsParseIcmpV6(const NET_BUFFER_LIST *packet,
>> >+                           Ipv6Key *ipv6Key,
>> >+                           Icmp6Key *flow,
>> >+                           POVS_PACKET_HDR_INFO layers);
>> >
>> > static __inline ULONG
>> > OvsPacketLenNBL(const NET_BUFFER_LIST *_pNB)
>> >diff --git a/datapath-windows/ovsext/Stt.c
>>b/datapath-windows/ovsext/Stt.c
>> >index dd7bf92..c547837 100644
>> >--- a/datapath-windows/ovsext/Stt.c
>> >+++ b/datapath-windows/ovsext/Stt.c
>> >@@ -217,6 +217,8 @@ OvsDoEncapStt(POVS_VPORT_ENTRY vport,
>> >         } else {
>> >             innerPartialChecksum = TRUE;
>> >         }
>> >+    } else if (!layers->isIPv4) {
>> >+            innerChecksumVerified = TRUE;
>> >     }
>> >
>> >     status = NdisRetreatNetBufferDataStart(curNb, headRoom, 0, NULL);
>> >@@ -231,8 +233,8 @@ OvsDoEncapStt(POVS_VPORT_ENTRY vport,
>> >      * memory.
>> >      */
>> >     curMdl = NET_BUFFER_CURRENT_MDL(curNb);
>> >-    ASSERT((int) (MmGetMdlByteCount(curMdl) -
>> >NET_BUFFER_CURRENT_MDL_OFFSET(curNb))
>> >-                >= (int) headRoom);
>> >+    ASSERT((int) (MmGetMdlByteCount(curMdl) -
>> >+                NET_BUFFER_CURRENT_MDL_OFFSET(curNb)) >= (int)
>>headRoom);
>> >
>> >     buf = (PUINT8) MmGetSystemAddressForMdlSafe(curMdl,
>>LowPagePriority);
>> >     if (!buf) {
>> >@@ -288,8 +290,10 @@ OvsDoEncapStt(POVS_VPORT_ENTRY vport,
>> >     /* Calculate pseudo header chksum */
>> >     tcpChksumLen = sizeof(TCPHdr) + STT_HDR_LEN + innerFrameLen;
>> >     ASSERT(tcpChksumLen < 65535);
>> >-    outerTcpHdr->check = IPPseudoChecksum(&fwdInfo->srcIpAddr,(uint32 
>>*)
>> >&tunKey->dst,
>> >-                                          IPPROTO_TCP, (uint16)
>> >tcpChksumLen);
>> >+    outerTcpHdr->check = IPPseudoChecksum(&fwdInfo->srcIpAddr,
>> >+                                          (uint32 *) &tunKey->dst,
>> >+                                          IPPROTO_TCP,
>> >+                                          (uint16) tcpChksumLen);
>> >     sttHdr->version = 0;
>> >
>> >     /* Set STT Header */
>> >@@ -327,8 +331,16 @@ OvsDoEncapStt(POVS_VPORT_ENTRY vport,
>> >     NET_BUFFER_LIST_INFO(curNbl,
>> >                          TcpIpChecksumNetBufferListInfo) =
>> >csumInfo.Value;
>> >
>> >-    UINT32 encapMss = OvsGetExternalMtu(switchContext) - 
>>sizeof(IPHdr) -
>> >sizeof(TCPHdr);
>> >+    UINT32 encapMss = OvsGetExternalMtu(switchContext)
>> >+                      - sizeof(IPHdr)
>> >+                      - sizeof(TCPHdr);
>> >     if (ipTotalLen > encapMss) {
>> >+        outerIpHdr->check = IPChecksum((UINT8 *)outerIpHdr,
>> >+                                       sizeof *outerIpHdr, 0);
>> >+        outerTcpHdr->check = IPPseudoChecksum(&fwdInfo->srcIpAddr,
>> >+                                              (uint32 *) &tunKey->dst,
>> >+                                              IPPROTO_TCP, (uint16) 
>>0);
>> >+
>> >         lsoInfo.Value = 0;
>> >         lsoInfo.LsoV2Transmit.TcpHeaderOffset = tcpHeaderOffset;
>> >         lsoInfo.LsoV2Transmit.MSS = encapMss;
>> >@@ -616,7 +628,8 @@ OvsSttReassemble(POVS_SWITCH_CONTEXT
>> switchContext,
>> >
>> >         UINT64 currentTime;
>> >         NdisGetCurrentSystemTime((LARGE_INTEGER *) &currentTime);
>> >-        entry->timeout = currentTime + STT_ENTRY_TIMEOUT;
>> >+        // use IpHdr TTL for fragment expiration
>> >+        entry->timeout = currentTime + 
>>((UINT64)ipHdr->ttl*1000*1000*10);
>> >
>> >         if (segOffset == 0) {
>> >             entry->sttHdr = *sttHdr;
>> >@@ -655,7 +668,8 @@ handle_error:
>> >     if (lastPacket) {
>> >         /* Retrieve the original STT header */
>> >         NdisMoveMemory(newSttHdr, &pktFragEntry->sttHdr, sizeof
>> >(SttHdr));
>> >-        targetPNbl = OvsAllocateNBLFromBuffer(switchContext,
>> >pktFragEntry->packetBuf,
>> >+        targetPNbl = OvsAllocateNBLFromBuffer(switchContext,
>> >+                                              pktFragEntry->packetBuf,
>> >                                               innerPacketLen);
>> >
>> >         /* Delete this entry and free up the memory/ */
>> >@@ -668,16 +682,68 @@ handle_error:
>> >     return lastPacket ? targetPNbl : NULL;
>> > }
>> >
>> >-VOID
>> >-OvsDecapSetOffloads(PNET_BUFFER_LIST curNbl, SttHdr *sttHdr)
>> >+
>> >+/*
>> 
>>>+*----------------------------------------------------------------------
>>>--
>> >----
>> >+* OvsDecapApplyOffloads
>> >+*     Processes received STT header and sets
>> >TcpIpChecksumNetBufferListInfo
>> >+*     accordingly.
>> >+*     For TCP packets with total length bigger than destination MSS it
>> >+*     populates TcpLargeSendNetBufferListInfo.
>> >+*
>> >+* Returns NDIS_STATUS_SUCCESS normally.
>> >+* Fails only if packet data is invalid.
>> >+* (e.g. if OvsExtractLayers() returns an error).
>> 
>>>+*----------------------------------------------------------------------
>>>--
>> >----
>> >+*/
>> >+NDIS_STATUS
>> >+OvsDecapApplyOffloads(POVS_SWITCH_CONTEXT switchContext,
>> >+                    PNET_BUFFER_LIST *curNbl, SttHdr *sttHdr)
>> > {
>> >-    if ((sttHdr->flags & STT_CSUM_VERIFIED)
>> >-        || !(sttHdr->flags & STT_CSUM_PARTIAL)) {
>> >-        return;
>> >+    NDIS_STATUS status;
>> >+    OVS_PACKET_HDR_INFO layers;
>> >+    NDIS_TCP_IP_CHECKSUM_NET_BUFFER_LIST_INFO csumInfo;
>> >+
>> >+    // if STT_CSUM_PARTIAL is not set we have two options:
>> >+    //     - STT_CSUM_VERIFIED is set - we only set IsIPv4/IsIPv6
>> >+    //     - no flag set - we must compute the checksums
>> >+    if (!(sttHdr->flags & STT_CSUM_PARTIAL)) {
>> >+        status = OvsExtractLayers(*curNbl, &layers);
>> >+        if (status != NDIS_STATUS_SUCCESS) {
>> >+            return status;
>> >+        }
>> >+
>> >+        csumInfo.Value = 0;
>> >+        csumInfo.Transmit.IsIPv4 = layers.isIPv4;
>> >+        csumInfo.Transmit.IsIPv6 = layers.isIPv6;
>> >+
>> >+        if (sttHdr->flags & STT_CSUM_VERIFIED) {
>> >+            NET_BUFFER_LIST_INFO(*curNbl,
>> >+                TcpIpChecksumNetBufferListInfo) = csumInfo.Value;
>> >+            return NDIS_STATUS_SUCCESS;
>> >+        }
>> >+
>> >+        /* Set Transmit fields in order to calculate the checksums */
>> >+        csumInfo.Transmit.IpHeaderChecksum = layers.isIPv4;
>> >+        csumInfo.Transmit.TcpChecksum = layers.isTcp;
>> >+        csumInfo.Transmit.UdpChecksum = layers.isUdp;
>> >+
>> >+        status = OvsApplySWChecksumOnNB(&layers, *curNbl, &csumInfo);
>> >+        if (status != NDIS_STATUS_SUCCESS) {
>> >+            return status;
>> >+        }
>> >+
>> >+        csumInfo.Value = 0;
>> >+        csumInfo.Transmit.IpHeaderChecksum = 0;
>> >+        csumInfo.Transmit.TcpChecksum = 0;
>> >+        csumInfo.Transmit.UdpChecksum = 0;
>> >+        NET_BUFFER_LIST_INFO(*curNbl,
>> >+            TcpIpChecksumNetBufferListInfo) = csumInfo.Value;
>> >+
>> >+        return NDIS_STATUS_SUCCESS;
>> >     }
>> >
>> >     UINT8 protoType;
>> >-    NDIS_TCP_IP_CHECKSUM_NET_BUFFER_LIST_INFO csumInfo;
>> >     csumInfo.Value = 0;
>> >     csumInfo.Transmit.IpHeaderChecksum = 0;
>> >     csumInfo.Transmit.TcpHeaderOffset = sttHdr->l4Offset;
>> >@@ -703,14 +769,58 @@ OvsDecapSetOffloads(PNET_BUFFER_LIST curNbl,
>> SttHdr
>> >*sttHdr)
>> >             csumInfo.Transmit.IsIPv6 = 1;
>> >             csumInfo.Transmit.UdpChecksum = 1;
>> >     }
>> >-    NET_BUFFER_LIST_INFO(curNbl,
>> >+    NET_BUFFER_LIST_INFO(*curNbl,
>> >                          TcpIpChecksumNetBufferListInfo) =
>> >csumInfo.Value;
>> >
>> >     if (sttHdr->mss) {
>> >         NDIS_TCP_LARGE_SEND_OFFLOAD_NET_BUFFER_LIST_INFO lsoInfo;
>> >+
>> >+        if (sttHdr->flags & STT_PROTO_TCP)
>> >+        {
>> >+            PMDL curMdl = NULL;
>> >+            PNET_BUFFER curNb;
>> >+            PUINT8 buf = NULL;
>> >+
>> >+            status = OvsExtractLayers(*curNbl, &layers);
>> >+            if (status != NDIS_STATUS_SUCCESS) {
>> >+                return status;
>> >+            }
>> >+
>> >+            curNb = NET_BUFFER_LIST_FIRST_NB(*curNbl);
>> >+            curMdl = NET_BUFFER_CURRENT_MDL(curNb);
>> >+
>> >+            buf = (PUINT8)MmGetSystemAddressForMdlSafe(curMdl,
>> >+                LowPagePriority);
>> >+            buf += NET_BUFFER_CURRENT_MDL_OFFSET(curNb);
>> >+
>> >+            // apply pseudo checksum on extracted packet
>> >+            if (sttHdr->flags & STT_PROTO_IPV4) {
>> >+                IPHdr *ipHdr;
>> >+                TCPHdr *tcpHdr;
>> >+
>> >+                ipHdr = (IPHdr *)(buf + layers.l3Offset);
>> >+                tcpHdr = (TCPHdr *)(buf + layers.l4Offset);
>> >+
>> 
>> Sai: Why are you recomputing checksum here? Doesn¹t this mean your
>> previous changes have been reset?
>> 
>> >+                tcpHdr->check = IPPseudoChecksum(&ipHdr->saddr,
>> >+                                                 (uint32 
>>*)&ipHdr->daddr,
>> >+                                                 IPPROTO_TCP, 0);
>> >+            } else {
>> >+                IPv6Hdr *ipHdr;
>> >+                TCPHdr *tcpHdr;
>> >+
>> >+                ipHdr = (IPv6Hdr *)(buf + layers.l3Offset);
>> >+                tcpHdr = (TCPHdr *)(buf + layers.l4Offset);
>> >+
>> >+                tcpHdr->check =
>> >IPv6PseudoChecksum((UINT32*)&ipHdr->saddr,
>> >+                                           (UINT32*)&ipHdr->daddr,
>> >+                                           IPPROTO_TCP, 0);
>> >+            }
>> >+        }
>> >+
>> >+        // setup LSO
>> >         lsoInfo.Value = 0;
>> >         lsoInfo.LsoV2Transmit.TcpHeaderOffset = sttHdr->l4Offset;
>> >-        lsoInfo.LsoV2Transmit.MSS = ETH_DEFAULT_MTU
>> 
>> Sai: We can¹t use the ExternalMTU as MSS for inner VM.
>> 
>> >+        lsoInfo.LsoV2Transmit.MSS = OvsGetExternalMtu(switchContext)
>> >                                     - sizeof(IPHdr)
>> >                                     - sizeof(TCPHdr);
>> >         lsoInfo.LsoV2Transmit.Type =
>> NDIS_TCP_LARGE_SEND_OFFLOAD_V2_TYPE;
>> >@@ -719,9 +829,11 @@ OvsDecapSetOffloads(PNET_BUFFER_LIST curNbl,
>> SttHdr
>> >*sttHdr)
>> >         } else {
>> >             lsoInfo.LsoV2Transmit.IPVersion =
>> >NDIS_TCP_LARGE_SEND_OFFLOAD_IPv6;
>> >         }
>> >-        NET_BUFFER_LIST_INFO(curNbl,
>> >+        NET_BUFFER_LIST_INFO(*curNbl,
>> >                              TcpLargeSendNetBufferListInfo) =
>> >lsoInfo.Value;
>> >     }
>> >+
>> >+    return NDIS_STATUS_SUCCESS;
>> > }
>> >
>> > /*
>> >@@ -736,15 +848,14 @@ OvsDecapStt(POVS_SWITCH_CONTEXT
>> switchContext,
>> >             OvsIPv4TunnelKey *tunKey,
>> >             PNET_BUFFER_LIST *newNbl)
>> > {
>> >-    NDIS_STATUS status = NDIS_STATUS_FAILURE;
>> >-    PNET_BUFFER curNb, newNb;
>> >+    NDIS_STATUS status;
>> >+    PNET_BUFFER curNb;
>> >     IPHdr *ipHdr;
>> >     char *ipBuf[sizeof(IPHdr)];
>> >     SttHdr stt;
>> >     SttHdr *sttHdr;
>> >     char *sttBuf[STT_HDR_LEN];
>> >     UINT32 advanceCnt, hdrLen;
>> >-    BOOLEAN isLsoPacket = FALSE;
>> >
>> >     curNb = NET_BUFFER_LIST_FIRST_NB(curNbl);
>> >     ASSERT(NET_BUFFER_NEXT_NB(curNb) == NULL);
>> >@@ -767,7 +878,7 @@ OvsDecapStt(POVS_SWITCH_CONTEXT switchContext,
>> >     TCPHdr *tcp = (TCPHdr *)((PCHAR)ipHdr + ipHdr->ihl * 4);
>> >
>> >     /* Skip IP & TCP headers */
>> >-    hdrLen = sizeof(IPHdr) + sizeof(TCPHdr),
>> >+    hdrLen = (ipHdr->ihl * 4) + (tcp->doff * 4);
>> >     NdisAdvanceNetBufferDataStart(curNb, hdrLen, FALSE, NULL);
>> >     advanceCnt += hdrLen;
>> >
>> >@@ -775,7 +886,7 @@ OvsDecapStt(POVS_SWITCH_CONTEXT switchContext,
>> >     UINT32 totalLen = (seq >> STT_SEQ_LEN_SHIFT);
>> >     UINT16 payloadLen = (UINT16)ntohs(ipHdr->tot_len)
>> >                         - (ipHdr->ihl * 4)
>> >-                        - (sizeof * tcp);
>> >+                        - (tcp->doff * 4);
>> >
>> >     /* Check if incoming packet requires reassembly */
>> >     if (totalLen != payloadLen) {
>> >@@ -788,7 +899,6 @@ OvsDecapStt(POVS_SWITCH_CONTEXT switchContext,
>> >         }
>> >
>> >         *newNbl = pNbl;
>> >-        isLsoPacket = TRUE;
>> >     } else {
>> >         /* STT Header */
>> >         sttHdr = NdisGetDataBuffer(curNb, sizeof *sttHdr,
>> >@@ -812,7 +922,6 @@ OvsDecapStt(POVS_SWITCH_CONTEXT switchContext,
>> >         OvsCompleteNBL(switchContext, *newNbl, TRUE);
>> >         return NDIS_STATUS_FAILURE;
>> >     }
>> >-    newNb = NET_BUFFER_LIST_FIRST_NB(*newNbl);
>> >
>> >     ASSERT(sttHdr);
>> >
>> >@@ -826,7 +935,7 @@ OvsDecapStt(POVS_SWITCH_CONTEXT switchContext,
>> >     tunKey->pad = 0;
>> >
>> >     /* Set Checksum and LSO offload flags */
>> >-    OvsDecapSetOffloads(*newNbl, sttHdr);
>> >+    OvsDecapApplyOffloads(switchContext, newNbl, sttHdr);
>> >
>> >     return NDIS_STATUS_SUCCESS;
>> > }
>> >diff --git a/datapath-windows/ovsext/Stt.h 
>>b/datapath-windows/ovsext/Stt.h
>> >index a3e3915..20066e6 100644
>> >--- a/datapath-windows/ovsext/Stt.h
>> >+++ b/datapath-windows/ovsext/Stt.h
>> >@@ -36,7 +36,6 @@
>> >
>> > #define STT_HASH_TABLE_SIZE ((UINT32)1 << 10)
>> > #define STT_HASH_TABLE_MASK (STT_HASH_TABLE_SIZE - 1)
>> >-#define STT_ENTRY_TIMEOUT 300000000   // 30s
>> > #define STT_CLEANUP_INTERVAL 300000000 // 30s
>> >
>> > #define STT_ETH_PAD 2
>> >diff --git a/datapath-windows/ovsext/User.c
>> >b/datapath-windows/ovsext/User.c
>> >index 34f38f4..33cbd89 100644
>> >--- a/datapath-windows/ovsext/User.c
>> >+++ b/datapath-windows/ovsext/User.c
>> >@@ -459,7 +459,8 @@ NTSTATUS
>> > OvsPurgeDpIoctl(PFILE_OBJECT fileObject)
>> > {
>> >     POVS_OPEN_INSTANCE instance =
>> >(POVS_OPEN_INSTANCE)fileObject->FsContext;
>> >-    POVS_USER_PACKET_QUEUE queue =
>> >(POVS_USER_PACKET_QUEUE)instance->packetQueue;
>> >+    POVS_USER_PACKET_QUEUE queue =
>> >+        (POVS_USER_PACKET_QUEUE)instance->packetQueue;
>> >
>> >     if (queue == NULL) {
>> >         return STATUS_INVALID_PARAMETER;
>> >@@ -736,7 +737,8 @@ OvsCreateAndAddPackets(PVOID userData,
>> >         NDIS_TCP_LARGE_SEND_OFFLOAD_NET_BUFFER_LIST_INFO tsoInfo;
>> >         UINT32 packetLength;
>> >
>> >-        tsoInfo.Value = NET_BUFFER_LIST_INFO(nbl,
>> >TcpLargeSendNetBufferListInfo);
>> >+        tsoInfo.Value = NET_BUFFER_LIST_INFO(nbl,
>> >+            TcpLargeSendNetBufferListInfo);
>> >         nb = NET_BUFFER_LIST_FIRST_NB(nbl);
>> >         packetLength = NET_BUFFER_DATA_LENGTH(nb);
>> >
>> >@@ -838,7 +840,8 @@ OvsCompletePacketHeader(UINT8 *packet,
>> >                                          (UINT32
>> >*)&ipHdr->DestinationAddress,
>> >                                          IPPROTO_TCP,
>> >hdrInfoOut->l4PayLoad);
>> >         } else {
>> >-            PIPV6_HEADER ipv6Hdr = (PIPV6_HEADER)(packet +
>> >hdrInfoIn->l3Offset);
>> >+            PIPV6_HEADER ipv6Hdr = (PIPV6_HEADER)(packet +
>> >+                hdrInfoIn->l3Offset);
>> >             hdrInfoOut->l4PayLoad =
>> >                 (UINT16)(ntohs(ipv6Hdr->PayloadLength) +
>> >                 hdrInfoIn->l3Offset + sizeof(IPV6_HEADER)-
>> >@@ -852,9 +855,9 @@ OvsCompletePacketHeader(UINT8 *packet,
>> >         hdrInfoOut->tcpCsumNeeded = 1;
>> >         ovsUserStats.recalTcpCsum++;
>> >     } else if (!isRecv) {
>> >-        if (csumInfo.Transmit.TcpChecksum) {
>> >+        if (hdrInfoIn->isTcp && csumInfo.Transmit.TcpChecksum) {
>> >             hdrInfoOut->tcpCsumNeeded = 1;
>> >-        } else if (csumInfo.Transmit.UdpChecksum) {
>> >+        } else if (hdrInfoIn->isUdp && csumInfo.Transmit.UdpChecksum) 
>>{
>> >             hdrInfoOut->udpCsumNeeded = 1;
>> >         }
>> >         if (hdrInfoOut->tcpCsumNeeded || hdrInfoOut->udpCsumNeeded) {
>> >@@ -864,7 +867,8 @@ OvsCompletePacketHeader(UINT8 *packet,
>> >                 hdrInfoOut->tcpCsumNeeded ? IPPROTO_TCP : IPPROTO_UDP;
>> > #endif
>> >             if (hdrInfoIn->isIPv4) {
>> >-                PIPV4_HEADER ipHdr = (PIPV4_HEADER)(packet +
>> >hdrInfoIn->l3Offset);
>> >+                PIPV4_HEADER ipHdr = (PIPV4_HEADER)(packet +
>> >+                    hdrInfoIn->l3Offset);
>> >                 hdrInfoOut->l4PayLoad =
>> >(UINT16)(ntohs(ipHdr->TotalLength) -
>> >                     (ipHdr->HeaderLength << 2));
>> > #ifdef DBG
>> >@@ -972,8 +976,8 @@ OvsCreateQueueNlPacket(PVOID userData,
>> >     csumInfo.Value = NET_BUFFER_LIST_INFO(nbl,
>> >TcpIpChecksumNetBufferListInfo);
>> >
>> >     if (isRecv && (csumInfo.Receive.TcpChecksumFailed ||
>> >-                  (csumInfo.Receive.UdpChecksumFailed &&
>> >!hdrInfo->udpCsumZero) ||
>> >-                  csumInfo.Receive.IpChecksumFailed)) {
>> >+            (csumInfo.Receive.UdpChecksumFailed &&
>> >!hdrInfo->udpCsumZero) ||
>> >+            csumInfo.Receive.IpChecksumFailed)) {
>> >         OVS_LOG_INFO("Packet dropped due to checksum failure.");
>> >         ovsUserStats.dropDuetoChecksum++;
>> >         return NULL;
>> >--
>> >2.7.2.windows.1
>> >
>> >_______________________________________________
>> >dev mailing list
>> >dev at openvswitch.org
>> >https://urldefense.proofpoint.com/v2/url?u=http-
>> 3A__openvswitch.org_mailma
>> >n_listinfo_dev&d=BQIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-
>> uEs&r=Dc
>> >ruz40PROJ40ROzSpxyQSLw6fcrOWpJgEcEmNR3JEQ&m=RdwsCDAI2yfo1TYwm
>> cBC9a_itZZsD-
>> >eTS48y7bP8hUk&s=Avc2We4lg6NREQbdMXwz-VDKZWtz17bHf4xvrocCeaY&e=
>> 
>
>



More information about the dev mailing list