[ovs-dev] [PATCH RFC] netdev-dpdk: Rx checksum offloading feature on DPDK physical ports.

Tue Aug 16 10:06:48 UTC 2016

Hi Jesse,
Thank you for looking into the patch
Please find my comments below,

Regards
_Sugesh

> -----Original Message-----
> From: Jesse Gross [mailto:jesse at kernel.org]
> Sent: Monday, August 15, 2016 6:44 PM
> To: Chandran, Sugesh <sugesh.chandran at intel.com>
> Cc: ovs dev <dev at openvswitch.org>
> Subject: Re: [PATCH RFC] netdev-dpdk: Rx checksum offloading feature on
> DPDK physical ports.
> 
> On Mon, Aug 8, 2016 at 3:03 AM, Sugesh Chandran
> <sugesh.chandran at intel.com> wrote:
> > To enable checksum offloading at rx side while adding a port, add the
> > 'rx-checksum-offload' option to the 'ovs-vsctl add-port' command-line
> > as below,
> >
> > 'ovs-vsctl add-port br0 dpdk0 -- \
> > set Interface dpdk0 type=dpdk options:rx-checksum-offload=true'
> 
> Is there a reason not to enable this by default? It seems like it would be
> beneficial to more users that way.
> 
> I think we should also add some documentation for this new option.
> 
[Sugesh] : I will add the documentation in the v2 release of patch.
My concern over make it enabled by default is, the possible performance 
impact of being  Vectorization OFF which is completely depends on the traffic pattern
and packet size. Though I haven’t find any thing as such in my testing. This will
avoid any surprises for the customer. Any comments?

> > 2) Normally, OVS generates checksum for tunnel packets. It is more
> > efficient to calculate the checksum at 'tunnel_push' operation itself
> > as all the relevant fields are already present in the cache and also
> > no additional validation overhead is involved as in checksum offloading.
> 
> I don't quite understand this comment. This is more efficient compared to
> what? Presumably, at least for large packets, there would still be some
> benefit if the issue with losing vectorization wasn't present.
[Sugesh] Sorry for the confusion. This is about generating checksum 
for tunnel packets at encap side(i.e tx checksum offload). What I meant here 
is generating checksum for tunnel packets at the time of 'tunnel_push' is more efficient  than 
offloading into the NIC. 
In detail, At the time of packet  encapsulation  in 'tunnel_push' , the egress port for 
the tunnel packet is unknown. The tunnel packets are recirculated in OVS to decide the 
action, which can be  either send to a phy port with tx checksum offload or send to phy port 
without tx offload, or even send to any other virtual/kernel port.
So to offload the Tx checksum in NIC, 
*) Mark the packet for checksum offload at the time of tunnel_push
*) At 'netdev_send' validate every packet for tx_checksum offload flag,
*) If tx checksum offload-flag is set,  Calculate the checksum in software or
hardware based on NIC offload configuration.

While implementing, we found that the validating every packet at netdev_send
+ vectorization OFF make performance worse than calculating checksum at the time of
'tunnel_push' because the cache is warm.
The worst case is sending packet to a port with no tx checksum offload. It makes additional
overhead of 
1) Marking every packet with offload flag at the time of tunnel_push
2) validate every packet for offload flag at netdev_send,
3) Since the port doesn’t support offload, load all the fields into cache and calculate the checksum 
in software before sending out.
Please let me know if anything is wrong / not clear?

> 
> > diff --git a/lib/dp-packet.h b/lib/dp-packet.h index 7c1e637..f2e9d4f
> > 100644
> > --- a/lib/dp-packet.h
> > +++ b/lib/dp-packet.h
> > @@ -38,6 +38,14 @@ enum OVS_PACKED_ENUM dp_packet_source {
> >
> >  #define DP_PACKET_CONTEXT_SIZE 64
> >
> > +enum hw_pkt_ol_flags {
> > +    OL_RX_IP_CKSUM_GOOD = (1 << 0),
> > +    OL_RX_L4_CKSUM_GOOD = (1 << 1),
> > +    OL_RX_IP_CKSUM_BAD = (1 << 2),
> > +    OL_RX_L4_CKSUM_BAD = (1 << 3),
> 
> Is it necessary to redefine this fields in OVS instead of just using them from
> the DPDK mbuf directly? This means that we have to do mapping and other
> checks at runtime, which if I remember correctly resulted in a performance
> hit when you posted numbers before.
[Sugesh] : Yes, the performance improvement will be slightly less due to these new
flags. The idea here is  to generalize the hardware offloading , so that it can be 
extended for other hardware offloading as well. Something like exposing the hardware
offloading to virtual ports/tx offloading as well in the future? Comments?

 Just to confirm, if we are not using the new flags, still we need to verify 
the source of packet like below,
	if (packet->source == DPBUF_DPDK) 
			/* Do the checksum validation */

The problem with this approach is, In future, how a packet from virtual port will use the 
hardware offload ? How to enable the any offload at the tx side?

> 
> I think we could also simplify things by not checking
> NETDEV_RX_CHECKSUM_OFFLOAD for each packet. Presumably if checksum
> offloading isn't configured on the interface then it wouldn't set checksum
> good on the packet.
[Sugesh] : Yes, the packet must have CHECKUM_UNKNOWN flag is set on 
Interface without checksum offloading. Again still we need to check
for the packet source , before checking these flags.

I can change the logic to avoid using these flags if its OK to limit the scope
only for rx checksum offloading for now.

> 
> > diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c index
> > ce2582f..bad2c55 100644
> > --- a/lib/netdev-native-tnl.c
> > +++ b/lib/netdev-native-tnl.c
> > @@ -178,7 +179,10 @@ udp_extract_tnl_md(struct dp_packet *packet,
> struct flow_tnl *tnl,
> >          return NULL;
> >      }
> >
> > -    if (udp->udp_csum) {
> > +    if (packet->md.ol_flags & OL_RX_L4_CKSUM_GOOD) {
> > +        tnl->flags |= FLOW_TNL_F_CSUM;
> > +    }
> > +    else if (udp->udp_csum) {
> >          uint32_t csum;
> >          if (netdev_tnl_is_header_ipv6(dp_packet_data(packet))) {
> >              csum = packet_csum_pseudoheader6(dp_packet_l3(packet));
> 
> I think I would put the check for OL_RX_L4_CKSUM_GOOD inside the if (udp-
> >udp_csum). It's more logical that way and also avoids any possible problems
> with drivers that might mistakenly report checksum of 0 as a validated,
> correct checksum.
[Sugesh] Sure, will make the change in the v2 version of patch.