[ovs-dev] [PATCH 0/3] Add VxLAN encap support for tc offload.

William Tu u9012063 at gmail.com
Wed Jul 8 19:07:10 UTC 2020


On Wed, Jul 08, 2020 at 07:55:58PM +0200, Ilya Maximets wrote:
> On 7/8/20 6:10 PM, William Tu wrote:
> > The patch adds VxLAN encap tc-offload support.  The userspace datapath, dpif-netdev,
> > flow format differs than the kernel datapath in case of tunnel encap.  Unlike kernel,
> > the dpif-netdev does not use set and output action, but uses a single clone action with
> > all the tunnel info nested inside.  As an exmaple blow:
> > actions:clone(tnl_push(tnl_port(5),
> >   header(size=50,type=4,eth(dst=06:1d:6e:a3:f1:61,src=26:df:25:f6:7b:4f,dl_type=0x0800),
> >     ipv4(src=172.31.1.100,dst=172.31.1.1,proto=17,tos=0,ttl=64,frag=0x4000),
> >     udp(src=0,dst=4789,csum=0x0),
> >     vxlan(flags=0x8000000,vni=0x0)),out_port(2)
> >   ), 3)
> > 
> > The patch parses the above tunnel encap format and passes to the tc for
> > offloading the VxLAN tunnel. The idea is similar to the recent dpdk
> > offload patchset:
> >   netdev-offload-dpdk: Support offload of clone tnl_push/output actions
snip

> 
> Hi.
> 
> That is interesting thing.  I didn't look to the code, but I have a question.
> IIUC, you're running userspace datapath with some linux ports and linux_tc
> offloading provider enabled for them.  I tried this combination previously
> and it has an issue that having a RAW socket open, even if packet was redirected
> by TC to another OVS port, we will still receive it via RAW socket at least
> on the destination port.  I'm not sure how to work around this issue.
> Do you have any thoughts?

Yes, I encountered the same issue.
IIUC, the reason is when registering a raw socket, at kernel __netif_receive_skb_core(),
the packet is delivered to raw socket first, then calls the sch_handle_ingress().
So even though at TC layer we return TC_ACT_SHOT, the packet is already delivered
to raw socket and seen by OVS. And this causes my ping test reporting

64 bytes from 10.1.1.2: icmp_seq=7 ttl=64 time=0.503 ms (DUP!)
64 bytes from 10.1.1.2: icmp_seq=7 ttl=64 time=0.508 ms (DUP!)

Even using afxdp socket has the same issue, because the skb deliver point, 
do_xdp_generic() is also before tc. 

> 
> Or you're using HW offloading with afxdp or/and skip_sw flag?  I guess, there
> should be no such issue in this case if packet never reaches the kernel tc.
> 

I don't have a solution to this problem, but my plan is that
So for testing, I'm using the software tc-flower, skip_hw.
And once everything works, we should use HW offload (skip_sw) with afxdp.

> BTW, I merged the patch-set from Eli, so first two patches are in repository
> now.
> 

Thanks for your comment, I will work on v2.
William


More information about the dev mailing list