[ovs-dev] [PATCH 00/25] netdev datapath vxlan offload

William Tu u9012063 at gmail.com
Wed Jul 8 23:36:16 UTC 2020


On Thu, Mar 5, 2020 at 11:12 AM Pravin Shelar <pshelar at ovn.org> wrote:
>
> On Sun, Mar 1, 2020 at 8:25 PM Sriharsha Basavapatna via dev
> <ovs-dev at openvswitch.org> wrote:
> >
> > On Tue, Feb 18, 2020 at 3:30 PM Eli Britstein <elibr at mellanox.com> wrote:
> > >
> > >
> > > On 2/10/2020 11:16 PM, Hemal Shah wrote:
> > > > Eli,
> > > >
> > > > There are some fundamental architecture issues (multi HW tables vs.
> > > > single HW table, use of rte_flow JUMP action for mapping ovs sw
> > > > tnl_pop action, etc.) that we need to discuss before we can get into
> > > > the details of the patchset.
> > > > I'm inlining my comments on the discussion between you and Harsha below.
> > > >
> > > > Hemal
> > > >
> > > >
> > > > On Wed, Feb 5, 2020 at 6:31 AM Eli Britstein <elibr at mellanox.com
> > > > <mailto:elibr at mellanox.com>> wrote:
> > > >
> > > >
> > > >     On 2/3/2020 7:05 PM, Sriharsha Basavapatna wrote:
> > > >     > Hi Eli,
> > > >     >
> > > >     > Thanks for sending this patchset. I have some questions about the
> > > >     > design, please see my comments below.
> > > >     >
> > > >     > Thanks,
> > > >     > -Harsha
> > > >     >
> > > >     > On Mon, Jan 20, 2020 at 8:39 PM Eli Britstein
> > > >     <elibr at mellanox.com <mailto:elibr at mellanox.com>> wrote:
> > > >     >> In the netdev datapath, packets arriving over a tunnel are
> > > >     processed by
> > > >     >> more than one flow. For example a packet that should be
> > > >     decapsulated and
> > > >     >> forwarded to another port is processed by two flows:
> > > >     >> 1. in_port=<PF>, outer-header matches, dst-port=4789 (VXLAN),
> > > >     >>     actions:tnl_pop(vxlan_sys_4789).
> > > >     >> 2. in_port=vxlan_sys_4789, tunnel matches (VNI for example),
> > > >     >>     inner-header matches, actions: vm1.
> > > >     >>
> > > >     >> In order to offload such a multi-flow processing path, a
> > > >     multi-table HW
> > > >     >> model is used.
> > > >     > Did you explore other ways to avoid this multi-flow processing
> > > >     model ?
> > > >     > For example, may be support some additional logic to merge the two
> > > >     > datapath flow-rules for decap into a single flow-rule in the offload
> > > >     > layer ?
> > > >     Yes, we did explore this approach and our research lead us to
> > > >     conclude
> > > >     that following the OVS multi-table SW model in HW proves to be a more
> > > >     robust architecture.
> > > >
> > > > [Hemal] It will be good and prudent for you to share details of that
> > > > research to help the community understand how you arrive to the
> > > > conclusion above. Can you share it?
> > > The HW offload scheme should follow the SW model. It is not only about
> > > current VXLAN patch-set, but rather an infrastructure for offloading
> > > more actions. Once HW offloading does not follow the SW model it becomes
> > > difficult or not practical to support them. Squashing multi-table flows
> > > into single rules might work but only for specific use cases. VXLAN with
> > > fixed sized outer header is one of them, but not with other use cases.
> > > There are actions, like DP_HASH, that perform calculations on the
> > > packets, to be used in the next recirc.
> > > For example:
> > > recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=6,frag=no),
> > > packets:26, bytes:1924, used:0.105s, flags:S,
> > > actions:hash(sym_l4(0)),recirc(0x5)
> > >
> > > recirc_id(0x5),dp_hash(0x315c2571/0x3),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=6,frag=no),
> > > packets:0, bytes:0, used:never, actions:,3
> > > recirc_id(0x5),dp_hash(0x315ca562/0x3),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=6,frag=no),
> > > packets:0, bytes:0, used:never, actions:,2
> > > ..
> > > 4 rules
> > >
> > > 1000000 TCP/UDP sessions will not require more data plane flows.
> > > In single-table, every separated 5-tuple should be matched, so 1000000
> > > sessions will create 1000000 data plane flows.
> >
> > [Harsha] I understand your concerns and the requirement from different
> > use cases (like dp-hash) that might need multitable flows. But for
> > VXLAN decap offload, we are questioning the need for multitable. We
> > are looking for design options that you considered to support VXLAN
> > use case with single-table. Can you please provide specific details on
> > the challenges ?
> >
> I think we need to align offload with software datapath. So if you
> want to explore single table tunnel offload processing you need to
> show how would it work in software. That will eliminate the mapping of
> two software tables to single hardware table.
>
I thought about merging multiple flows into one in order to use a single table.
I think it's pretty hard to do it correctly, with the cartesian
product of all the flows
and revalidating these flows.

As a example for vxlan decap,
FLOW1: recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=fa:4d:c4:81:25:a9,dst=f6:95:3f:d2:ea:42),eth_type(0x0800),ipv4(dst=172.31.1.100,proto=17,frag=no),udp(dst=4789),
packets:6, bytes:812, used:0.007s, actions:tnl_pop(4)
innerFLOW1: tunnel(tun_id=0x0,src=172.31.1.1,dst=172.31.1.100,flags(-df+csum+key)),recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=aa:0f:92:44:46:df,dst=b2:e3:8d:97:f0:4e),eth_type(0x0800),ipv4(dst=10.1.1.100,proto=1,frag=no),
packets:2, bytes:196, used:0.007s, actions:1
innerFLOW2:
tunnel(tun_id=0x0,src=172.31.1.1,dst=172.31.1.100,flags(-df+csum+key)),recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=aa:0f:92:44:46:df,dst=33:33:ff:44:46:df),eth_type(0x86dd),ipv6(frag=no),
packets:0, bytes:0, used:never, actions:1

Basically we have to merge and create
a. FLOW1 + innerFLOW1
b. FLOW1 + innerFLOW2
and for inner flow, we usually do connection tracking, which
introduces another recirc.
So it's almost impossible to do everything in single table.

At this moment,I couldn't think of any better way than translating
tnl_pop to jump and group.
William


More information about the dev mailing list