[ovs-dev] Wildcard Matching optimization idea

Fischetti, Antonio antonio.fischetti at intel.com
Fri Feb 5 09:14:44 UTC 2016


Hi Jarno,
Below my reply.

> -----Original Message-----
> From: Jarno Rajahalme [mailto:jarno at ovn.org]
> Sent: Thursday, February 4, 2016 10:00 PM
> To: Fischetti, Antonio
> Cc: u9012063 at gmail.com; dev at openvswitch.org
> Subject: Re: [ovs-dev] Wildcard Matching optimization idea
> 
> 
> > On Feb 4, 2016, at 6:08 AM, Fischetti, Antonio
> <antonio.fischetti at intel.com> wrote:
> >
> > Hi William,
> > Thanks for your interest on this idea.
> > I'm currently working to provide a patch, once it is ready I'll be glad to
> share.
> >
> > In order to compare the throughput between the classifier and the acl
> > I changed the code to bypass the EMC table. So in both cases any packet
> > was hitting the classifier/acl. That's just for my tests, of course.
> >
> 
> Could you share the comparable numbers WITH the EMC cache?

At the time I was focused to compare the classifier with the acl, I did all 
my tests bypassing the EMC cache.
Now I'm almost done with a change to eliminate the big ACL latency when 
inserting new rules. 
So I will run the tests again and as you suggest will add also tests to compare 
the numbers with the EMC enabled.

Thanks.

> 
>   Jarno
> 
> > I installed flows like
> > sudo ./utilities/ovs-ofctl add-flow br0
> dl_type=0x0800,nw_dst=34.34.34.34,action=output:2
> > sudo ./utilities/ovs-ofctl add-flow br0
> idle_timeout=0,dl_type=0x0800,nw_dst=34.34.34.35,dl_src=01:02:03:04:05:
> 08,action=output:2
> > sudo ./utilities/ovs-ofctl add-flow br0
> dl_type=0x0800,nw_proto=17,udp_dst=64,action=output:2
> > sudo ./utilities/ovs-ofctl add-flow br0
> dl_src=01:02:03:04:05:06,action=output:2
> > sudo ./utilities/ovs-ofctl add-flow br0
> dl_type=0x0800,nw_proto=17,actions=output:2
> > sudo ./utilities/ovs-ofctl add-flow br0
> dl_type=0x0800,nw_tos=2,actions=output:2
> > sudo ./utilities/ovs-ofctl add-flow br0 vlan_tci=0,actions=output:2
> >
> > so that the classifier was using 6 Subtables.
> >
> >
> > Antonio
> >
> >
> > From: William Tu [mailto:u9012063 at gmail.com]
> > Sent: Wednesday, February 3, 2016 11:52 PM
> > To: Fischetti, Antonio
> > Cc: dev at openvswitch.org
> > Subject: Re: [ovs-dev] Wildcard Matching optimization idea
> >
> > Hi Fischetti,
> > I'm interested in trying your experiment with 2 DPDK ACL table swapping.
> Do you have a patch for me to start with? Or any comments about how to
> reproduce your experiment results? (the one comparing classifier and DPDK
> ACL)
> >>                        +------------+------------+
> >>                        | Classifier |   2 ACLs   |
> >> +----------------+------------+------------+
> >> | Max Throughput |    2.2     |    5.4     |
> >> |     [Mpps]     |                |               |
> >> +----------------+------------+------------+
> > Thank you
> > William
> >
> > On Thu, Jan 14, 2016 at 7:28 AM, Fischetti, Antonio
> <antonio.fischetti at intel.com> wrote:
> > Thanks Wei Li, please see my replies inline.
> > If I missed something or for further detail, just let me know.
> >
> >> -----Original Message-----
> >> From: dev [mailto:dev-bounces at openvswitch.org] On Behalf Of Wei Li
> >> Sent: Thursday, January 14, 2016 6:38 AM
> >> To: dev at openvswitch.org
> >> Subject: Re: [ovs-dev] Wildcard Matching optimization idea
> >>
> >> insert a rule to "mirror" and swap "mirror" and "real"
> >>
> >> then real does not contain the new inserted rule
> >
> > Correct, it needs to be updated as well.
> >
> >>
> >> how to sync?  take the "long" time as inserting to "mirror"?
> >>
> >
> > Yes, it will take the same long time.
> > The thread that has previously inserted the rule into what was the Shadow
> > (or "mirror") will repeat the same work on the other table.
> >
> > Below is the sequence with few more details to show the latency for a new
> rule.
> >
> > 1. Insert the new rule into the background table
> > ----------------------------------------------------------
> > Please Note: while this is happening the new rule will not affect the
> lookups
> > as they're always carried out on the foreground Table.
> > Background table updated? Go to Step #2.
> >
> > 2. Swap tables
> > ------------------
> > The updated table is 'moved' to the foreground so that lookups are now
> > affected by the new rule.
> > Anyway we're not done yet, we updated just one of the 2 tables.
> > In case further new rules need to be added at this stage, they will get
> buffered
> > in a 'waiting list' to be processed later.
> > Background table updated? Go to Step #3.
> >
> > 3. Done
> > ----------
> > Both tables are synced up.
> > Is there a new rule to insert and/or anything in the 'waiting list'? Go to Step
> #1.
> >
> >
> >> 在 2016/1/12 18:37, Fischetti, Antonio 写道:
> >>> Thanks Zoltan for your questions.
> >>>
> >>> Anyone else has any new feedback?
> >>>
> >>>> -----Original Message-----
> >>>> From: dev [mailto:dev-bounces at openvswitch.org] On Behalf Of
> Fischetti,
> >>>> Antonio
> >>>> Sent: Friday, December 18, 2015 3:44 PM
> >>>> To: Zoltan Kiss; dev at openvswitch.org
> >>>> Subject: Re: [ovs-dev] Wildcard Matching optimization idea
> >>>>
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> >>>>> Sent: Friday, December 18, 2015 12:38 PM
> >>>>> To: Fischetti, Antonio; dev at openvswitch.org
> >>>>> Subject: Re: [ovs-dev] Wildcard Matching optimization idea
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 17/12/15 16:23, Fischetti, Antonio wrote:
> >>>>>> Hi Zoltan, thanks for your questions.
> >>>>>> Please find below my answers inline.
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> >>>>>>> Sent: Thursday, December 17, 2015 2:33 PM
> >>>>>>> To: Fischetti, Antonio; dev at openvswitch.org
> >>>>>>> Subject: Re: [ovs-dev] Wildcard Matching optimization idea
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 17/12/15 10:41, Fischetti, Antonio wrote:
> >>>>>>>> Hi All,
> >>>>>>>> Here's an optimization idea for the datapath classifier table.
> >>>>>>>> I'd like to get some feedback.
> >>>>>>>>
> >>>>>>>> I used the DPDK ACL tables. They can perform a wildcarded
> matching
> >>>>> and
> >>>>>>> each
> >>>>>>>> lookup requires less CPU cycles than the Classifier.
> >>>>>>>> Anyway there's a negative aspect with ACLs. They take a very long
> >> time
> >>>>> to
> >>>>>>>> insert a new Rule.
> >>>>>>>> It can be 50 times greater than an insertion into the Classifier. See
> >>>> Note
> >>>>>>> below
> >>>>>>>> for further details.
> >>>>>>>>
> >>>>>>>> So a simple 1:1 replacement of the Classifier with an ACL table is
> not
> >> a
> >>>>>>> viable
> >>>>>>>> solution.
> >>>>>>>>
> >>>>>>>> The idea described below is instead to replace the Classifier with 2
> >> ACL
> >>>>>>>> tables. One is the 'Operating', while the other is a 'Shadow' table.
> >>>>>>>>
> >>>>>>>> Any lookup will be performed on the Operating table.
> >>>>>>>>
> >>>>>>>> Instead any new insertion will be executed on the Shadow table by
> >>>>> means
> >>>>>>> of a
> >>>>>>>> separate thread.
> >>>>>>>> After the insertion is done, the 2 tables will be swapped.
> >>>>> Are you swapping after each insertion, or in batches?
> >>>> In batches.
> >>>>
> >>>>> The new shadow table needs to get updated first to be sync with
> >>>> Operating, does it take
> >>>>> a similar amount of time?
> >>>> No, the shadow table acts as a 'mirror' of the operating. So the 2
> >>>> tables are supposed to contain exactly the same entries.
> >>>> An exception is during the transient insertion procedure. But after it
> >>>> is completed the 2 tables will contain again the same entries.
> >>>>
> >>>>> Instead of having this 2 table, how about have one, and make it
> possible
> >>>>> that you can look up while an insertion is on place? Something in an
> RCU
> >>>>> fashion?
> >>>> Unluckly not. While an insertion is taking place it is not possible to
> access
> >> the
> >>>> ACL
> >>>> to read its entries.
> >>>> More precisely, an ACL insertion means 2 actions: add + rebuild. The
> >> rebuild
> >>>> takes
> >>>> the 95% cpu cycles of all the insertion.
> >>>> You could read while the 'add' is in progress. Instead you can't read
> while
> >> the
> >>>> 'rebuild' is still happening.
> >>>> That's why I'm using 2 ACL tables.
> >>>>
> >>>>>>> So while this insertion happens, you still look up in the actual
> >>>>>>> Operating table.
> >>>>>> Yes, while insertion is in progress any lookup will still be carried out
> on
> >>>> the
> >>>>>> Operating table.
> >>>>> I don't know how the classifier works exactly, but is the following
> >>>>> scenario possible?:
> >>>>>
> >>>>> Rule A matches a flow and specifies an action. A new insertion would
> >>>>> essentially remove Rule A and add B which matches the same flow but
> >>>>> specifies a different action. While that happens, packets would still
> >>>>> match A, while the expectation probably would be to match B.
> >>>>>
> >>>> The same issue can happen with the Classifier, in this case it will be
> worse.
> >>>> The solution with 2 ACLs have an insertion latency much longer.
> >>>> That is because an ACL insertion can be about 50 times greater than an
> >>>> insertion into the Classifier.
> >>>>
> >>>>>>> What happens if you have a new insertion in the meantime?
> >>>>>> The new Rule gets buffered into a 'wait' queue.
> >>>>>>
> >>>>>>> Especially, what happens if your lookup yields the same rule
> >>>>>>> which is inserted at the moment?
> >>>>>> That's a good point.  At the current stage it is simply added into the
> >> wait
> >>>>> queue.
> >>>>>> So I could potentially have duplications where different rules into the
> >> ACL
> >>>>> are
> >>>>>> referring to the same netdev-flow.
> >>>>>> To avoid these duplications there could be 2 approaches.
> >>>>>> One option would be to check that in the wait queue that rule is not
> >>>>> present.
> >>>>>> Another option would be to store it into the wait queue anyway and
> >> then
> >>>>> check
> >>>>>> that the ACL does not already contain that rule.
> >>>>>
> >>>>>
> >>>>>
> >>>>>>>> Thus the Shadow table will now become the Operating one, and
> >>>>> viceversa.
> >>>>>>>>
> >>>>>>>> Is the following ok with real use cases?
> >>>>>>>> ========================================
> >>>>>>>> An Assumption was made: new sets of Rules arrive with a
> frequency
> >>>>> lower
> >>>>>>>> than 1 (Rule Sets)/sec.
> >>>>>>>> Would this be ok with real use cases?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Performance Figures
> >>>>>>>> ===================
> >>>>>>>> The table below refers to a mono-directional test where the
> >>>>> performance
> >>>>>>> is
> >>>>>>>> compared between the 2 implementations.
> >>>>>>>> Some Flows were installed so that the Classifier was using 7
> >> SubTables.
> >>>>>>>> The ACL Rule format was {Protocol, IPdest, MACsrc, UdpPortDest,
> >> ToS,
> >>>>>>> VlanTci}.
> >>>>>>>> The performance figures are expressed in Mpps.
> >>>>>>>>
> >>>>>>>>                      +------------+------------+
> >>>>>>>>                      | Classifier |   2 ACLs   |
> >>>>>>>> +----------------+------------+------------+
> >>>>>>>> | Max Throughput |    2.2     |    5.4     |
> >>>>>>>> |     [Mpps]     |            |            |
> >>>>>>>> +----------------+------------+------------+
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Conclusions
> >>>>>>>> ===========
> >>>>>>>> At this stage it would really be helpful to have an initial feedback
> >> from
> >>>>> the
> >>>>>>>> Community. Any comment or suggestion will be useful to drive
> >> further
> >>>>>>>> developments.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> References
> >>>>>>>> ==========
> >>>>>>>> DPDK ACL Rules, how to:
> >>>>>>>>
> >>>>>
> http://dpdk.org/doc/guides/prog_guide/packet_classif_access_ctrl.html
> >>>>>>>>
> >>>>>>>> Notes
> >>>>>>>> =====
> >>>>>>>> When an ACL table contains about 2000 Rules with a structure like
> >>>>>>>> {Protocol, IPsource, IPdest, PortSource, PortDest}
> >>>>>>>> a new insertion costs about 69000 CPUcycles/Rule.
> >>>>>>>> Instead under similar operating conditions the Classifier would
> >> require
> >>>>>>> about
> >>>>>>>> 1300 CPUcycles/Rule.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Antonio
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> dev mailing list
> >>>>>>>> dev at openvswitch.org
> >>>>>>>> http://openvswitch.org/mailman/listinfo/dev
> >>>>>>>>
> >>>> _______________________________________________
> >>>> dev mailing list
> >>>> dev at openvswitch.org
> >>>> http://openvswitch.org/mailman/listinfo/dev
> >>> _______________________________________________
> >>> dev mailing list
> >>> dev at openvswitch.org
> >>> http://openvswitch.org/mailman/listinfo/dev
> >>
> >> _______________________________________________
> >> dev mailing list
> >> dev at openvswitch.org
> >> http://openvswitch.org/mailman/listinfo/dev
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
> >
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev




More information about the dev mailing list