[ovs-dev] Wildcard Matching optimization idea

Fischetti, Antonio antonio.fischetti at intel.com
Thu Jan 14 15:28:23 UTC 2016


Thanks Wei Li, please see my replies inline.
If I missed something or for further detail, just let me know.

> -----Original Message-----
> From: dev [mailto:dev-bounces at openvswitch.org] On Behalf Of Wei Li
> Sent: Thursday, January 14, 2016 6:38 AM
> To: dev at openvswitch.org
> Subject: Re: [ovs-dev] Wildcard Matching optimization idea
> 
> insert a rule to "mirror" and swap "mirror" and "real"
> 
> then real does not contain the new inserted rule

Correct, it needs to be updated as well.

> 
> how to sync?  take the "long" time as inserting to "mirror"?
> 

Yes, it will take the same long time.
The thread that has previously inserted the rule into what was the Shadow 
(or "mirror") will repeat the same work on the other table.

Below is the sequence with few more details to show the latency for a new rule.

1. Insert the new rule into the background table
----------------------------------------------------------
Please Note: while this is happening the new rule will not affect the lookups 
as they're always carried out on the foreground Table.
Background table updated? Go to Step #2.

2. Swap tables 
------------------
The updated table is 'moved' to the foreground so that lookups are now 
affected by the new rule.
Anyway we're not done yet, we updated just one of the 2 tables. 
In case further new rules need to be added at this stage, they will get buffered 
in a 'waiting list' to be processed later.
Background table updated? Go to Step #3.

3. Done
----------
Both tables are synced up.
Is there a new rule to insert and/or anything in the 'waiting list'? Go to Step #1.


> 在 2016/1/12 18:37, Fischetti, Antonio 写道:
> > Thanks Zoltan for your questions.
> >
> > Anyone else has any new feedback?
> >
> >> -----Original Message-----
> >> From: dev [mailto:dev-bounces at openvswitch.org] On Behalf Of Fischetti,
> >> Antonio
> >> Sent: Friday, December 18, 2015 3:44 PM
> >> To: Zoltan Kiss; dev at openvswitch.org
> >> Subject: Re: [ovs-dev] Wildcard Matching optimization idea
> >>
> >>
> >>> -----Original Message-----
> >>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> >>> Sent: Friday, December 18, 2015 12:38 PM
> >>> To: Fischetti, Antonio; dev at openvswitch.org
> >>> Subject: Re: [ovs-dev] Wildcard Matching optimization idea
> >>>
> >>>
> >>>
> >>> On 17/12/15 16:23, Fischetti, Antonio wrote:
> >>>> Hi Zoltan, thanks for your questions.
> >>>> Please find below my answers inline.
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> >>>>> Sent: Thursday, December 17, 2015 2:33 PM
> >>>>> To: Fischetti, Antonio; dev at openvswitch.org
> >>>>> Subject: Re: [ovs-dev] Wildcard Matching optimization idea
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 17/12/15 10:41, Fischetti, Antonio wrote:
> >>>>>> Hi All,
> >>>>>> Here's an optimization idea for the datapath classifier table.
> >>>>>> I'd like to get some feedback.
> >>>>>>
> >>>>>> I used the DPDK ACL tables. They can perform a wildcarded matching
> >>> and
> >>>>> each
> >>>>>> lookup requires less CPU cycles than the Classifier.
> >>>>>> Anyway there's a negative aspect with ACLs. They take a very long
> time
> >>> to
> >>>>>> insert a new Rule.
> >>>>>> It can be 50 times greater than an insertion into the Classifier. See
> >> Note
> >>>>> below
> >>>>>> for further details.
> >>>>>>
> >>>>>> So a simple 1:1 replacement of the Classifier with an ACL table is not
> a
> >>>>> viable
> >>>>>> solution.
> >>>>>>
> >>>>>> The idea described below is instead to replace the Classifier with 2
> ACL
> >>>>>> tables. One is the 'Operating', while the other is a 'Shadow' table.
> >>>>>>
> >>>>>> Any lookup will be performed on the Operating table.
> >>>>>>
> >>>>>> Instead any new insertion will be executed on the Shadow table by
> >>> means
> >>>>> of a
> >>>>>> separate thread.
> >>>>>> After the insertion is done, the 2 tables will be swapped.
> >>> Are you swapping after each insertion, or in batches?
> >> In batches.
> >>
> >>> The new shadow table needs to get updated first to be sync with
> >> Operating, does it take
> >>> a similar amount of time?
> >> No, the shadow table acts as a 'mirror' of the operating. So the 2
> >> tables are supposed to contain exactly the same entries.
> >> An exception is during the transient insertion procedure. But after it
> >> is completed the 2 tables will contain again the same entries.
> >>
> >>> Instead of having this 2 table, how about have one, and make it possible
> >>> that you can look up while an insertion is on place? Something in an RCU
> >>> fashion?
> >> Unluckly not. While an insertion is taking place it is not possible to access
> the
> >> ACL
> >> to read its entries.
> >> More precisely, an ACL insertion means 2 actions: add + rebuild. The
> rebuild
> >> takes
> >> the 95% cpu cycles of all the insertion.
> >> You could read while the 'add' is in progress. Instead you can't read while
> the
> >> 'rebuild' is still happening.
> >> That's why I'm using 2 ACL tables.
> >>
> >>>>> So while this insertion happens, you still look up in the actual
> >>>>> Operating table.
> >>>> Yes, while insertion is in progress any lookup will still be carried out on
> >> the
> >>>> Operating table.
> >>> I don't know how the classifier works exactly, but is the following
> >>> scenario possible?:
> >>>
> >>> Rule A matches a flow and specifies an action. A new insertion would
> >>> essentially remove Rule A and add B which matches the same flow but
> >>> specifies a different action. While that happens, packets would still
> >>> match A, while the expectation probably would be to match B.
> >>>
> >> The same issue can happen with the Classifier, in this case it will be worse.
> >> The solution with 2 ACLs have an insertion latency much longer.
> >> That is because an ACL insertion can be about 50 times greater than an
> >> insertion into the Classifier.
> >>
> >>>>> What happens if you have a new insertion in the meantime?
> >>>> The new Rule gets buffered into a 'wait' queue.
> >>>>
> >>>>> Especially, what happens if your lookup yields the same rule
> >>>>> which is inserted at the moment?
> >>>> That's a good point.  At the current stage it is simply added into the
> wait
> >>> queue.
> >>>> So I could potentially have duplications where different rules into the
> ACL
> >>> are
> >>>> referring to the same netdev-flow.
> >>>> To avoid these duplications there could be 2 approaches.
> >>>> One option would be to check that in the wait queue that rule is not
> >>> present.
> >>>> Another option would be to store it into the wait queue anyway and
> then
> >>> check
> >>>> that the ACL does not already contain that rule.
> >>>
> >>>
> >>>
> >>>>>> Thus the Shadow table will now become the Operating one, and
> >>> viceversa.
> >>>>>>
> >>>>>> Is the following ok with real use cases?
> >>>>>> ========================================
> >>>>>> An Assumption was made: new sets of Rules arrive with a frequency
> >>> lower
> >>>>>> than 1 (Rule Sets)/sec.
> >>>>>> Would this be ok with real use cases?
> >>>>>>
> >>>>>>
> >>>>>> Performance Figures
> >>>>>> ===================
> >>>>>> The table below refers to a mono-directional test where the
> >>> performance
> >>>>> is
> >>>>>> compared between the 2 implementations.
> >>>>>> Some Flows were installed so that the Classifier was using 7
> SubTables.
> >>>>>> The ACL Rule format was {Protocol, IPdest, MACsrc, UdpPortDest,
> ToS,
> >>>>> VlanTci}.
> >>>>>> The performance figures are expressed in Mpps.
> >>>>>>
> >>>>>>                     +------------+------------+
> >>>>>>                     | Classifier |   2 ACLs   |
> >>>>>> +----------------+------------+------------+
> >>>>>> | Max Throughput |    2.2     |    5.4     |
> >>>>>> |     [Mpps]     |            |            |
> >>>>>> +----------------+------------+------------+
> >>>>>>
> >>>>>>
> >>>>>> Conclusions
> >>>>>> ===========
> >>>>>> At this stage it would really be helpful to have an initial feedback
> from
> >>> the
> >>>>>> Community. Any comment or suggestion will be useful to drive
> further
> >>>>>> developments.
> >>>>>>
> >>>>>>
> >>>>>> References
> >>>>>> ==========
> >>>>>> DPDK ACL Rules, how to:
> >>>>>>
> >>> http://dpdk.org/doc/guides/prog_guide/packet_classif_access_ctrl.html
> >>>>>>
> >>>>>> Notes
> >>>>>> =====
> >>>>>> When an ACL table contains about 2000 Rules with a structure like
> >>>>>> {Protocol, IPsource, IPdest, PortSource, PortDest}
> >>>>>> a new insertion costs about 69000 CPUcycles/Rule.
> >>>>>> Instead under similar operating conditions the Classifier would
> require
> >>>>> about
> >>>>>> 1300 CPUcycles/Rule.
> >>>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Antonio
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> dev mailing list
> >>>>>> dev at openvswitch.org
> >>>>>> http://openvswitch.org/mailman/listinfo/dev
> >>>>>>
> >> _______________________________________________
> >> dev mailing list
> >> dev at openvswitch.org
> >> http://openvswitch.org/mailman/listinfo/dev
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
> 
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev


More information about the dev mailing list