[ovs-dev] [PATCH monitor_cond V10] RFC OVN: Implementation of conditional monitoring usage

Liran Schour LIRANS at il.ibm.com
Mon Jul 25 06:21:35 UTC 2016


"dev" <dev-bounces at openvswitch.org> wrote on 25/07/2016 09:03:55 AM:

> From: Liran Schour/Haifa/IBM at IBMIL
> To: "Ryan Moats" <rmoats at us.ibm.com>
> Cc: dev at openvswitch.org
> Date: 25/07/2016 09:04 AM
> Subject: Re: [ovs-dev] [PATCH monitor_cond V10] RFC OVN: 
> Implementation of conditional monitoring usage
> Sent by: "dev" <dev-bounces at openvswitch.org>
> 
> Ryan Moats/Omaha/IBM wrote on 24/07/2016 11:03:21 PM:
> 
> > From: Ryan Moats/Omaha/IBM
> > To: Liran Schour <lirans at il.ibm.com>
> > Cc: Ben Pfaff <blp at ovn.org>, dev at openvswitch.org
> > Date: 24/07/2016 11:03 PM
> > Subject: Re: [ovs-dev] [PATCH monitor_cond V10] RFC OVN: 
> > Implementation of conditional monitoring usage
> > 
> > "dev" <dev-bounces at openvswitch.org> wrote on 07/19/2016 03:44:40 AM:
> > 
> > > From: Liran Schour <lirans at il.ibm.com>
> > > To: Ben Pfaff <blp at ovn.org>
> > > Cc: dev at openvswitch.org
> > > Date: 07/19/2016 03:45 AM
> > > Subject: [ovs-dev] [PATCH monitor_cond V10] RFC OVN: Implementation 
> > > of conditional monitoring usage
> > > Sent by: "dev" <dev-bounces at openvswitch.org>
> > > 
> > > Conditional monitor of: Port_Binding, Logical_Flow, Multicast_Group
> > > MAC_Binding tables. As a result ovn-controller will be notified only 

> about
> > > records belongs to a datapath that is being served by this 
hypervisor.
> > > 
> > > Performance evaluation:
> > > OVN is the main candidate for conditional monitoring usage. It is 
> clear that
> > > conditional monitoring reduces computation on the ovn-controller 
> > (client) side
> > > due to the reduced size of flow tables and update messages. 
> Performance
> > > evaluation shows up to 75% computation reduction.
> > > However, performance evaluation shows also a reduction in 
> > > computation on the SB
> > > ovsdb-server side proportional to the degree that each logical 
network 
> is
> > > spread over physical hosts in the DC. Evaluation shows that in a 
> realistic
> > > scenarios there is a computation reduction also in the server side.
> > > 
> > > Evaluation on simulated environment of 50 hosts and 1000 logical 
ports 
> shows
> > > the following results (cycles #):
> > > 
> > > LN spread over # hosts|    master    | patch        | change
> > > -------------------------------------------------------------
> > >             1         | 24597200127  | 24339235374  |  1.0%
> > >             6         | 23788521572  | 19145229352  | 19.5%
> > >            12         | 23886405758  | 17913143176  | 25.0%
> > >            18         | 25812686279  | 23675094540  |  8.2%
> > >            24         | 28414671499  | 24770202308  | 12.8%
> > >            30         | 31487218890  | 28397543436  |  9.8%
> > >            36         | 36116993930  | 34105388739  |  5.5%
> > >            42         | 37898342465  | 38647139083  | -1.9%
> > >            48         | 41637996229  | 41846616306  | -0.5%
> > >            50         | 41679995357  | 43455565977  | -4.2%
> > > 
> > > Signed-off-by: Liran Schour <lirans at il.ibm.com>
> > > ---
> > 
> > It looks like this patch needs to be rebased - it appears to conflict
> > in both binding.c and patch.c
> 
> I have a rebased version on-top of the new incremental processing code 
> however I have a problem with one unit test (2189: ovn -- 2 HVs, 2 LRs 
> connected via LS, gateway router FAILED (ovn.at:3238) that I see 100% 
> failure when combining the incremental processing and the conditional 
> monitoring patch.
> 
> Although the new conditional monitoring patch improve the stability of 
the 
> tests. I run the following test:
> for i in `seq 100` ; do
>         echo `pwd`
>       make check TESTSUITEFLAGS="-k ovn" 2>&1 >> check.out
> done
> grep 'FAILED (' check.out | sort | uniq -c
> 
> On master I get:
>       4 2180: ovn -- 3 HVs, 1 VIFs/HV, 1 software GW, 1 LS    FAILED 
> (ovn.at:1473)
>       1 2181: ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR           FAILED 
> (ovn.at:1887)
>      75 2183: ovn -- 2 HVs, 2 LS, 1 lport/LS, 2 peer LRs      FAILED 
> (ovn.at:2416)
>      15 2184: ovn -- 1 HV, 1 LS, 2 lport/LS, 1 LR             FAILED 
> (ovn.at:2529)
>       1 2185: ovn -- 1 HV, 2 LSs, 1 lport/LS, 1 LR            FAILED 
> (ovn.at:2668)
>      88 2186: ovn -- 2 HVs, 3 LS, 1 lport/LS, 2 peer LRs, static routes 
> FAILED (ovn.at:2819)
>      98 2188: ovn -- 2 HVs, 3 LRs connected via LS, static routes FAILED 

> (ovn.at:3053)
>       6 2189: ovn -- 2 HVs, 2 LRs connected via LS, gateway router 
FAILED 
> (ovn.at:3237)
>      94 2189: ovn -- 2 HVs, 2 LRs connected via LS, gateway router 
FAILED 
> (ovn.at:3240)
>       2 2190: ovn -- icmp_reply: 1 HVs, 2 LSs, 1 lport/LS, 1 LR FAILED 
> (ovn.at:3389)
> 
> And on conditional monitor patch rebased on master I get:
>      40 2180: ovn -- 3 HVs, 1 VIFs/HV, 1 software GW, 1 LS    FAILED 
> (ovn.at:1474)
>       5 2184: ovn -- 1 HV, 1 LS, 2 lport/LS, 1 LR             FAILED 
> (ovn.at:2530)
>     100 2189: ovn -- 2 HVs, 2 LRs connected via LS, gateway router 
FAILED 
> (ovn.at:3238)
> 
> As you can see things are much more stable after applying conditional 
> monitor patch but test 2189 has 100% failure.
> Conditional monitoring without incremental processing does not fail this 

> test.
> 
> I will submit the rebased patch and we will all have the opportunity to 
> look on this.
> 

Sorry, but I see that I did not rebased the most recent commits.
Seems that the 2 commits:
- "ovn-controller: Handle physical changes correctly"
- "ovn-controller: eliminate stall in ofctrl state machine"

Really solves the problem :-)
Will submit my patch soon.




More information about the dev mailing list