[ovs-dev] [PATCH] ovn-controller: Handle physical changes correctly
Ryan Moats
rmoats at us.ibm.com
Sat Jul 23 12:57:37 UTC 2016
Ben Pfaff <blp at ovn.org> wrote on 07/22/2016 06:57:59 PM:
> From: Ben Pfaff <blp at ovn.org>
> To: Ryan Moats/Omaha/IBM at IBMUS
> Cc: dev at openvswitch.org
> Date: 07/22/2016 06:58 PM
> Subject: Re: [ovs-dev] [PATCH] ovn-controller: Handle physical
> changes correctly
>
> On Fri, Jul 22, 2016 at 09:54:26PM +0000, Ryan Moats wrote:
> > [1] reported increased failure rates in certain tests
> > with incremental processing (the numbers are the number of failures
> > seen in 100 tests):
> >
> > 2 ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS
> > 10 ovn -- 2 HVs, 2 LS, 1 lport/LS, 2 peer LRs
> > 52 ovn -- 1 HV, 1 LS, 2 lport/LS, 1 LR
> > 45 ovn -- 1 HV, 2 LSs, 1 lport/LS, 1 LR
> > 23 ovn -- 2 HVs, 3 LS, 1 lport/LS, 2 peer LRs, static routes
> > 53 ovn -- 2 HVs, 3 LRs connected via LS, static routes
> > 32 ovn -- 2 HVs, 2 LRs connected via LS, gateway router
> > 50 ovn -- icmp_reply: 1 HVs, 2 LSs, 1 lport/LS, 1 LR
> >
> > These failures were caused by a combination of problems in
> > handling physical changes:
> >
> > 1. When a vif was removed, the localvif_to_ofport entry was not
> > removed.
> > 2. When a physical change was detected, ovn-controller would wait
> > a poll cycle before processing the logical flow table.
> >
> > This patch set addresses both of these issues while simultaneously
> > cleaning up the code in physical.c. A side effect is a modification
> > of where OF flows are dumped in the gateway router case that allowed
> > the root causes of this issue to be found.
> >
> > With these changes, all of the above tests had a 100/100 success rate.
> >
> > [1] http://openvswitch.org/pipermail/dev/2016-July/075803.html
> >
> > Signed-off-by: Ryan Moats <rmoats at us.ibm.com>
>
> Thank you!
>
> This made a major improvement for me, too. I'm not seeing 100/100 but I
> also run my tests with TESTSUITEFLAGS=-j10, which is really stressful.
>
> I folded in the following changes (most importantly, to make it obvious
> why we're calling poll_immediate_wake(), which the commit message
> explained but the code didn't) and applied this to master.
>
> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
> index dc4ef77..11b2c2b 100644
> --- a/ovn/controller/physical.c
> +++ b/ovn/controller/physical.c
> @@ -55,8 +55,6 @@ physical_register_ovs_idl(struct ovsdb_idl *ovs_idl)
>
> static struct simap localvif_to_ofport =
> SIMAP_INITIALIZER(&localvif_to_ofport);
> -static struct simap new_localvif_to_ofport =
> - SIMAP_INITIALIZER(&new_localvif_to_ofport);
> static struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
>
> /* UUID to identify OF flows not associated with ovsdb rows. */
> @@ -606,6 +604,8 @@ physical_run(struct controller_ctx *ctx, enum
> mf_field_id mff_ovn_geneve,
> uuid_generate(hc_uuid);
> }
>
> + struct simap new_localvif_to_ofport =
> + SIMAP_INITIALIZER(&new_localvif_to_ofport);
> for (int i = 0; i < br_int->n_ports; i++) {
> const struct ovsrec_port *port_rec = br_int->ports[i];
> if (!strcmp(port_rec->name, br_int->name)) {
> @@ -674,8 +674,10 @@ physical_run(struct controller_ctx *ctx, enum
> mf_field_id mff_ovn_geneve,
> tun->ofport = u16_to_ofp(ofport);
> tun->type = tunnel_type;
> full_binding_processing = true;
> - lflow_reset_processing();
> binding_reset_processing();
> +
> + /* Reprocess logical flow table immediately. */
> + lflow_reset_processing();
> poll_immediate_wake();
> break;
> } else {
> @@ -712,6 +714,8 @@ physical_run(struct controller_ctx *ctx, enum
> mf_field_id mff_ovn_geneve,
> }
> if (localvif_map_changed) {
> full_binding_processing = true;
> +
> + /* Reprocess logical flow table immediately. */
> lflow_reset_processing();
> poll_immediate_wake();
> }
> @@ -853,9 +857,10 @@ physical_run(struct controller_ctx *ctx, enum
> mf_field_id mff_ovn_geneve,
> ofctrl_add_flow(OFTABLE_DROP_LOOPBACK, 0, &match, &ofpacts,
hc_uuid);
>
> ofpbuf_uninit(&ofpacts);
> - simap_clear(&new_localvif_to_ofport);
> HMAP_FOR_EACH_POP (tun, hmap_node, &tunnels) {
> free(tun);
> }
> hmap_clear(&tunnels);
> +
> + simap_destroy(&new_localvif_to_ofport);
> }
>
I should have said 100/100 with -j1 and the above changes make sense.
Thanks for applying!
Ryan
More information about the dev
mailing list