[ovs-dev] [PATCH] ovn-controller: Handle physical changes correctly

Ryan Moats rmoats at us.ibm.com
Sat Jul 23 12:57:37 UTC 2016



Ben Pfaff <blp at ovn.org> wrote on 07/22/2016 06:57:59 PM:

> From: Ben Pfaff <blp at ovn.org>
> To: Ryan Moats/Omaha/IBM at IBMUS
> Cc: dev at openvswitch.org
> Date: 07/22/2016 06:58 PM
> Subject: Re: [ovs-dev] [PATCH] ovn-controller: Handle physical
> changes correctly
>
> On Fri, Jul 22, 2016 at 09:54:26PM +0000, Ryan Moats wrote:
> > [1] reported increased failure rates in certain tests
> > with incremental processing (the numbers are the number of failures
> > seen in 100 tests):
> >
> >    2  ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS
> >   10  ovn -- 2 HVs, 2 LS, 1 lport/LS, 2 peer LRs
> >   52  ovn -- 1 HV, 1 LS, 2 lport/LS, 1 LR
> >   45  ovn -- 1 HV, 2 LSs, 1 lport/LS, 1 LR
> >   23  ovn -- 2 HVs, 3 LS, 1 lport/LS, 2 peer LRs, static routes
> >   53  ovn -- 2 HVs, 3 LRs connected via LS, static routes
> >   32  ovn -- 2 HVs, 2 LRs connected via LS, gateway router
> >   50  ovn -- icmp_reply: 1 HVs, 2 LSs, 1 lport/LS, 1 LR
> >
> > These failures were caused by a combination of problems in
> > handling physical changes:
> >
> >   1. When a vif was removed, the localvif_to_ofport entry was not
> >      removed.
> >   2. When a physical change was detected, ovn-controller would wait
> >      a poll cycle before processing the logical flow table.
> >
> > This patch set addresses both of these issues while simultaneously
> > cleaning up the code in physical.c.  A side effect is a modification
> > of where OF flows are dumped in the gateway router case that allowed
> > the root causes of this issue to be found.
> >
> > With these changes, all of the above tests had a 100/100 success rate.
> >
> > [1] http://openvswitch.org/pipermail/dev/2016-July/075803.html
> >
> > Signed-off-by: Ryan Moats <rmoats at us.ibm.com>
>
> Thank you!
>
> This made a major improvement for me, too.  I'm not seeing 100/100 but I
> also run my tests with TESTSUITEFLAGS=-j10, which is really stressful.
>
> I folded in the following changes (most importantly, to make it obvious
> why we're calling poll_immediate_wake(), which the commit message
> explained but the code didn't) and applied this to master.
>
> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
> index dc4ef77..11b2c2b 100644
> --- a/ovn/controller/physical.c
> +++ b/ovn/controller/physical.c
> @@ -55,8 +55,6 @@ physical_register_ovs_idl(struct ovsdb_idl *ovs_idl)
>
>  static struct simap localvif_to_ofport =
>      SIMAP_INITIALIZER(&localvif_to_ofport);
> -static struct simap new_localvif_to_ofport =
> -    SIMAP_INITIALIZER(&new_localvif_to_ofport);
>  static struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
>
>  /* UUID to identify OF flows not associated with ovsdb rows. */
> @@ -606,6 +604,8 @@ physical_run(struct controller_ctx *ctx, enum
> mf_field_id mff_ovn_geneve,
>          uuid_generate(hc_uuid);
>      }
>
> +    struct simap new_localvif_to_ofport =
> +        SIMAP_INITIALIZER(&new_localvif_to_ofport);
>      for (int i = 0; i < br_int->n_ports; i++) {
>          const struct ovsrec_port *port_rec = br_int->ports[i];
>          if (!strcmp(port_rec->name, br_int->name)) {
> @@ -674,8 +674,10 @@ physical_run(struct controller_ctx *ctx, enum
> mf_field_id mff_ovn_geneve,
>                  tun->ofport = u16_to_ofp(ofport);
>                  tun->type = tunnel_type;
>                  full_binding_processing = true;
> -                lflow_reset_processing();
>                  binding_reset_processing();
> +
> +                /* Reprocess logical flow table immediately. */
> +                lflow_reset_processing();
>                  poll_immediate_wake();
>                  break;
>              } else {
> @@ -712,6 +714,8 @@ physical_run(struct controller_ctx *ctx, enum
> mf_field_id mff_ovn_geneve,
>      }
>      if (localvif_map_changed) {
>          full_binding_processing = true;
> +
> +        /* Reprocess logical flow table immediately. */
>          lflow_reset_processing();
>          poll_immediate_wake();
>      }
> @@ -853,9 +857,10 @@ physical_run(struct controller_ctx *ctx, enum
> mf_field_id mff_ovn_geneve,
>      ofctrl_add_flow(OFTABLE_DROP_LOOPBACK, 0, &match, &ofpacts,
hc_uuid);
>
>      ofpbuf_uninit(&ofpacts);
> -    simap_clear(&new_localvif_to_ofport);
>      HMAP_FOR_EACH_POP (tun, hmap_node, &tunnels) {
>          free(tun);
>      }
>      hmap_clear(&tunnels);
> +
> +    simap_destroy(&new_localvif_to_ofport);
>  }
>

I should have said 100/100 with -j1 and the above changes make sense.

Thanks for applying!

Ryan



More information about the dev mailing list