[ovs-dev] [PATCH ovn] ovn-controller: Fix IP engine run with in-flight messages

Dumitru Ceara dceara at redhat.com
Mon Aug 5 07:12:45 UTC 2019


On Mon, Aug 5, 2019 at 8:52 AM Numan Siddique <nusiddiq at redhat.com> wrote:
>
>
>
> On Mon, Aug 5, 2019 at 2:11 AM Han Zhou <zhouhan at gmail.com> wrote:
>>
>> On Fri, Aug 2, 2019 at 2:23 AM Dumitru Ceara <dceara at redhat.com> wrote:
>> >
>> > When trying to incrementally process changes even if there are in-flight
>> > messages we were incorrectly setting the engine_aborted variable to the
>> > value returned by engine_run. However, engine_run returns true if the
>> > run was completed.
>> >
>> > This was causing discrepancies between logical flows and openflow flows
>> > due to the fact that the rerun wasn't happening after an aborted run.
>> >
>> > In order to avoid confusion the engine_aborted variable is now renamed to
>> > engine_run_done thus avoiding the negated logic.
>> >
>> > CC: Han Zhou <hzhou8 at ebay.com>
>> > Fixes: a6b7d9f4f047 ("ovn-controller: Fix flow installation latency
>> caused by recompute.")
>> > Signed-off-by: Dumitru Ceara <dceara at redhat.com>
>> > ---
>> >  controller/ovn-controller.c | 14 +++++++-------
>> >  1 file changed, 7 insertions(+), 7 deletions(-)
>> >
>> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>> > index ad33d17..15b9f4e 100644
>> > --- a/controller/ovn-controller.c
>> > +++ b/controller/ovn-controller.c
>> > @@ -1896,7 +1896,7 @@ main(int argc, char *argv[])
>> >
>> >      uint64_t engine_run_id = 0;
>> >      uint64_t old_engine_run_id = 0;
>> > -    bool engine_aborted = false;
>> > +    bool engine_run_done = true;
>> >
>> >      unsigned int ovs_cond_seqno = UINT_MAX;
>> >      unsigned int ovnsb_cond_seqno = UINT_MAX;
>> > @@ -1997,14 +1997,14 @@ main(int argc, char *argv[])
>> >                               * this round of engine_run and continue
>> processing
>> >                               * acculated changes incrementally later when
>> >                               * ofctrl_can_put() returns true. */
>> > -                            if (!engine_aborted) {
>> > +                            if (engine_run_done) {
>> >                                  engine_set_abort_recompute(true);
>> > -                                engine_aborted =
>> engine_run(&en_flow_output,
>> > -
>>  ++engine_run_id);
>> > +                                engine_run_done =
>> engine_run(&en_flow_output,
>> > +
>> ++engine_run_id);
>> >                              }
>> >                          } else {
>> >                              engine_set_abort_recompute(false);
>> > -                            engine_aborted = false;
>> > +                            engine_run_done = true;
>> >                              engine_run(&en_flow_output, ++engine_run_id);
>> >                          }
>> >                      }
>> > @@ -2048,8 +2048,8 @@ main(int argc, char *argv[])
>> >                  }
>> >
>> >              }
>> > -            if (old_engine_run_id == engine_run_id || engine_aborted) {
>> > -                if (engine_aborted || engine_need_run(&en_flow_output)) {
>> > +            if (old_engine_run_id == engine_run_id || !engine_run_done) {
>> > +                if (!engine_run_done ||
>> engine_need_run(&en_flow_output)) {
>> >                      VLOG_DBG("engine did not run, force recompute next
>> time: "
>> >                               "br_int %p, chassis %p", br_int, chassis);
>> >                      engine_set_force_recompute(true);
>> > --
>> > 1.8.3.1
>> >
>> > _______________________________________________
>> > dev mailing list
>> > dev at openvswitch.org
>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>> Acked-by: Han Zhou <hzhou8 at ebay.com>
>
>
>
> Thanks.
> I applied this patch to master.
>
> Numan
>
>>
>>
>> Thanks Dumitru for this critical fix! It was my silly mistake. I tried to
>> reproduce the test case failure you reported, but it still didn't happen on
>> my laptop, even with -j10. Later we may need to add a test case that
>> triggers a lot of flows being installed so that the scenario with in-flight
>> messages are easily covered.
>>
>> The original patch is still waiting to be backported to 2.12. I will send a
>> new version followed by your fix on 2.12.
>> _______________________________________________


Thanks Han, Mark, Numan for reviewing and testing the patch!


More information about the dev mailing list