[ovs-dev] [PATCH ovn] ovn-controller: Fix IP engine run with in-flight messages
Han Zhou
zhouhan at gmail.com
Sun Aug 4 20:40:42 UTC 2019
On Fri, Aug 2, 2019 at 2:23 AM Dumitru Ceara <dceara at redhat.com> wrote:
>
> When trying to incrementally process changes even if there are in-flight
> messages we were incorrectly setting the engine_aborted variable to the
> value returned by engine_run. However, engine_run returns true if the
> run was completed.
>
> This was causing discrepancies between logical flows and openflow flows
> due to the fact that the rerun wasn't happening after an aborted run.
>
> In order to avoid confusion the engine_aborted variable is now renamed to
> engine_run_done thus avoiding the negated logic.
>
> CC: Han Zhou <hzhou8 at ebay.com>
> Fixes: a6b7d9f4f047 ("ovn-controller: Fix flow installation latency
caused by recompute.")
> Signed-off-by: Dumitru Ceara <dceara at redhat.com>
> ---
> controller/ovn-controller.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index ad33d17..15b9f4e 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -1896,7 +1896,7 @@ main(int argc, char *argv[])
>
> uint64_t engine_run_id = 0;
> uint64_t old_engine_run_id = 0;
> - bool engine_aborted = false;
> + bool engine_run_done = true;
>
> unsigned int ovs_cond_seqno = UINT_MAX;
> unsigned int ovnsb_cond_seqno = UINT_MAX;
> @@ -1997,14 +1997,14 @@ main(int argc, char *argv[])
> * this round of engine_run and continue
processing
> * acculated changes incrementally later when
> * ofctrl_can_put() returns true. */
> - if (!engine_aborted) {
> + if (engine_run_done) {
> engine_set_abort_recompute(true);
> - engine_aborted =
engine_run(&en_flow_output,
> -
++engine_run_id);
> + engine_run_done =
engine_run(&en_flow_output,
> +
++engine_run_id);
> }
> } else {
> engine_set_abort_recompute(false);
> - engine_aborted = false;
> + engine_run_done = true;
> engine_run(&en_flow_output, ++engine_run_id);
> }
> }
> @@ -2048,8 +2048,8 @@ main(int argc, char *argv[])
> }
>
> }
> - if (old_engine_run_id == engine_run_id || engine_aborted) {
> - if (engine_aborted || engine_need_run(&en_flow_output)) {
> + if (old_engine_run_id == engine_run_id || !engine_run_done) {
> + if (!engine_run_done ||
engine_need_run(&en_flow_output)) {
> VLOG_DBG("engine did not run, force recompute next
time: "
> "br_int %p, chassis %p", br_int, chassis);
> engine_set_force_recompute(true);
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Acked-by: Han Zhou <hzhou8 at ebay.com>
Thanks Dumitru for this critical fix! It was my silly mistake. I tried to
reproduce the test case failure you reported, but it still didn't happen on
my laptop, even with -j10. Later we may need to add a test case that
triggers a lot of flows being installed so that the scenario with in-flight
messages are easily covered.
The original patch is still waiting to be backported to 2.12. I will send a
new version followed by your fix on 2.12.
More information about the dev
mailing list