[ovs-discuss] Why ovn's code structure is getting messy

Tue Feb 2 20:56:09 UTC 2021

Hi. In general, I agree that the code could use a more rigid and 
easier-to-follow structure. Please see this email I sent last week to 
the ovs-dev list: 
https://mail.openvswitch.org/pipermail/ovs-dev/2021-January/379774.html

I'll address some of your points below.

On 2/2/21 6:35 AM, gongming chen wrote:
> To be honest, I think the direction of the ovn community is not towards 
> a stable and clear structure. Especially in the patch integration in the 
> past year, the new patch does not take too much into consideration for 
> compatibility.Many of them are completely logical changes, so the cost 
> of learning and following the ovn community is greatly increased. And 
> many logical changes are completely unnecessary.

I disagree here. Every patch series submitted for OVN either fixes a 
bug, adds a new feature, or fixes a performance bottleneck. Some 
individual patches within a series might reorganize code in order to 
make later patches in the same series easier to follow, though.

> For example, ofctrl_dup_flow--->ovn_flow_dup, when looking at the new 
> github code, I found that ofctrl_dup_flow was not found, and the new 
> function did not change anything except the name. I think this change is 
> completely unnecessary, except Will increase the time cost of reading 
> new code.

It also changed the function to be callable only from within ofctrl.c. 
The name change helps to hint that it is not part of the public ofctrl_* 
API. It's also moot since ovn_flow_dup() was removed in a later patch in 
the same series [1].

> Complexity such as (no other maliciousness), Incremental processing for 
> flow installation by tracking:
> Compared with the small benefits it brings (ofctl put itself is not a 
> big bottleneck), I think the complexity it brings, and the changes 
> brought by the code structure (the change of the existing version is a 
> big cost) Is bigger. This is not conducive to the long-term stable 
> development of ovn, because the code structure of ovn is becoming 
> increasingly unclear.

I agree with the final part of this: the structure of ovn is becoming 
increasingly unclear. It is harder to make new changes to ovn without 
causing regressions, because a change in one area of the code has the 
potential to affect a seemingly unrelated area.

However, I will contend that incremental processing for flow processing 
was a large bottleneck in ovn-controller. According to the cover letter 
for that patch series 
(https://mail.openvswitch.org/pipermail/ovs-dev/2020-August/374099.html), adding 
incremental processing dropped CPU usage by 39% for the port binding 
operation on a 1200 HV, 12K lport network. This is in keeping with 
previous profiling results that had shown that reconciling desired and 
installed flows was the second highest consumer of CPU in ovn-controller 
after lflow expression parsing [2].

> As a general community participant, with long-term ovn production 
> environment operating experience, I think the big bottleneck of ovn now 
> is stability and structural clarity (for custom development). I hope ovn 
> has clear code (like the code structure of the kernel), which can reduce 
> the time cost of using it. I hope ovn is small, clear, stable, and 
> reliable. Some personal suggestions, and finally hope that the ovn 
> community will get better and better.
> Best wishes.

Thanks, I'm in agreement that the code needs clearer structure. The 
hardest of those descriptors to deliver on is "small". Users are 
demanding larger and more complex scenarios than when ovn was initially 
developed. With those scenarios, we are finding that delivering the 
requested performance has come at the cost of code complexity. However, 
the suggested refactor for OVN is made with the idea of better defining 
the structure of the code so that it is easier to document, follow, and 
can be more easily tested.

Mark Michelson

> 
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> 
[1] You may be asking "Why would this function be added, only to be 
removed later in the same patch series?" That's due to the review 
process. The first version of the series changed the function name to 
ovn_flow_dup() and left it that way for the entire series. In code 
review, it was suggested to split desired and installed flows into 
separate types. The result is that a later version of the patch series 
created ovn_flow_dup() in patch 1 and then changed it to 
installed_flow_dup() in patch 6.

[2] lflow_expression_parsing is avoided when possible due to incremental 
processing of logical flows, as well as caching expression results when 
possible. Both of these changes fall into the category of "increases 
performance but makes the code more complex".