[ovs-dev] [RFC 0/8] ovn-controller Incremental Processing

Fri Feb 23 22:00:54 UTC 2018

On 02/23/2018 02:04 PM, Han Zhou wrote:
> Hi Mark, thank you very much for the feedback!!
> 
> On Fri, Feb 23, 2018 at 10:56 AM, Mark Michelson <mmichels at redhat.com 
> <mailto:mmichels at redhat.com>> wrote:
>  >
>  > Hi Han,
>  >
>  > First off, great work! This is fantastic to see, and I'm super 
> excited to see such a drop in CPU usage with this patch.
>  >
>  > Since this is an RFC patch, I won't go too deep into nitty-gritty 
> details on this. Rather, I'll look at it from a high level.
>  >
>  > The API design idea is great. I like the use of a directed graph 
> since it clearly indicates dependencies. The patch progression is done 
> really well since it allows for us to easily understand each individual 
> change. It also makes it easy to see additional incremental changes that 
> could be added in the future.
>  >
>  > My main concern is that the specific setup you have used here results 
> in data encapsulation not being very strong. For instance, the 
> en_flow_output node consistently looks inside the en_runtime_data node 
> for input data. It appears from my quick look that en_flow_output is 
> treating this data as immutable, but I'm not 100% sure if I'm correct. 
> There's nothing in place to stop it from modifying that data if it 
> wanted to, though.
> 
> Agree. It is also my concern for the lack of strong encapsulation. The 
> principle to follow is that each node take input node data as immutable, 
> and write its own data. I will think about how to make this explicitly 
> from code perspective. I didn't have a good idea on this. Do you have 
> any suggestion?
One thought is that we could create better delineations between the 
working data of a node and its output data. In other words, there could 
be one structure of mutable data that the node can manipulate, and a 
separate structure of const data that it exports for other nodes to 
read. Right now, there is a function called engine_get_input() that 
retrieves an engine_node. Maybe a better idea would be to replace that 
with something like engine_get_input_data() that retrieves read-only 
data from an engine node instead.

> 
>  > Also it is common for nodes to modify global data rather than just 
> modifying node-local data(e.g. making calls to ofctrl_add_flow(), which 
> modifies a global hmap).
> 
> flow-table can be considered as data of en_flow_output. Since 
> en_flow_output is the leaf node of the DAG and no other engine node 
> depends on it, so I didn't think it is quit necessary to maintain its 
> own local data. But I agree it is better to be clear from the engine 
> node perspective. I can move the flow-table to en_flow_output data, and 
> update ofctrl module as consumer of that data.
> 
>  > Ideally, this entire thing would be doable without the need for the 
> engine_get_input() function. Each node would operate independently on 
> its own data. Doing this would aid in the ability to reason about what 
> each node is responsible for, and it would open the door for introducing 
> parallel processing of sibling nodes in the DAG.
> 
> Sorry that I didn't get this point. Each node would operate on its own 
> data but they also depends on their inputs, and that's why 
> engine_get_input() is here for. Did I misunderstand your point?

I think taking a second look at the patch has made things a bit more 
clear for me. In my initial reading, I thought that engine_get_input() 
was an anti-pattern since it was being used to look into the "private" 
data of another node. Now that I look again, it makes more sense that 
the input node data should be accessible since they provide input data 
for the parent node. You can ignore my initial finding.

> For parallel processing, I am a little hesitating, because the goal is 
> to reduce the CPU cost on HVs (to save CPU for real workloads). The 
> processing engine should be non-blocking, so if we need parallel 
> processing in the engine, it means we are requesting too much CPU on the 
> HV. I hope we don't need parallel processing for flow computing.

That's a good point.

> 
>  >
>  > All that being said, I would be happy to see a change like this go in 
> as-is (pending more in-depth code review, of course) if the data 
> encapsulation work would creep the scope out too much for the initial 
> changeset.
>  >
>  > Mark!
>  >
>  >
>  > On 02/20/2018 01:04 PM, Han Zhou wrote:
>  >>
>  >> ovn-controller currently recomputes everything when there are any 
> changes
>  >> of input, which leads to high CPU usages and slow in end-to-end flow
>  >> enforcement in response to changes. It even wastes CPU to recompute 
> flows
>  >> for unrelated inputs such as pinctrl events.
>  >>
>  >> This patch series implements incremental processing in ovn-controller to
>  >> solve above problems. There has been a similar attempt of solve the 
> problem
>  >> earlier but was reverted (see commit: 926c34fd). This patch series takes
>  >> a different approach with an incremental processing engine, to make the
>  >> dependencies clear and easier to maintain. The engine is a DAG 
> representing
>  >> dependencies between different nodes. Each node maintains its own 
> data, which
>  >> depends on its inputs and the data can also be inputs of other 
> nodes. Each
>  >> node implements a method to recompute its data based on all the 
> inputs, but
>  >> also implements methods to handle changes of different inputs 
> incrementally.
>  >> The engine will be responsible to try incremental processing for 
> each node
>  >> based on the dependencies or fallback to recompute when changes 
> cannot be
>  >> handled incrementally.
>  >>
>  >> This patch series can incrementally process the most common changes:
>  >> logical flows and port bindings from OVNSB. It can be expanded 
> further for
>  >> more fine grained incremental processing gradually.
>  >>
>  >> With the patch series, the CPU time of ovn-controller in ovn-scale-test
>  >> for 500 lports creating and binding on 50 HVs decreased 90%.
>  >>
>  >> This is RFC version to get feedback to see if there is any major 
> issue of
>  >> this approach, before refining it future for formal review. There 
> are still
>  >> two test cases failed and debugging ongoing.
>  >>
>  >> Han Zhou (8):
>  >>    ovn-controller: Incremental processing engine
>  >>    ovn-controller: Track OVSDB changes
>  >>    ovn-controller: Initial use of incremental engine in main
>  >>    ovn-controller: Split SB inputs as separate incremental engine nodes
>  >>    ovn-controller: split ovs_idl inputs in incremental engine
>  >>    ovn-controller: Incremental logical flow processing
>  >>    ovn-controller: runtime_data change handler for SB port-binding
>  >>    ovn-controller: port-binding incremental processing for physical 
> flows
>  >>
>  >>   include/ovn/actions.h           |   3 +
>  >>   ovn/controller/bfd.c            |   4 +-
>  >>   ovn/controller/binding.c        | 101 +++++++-
>  >>   ovn/controller/binding.h        |   6 +
>  >>   ovn/controller/encaps.c         |  12 +-
>  >>   ovn/controller/lflow.c          | 104 ++++++--
>  >>   ovn/controller/lflow.h          |  10 +-
>  >>   ovn/controller/ofctrl.c         | 226 ++++++++++++-----
>  >>   ovn/controller/ofctrl.h         |  16 +-
>  >>   ovn/controller/ovn-controller.c | 546 
> +++++++++++++++++++++++++++++++---------
>  >>   ovn/controller/ovn-controller.h |   5 +
>  >>   ovn/controller/physical.c       | 140 +++++++----
>  >>   ovn/controller/physical.h       |   8 +-
>  >>   ovn/lib/actions.c               |   6 +-
>  >>   ovn/lib/automake.mk <http://automake.mk>             |   4 +-
>  >>   ovn/lib/extend-table.c          |  31 ++-
>  >>   ovn/lib/extend-table.h          |   9 +-
>  >>   ovn/lib/inc-proc-eng.c          |  97 +++++++
>  >>   ovn/lib/inc-proc-eng.h          | 118 +++++++++
>  >>   19 files changed, 1143 insertions(+), 303 deletions(-)
>  >>   create mode 100644 ovn/lib/inc-proc-eng.c
>  >>   create mode 100644 ovn/lib/inc-proc-eng.h
>  >>
>  >