[ovs-dev] [PATCH ovn v2] ovn-controller: Fix use of dangling pointers in I-P runtime_data.

Fri Nov 8 19:26:52 UTC 2019

On Fri, Nov 8, 2019 at 11:22 AM Han Zhou <hzhou at ovn.org> wrote:
>
> 1. storage data and the void *arg of init() breaks the engine node data
encapsulation.
> 2. engine_node_valid(&en_flow_output, engine_run_id) is not needed?
Should use storage to access instead?
> 3. refactor of engine is good but better to be a separate commit
> 4. we can have a new interface: engine_get_data(), which returns data if
it is valid. we should never expose the data directly. We should init the
engine node with dynamically allocated engine data structure (and remember
to free during destroy)

Oops! please ignore the above part since it was draft and I forgot to
delete after editing the formal response, mostly redundant :-)
Real response started here =>
>
> Hi Dumitru,
>
> Sorry for late response.
> On Mon, Nov 4, 2019 at 4:54 AM Dumitru Ceara <dceara at redhat.com> wrote:
> >
> > The incremental processing engine might stop a run before the
> > en_runtime_data node is processed. In such cases the ed_runtime_data
> > fields might contain pointers to already deleted SB records. For
> > example, if a port binding corresponding to a patch port is removed from
> > the SB database and the incremental processing engine aborts before the
> > en_runtime_data node is processed then the corresponding local_datapath
> > hashtable entry in ed_runtime_data is stale and will store a pointer to
> > the already freed sbrec_port_binding record.
> >
> > This will cause invalid memory accesses in various places (e.g.,
> > pinctrl_run() -> prepare_ipv6_ras()).
> >
> > To fix the issue we need a way to track how each node was processed
> > during an engine run. This commit transforms the 'changed' field in
> > struct engine_node in a 'state' field. Possible node states are:
> > - "New": the node is not yet initialized.
> > - "Stale": data in the node is not up to date with the DB.
> > - "Updated": data in the node is valid but was updated during
> >   the last run of the engine.
> > - "Valid": data in the node is valid and didn't change during
> >   the last run of the engine.
> > - "Aborted": during the last run, processing was aborted for
> >   this node.
> > - "Destroyed": the node was already cleaned up.
> >
> > We also add a separation between engine node data that can be accessed
> > at any time (regardless if the last engine run was successful or not)
> > and data that may be accessed only if the nodes are up to date. This
> > helps avoiding custom "engine_node_valid" handlers for different
> > nodes.
> >
> > The commit also simplifies the logic of calling engine_run and
> > engine_need_run in order to reduce the number of external variables
> > required to track the result of the last engine execution.
> >
> > Functions that need to be called from the main loop and depend on
> > various data contents of the engine's nodes are now called only if
> > the data is up to date.
> >
> > CC: Han Zhou <hzhou8 at ebay.com>
> > Fixes: ca278d98a4f5 ("ovn-controller: Initial use of incremental engine
- quiet mode.")
> > Signed-off-by: Dumitru Ceara <dceara at redhat.com>
> >
> > ---
> > v2: Address Han's comments:
> > - call engine_node_valid() in all the places where node local data is
> >   used.
> > - move out "global" data outside the engine nodes. Make a clear
> >   separation between data that can be safely used at any time and data
> >   that can be used only when the engine run was successful.
>
> I am concerned with this kind of separation of *global* data, which
breaks the data encapsulation of engine node, and can easily result in
hidden dependency. As you know the main purpose of the I-P engine is to
force explicit dependency exposed between different engine nodes thus
ensure the correctness (at least it helps to ensure) of incremental
processing.
>
> Here is my proposal to address the problem with better encapsulation.
>
> Firstly, let's avoid direct engine data access outside of engine module.
At engine node construction, instead of using reference of stack variable
(such as struct ed_type_runtime_data ed_runtime_data), we can allocate the
memory in the engine node's init() interface, and free in the cleanup()
interface. This way, there will be no way to directly access engine data
like &ed_runtime_data.local_datapaths.
>
> Secondly, let's add a engine module interface engine_get_data() to
retrieve *and validate* data for an engine node:
> void *
> engine_get_data(struct engine_node *node, uint64_t run_id)
> {
>     if (engine_node_valid(node, run_id)) {
>         return node->data;
>     }
>     return NULL;
> }
>
> This should be used whenever an engine node data need to be accessed. (It
may be even better to use node name as input instead of node structure, but
it seems ok to me either way)
>
> Now let's address the always-valid node problem. I was proposing an
is_valid() interface for engine node. It can be NULL by default, but if a
node xxx's data is always valid, it can be implemented like:
>
> static bool
> en_xxx_is_valid(struct engine_node *node)
> {
>     /* This node's data will always be valid */
>     return true;
> }
>
> For the engine_node_valid() function, it can be changed slightly:
>
> bool
> engine_node_valid(struct engine_node *node, uint64_t run_id)
> {
>     if (node->is_valid) {
>         return node->is_valid();
>     }
>     return node->run_id == run_id &&
>         (node->state == EN_UPDATED || node->state == EN_VALID);
> }
>
> So if is_valid is not implemented, it will be the default check,
otherwise, follow whatever logic is used for is_valid(). This is flexible
and may fit for some new cases if a node's data validity depends on some
other factors it can always be customized in is_valid() interface of that
node.
>
> This may result in a little bit more code, since we will have to use
engine_get_data() to retrieve data from a node (and convert to its node's
data type) before accessing, but it would look more clean and safe, while
keeping the dependency well maintained. Does this sound plausible?
>
> Thanks for the other updates and refactors. Please see some minor
comments inlined below.
>
> >
> > -                    if (en_runtime_data.changed) {
> > +                    /* We need to make sure the en_flow_output node was
> > +                     * properly computed before trying to install OF
flows.
> > +                     */
> > +                    if (engine_node_valid(&en_flow_output,
engine_run_id)) {
>
> I guess your intention of using storage.flow_table here was to avoid the
engine_node_valid check, so that this "if" can be avoided, right? (However,
I suggest not using storage at all as discussed above)
>
> > +                        ofctrl_put(&storage.flow_table,
> > +                                   &storage.pending_ct_zones,
> > +
sbrec_meter_table_get(ovnsb_idl_loop.idl),
> > +
get_nb_cfg(sbrec_sb_global_table_get(
> > +
ovnsb_idl_loop.idl)),
> > +                                   engine_node_changed(&en_flow_output,
> > +                                                       engine_run_id));
> > +                    }
>
>
>
>
> >      if (engine_force_recompute) {
> > -        need_recompute = true;
> > -    } else {
> > -        for (size_t i = 0; i < node->n_inputs; i++) {
> > -            if (node->inputs[i].node->changed) {
> > -                need_compute = true;
> > -                if (!node->inputs[i].change_handler) {
> > -                    need_recompute = true;
> > -                    break;
> > -                }
> > +        engine_recompute(node, true, !engine_abort_recompute);
> > +        return;
> > +    }
> > +
> > +    /* If one of the inputs updated data then we need to recompute the
> > +     * current node too.
> > +     */
>
> The comment could be more accurate as: If any of the inputs updated data
but there is no change_handler, then ...
>
> > +    for (size_t i = 0; i < node->n_inputs; i++) {
> > +        if (node->inputs[i].node->state == EN_UPDATED) {
> > +            need_compute = true;
> > +
> > +            /* Trigger a recompute if we don't have a change handler.
*/
> > +            if (!node->inputs[i].change_handler) {
> > +                engine_recompute(node, false, !engine_abort_recompute);
> > +                return;
> >              }
> >          }
> >      }
> >
>
> Thanks,
> Han