[ovs-dev] [PATCH 7/7] ofproto-dpif: Restore metadata and registers on recirculation.

Jarno Rajahalme jrajahalme at nicira.com
Tue Mar 3 23:00:34 UTC 2015


Ben,

Thanks for your detailed comments, and sorry for only compiling with gcc before posting.

> On Mar 3, 2015, at 11:22 AM, Ben Pfaff <blp at nicira.com> wrote:
> 

(snip)

> ofproto-dpif-rid.c
> ------------------
> 
> This comment in recirc_run() gave me pause.  250 ms "should be" enough?
> What happens if someone odd happens; I hope that nothing
> e.g. dereferences a wild pointer and explodes?
> 
>        /* Delete the expired.  These have been lingering for at least 250 ms,
>         * which should be enough for any ongoing recirculations to be
>         * finished. */
> 

The worst that can happen is that a recirculated packet cannot be associated with a recirculation context (and needs to be dropped). This is no worse than dropping a packet on the initial input, which, while undesirable, is not catastrophic.

Lingering is useful in cases where we are only executing actions without setting up a datapath flow, such as packet-outs, but also with slow-pathed flows, or cases where datapath flow set-up fails. It also helps testing with “trace -generate” followed by dummy-receive of the recirculated packet. In these cases the recirculation context would be likely released before the recirculated packet comes back from the datapath. If the datapath manages to hold on to the recirculated packet longer than 250ms IMO it is OK to drop it.

recirc_run() is called from the revalidator, which normally runs every 500ms.

> I think that recirc_find__() requires 'mutex', because it calls
> cmap_find_protected() and 'mutex' is what keeps the cmap from changing.
> I guess that should be annotated for Clang?  It appears that to me that
> recirc_free_id() should take the mutex.

I changed record_free_id() to use public recirc_id_node_find() instead, so it does not need the mutex, thanks!

> 
> You might annotate next_id, expiring, and expired for Clang as protected
> by mutex.

Annotated, thanks!

> 
> This line in recirc_metadata_hash() makes me hope we're careful enough
> about zeroing holes (if any) in ofpacts:
> 
>        hash = hash_words64((const uint64_t *)ofpacts,
>                            OFPACT_ALIGN(ofpacts_len) / OFPACT_ALIGNTO, hash);
> 

We are already using bitwise comparison (ofpbuf_equal()) of rule’s actions in revalidation, so I have assumed holes are already taken care of. The worst that can happen if not is that a new recirc id will be allocated when an old one could have been reused - no catastrophic failures. Recirculated packets get a lookup on the ID, which is not affected by any hash differences due to potential holes in actions.

> 
> recirc_alloc_id() creates and initializes a struct flow but it doesn't
> use it for anything later.

A remnant from an early version, thanks for pointing it out!

> 
> In recirc_free_id(), s/exiting/existing/ or better yet
> s/non-exiting/nonexistent/.  Also I think this is more of a "warn"
> situation (isn't this something that can be recovered from? it's not a
> fatal error as far as I can tell):
>        VLOG_ERR("Freeing non-exiting recirculation ID: %"PRIu32, id);
> 

Changed, thanks!

> I'm not sure why recirc_free_ofproto() logs errors.  Can't there still
> be recirc_ids if there were bonds or if the ofproto didn't go idle and
> free all its recirc ids before exit?  Or is the higher-level code
> supposed to clean up before it kills the recirc id engine?
> 

I adopted this from the existing recirculation code. I think the intent has been to make sure clean up has been done so that there are no stale ofproto pointers after an ofproto is destroyed.

> 
> I need to continue reading this patch starting from
> ofproto-dpif-upcall.c.




More information about the dev mailing list