[ovs-dev] [PATCH 10/10] Implement "closures".

Jarno Rajahalme jarno at ovn.org
Thu Jan 28 22:15:29 UTC 2016


IMO you have included sufficient testing to warrant the code to be merged, maybe remove the “RFC” reference from the commit message?

Some comments for future consideration below.

With this incremental:

diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index 42c8f1e..4a080cd 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -3887,7 +3887,7 @@ parse_actions_property(struct ofpbuf *property, enum ofp_version version,
 {
     if (!ofpbuf_try_pull(property, ROUND_UP(ofpbuf_headersize(property), 8))) {
         VLOG_WARN_RL(&bad_ofmsg_rl,
-                     "actions property has bad length %"PRIuSIZE,
+                     "actions property has bad length %"PRIu32,
                      property->size);
         return OFPERR_OFPBPC_BAD_LEN;
     }


Acked-by: Jarno Rajahalme <jarno at ovn.org>

> On Jan 27, 2016, at 9:51 AM, Ben Pfaff <blp at ovn.org> wrote:
> 
> One purpose of OpenFlow packet-in messages is to allow a controller to
> interpose on the path of a packet through the flow tables.  If, for
> example, the controller needs to modify a packet in some way that the
> switch doesn't directly support, the controller should be able to
> program the switch to send it the packet, then modify the packet and
> send it back to the switch to continue through the flow table.
> 
> That's the theory.  In practice, this doesn't work with any but the
> simplest flow tables.  Packet-in messages simply don't include enough
> context to allow the flow table traversal to continue.  For example:
> 
>    * Via "resubmit" actions, an Open vSwitch packet can have an
>      effective "call stack", but a packet-in can't describe it, and
>      so it would be lost.
> 
>    * Via "patch ports", an Open vSwitch packet can traverse multiple
>      OpenFlow logical switches.  A packet-in can't describe or resume
>      this context.
> 
>    * A packet-in can't preserve the stack used by NXAST_PUSH and
>      NXAST_POP actions.
> 
>    * A packet-in can't preserve the OpenFlow 1.1+ action set.
> 
>    * A packet-in can't preserve the state of Open vSwitch mirroring
>      or connection tracking.
> 
> This commit introduces a solution called "closures".  A closure is the
> state of a packet's traversal through OpenFlow flow tables.  A new
> NXAST_PAUSE action generates a closure and sends it to the OpenFlow
> controller as an NXT_CLOSURE asynchronous message (which must be
> enabled through an extension to OFPT_SET_ASYNC or NXT_SET_ASYNC2).

Should the latter be NXT_SET_ASYNC_CONFIG?

> The controller processes the closure, possibly modifying some of its
> publicly accessible data (for now, the packet and its metadata), and
> sends it back to the switch with an NXT_RESUME request, which
> causes flow table traversal to continue.  In principle, a single
> packet can be paused and resumed multiple times.
> 
> Another way to look at it is:
> 
>    - NXAST_PAUSE is an extension of the existing OFPAT_CONTROLLER
>      action.  It sends the packet to the controller, with full
>      pipeline context (some of which is switch implementation
>      dependent, and may thus vary from switch to switch).
> 
>    - NXT_CLOSURE is an extension of OFPT_PACKET_IN, allowing for
>      implementation dependent metadata.
> 
>    - NXT_RESUME is an extension of OFPT_PACKET_OUT, with the
>      semantics that the pipeline processing is continued with the
>      original translation context from where it was left at the time
>      of NXAST_PAUSE.
> 
> This RFC patch has a number of caveats:
> 
>    * No real tests yet.
> 
> Signed-off-by: Ben Pfaff <blp at ovn.org>
> ---
> NEWS                          |   4 +
> include/openflow/nicira-ext.h |  99 +++++++++++-
> lib/meta-flow.c               |   9 +-
> lib/meta-flow.h               |   3 +-
> lib/ofp-actions.c             |  74 ++++++++-
> lib/ofp-actions.h             |   9 +-
> lib/ofp-errors.h              |  15 +-
> lib/ofp-msgs.h                |   8 +
> lib/ofp-print.c               |  68 +++++++++
> lib/ofp-util.c                | 341 ++++++++++++++++++++++++++++++++++++++++++
> lib/ofp-util.h                |  56 ++++++-
> lib/rconn.c                   |   4 +-
> ofproto/connmgr.c             |  65 ++++++--
> ofproto/connmgr.h             |   3 +
> ofproto/ofproto-dpif-xlate.c  | 132 ++++++++++++++--
> ofproto/ofproto-dpif-xlate.h  |   4 +
> ofproto/ofproto-dpif.c        |  35 +++++
> ofproto/ofproto-provider.h    |   5 +-
> ofproto/ofproto.c             |  25 ++++
> tests/ofp-print.at            |   6 +
> tests/ofproto-dpif.at         | 174 +++++++++++++++++++++
> tests/ofproto-macros.at       |  40 ++++-
> tests/ofproto.at              |   2 +
> utilities/ovs-ofctl.8.in      |  19 ++-
> utilities/ovs-ofctl.c         | 111 ++++++++++----
> vswitchd/vswitch.xml          |  17 ++-
> 26 files changed, 1247 insertions(+), 81 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 04925f6..712d9b9 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -5,6 +5,10 @@ Post-v2.5.0
>    - OpenFlow:
>      * OpenFlow 1.1+ OFPT_QUEUE_GET_CONFIG_REQUEST now supports OFPP_ANY.
>      * OpenFlow 1.4+ OFPMP_QUEUE_DESC is now supported.
> +     * New extension message NXT_SET_ASYNC_CONFIG2 to allow OpenFlow 1.4-like

Should this be just NXT_SET_ASYNC_CONFIG?

> +       control over asynchronous messages in earlier versions of OpenFlow.
> +     * New "closure" extension to pause and resume flow table traversal.
> +       See NXAST_PAUSE, NXT_CLOSURE, and NXT_RESUME for more information.
>    - ovs-ofctl:
>      * queue-get-config command now allows a queue ID to be specified.
> 
> diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h
> index dad8707..8572496 100644
> --- a/include/openflow/nicira-ext.h
> +++ b/include/openflow/nicira-ext.h
> @@ -1,5 +1,5 @@
> /*
> - * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015 Nicira, Inc.
> + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc.
>  *
>  * Licensed under the Apache License, Version 2.0 (the "License");
>  * you may not use this file except in compliance with the License.
> @@ -246,6 +246,103 @@ struct nx_packet_in {
> };
> OFP_ASSERT(sizeof(struct nx_packet_in) == 24);
> 
> +/* NXT_CLOSURE.
> + *
> + * A closure serializes the state of a packet's trip through Open vSwitch flow
> + * tables.  This permits an OpenFlow controller to interpose on a packet midway
> + * through processing in Open vSwitch.
> + *
> + * Closures fit into packet processing this way:
> + *
> + * 1. A packet ingresses into Open vSwitch, which runs it through the OpenFlow
> + *    tables.
> + *
> + * 2. An OpenFlow flow executes a NXAST_PAUSE action.  Open vSwitch serializes
> + *    the packet processing state into a closure and sends it (as an
> + *    NXT_CLOSURE) to the OpenFlow controller.
> + *
> + * 3. The controller receives the NXT_CLOSURE and processes it.  The controller
> + *    can interpret and, if desired, modify some of the contents of the
> + *    closure, such as the packet and the metadata being processed.
> + *
> + * 4. The controller sends the NXT_CLOSURE back to the switch, using an
> + *    NXT_RESUME message.  Packet processing resumes where it left off.
> + *
> + * The controller might change the pipeline configuration concurrently with
> + * steps 2 through 4.  For example, it might add or remove OpenFlow flows.  If
> + * that happens, then the packet will experience a mix of processing from the
> + * two configurations, that is, the initial processing (before NXAST_PAUSE)
> + * uses the initial flow table, and the later processing (after NXT_RESUME)
> + * uses the later flow table.
> + *
> + * External side effects (e.g. "output") of OpenFlow actions processed before
> + * NXAST_PAUSE is encountered might be executed during step 2 or step 4, and
> + * the details may vary among Open vSwitch features and versions.  Thus, a
> + * controller that wants to make sure that side effects are executed must pass
> + * the closure back to the switch, that is, must not skip step 4.
> + *
> + * Closures may be "stateful" or "stateless", that is, they may or may not
> + * refer to buffered state maintained in Open vSwitch.  This means that a
> + * controller should not attempt to resume a given closure more than once
> + * (because the switch might have discarded the buffered state after the first
> + * use).  For the same reason, closures might become "stale" if the controller
> + * takes too long to resume them (because the switch might have discarded old
> + * buffered state).  Taken together with the previous note, this means that a
> + * controller should resume each closure exactly once (and promptly).
> + *
> + * NXT_CLOSURE is something like OFPT_PACKET_IN (if OFPT_PACKET_IN were
> + * extensible then it would make sense to implement NXT_CLOSURE as an
> + * extension).  The important difference is the extra information that captures
> + * the state of the pipeline.  Without this information, the controller can
> + * (with careful design, and help from the flow cookie) determine where the
> + * packet is in the pipeline, but in the general case it can't determine what
> + * nested "resubmit"s that may be in progress, or what data is on the stack
> + * maintained by NXAST_STACK_PUSH and NXAST_STACK_POP actions, what is in the
> + * OpenFlow action set, etc.
> + *
> + * Closures are expensive because they require a round trip between the switch
> + * and the controller.  Thus, they should not be used to implement processing
> + * that needs to happen at "line rate".
> + *
> + * Closures are serialized as type-level-value properties in the format
> + * commonly used in OpenFlow 1.4 and later.  Some properties, with 'type'
> + * values less than 0x8000, are meant to be interpreted by the controller, and
> + * are documented here.  Other properties, with 'type' values of 0x8000 or
> + * greater, are private to the switch, may change unpredictably from one
> + * version of Open vSwitch to another, and are not documented here.
> + *
> + * An NXT_CLOSURE is tied to a given Open vSwitch process and bridge, so that
> + * restarting Open vSwitch or deleting and recreating a bridge will cause the
> + * corresponding NXT_RESUME to be rejected.
> + *
> + * In the current implementation, Open vSwitch forks the packet processing
> + * pipeline across patch ports.  Suppose, for example, that the pipeline for
> + * br0 outputs to a patch port whose peer belongs to br1, and that the pipeline
> + * for br1 executes a "pause" action.  This "pause" only pauses processing
> + * within br1, and processing in br0 continues and possibly completes with
> + * visible side effects, such as outputting to ports, before br1's controller
> + * receives or processes the closure.  This implementation maintains the
> + * independence of separate bridges and, since processing in br1 cannot affect
> + * the behavior of br0 anyway, should not cause visible behavioral changes.
> + */
> +
> +enum nx_closure_prop_type {
> +    /* Public properties. */
> +    NXCPT_PACKET,               /* Raw packet data. */
> +    NXCPT_METADATA,             /* NXM or OXM for metadata fields. */
> +
> +    /* Private properties.  These are subject to change and not architectural.
> +     * Do not depend on them. */
> +    NXCPT_BRIDGE = 0x8000,
> +    NXCPT_STACK,
> +    NXCPT_MIRRORS,
> +    NXCPT_CONNTRACKED,
> +    NXCPT_TABLE_ID,
> +    NXCPT_COOKIE,
> +    NXCPT_ACTIONS,
> +    NXCPT_ACTION_SET,

When I wrote the following I did not yet see that there may be multiple table_id, cookie, and actions properties. I guess the comment below applies to the table_id if the table where the pause action was located, and the cookie to the cookie of the rule that has the pause action:

It seems to me that the controller parsing of the incoming closures might be eased by making table_id and cookie public properties. I do not have any specific example in mind, but controllers explicitly install rules with specific cookies to specific tables, hence the controller could use, e.g., a special cookie value (or bits) to separate different PAUSE reasons from each other. With that thought, maybe there could be a ‘reason’ parameter to the PAUSE action that would then be passed on in the CLOSURE message? 

(snip)

> +static void
> +ofputil_put_closure_private(const struct ofputil_closure_private *closure,
> +                            enum ofputil_protocol protocol, struct ofpbuf *msg)
> +{
> +    enum ofp_version version = ofputil_protocol_to_ofp_version(protocol);
> +
> +    ofputil_put_closure(&closure->public, protocol, msg);
> +
> +    ofpprop_put_uuid(msg, NXCPT_BRIDGE, &closure->bridge);
> +
> +    for (size_t i = 0; i < closure->n_stack; i++) {
> +        const union mf_subvalue *s = &closure->stack[i];
> +        size_t ofs;
> +        for (ofs = 0; ofs < sizeof *s; ofs++) {
> +            if (s->u8[ofs]) {
> +                break;
> +            }
> +        }
> +
> +        ofpprop_put(msg, NXCPT_STACK, &s->u8[ofs], sizeof *s - ofs);
> +    }
> +
> +    if (closure->mirrors) {
> +        ofpprop_put_u32(msg, NXCPT_MIRRORS, closure->mirrors);
> +    }
> +
> +    if (closure->conntracked) {
> +        ofpprop_put_flag(msg, NXCPT_CONNTRACKED);
> +    }
> +
> +    if (closure->actions_len) {
> +        const struct ofpact *const end = ofpact_end(closure->actions,
> +                                                    closure->actions_len);
> +        const struct ofpact_unroll_xlate *unroll = NULL;
> +        uint8_t table_id = 0;
> +        ovs_be64 cookie = 0;
> +
> +        const struct ofpact *a;
> +        for (a = closure->actions; ; a = ofpact_next(a)) {
> +            if (a == end || a->type == OFPACT_UNROLL_XLATE) {
> +                if (unroll) {
> +                    if (table_id != unroll->rule_table_id) {
> +                        ofpprop_put_u8(msg, NXCPT_TABLE_ID,
> +                                       unroll->rule_table_id);
> +                        table_id = unroll->rule_table_id;
> +                    }
> +                    if (cookie != unroll->rule_cookie) {
> +                        ofpprop_put_be64(msg, NXCPT_COOKIE,
> +                                         unroll->rule_cookie);
> +                        cookie = unroll->rule_cookie;
> +                    }
> +                }
> +
> +                const struct ofpact *start
> +                    = unroll ? ofpact_next(&unroll->ofpact) : closure->actions;
> +                put_actions_property(msg, NXCPT_ACTIONS, version,
> +                                     start, (a - start) * sizeof *a);
> +
> +                if (a == end) {
> +                    break;
> +                }
> +                unroll = ofpact_get_UNROLL_XLATE(a);
> +            }
> +        }

So the above emits sets of actions as separated by the unroll_xlate actions, and instead of encoding the separating unroll_xlate action, a pair of table_id and cookie properties is encoded (but only if they differ from the last ones). Is this to keep unroll_xlate private to OVS, or are there some other reasons as well? Maybe the loop could be preceded by a comment explaining all this?

> +    }
> +
> +    if (closure->action_set_len) {
> +        size_t start = ofpprop_start(msg, NXCPT_ACTION_SET);
> +        ofpbuf_padto(msg, ROUND_UP(msg->size, 8));
> +        ofpacts_put_openflow_actions(closure->action_set,
> +                                     closure->action_set_len, msg, version);
> +        ofpprop_end(msg, start);
> +    }
> +}
> +
> +struct ofpbuf *
> +ofputil_encode_resume(const struct ofputil_closure *closure,
> +                      const struct ofpbuf *private_properties,
> +                      enum ofputil_protocol protocol)
> +{
> +    enum ofp_version version = ofputil_protocol_to_ofp_version(protocol);
> +    size_t extra = (closure->packet_len
> +                    + NXM_TYPICAL_LEN   /* flow_metadata */
> +                    + private_properties->size);
> +    struct ofpbuf *msg = ofpraw_alloc_xid(OFPRAW_NXT_RESUME, version,
> +                                          0, extra);
> +    ofputil_put_closure(closure, protocol, msg);
> +    ofpbuf_put(msg, private_properties->data, private_properties->size);
> +    ofpmsg_update_length(msg);
> +    return msg;
> +}
> +
> +struct ofpbuf *
> +ofputil_encode_closure_private(const struct ofputil_closure_private *closure,
> +                               enum ofputil_protocol protocol)
> +{
> +    enum ofp_version version = ofputil_protocol_to_ofp_version(protocol);
> +    size_t extra = (closure->public.packet_len
> +                    + NXM_TYPICAL_LEN   /* flow_metadata */
> +                    + closure->n_stack * 16
> +                    + closure->actions_len
> +                    + closure->action_set_len
> +                    + 256);     /* fudge factor */
> +    struct ofpbuf *msg = ofpraw_alloc_xid(OFPRAW_NXT_CLOSURE, version,
> +                                          0, extra);
> +    ofputil_put_closure_private(closure, protocol, msg);
> +    ofpmsg_update_length(msg);
> +    return msg;
> +}
> +
> +static enum ofperr
> +parse_subvalue_prop(const struct ofpbuf *property, union mf_subvalue *sv)
> +{
> +    unsigned int len = ofpbuf_msgsize(property);
> +    if (len > sizeof *sv) {
> +        VLOG_WARN_RL(&bad_ofmsg_rl, "NXCPT_STACK property has bad length %u",
> +                     len);
> +        return OFPERR_OFPBPC_BAD_LEN;
> +    }
> +    memset(sv, 0, sizeof *sv);
> +    memcpy(&sv->u8[sizeof *sv - len], property->msg, len);
> +    return 0;
> +}
> +
> +enum ofperr
> +ofputil_decode_closure(const struct ofp_header *oh,
> +                       struct ofputil_closure *closure,
> +                       struct ofpbuf *private_properties)
> +{
> +    memset(closure, 0, sizeof *closure);
> +
> +    struct ofpbuf properties;
> +    ofpbuf_use_const(&properties, oh, ntohs(oh->length));
> +    ofpraw_pull_assert(&properties);
> +
> +    while (properties.size > 0) {
> +        struct ofpbuf payload;
> +        uint64_t type;
> +
> +        enum ofperr error = ofpprop_pull(&properties, &payload, &type);
> +        if (error) {
> +            return error;
> +        }
> +
> +        if (type >= 0x8000) {
> +            if (private_properties) {
> +                ofpbuf_put(private_properties, payload.data, payload.size);
> +                ofpbuf_padto(private_properties,
> +                             ROUND_UP(private_properties->size, 8));
> +            }
> +            continue;
> +        }
> +
> +        switch (type) {
> +        case NXCPT_PACKET:
> +            free(closure->packet);
> +            closure->packet_len = ofpbuf_msgsize(&payload);
> +            closure->packet = xmemdup(payload.msg, closure->packet_len);
> +            break;
> +
> +        case NXCPT_METADATA:
> +            error = oxm_decode_match(payload.msg, ofpbuf_msgsize(&payload),
> +                                     &closure->metadata);
> +            break;
> +
> +        default:
> +            error = OFPPROP_UNKNOWN(false, "closure", type);
> +            break;
> +        }
> +        if (error) {
> +            ofputil_closure_destroy(closure);
> +            return error;
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +static enum ofperr
> +parse_actions_property(struct ofpbuf *property, enum ofp_version version,
> +                       struct ofpbuf *ofpacts)
> +{
> +    if (!ofpbuf_try_pull(property, ROUND_UP(ofpbuf_headersize(property), 8))) {
> +        VLOG_WARN_RL(&bad_ofmsg_rl,
> +                     "actions property has bad length %"PRIuSIZE,
> +                     property->size);

‘property->size’ is an uint32_t, not size_t, so the formatting is wrong.

> +        return OFPERR_OFPBPC_BAD_LEN;
> +    }
> +
> +    return ofpacts_pull_openflow_actions(property, property->size,
> +                                         version, ofpacts);
> +}
> +

(snip)

> @@ -1661,24 +1670,44 @@ connmgr_send_async_msg(struct connmgr *mgr,
>     LIST_FOR_EACH (ofconn, node, &mgr->all_conns) {
>         enum ofputil_protocol protocol = ofconn_get_protocol(ofconn);
>         if (protocol == OFPUTIL_P_NONE || !rconn_is_connected(ofconn->rconn)
> -            || ofconn->controller_id != am->controller_id
> -            || !ofconn_receives_async_msg(ofconn, am->oam,
> -                                          am->pin.up.reason)) {
> +            || ofconn->controller_id != am->controller_id) {
>             continue;
>         }
> 
> -        struct ofpbuf *msg = ofputil_encode_packet_in(
> -            &am->pin.up, protocol, ofconn->packet_in_format,
> -            am->pin.max_len >= 0 ? am->pin.max_len : ofconn->miss_send_len,
> -            ofconn->pktbuf);
> +        struct ofpbuf *msg;
> +        enum ofconn_scheduler sched;
> +        ofp_port_t port;
> +        if (am->oam == OAM_PACKET_IN) {
> +            if (!ofconn_receives_async_msg(ofconn, am->oam,
> +                                           am->pin.up.reason)) {
> +                continue;
> +            }
> +
> +            uint16_t max_len = (am->pin.max_len >= 0
> +                                ? am->pin.max_len
> +                                : ofconn->miss_send_len);
> +            msg = ofputil_encode_packet_in(&am->pin.up, protocol,
> +                                           ofconn->packet_in_format,
> +                                           max_len, ofconn->pktbuf);
> +            sched = ((am->pin.up.reason == OFPR_NO_MATCH ||
> +                      am->pin.up.reason == OFPR_EXPLICIT_MISS ||
> +                      am->pin.up.reason == OFPR_IMPLICIT_MISS)
> +                     ? SCHED_MISS : SCHED_ACTION);
> +            port = am->pin.up.flow_metadata.flow.in_port.ofp_port;
> +        } else if (am->oam == OAM_CLOSURE) {
> +            if (!ofconn_receives_async_msg(ofconn, am->oam, 0)) {
> +                continue;
> +            }
> +
> +            msg = ofputil_encode_closure_private(&am->closure, protocol);
> +            sched = SCHED_CLOSURE;
> +            port = 0; /* XXX */

Maybe should just document that closures have port number 0?

> +        } else {
> +            OVS_NOT_REACHED();
> +        }
> 
>         struct ovs_list txq;
> -        bool is_miss = (am->pin.up.reason == OFPR_NO_MATCH ||
> -                        am->pin.up.reason == OFPR_EXPLICIT_MISS ||
> -                        am->pin.up.reason == OFPR_IMPLICIT_MISS);
> -        pinsched_send(ofconn->schedulers[is_miss],
> -                      am->pin.up.flow_metadata.flow.in_port.ofp_port,
> -                      msg, &txq);
> +        pinsched_send(ofconn->schedulers[sched], port, msg, &txq);
>         do_send_packet_ins(ofconn, &txq);
>     }
> }
> @@ -2247,6 +2276,10 @@ ofmonitor_wait(struct connmgr *mgr)
> void
> ofproto_async_msg_free(struct ofproto_async_msg *am)
> {
> -    free(CONST_CAST(void *, am->pin.up.packet));
> +    if (am->oam == OAM_PACKET_IN) {
> +        free(CONST_CAST(void *, am->pin.up.packet));
> +    } else if (am->oam == OAM_CLOSURE) {
> +        ofputil_closure_private_destroy(&am->closure);
> +    }
>     free(am);
> }
> diff --git a/ofproto/connmgr.h b/ofproto/connmgr.h
> index fb7573e..e3c6b59 100644
> --- a/ofproto/connmgr.h
> +++ b/ofproto/connmgr.h
> @@ -66,6 +66,9 @@ struct ofproto_async_msg {
>             struct ofputil_packet_in up;
>             int max_len;                /* From action, or -1 if none. */
>         } pin;
> +
> +        /* OAM_CLOSURE. */
> +        struct ofputil_closure_private closure;
>     };
> };
> void ofproto_async_msg_free(struct ofproto_async_msg *);
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 6e928da..a71f778 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -294,6 +294,7 @@ struct xlate_ctx {
>                                  * executed after recirculation, or -1. */
>     int last_unroll_offset;     /* Offset in 'action_set' to the latest unroll
>                                  * action, or -1. */
> +    const struct ofpact_pause *pause;
> 
>     /* True if a packet was but is no longer MPLS (due to an MPLS pop action).
>      * This is a trigger for recirculation in cases where translating an action
> @@ -3628,6 +3629,35 @@ execute_controller_action(struct xlate_ctx *ctx, int len,
> }
> 
> static void
> +emit_closure(struct xlate_ctx *ctx, const struct recirc_state *state)
> +{
> +    struct ofproto_async_msg *am = xmalloc(sizeof *am);
> +    *am = (struct ofproto_async_msg) {
> +        .controller_id = ctx->pause->controller_id,
> +        .oam = OAM_CLOSURE,
> +        .closure = {
> +            .public = {
> +                .packet = xmemdup(dp_packet_data(ctx->xin->packet),
> +                                  dp_packet_size(ctx->xin->packet)),
> +                .packet_len = dp_packet_size(ctx->xin->packet),
> +            },
> +            .bridge = *ofproto_dpif_get_uuid(ctx->xbridge->ofproto),
> +            .stack = xmemdup(state->stack,
> +                             state->n_stack * sizeof *state->stack),
> +            .n_stack = state->n_stack,
> +            .mirrors = state->mirrors,
> +            .conntracked = state->conntracked,
> +            .actions = xmemdup(state->ofpacts, state->ofpacts_len),
> +            .actions_len = state->ofpacts_len,
> +            .action_set = xmemdup(state->action_set, state->action_set_len),
> +            .action_set_len = state->action_set_len,
> +        }
> +    };
> +    flow_get_metadata(&ctx->xin->flow, &am->closure.public.metadata);
> +    ofproto_dpif_send_async_msg(ctx->xbridge->ofproto, am);
> +}
> +
> +static void
> compose_recirculate_action__(struct xlate_ctx *ctx, uint8_t table)
> {
>     struct recirc_metadata md;
> @@ -3652,20 +3682,26 @@ compose_recirculate_action__(struct xlate_ctx *ctx, uint8_t table)
>         .action_set_len = ctx->recirc_action_offset,
>     };
> 
> -    /* Allocate a unique recirc id for the given metadata state in the
> -     * flow.  An existing id, with a new reference to the corresponding
> -     * recirculation context, will be returned if possible.
> -     * The life-cycle of this recirc id is managed by associating it
> -     * with the udpif key ('ukey') created for each new datapath flow. */
> -    id = recirc_alloc_id_ctx(&state);
> -    if (!id) {
> -        XLATE_REPORT_ERROR(ctx, "Failed to allocate recirculation id");
> -        ctx->error = XLATE_NO_RECIRCULATION_CONTEXT;
> -        return;
> -    }
> -    recirc_refs_add(&ctx->xout->recircs, id);
> +    if (ctx->pause) {
> +        if (ctx->xin->packet) {
> +            emit_closure(ctx, &state);
> +        }
> +    } else {
> +        /* Allocate a unique recirc id for the given metadata state in the
> +         * flow.  An existing id, with a new reference to the corresponding
> +         * recirculation context, will be returned if possible.
> +         * The life-cycle of this recirc id is managed by associating it
> +         * with the udpif key ('ukey') created for each new datapath flow. */
> +        id = recirc_alloc_id_ctx(&state);
> +        if (!id) {
> +            XLATE_REPORT_ERROR(ctx, "Failed to allocate recirculation id");
> +            ctx->error = XLATE_NO_RECIRCULATION_CONTEXT;
> +            return;
> +        }
> +        recirc_refs_add(&ctx->xout->recircs, id);
> 
> -    nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_RECIRC, id);
> +        nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_RECIRC, id);
> +    }

This function’s name could be changed to reflect the fact that it is used to either emit closure or to compose the recirculation action.

> 
>     /* Undo changes done by recirculation. */
>     ctx->action_set.size = ctx->recirc_action_offset;
> @@ -4178,6 +4214,7 @@ recirc_unroll_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
>         case OFPACT_GROUP:
>         case OFPACT_OUTPUT:
>         case OFPACT_CONTROLLER:
> +        case OFPACT_PAUSE:
>         case OFPACT_DEC_MPLS_TTL:
>         case OFPACT_DEC_TTL:
>             /* These actions may generate asynchronous messages, which include
> @@ -4746,6 +4783,13 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
>             break;
>         }
> 
> +        case OFPACT_PAUSE:
> +            ctx->pause = ofpact_get_PAUSE(a);
> +            ctx->xout->slow |= SLOW_CONTROLLER;
> +            ctx_trigger_recirculation(ctx);

ctx_trigger_recirculation() could be renamed as ctx_pause_translation(), which could then lead to either emitting the closure or recirculation.

> +            a = ofpact_next(a);
> +            break;
> +
>         case OFPACT_EXIT:
>             ctx->exit = true;
>             break;
> @@ -5132,6 +5176,7 @@ xlate_actions(struct xlate_in *xin, struct xlate_out *xout)
> 
>         .recirc_action_offset = -1,
>         .last_unroll_offset = -1,
> +        .pause = NULL,
> 
>         .was_mpls = false,
>         .conntracked = false,
> @@ -5434,6 +5479,67 @@ exit:
>     return ctx.error;
> }
> 
> +enum ofperr
> +xlate_closure(struct ofproto_dpif *ofproto,
> +              const struct ofputil_closure_private *closure,
> +              struct ofpbuf *odp_actions,
> +              enum slow_path_reason *slow)
> +{
> +    struct dp_packet packet;
> +    dp_packet_use_const(&packet, closure->public.packet,
> +                        closure->public.packet_len);
> +
> +    struct flow flow;
> +    flow_extract(&packet, &flow);
> +
> +    struct xlate_in xin;
> +    xlate_in_init(&xin, ofproto, &flow, 0, NULL, flow.tcp_flags, &packet,
> +                  NULL, odp_actions);
> +
> +    struct ofpact_note noop;
> +    ofpact_init_NOTE(&noop);
> +    noop.length = 0;
> +
> +    bool any_actions = closure->actions_len > 0;
> +    struct recirc_state recirc = {
> +        .table_id = 0,     /* Not the table where NXAST_PAUSE was executed. */
> +        .ofproto_uuid = closure->bridge,
> +        .stack = closure->stack,
> +        .n_stack = closure->n_stack,
> +        .mirrors = closure->mirrors,
> +        .conntracked = closure->conntracked,
> +
> +        /* When there are no actions, xlate_actions() will search the flow
> +         * table.  We don't want it to do that (we want it to resume), so
> +         * supply a no-op action if there aren't any.
> +         *
> +         * (We can't necessarily avoid translating actions entirely if there
> +         * aren't any actions, because there might be some finishing-up to do
> +         * at the end of the pipeline, and we don't check for those
> +         * conditions.)) */

So this part in parenthesis is the explanation for the use of the ‘noop’. Would it be correct to state that even if the closure has no actions, it still may have an action set that is executed during the translation? What if also the action set is empty, is the translation still needed?

There are two closing parentheses above: ‘))’. 

> +        .ofpacts = any_actions ? closure->actions : &noop.ofpact,
> +        .ofpacts_len = any_actions ? closure->actions_len : sizeof noop,
> +
> +        .action_set = closure->action_set,
> +        .action_set_len = closure->action_set_len,
> +    };
> +    recirc_metadata_from_flow(&recirc.metadata,
> +                              &closure->public.metadata.flow);
> +    xin.recirc = &recirc;
> +
> +    struct xlate_out xout;
> +    enum xlate_error error = xlate_actions(&xin, &xout);
> +    *slow = xout.slow;
> +    xlate_out_uninit(&xout);
> +
> +    /* xlate_actions() can generate a number of errors, but only
> +     * XLATE_BRIDGE_NOT_FOUND really stands out to me as one that we should be
> +     * sure to report over OpenFlow.  The others could come up in packet-outs
> +     * or regular flow translation and I don't think that it's going to be too
> +     * useful to report them to the controller. */
> +    return error == XLATE_BRIDGE_NOT_FOUND ? OFPERR_NXR_STALE : 0;
> +}
> +
> /* Sends 'packet' out 'ofport'.
>  * May modify 'packet'.
>  * Returns 0 if successful, otherwise a positive errno value. */

(snip)





More information about the dev mailing list