[ovs-dev] [PATCH v4 07/14] Implement serializing the state of packet traversal in "continuations".
Jarno Rajahalme
jarno at ovn.org
Fri Feb 19 23:43:35 UTC 2016
With small comments below:
Acked-by: Jarno Rajahalme <jarno at ovn.org>
> On Feb 19, 2016, at 12:34 AM, Ben Pfaff <blp at ovn.org> wrote:
>
> One purpose of OpenFlow packet-in messages is to allow a controller to
> interpose on the path of a packet through the flow tables. If, for
> example, the controller needs to modify a packet in some way that the
> switch doesn't directly support, the controller should be able to
> program the switch to send it the packet, then modify the packet and
> send it back to the switch to continue through the flow table.
>
> That's the theory. In practice, this doesn't work with any but the
> simplest flow tables. Packet-in messages simply don't include enough
> context to allow the flow table traversal to continue. For example:
>
> * Via "resubmit" actions, an Open vSwitch packet can have an
> effective "call stack", but a packet-in can't describe it, and
> so it would be lost.
>
> * Via "patch ports", an Open vSwitch packet can traverse multiple
> OpenFlow logical switches. A packet-in can't describe or resume
> this context.
>
Is there any context regarding this that needs to be described?
> * A packet-in can't preserve the stack used by NXAST_PUSH and
> NXAST_POP actions.
>
> * A packet-in can't preserve the OpenFlow 1.1+ action set.
>
> * A packet-in can't preserve the state of Open vSwitch mirroring
> or connection tracking.
>
> This commit introduces a solution called "continuations". A continuation
> is the state of a packet's traversal through OpenFlow flow tables. A
> "controller" action with the "pause" flag, which is newly implemented in
> this comit, generates a continuation and sends it to the OpenFlow
“commit"
> controller in a packet-in asynchronous message (only NXT_PACKET_IN2
> supports continuations, so the controller must configure them with
> NXT_SET_PACKET_IN_FORMAT). The controller processes the packet-in,
> possibly modifying some of its data, and sends it back to the switch with
> an NXT_RESUME request, which causes flow table traversal to continue. In
> principle, a single packet can be paused and resumed multiple times.
>
> Another way to look at it is:
>
> - "pause" is an extension of the existing OFPAT_CONTROLLER
> action. It sends the packet to the controller, with full
> pipeline context (some of which is switch implementation
> dependent, and may thus vary from switch to switch).
>
> - A continuation is an extension of OFPT_PACKET_IN, allowing for
> implementation dependent metadata.
>
> - NXT_RESUME is an extension of OFPT_PACKET_OUT, with the
> semantics that the pipeline processing is continued with the
> original translation context from where it was left at the time
> it was paused.
>
> Signed-off-by: Ben Pfaff <blp at ovn.org>
> Acked-by: Jarno Rajahalme <jarno at ovn.org>
> ---
> NEWS | 5 +-
> include/openflow/nicira-ext.h | 96 ++++++++-
> lib/learning-switch.c | 3 +-
> lib/meta-flow.c | 9 +-
> lib/meta-flow.h | 3 +-
> lib/ofp-actions.c | 28 ++-
> lib/ofp-actions.h | 5 +
> lib/ofp-errors.h | 16 +-
> lib/ofp-msgs.h | 4 +
> lib/ofp-print.c | 78 +++++--
> lib/ofp-util.c | 470 ++++++++++++++++++++++++++++++++++++------
> lib/ofp-util.h | 57 ++++-
> lib/rconn.c | 3 +-
> ofproto/connmgr.c | 23 ++-
> ofproto/connmgr.h | 2 +-
> ofproto/fail-open.c | 16 +-
> ofproto/ofproto-dpif-xlate.c | 199 ++++++++++++++----
> ofproto/ofproto-dpif-xlate.h | 4 +
> ofproto/ofproto-dpif.c | 34 +++
> ofproto/ofproto-provider.h | 3 +
> ofproto/ofproto.c | 24 +++
> ovn/TODO | 57 -----
> ovn/controller/pinctrl.c | 4 +-
> tests/ofp-actions.at | 13 +-
> tests/ofp-print.at | 12 ++
> tests/ofproto-dpif.at | 172 ++++++++++++++++
> tests/ofproto-macros.at | 35 +++-
> utilities/ovs-ofctl.8.in | 11 +-
> utilities/ovs-ofctl.c | 109 +++++++---
> 29 files changed, 1239 insertions(+), 256 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 9ab6cae..ba4b7f7 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -6,7 +6,10 @@ Post-v2.5.0
> * OpenFlow 1.1+ OFPT_QUEUE_GET_CONFIG_REQUEST now supports OFPP_ANY.
> * OpenFlow 1.4+ OFPMP_QUEUE_DESC is now supported.
> * New property-based packet-in message format NXT_PACKET_IN2 with support
> - for arbitrary user-provided data.
> + for arbitrary user-provided data and for serializing flow table
> + traversal into a continuation for later resumption.
> + * New extension message NXT_SET_ASYNC_CONFIG2 to allow OpenFlow 1.4-like
> + control over asynchronous messages in earlier versions of OpenFlow.
> - ovs-ofctl:
> * queue-get-config command now allows a queue ID to be specified.
> - DPDK:
> diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h
> index 7e56066..77a735d 100644
> --- a/include/openflow/nicira-ext.h
> +++ b/include/openflow/nicira-ext.h
> @@ -260,12 +260,103 @@ struct nx_packet_in {
> };
> OFP_ASSERT(sizeof(struct nx_packet_in) == 24);
>
> -/* NXT_PACKET_IN2.
> +/* NXT_PACKET_IN2
> + * ==============
> *
> * NXT_PACKET_IN2 is conceptually similar to OFPT_PACKET_IN but it is expressed
> * as an extensible set of properties instead of using a fixed structure.
> *
> - * Added in Open vSwitch 2.6. */
> + * Added in Open vSwitch 2.6
> + *
> + *
> + * Continuations
> + * -------------
> + *
> + * When a "controller" action specifies the "pause" flag, the controller action
> + * freezes the packet's trip through Open vSwitch flow tables and serializes
> + * that state into the packet-in message as a "continuation". The controller
> + * can later send the continuation back to the switch, which will restart the
> + * packet's traversal from the point where it was interrupted. This permits an
> + * OpenFlow controller to interpose on a packet midway through processing in
> + * Open vSwitch.
> + *
> + * Continuations fit into packet processing this way:
> + *
> + * 1. A packet ingresses into Open vSwitch, which runs it through the OpenFlow
> + * tables.
> + *
> + * 2. An OpenFlow flow executes a "controller" action that includes the "pause"
> + * flag. Open vSwitch serializes the packet processing state and sends it,
> + * as an NXT_PACKET_IN2 that includes an additional NXPINT_CONTINUATION
> + * property (the continuation), to the OpenFlow controller.
> + *
> + * (The controller must use NXAST_CONTROLLER2 to generate the packet-in,
> + * because only this form of the "controller" action has a "pause" flag.
> + * Similarly, the controller must use NXT_SET_PACKET_IN_FORMAT to select
> + * NXT_PACKET_IN2 as the packet-in format, because this is the only format
> + * that supports continuation passing.)
> + *
> + * 3. The controller receives the NXT_PACKET_IN2 and processes it. The
> + * controller can interpret and, if desired, modify some of the contents of
> + * the packet-in, such as the packet and the metadata being processed.
> + *
> + * 4. The controller sends the continuation back to the switch, using an
> + * NXT_RESUME message. Packet processing resumes where it left off.
> + *
> + * The controller might change the pipeline configuration concurrently with
> + * steps 2 through 4. For example, it might add or remove OpenFlow flows. If
> + * that happens, then the packet will experience a mix of processing from the
> + * two configurations, that is, the initial processing (before
> + * NXAST_CONTROLLER2) uses the initial flow table, and the later processing
> + * (after NXT_RESUME) uses the later flow table.
Maybe it should be noted here that if the layout of data that is pushed/popped to/from the stack changes then the continuation of the packet processing might have unpredictable behavior. But maybe this is true for pipeline “shape” changes in general.
> + *
> + * External side effects (e.g. "output") of OpenFlow actions processed before
> + * NXAST_CONTROLLER2 is encountered might be executed during step 2 or step 4,
> + * and the details may vary among Open vSwitch features and versions. Thus, a
> + * controller that wants to make sure that side effects are executed must pass
> + * the continuation back to the switch, that is, must not skip step 4.
> + *
> + * Architecturally, continuations may be "stateful" or "stateless", that is,
> + * they may or may not refer to buffered state maintained in Open vSwitch.
> + * This means that a controller should not attempt to resume a given
> + * continuations more than once (because the switch might have discarded the
> + * buffered state after the first use). For the same reason, continuations
> + * might become "stale" if the controller takes too long to resume them
> + * (because the switch might have discarded old buffered state). Taken
> + * together with the previous note, this means that a controller should resume
> + * each continuation exactly once (and promptly).
> + *
> + * Without the information in NXPINT_CONTINUATION, the controller can (with
> + * careful design, and help from the flow cookie) determine where the packet is
> + * in the pipeline, but in the general case it can't determine what nested
> + * "resubmit"s that may be in progress, or what data is on the stack maintained
> + * by NXAST_STACK_PUSH and NXAST_STACK_POP actions, what is in the OpenFlow
> + * action set, etc.
> + *
> + * Continuations are expensive because they require a round trip between the
> + * switch and the controller. Thus, they should not be used to implement
> + * processing that needs to happen at "line rate".
> + *
> + * The contents of NXPINT_CONTINUATION are private to the switch, may change
> + * unpredictably from one version of Open vSwitch to another, and are not
> + * documented here. The contents are also tied to a given Open vSwitch process
> + * and bridge, so that restarting Open vSwitch or deleting and recreating a
> + * bridge will cause the corresponding NXT_RESUME to be rejected.
> + *
> + * In the current implementation, Open vSwitch forks the packet processing
> + * pipeline across patch ports. Suppose, for example, that the pipeline for
> + * br0 outputs to a patch port whose peer belongs to br1, and that the pipeline
> + * for br1 executes a controller action with the "pause" flag. This only
> + * pauses processing within br1, and processing in br0 continues and possibly
> + * completes with visible side effects, such as outputting to ports, before
> + * br1's controller receives or processes the continuation. This
> + * implementation maintains the independence of separate bridges and, since
> + * processing in br1 cannot affect the behavior of br0 anyway, should not cause
> + * visible behavioral changes.
> + *
> + * A packet-in that includes a continuation always includes the entire packet
> + * and is never buffered.
Does this need to be the case? Does not not contradict the stateful/stateless comment above?
> + */
> enum nx_packet_in2_prop_type {
> /* Packet. */
> NXPINT_PACKET, /* Raw packet data. */
> @@ -280,6 +371,7 @@ enum nx_packet_in2_prop_type {
> NXPINT_REASON, /* uint8_t, one of OFPR_*. */
> NXPINT_METADATA, /* NXM or OXM for metadata fields. */
> NXPINT_USERDATA, /* From NXAST_CONTROLLER2 userdata. */
> + NXPINT_CONTINUATION, /* Private data for continuing processing. */
> };
>
> /* Configures the "role" of the sending controller. The default role is:
>
(snip)
More information about the dev
mailing list