[ovs-dev] [PATCH] Add packet recirculation
Simon Horman
horms at verge.net.au
Thu Mar 28 05:45:44 UTC 2013
On Mon, Mar 25, 2013 at 08:18:02AM +0900, Simon Horman wrote:
> Recirculation is a technique to allow a frame to re-enter
> frame processing. This is intended to be used after actions
> have been applied to the frame with modify the frame in
> some way that makes it possible for richer processing to occur.
>
> An example is and indeed targeted use case is MPLS. If an MPLS frame has an
> mpls_pop action applied with the IPv4 ethernet type then it becomes
> possible to decode the IPv4 portion of the frame. This may be used to
> construct a facet that modifies the IPv4 portion of the frame. This is not
> possible prior to the mpls_pop action as the contents of the frame after
> the MPLS stack is not known to be IPv4.
>
> Design:
> * New recirculation action.
>
> ovs-vswitchd adds a recirculation action to the end of a list of
> datapath actions for a flow when the actions are truncated because
> insufficient flow match information is available to add the next
> OpenFlow action. The recirculation action is preceded by an action
> to set the skb_mark to an id which can be used to scope a facet lookup
> of a recirculated packet.
>
> e.g. pop_mpls(0x0800),dec_ttl becomes pop_mpls(0x800),set(skb_mark(id)),recirculate
>
> * Datapath behaviour
>
> Then the datapath encounters a recirculate action it:
> + Recalculates the flow key based on the packet
> which will typically have been modified by previous actions
> + As the recirculate action is preceded by a set(skb_mark(id)) action,
> the new match key will now include skb_mark=id.
> + Performs a lookup using the new match key
> + Processes the packet if a facet matches the key or;
> + Makes an upcall if necessary
>
> Note that the implementation corresponding to the design above
> currently requires facets to be used. It does not support packet out.
>
> Signed-off-by: Simon Horman <horms at verge.net.au>
Hi Jesse,
a gentle nudge for a review of this.
I believe it addresses all of the issues you raised in regards to rfc1.
> ---
>
> This patch depends on "[PATCH v2.23] datapath: Add basic MPLS support to kernel".
> There are currently no other patches in the recirculation series.
>
> For reference this patch is available in git at:
> git://github.com/horms/openvswitch.git devel/mpls-recirculate.rfc2
>
> TODO:
>
> * More sensible handling of recirculation IDs [ovs-vswtichd]
> * More sensible lookup of facets based on recirculation IDs [ovs-vswtichd]
> * Look more closely at facet revalidation [ovs-vswtichd]
> * Handle recirculation for packet out
> * Do not force misses all MPLS frame misses to be handled with facets.
> Instead either more accurately detect if recirculation will occur
> or allow recirculation to occur without facets.
>
> rfc2
> * As suggested by Jesse Gross
> - Update for changes to ovs_dp_process_received_packet()
> to no longer check if OVS_CB(skb)->flow is pre-initialised.
> - Do not add spurious printk debugging to ovs_execute_actions()
> - Do not add spurious debugging messages to commit_set_nw_action()
> - Correct typo in comment above commit_odp_actions().
> - Do not execute recirculation in ovs-vswitchd, rather allow
> the datapath to make an upcall when a recirculation action
> is encountered on execute.
> + This implicitly breaks support for recirculation without facets,
> so for now force all misses of MPLS frames to be handled with
> a facet; and treat handling of recirculation for packet_out as
> a todo item.
> - Use skb_mark for recirculation_id in match. This avoids
> both expanding the match and including a recirculation_id parameter
> with the recirculation action: set_skb_mark should be used before
> the recirculation action.
> - Tidy up ownership of skb in ovs_execute_actions
>
> rfc1
> * Initial post
>
> ---
> datapath/actions.c | 9 +-
> datapath/datapath.c | 98 +++++++++++-------
> datapath/datapath.h | 2 +-
> include/linux/openvswitch.h | 4 +
> lib/dpif-netdev.c | 89 +++++++++++-----
> lib/flow.h | 3 +
> lib/odp-util.c | 15 ++-
> lib/odp-util.h | 1 +
> ofproto/ofproto-dpif.c | 236 +++++++++++++++++++++++++++++++++++++------
> 9 files changed, 359 insertions(+), 98 deletions(-)
>
> diff --git a/datapath/actions.c b/datapath/actions.c
> index be2b8a3..e69fcb1 100644
> --- a/datapath/actions.c
> +++ b/datapath/actions.c
> @@ -609,6 +609,9 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> case OVS_ACTION_ATTR_SAMPLE:
> err = sample(dp, skb, a);
> break;
> +
> + case OVS_ACTION_ATTR_RECIRCULATE:
> + return 1;
> }
>
> if (unlikely(err)) {
> @@ -649,7 +652,7 @@ static int loop_suppress(struct datapath *dp, struct sw_flow_actions *actions)
> }
>
> /* Execute a list of actions against 'skb'. */
> -int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb)
> +struct sk_buff *ovs_execute_actions(struct datapath *dp, struct sk_buff *skb)
> {
> struct sw_flow_actions *acts = rcu_dereference(OVS_CB(skb)->flow->sf_acts);
> struct loop_counter *loop;
> @@ -668,6 +671,8 @@ int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb)
> OVS_CB(skb)->tun_key = NULL;
> error = do_execute_actions(dp, skb, acts->actions,
> acts->actions_len, false);
> + if (likely(error <= 0))
> + skb = NULL;
>
> /* Check whether sub-actions looped too much. */
> if (unlikely(loop->looping))
> @@ -678,5 +683,5 @@ out_loop:
> if (!--loop->count)
> loop->looping = false;
>
> - return error;
> + return (error < 0) ? ERR_PTR(error) : skb;
> }
> diff --git a/datapath/datapath.c b/datapath/datapath.c
> index 9c30168..fb92f08 100644
> --- a/datapath/datapath.c
> +++ b/datapath/datapath.c
> @@ -195,52 +195,63 @@ void ovs_dp_detach_port(struct vport *p)
> ovs_vport_del(p);
> }
>
> +#define MAX_RECIRCULATION_DEPTH 4 /* Completely arbitrary */
> +
> /* Must be called with rcu_read_lock. */
> void ovs_dp_process_received_packet(struct vport *p, struct sk_buff *skb)
> {
> struct datapath *dp = p->dp;
> - struct sw_flow *flow;
> struct dp_stats_percpu *stats;
> - struct sw_flow_key key;
> - u64 *stats_counter;
> - int error;
> - int key_len;
> + int limit = MAX_RECIRCULATION_DEPTH;
>
> stats = this_cpu_ptr(dp->stats_percpu);
>
> - /* Extract flow from 'skb' into 'key'. */
> - error = ovs_flow_extract(skb, p->port_no, &key, &key_len);
> - if (unlikely(error)) {
> - kfree_skb(skb);
> - return;
> - }
> + while (1) {
> + u64 *stats_counter;
> + struct sw_flow *flow;
> + struct sw_flow_key key;
> + int error, key_len;
>
> - /* Look up flow. */
> - flow = ovs_flow_tbl_lookup(rcu_dereference(dp->table), &key, key_len);
> - if (unlikely(!flow)) {
> - struct dp_upcall_info upcall;
> -
> - upcall.cmd = OVS_PACKET_CMD_MISS;
> - upcall.key = &key;
> - upcall.userdata = NULL;
> - upcall.portid = p->upcall_portid;
> - ovs_dp_upcall(dp, skb, &upcall);
> - consume_skb(skb);
> - stats_counter = &stats->n_missed;
> - goto out;
> - }
> + /* Extract flow from 'skb' into 'key'. */
> + error = ovs_flow_extract(skb, p->port_no, &key, &key_len);
> + if (unlikely(error)) {
> + kfree_skb(skb);
> + return;
> + }
>
> - OVS_CB(skb)->flow = flow;
> + /* Look up flow. */
> + flow = ovs_flow_tbl_lookup(rcu_dereference(dp->table),
> + &key, key_len);
> + if (unlikely(!flow)) {
> + struct dp_upcall_info upcall;
> +
> + upcall.cmd = OVS_PACKET_CMD_MISS;
> + upcall.key = &key;
> + upcall.userdata = NULL;
> + upcall.portid = p->upcall_portid;
> + ovs_dp_upcall(dp, skb, &upcall);
> + consume_skb(skb);
> + stats_counter = &stats->n_missed;
> + skb = NULL;
> + } else {
> + OVS_CB(skb)->flow = flow;
> + stats_counter = &stats->n_hit;
> + ovs_flow_used(flow, skb);
> + skb = ovs_execute_actions(dp, skb);
> + }
>
> - stats_counter = &stats->n_hit;
> - ovs_flow_used(OVS_CB(skb)->flow, skb);
> - ovs_execute_actions(dp, skb);
> + /* Update datapath statistics. */
> + u64_stats_update_begin(&stats->sync);
> + (*stats_counter)++;
> + u64_stats_update_end(&stats->sync);
>
> -out:
> - /* Update datapath statistics. */
> - u64_stats_update_begin(&stats->sync);
> - (*stats_counter)++;
> - u64_stats_update_end(&stats->sync);
> + if (likely(!skb) || IS_ERR(skb)) {
> + break;
> + } else if (unlikely(!limit--)) {
> + kfree_skb(skb);
> + return;
> + }
> + }
> }
>
> static struct genl_family dp_packet_genl_family = {
> @@ -730,6 +741,7 @@ static int validate_and_copy_actions(const struct nlattr *attr,
> [OVS_ACTION_ATTR_POP_MPLS] = sizeof(__be16),
> [OVS_ACTION_ATTR_PUSH_VLAN] = sizeof(struct ovs_action_push_vlan),
> [OVS_ACTION_ATTR_POP_VLAN] = 0,
> + [OVS_ACTION_ATTR_RECIRCULATE] = 0,
> [OVS_ACTION_ATTR_SET] = (u32)-1,
> [OVS_ACTION_ATTR_SAMPLE] = (u32)-1
> };
> @@ -808,6 +820,9 @@ static int validate_and_copy_actions(const struct nlattr *attr,
> skip_copy = true;
> break;
>
> + case OVS_ACTION_ATTR_RECIRCULATE:
> + break;
> +
> default:
> return -EINVAL;
> }
> @@ -905,12 +920,23 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
> goto err_unlock;
>
> local_bh_disable();
> - err = ovs_execute_actions(dp, packet);
> + packet = ovs_execute_actions(dp, packet);
> + if (unlikely(packet) && !IS_ERR(packet)) {
> + struct vport *vport;
> + vport = ovs_lookup_vport(dp, flow->key.phy.in_port);
> + if (!vport) {
> + err = -ENODEV;
> + goto err_unlock;
> + }
> + /* Recirculate */
> + ovs_dp_process_received_packet(vport, packet);
> + packet = NULL;
> + }
> local_bh_enable();
> rcu_read_unlock();
>
> ovs_flow_free(flow);
> - return err;
> + return PTR_ERR(packet);
>
> err_unlock:
> rcu_read_unlock();
> diff --git a/datapath/datapath.h b/datapath/datapath.h
> index 7665742..8da5e8a 100644
> --- a/datapath/datapath.h
> +++ b/datapath/datapath.h
> @@ -188,7 +188,7 @@ const char *ovs_dp_name(const struct datapath *dp);
> struct sk_buff *ovs_vport_cmd_build_info(struct vport *, u32 portid, u32 seq,
> u8 cmd);
>
> -int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb);
> +struct sk_buff *ovs_execute_actions(struct datapath *dp, struct sk_buff *skb);
>
> unsigned char *skb_cb_mpls_stack(const struct sk_buff *skb);
> #endif /* datapath.h */
> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
> index 0dd3ee4..abb4e00 100644
> --- a/include/linux/openvswitch.h
> +++ b/include/linux/openvswitch.h
> @@ -513,6 +513,9 @@ struct ovs_action_push_vlan {
> * indicate the new packet contents This could potentially still be
> * %ETH_P_MPLS_* if the resulting MPLS label stack is not empty. If there
> * is no MPLS label stack, as determined by ethertype, no action is taken.
> + * @OVS_ACTION_ATTR_RECIRCULATE: Restart processing of packet.
> + * The packet must have been modified by a previous action in such a way
> + * that it does not match its original flow again.
> *
> * Only a single header can be set with a single %OVS_ACTION_ATTR_SET. Not all
> * fields within a header are modifiable, e.g. the IPv4 protocol and fragment
> @@ -529,6 +532,7 @@ enum ovs_action_attr {
> OVS_ACTION_ATTR_SAMPLE, /* Nested OVS_SAMPLE_ATTR_*. */
> OVS_ACTION_ATTR_PUSH_MPLS, /* struct ovs_action_push_mpls. */
> OVS_ACTION_ATTR_POP_MPLS, /* __be16 ethertype. */
> + OVS_ACTION_ATTR_RECIRCULATE, /* No argument */
> __OVS_ACTION_ATTR_MAX
> };
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index e4a2f75..31255f6 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -152,10 +152,14 @@ static int dpif_netdev_open(const struct dpif_class *, const char *name,
> static int dp_netdev_output_userspace(struct dp_netdev *, const struct ofpbuf *,
> int queue_no, const struct flow *,
> const struct nlattr *userdata);
> -static void dp_netdev_execute_actions(struct dp_netdev *,
> +static bool dp_netdev_execute_actions(struct dp_netdev *,
> struct ofpbuf *, struct flow *,
> const struct nlattr *actions,
> - size_t actions_len);
> + size_t actions_len,
> + uint32_t *skb_mark);
> +static void dp_netdev_port_input(struct dp_netdev *dp,
> + struct dp_netdev_port *port,
> + struct ofpbuf *packet);
>
> static struct dpif_netdev *
> dpif_netdev_cast(const struct dpif *dpif)
> @@ -940,8 +944,22 @@ dpif_netdev_execute(struct dpif *dpif, const struct dpif_execute *execute)
> error = dpif_netdev_flow_from_nlattrs(execute->key, execute->key_len,
> &key);
> if (!error) {
> - dp_netdev_execute_actions(dp, ©, &key,
> - execute->actions, execute->actions_len);
> + bool recirculate;
> + uint32_t skb_mark = 0;
> +
> + recirculate = dp_netdev_execute_actions(dp, ©, &key,
> + execute->actions,
> + execute->actions_len,
> + &skb_mark);
> + if (recirculate) {
> + struct dp_netdev_port *port;
> + port = (key.in_port < MAX_PORTS) ? dp->ports[key.in_port] : NULL;
> + if (port) {
> + dp_netdev_port_input(dp, port, ©);
> + return 0;
> + }
> + error = ENOENT;
> + }
> }
>
> ofpbuf_uninit(©);
> @@ -1028,23 +1046,32 @@ static void
> dp_netdev_port_input(struct dp_netdev *dp, struct dp_netdev_port *port,
> struct ofpbuf *packet)
> {
> - struct dp_netdev_flow *flow;
> - struct flow key;
> + bool recirculate;
> + uint32_t skb_mark = 0;
> + int limit = MAX_RECIRCULATION_DEPTH;
>
> - if (packet->size < ETH_HEADER_LEN) {
> - return;
> - }
> - flow_extract(packet, 0, 0, NULL, port->port_no, &key);
> - flow = dp_netdev_lookup_flow(dp, &key);
> - if (flow) {
> - dp_netdev_flow_used(flow, packet);
> - dp_netdev_execute_actions(dp, packet, &key,
> - flow->actions, flow->actions_len);
> - dp->n_hit++;
> - } else {
> - dp->n_missed++;
> - dp_netdev_output_userspace(dp, packet, DPIF_UC_MISS, &key, NULL);
> - }
> + do {
> + struct dp_netdev_flow *flow;
> + struct flow key;
> +
> + if (packet->size < ETH_HEADER_LEN) {
> + return;
> + }
> + flow_extract(packet, 0, skb_mark, NULL, port->port_no, &key);
> + flow = dp_netdev_lookup_flow(dp, &key);
> + if (flow) {
> + dp_netdev_flow_used(flow, packet);
> + recirculate = dp_netdev_execute_actions(dp, packet, &key,
> + flow->actions,
> + flow->actions_len,
> + &skb_mark);
> + dp->n_hit++;
> + } else {
> + dp->n_missed++;
> + dp_netdev_output_userspace(dp, packet, DPIF_UC_MISS, &key, NULL);
> + recirculate = false;
> + }
> + } while (recirculate && limit--);
> }
>
> static void
> @@ -1163,6 +1190,7 @@ dp_netdev_sample(struct dp_netdev *dp,
> const struct nlattr *subactions = NULL;
> const struct nlattr *a;
> size_t left;
> + uint32_t skb_mark;
>
> NL_NESTED_FOR_EACH_UNSAFE (a, left, action) {
> int type = nl_attr_type(a);
> @@ -1186,7 +1214,7 @@ dp_netdev_sample(struct dp_netdev *dp,
> }
>
> dp_netdev_execute_actions(dp, packet, key, nl_attr_get(subactions),
> - nl_attr_get_size(subactions));
> + nl_attr_get_size(subactions), &skb_mark);
> }
>
> static void
> @@ -1201,7 +1229,8 @@ dp_netdev_action_userspace(struct dp_netdev *dp,
> }
>
> static void
> -execute_set_action(struct ofpbuf *packet, const struct nlattr *a)
> +execute_set_action(struct ofpbuf *packet, const struct nlattr *a,
> + uint32_t *skb_mark)
> {
> enum ovs_key_attr type = nl_attr_type(a);
> const struct ovs_key_ipv4 *ipv4_key;
> @@ -1211,11 +1240,14 @@ execute_set_action(struct ofpbuf *packet, const struct nlattr *a)
>
> switch (type) {
> case OVS_KEY_ATTR_PRIORITY:
> - case OVS_KEY_ATTR_SKB_MARK:
> case OVS_KEY_ATTR_TUNNEL:
> /* not implemented */
> break;
>
> + case OVS_KEY_ATTR_SKB_MARK:
> + *skb_mark = nl_attr_get_u32(a);
> + break;
> +
> case OVS_KEY_ATTR_ETHERNET:
> dp_netdev_set_dl(packet,
> nl_attr_get_unspec(a, sizeof(struct ovs_key_ethernet)));
> @@ -1263,11 +1295,11 @@ execute_set_action(struct ofpbuf *packet, const struct nlattr *a)
> }
> }
>
> -static void
> +static bool
> dp_netdev_execute_actions(struct dp_netdev *dp,
> struct ofpbuf *packet, struct flow *key,
> const struct nlattr *actions,
> - size_t actions_len)
> + size_t actions_len, uint32_t *skb_mark)
> {
> const struct nlattr *a;
> unsigned int left;
> @@ -1305,18 +1337,23 @@ dp_netdev_execute_actions(struct dp_netdev *dp,
> break;
>
> case OVS_ACTION_ATTR_SET:
> - execute_set_action(packet, nl_attr_get(a));
> + execute_set_action(packet, nl_attr_get(a), skb_mark);
> break;
>
> case OVS_ACTION_ATTR_SAMPLE:
> dp_netdev_sample(dp, packet, key, a);
> break;
>
> + case OVS_ACTION_ATTR_RECIRCULATE:
> + return true;
> +
> case OVS_ACTION_ATTR_UNSPEC:
> case __OVS_ACTION_ATTR_MAX:
> NOT_REACHED();
> }
> }
> +
> + return false;
> }
>
> const struct dpif_class dpif_netdev_class = {
> diff --git a/lib/flow.h b/lib/flow.h
> index 6e169d6..66f89e3 100644
> --- a/lib/flow.h
> +++ b/lib/flow.h
> @@ -296,4 +296,7 @@ uint32_t minimask_hash(const struct minimask *, uint32_t basis);
> bool minimask_has_extra(const struct minimask *, const struct minimask *);
> bool minimask_is_catchall(const struct minimask *);
>
> +#define MAX_RECIRCULATION_DEPTH 4 /* Completely arbitrary value to
> + * guard against infinite loops */
> +
> #endif /* flow.h */
> diff --git a/lib/odp-util.c b/lib/odp-util.c
> index 1a3e386..8565269 100644
> --- a/lib/odp-util.c
> +++ b/lib/odp-util.c
> @@ -75,6 +75,7 @@ odp_action_len(uint16_t type)
> case OVS_ACTION_ATTR_POP_VLAN: return 0;
> case OVS_ACTION_ATTR_PUSH_MPLS: return sizeof(struct ovs_action_push_mpls);
> case OVS_ACTION_ATTR_POP_MPLS: return sizeof(ovs_be16);
> + case OVS_ACTION_ATTR_RECIRCULATE: return 0;
> case OVS_ACTION_ATTR_SET: return -2;
> case OVS_ACTION_ATTR_SAMPLE: return -2;
>
> @@ -376,6 +377,10 @@ format_odp_action(struct ds *ds, const struct nlattr *a)
> ds_put_format(ds, "pop_mpls(eth_type=0x%"PRIx16")", ntohs(ethertype));
> break;
> }
> + case OVS_ACTION_ATTR_RECIRCULATE: {
> + ds_put_format(ds, "recirculate");
> + break;
> + }
> case OVS_ACTION_ATTR_SAMPLE:
> format_odp_sample_action(ds, a);
> break;
> @@ -2171,6 +2176,12 @@ commit_odp_tunnel_action(const struct flow *flow, struct flow *base,
> }
> }
>
> +void
> +commit_odp_recirculate_action(struct ofpbuf *odp_actions)
> +{
> + nl_msg_put_flag(odp_actions, OVS_ACTION_ATTR_RECIRCULATE);
> +}
> +
> static void
> commit_set_ether_addr_action(const struct flow *flow, struct flow *base,
> struct ofpbuf *odp_actions)
> @@ -2384,14 +2395,14 @@ commit_set_skb_mark_action(const struct flow *flow, struct flow *base,
> return;
> }
> base->skb_mark = flow->skb_mark;
> -
> odp_put_skb_mark_action(base->skb_mark, odp_actions);
> }
> /* If any of the flow key data that ODP actions can modify are different in
> * 'base' and 'flow', appends ODP actions to 'odp_actions' that change the flow
> * key from 'base' into 'flow', and then changes 'base' the same way. Does not
> * commit set_tunnel actions. Users should call commit_odp_tunnel_action()
> - * in addition to this function if needed. */
> + * and commit_odp_recirculate_action() in addition to those functions are
> + * needed. */
> void
> commit_odp_actions(const struct flow *flow, struct flow *base,
> struct ofpbuf *odp_actions)
> diff --git a/lib/odp-util.h b/lib/odp-util.h
> index ad0fb30..da62aa5 100644
> --- a/lib/odp-util.h
> +++ b/lib/odp-util.h
> @@ -115,6 +115,7 @@ const char *odp_key_fitness_to_string(enum odp_key_fitness);
>
> void commit_odp_tunnel_action(const struct flow *, struct flow *base,
> struct ofpbuf *odp_actions);
> +void commit_odp_recirculate_action(struct ofpbuf *odp_actions);
> void commit_odp_actions(const struct flow *, struct flow *base,
> struct ofpbuf *odp_actions);
>
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 6e9023a..226eda6 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -119,7 +119,8 @@ static struct rule_dpif *rule_dpif_miss_rule(struct ofproto_dpif *ofproto,
>
> static void rule_credit_stats(struct rule_dpif *,
> const struct dpif_flow_stats *);
> -static void flow_push_stats(struct facet *, const struct dpif_flow_stats *);
> +static void flow_push_stats(struct facet *, const struct dpif_flow_stats *,
> + const struct ofpact *, size_t ofpacts_len);
> static tag_type rule_calculate_tag(const struct flow *,
> const struct minimask *, uint32_t basis);
> static void rule_invalidate(const struct rule_dpif *);
> @@ -276,6 +277,16 @@ struct action_xlate_ctx {
> uint16_t nf_output_iface; /* Output interface index for NetFlow. */
> mirror_mask_t mirrors; /* Bitmap of associated mirrors. */
>
> + size_t ofpacts_len; /* The number of bytes of the ofpacts
> + * argument to xlate_actions() processed
> + * by it. This is used to calculate an
> + * offset into ofpacts for calls to
> + * xlate_actions on recirculated packets */
> +
> + uint32_t recirculation_id; /* skb_mark to use to identify recirculation.
> + * Zero if the context does not add a
> + * recirculate action. */
> +
> /* xlate_actions() initializes and uses these members, but the client has no
> * reason to look at them. */
>
> @@ -312,7 +323,8 @@ static void action_xlate_ctx_init(struct action_xlate_ctx *,
> struct ofproto_dpif *, const struct flow *,
> const struct initial_vals *initial_vals,
> struct rule_dpif *,
> - uint8_t tcp_flags, const struct ofpbuf *);
> + uint8_t tcp_flags, const struct ofpbuf *,
> + uint32_t recirculation_id);
> static void xlate_actions(struct action_xlate_ctx *,
> const struct ofpact *ofpacts, size_t ofpacts_len,
> struct ofpbuf *odp_actions);
> @@ -491,13 +503,38 @@ struct facet {
> * always be valid, since it could have been removed after newer
> * subfacets were pushed onto the 'subfacets' list.) */
> struct subfacet one_subfacet;
> +
> + const struct ofpact *ofpacts; /* ofpacts for this facet.
> + * Will differ from rule->up.ofpacts
> + * if facet is for a recirculated packet. */
> + size_t ofpacts_len; /* ofpacts_len for this facet
> + * Will differ from * rule->up.ofpacts_len
> + * if facet is for a recirculated packet. */
> +
> + uint32_t recirculation_id; /* Recirculation id.
> + * Non-sero for a facet
> + * that recirculates packets;
> + * used as the value of flow.skb_mark
> + * in the facet of recirculated packets.
> + * Zero otherwise. */
> + const struct ofpact *recirculation_ofpacts;
> + /* ofpacts for facets of packets
> + * recirculated by this facet */
> + size_t recirculation_ofpacts_len;
> + /* ofpacts_len for facets of packets
> + * recirculated by this facet */
> +
> + bool recirculated; /* Facet of a recirculated packet? */
> +
> };
>
> -static struct facet *facet_create(struct rule_dpif *,
> - const struct flow *, uint32_t hash);
> +static struct facet *facet_create(struct rule_dpif *, const struct flow *,
> + const struct ofpact *, size_t ofpacts_len,
> + bool recirculated, uint32_t hash);
> static void facet_remove(struct facet *);
> static void facet_free(struct facet *);
>
> +static struct facet *facet_find_by_id(struct ofproto_dpif *, ovs_be32 id);
> static struct facet *facet_find(struct ofproto_dpif *,
> const struct flow *, uint32_t hash);
> static struct facet *facet_lookup_valid(struct ofproto_dpif *,
> @@ -3434,6 +3471,15 @@ static bool
> flow_miss_should_make_facet(struct ofproto_dpif *ofproto,
> struct flow_miss *miss, uint32_t hash)
> {
> + /* A facet is currently required to handle recirculation.
> + * There currently isn't a good way to detect if recirculation will
> + * occur or not. So in the mean time assume that it can't occur
> + * for non-MPLS packets and it may occur for MPLS packets
> + */
> + if (eth_type_mpls(miss->flow.dl_type)) {
> + return true;
> + }
> +
> if (!ofproto->governor) {
> size_t n_subfacets;
>
> @@ -3449,12 +3495,26 @@ flow_miss_should_make_facet(struct ofproto_dpif *ofproto,
> list_size(&miss->packets));
> }
>
> +/* XXX: This is just a stub */
> +static uint32_t get_recirculate_id(void)
> +{
> + static uint32_t id = 0;
> + /* Skip 0, it is reserved as a null value */
> + if (!id)
> + id++;
> + /* Skip IPSEC_MARK bit it is reserved */
> + if (id & IPSEC_MARK)
> + id++;
> + ovs_assert(!(id & IPSEC_MARK));
> + return id++;
> +}
> +
> /* Handles 'miss', which matches 'rule', without creating a facet or subfacet
> * or creating any datapath flow. May add an "execute" operation to 'ops' and
> * increment '*n_ops'. */
> static void
> -handle_flow_miss_without_facet(struct flow_miss *miss,
> - struct rule_dpif *rule,
> +handle_flow_miss_without_facet(struct flow_miss *miss, struct rule_dpif *rule,
> + const struct ofpact *ofpacts, size_t ofpacts_len,
> struct flow_miss_op *ops, size_t *n_ops)
> {
> struct ofproto_dpif *ofproto = ofproto_dpif_cast(rule->up.ofproto);
> @@ -3475,10 +3535,9 @@ handle_flow_miss_without_facet(struct flow_miss *miss,
> rule_credit_stats(rule, &stats);
>
> action_xlate_ctx_init(&ctx, ofproto, &miss->flow,
> - &miss->initial_vals, rule, 0, packet);
> + &miss->initial_vals, rule, 0, packet, 0);
> ctx.resubmit_stats = &stats;
> - xlate_actions(&ctx, rule->up.ofpacts, rule->up.ofpacts_len,
> - &odp_actions);
> + xlate_actions(&ctx, ofpacts, ofpacts_len, &odp_actions);
>
> if (odp_actions.size) {
> struct dpif_execute *execute = &op->dpif_op.u.execute;
> @@ -3592,14 +3651,30 @@ handle_flow_miss(struct flow_miss *miss, struct flow_miss_op *ops,
>
> facet = facet_lookup_valid(ofproto, &miss->flow, hash);
> if (!facet) {
> - struct rule_dpif *rule = rule_dpif_lookup(ofproto, &miss->flow);
> + struct rule_dpif *rule;
> + const struct ofpact *ofpacts;
> + size_t ofpacts_len;
> + struct facet *parent_facet;
> +
> + parent_facet = facet_find_by_id(ofproto, miss->flow.skb_mark);
> + if (parent_facet) {
> + rule = parent_facet->rule;
> + ofpacts = parent_facet->recirculation_ofpacts;
> + ofpacts_len = parent_facet->recirculation_ofpacts_len;
> + } else {
> + rule = rule_dpif_lookup(ofproto, &miss->flow);
> + ofpacts = rule->up.ofpacts;
> + ofpacts_len = rule->up.ofpacts_len;
> + }
>
> if (!flow_miss_should_make_facet(ofproto, miss, hash)) {
> - handle_flow_miss_without_facet(miss, rule, ops, n_ops);
> + handle_flow_miss_without_facet(miss, rule, ofpacts,
> + ofpacts_len, ops, n_ops);
> return;
> }
>
> - facet = facet_create(rule, &miss->flow, hash);
> + facet = facet_create(rule, &miss->flow, ofpacts, ofpacts_len,
> + parent_facet != NULL, hash);
> now = facet->used;
> } else {
> now = time_msec();
> @@ -4345,6 +4420,28 @@ rule_expire(struct rule_dpif *rule)
>
> /* Facets. */
>
> +/* XXX: This is just a stub */
> +static struct facet *
> +facet_find_by_id(struct ofproto_dpif *ofproto, uint32_t id)
> +{
> + struct facet *facet;
> +
> + /* id=0 is never used */
> + if (!id)
> + return NULL;
> +
> + /* This is a ridiculous way to look things up, most likely the id
> + * should be cooked somehow to allow a more efficient lookup.
> + */
> + HMAP_FOR_EACH(facet, hmap_node, &ofproto->facets) {
> + if (facet->recirculation_id == id) {
> + return facet;
> + }
> + }
> +
> + return NULL;
> +}
> +
> /* Creates and returns a new facet owned by 'rule', given a 'flow'.
> *
> * The caller must already have determined that no facet with an identical
> @@ -4356,7 +4453,9 @@ rule_expire(struct rule_dpif *rule)
> * The facet will initially have no subfacets. The caller should create (at
> * least) one subfacet with subfacet_create(). */
> static struct facet *
> -facet_create(struct rule_dpif *rule, const struct flow *flow, uint32_t hash)
> +facet_create(struct rule_dpif *rule, const struct flow *flow,
> + const struct ofpact *ofpacts, size_t ofpacts_len,
> + bool recirculated, uint32_t hash)
> {
> struct ofproto_dpif *ofproto = ofproto_dpif_cast(rule->up.ofproto);
> struct facet *facet;
> @@ -4367,6 +4466,9 @@ facet_create(struct rule_dpif *rule, const struct flow *flow, uint32_t hash)
> list_push_back(&rule->facets, &facet->list_node);
> facet->rule = rule;
> facet->flow = *flow;
> + facet->ofpacts = ofpacts;
> + facet->ofpacts_len = ofpacts_len;
> + facet->recirculated = recirculated;
> list_init(&facet->subfacets);
> netflow_flow_init(&facet->nf_flow);
> netflow_flow_update_time(ofproto->netflow, &facet->nf_flow, facet->used);
> @@ -4456,10 +4558,10 @@ facet_learn(struct facet *facet)
>
> action_xlate_ctx_init(&ctx, ofproto, &facet->flow,
> &subfacet->initial_vals,
> - facet->rule, facet->tcp_flags, NULL);
> + facet->rule, facet->tcp_flags, NULL,
> + facet->recirculation_id);
> ctx.may_learn = true;
> - xlate_actions_for_side_effects(&ctx, facet->rule->up.ofpacts,
> - facet->rule->up.ofpacts_len);
> + xlate_actions_for_side_effects(&ctx, facet->ofpacts, facet->ofpacts_len);
> }
>
> static void
> @@ -4682,6 +4784,12 @@ facet_check_consistency(struct facet *facet)
> bool may_log = false;
> bool ok;
>
> + if (facet->recirculated) {
> + /* XXX: This is a hack and needs to be replaced with facet
> + * validation for facets of recirculated packets */
> + return true;
> + }
> +
> /* Check the rule for consistency. */
> rule = rule_dpif_lookup(ofproto, &facet->flow);
> ok = rule == facet->rule;
> @@ -4713,7 +4821,8 @@ facet_check_consistency(struct facet *facet)
> struct ds s;
>
> action_xlate_ctx_init(&ctx, ofproto, &facet->flow,
> - &subfacet->initial_vals, rule, 0, NULL);
> + &subfacet->initial_vals, rule, 0, NULL,
> + facet->recirculation_id);
> xlate_actions(&ctx, rule->up.ofpacts, rule->up.ofpacts_len,
> &odp_actions);
>
> @@ -4844,7 +4953,8 @@ facet_revalidate(struct facet *facet)
> enum slow_path_reason slow;
>
> action_xlate_ctx_init(&ctx, ofproto, &facet->flow,
> - &subfacet->initial_vals, new_rule, 0, NULL);
> + &subfacet->initial_vals, new_rule, 0, NULL,
> + facet->recirculation_id);
> xlate_actions(&ctx, new_rule->up.ofpacts, new_rule->up.ofpacts_len,
> &odp_actions);
>
> @@ -4942,11 +5052,13 @@ facet_push_stats(struct facet *facet)
> stats.tcp_flags = 0;
>
> if (stats.n_packets || stats.n_bytes || facet->used > facet->prev_used) {
> +
> facet->prev_packet_count = facet->packet_count;
> facet->prev_byte_count = facet->byte_count;
> facet->prev_used = facet->used;
>
> - flow_push_stats(facet, &stats);
> + flow_push_stats(facet, &stats,
> + facet->ofpacts, facet->ofpacts_len);
>
> update_mirror_stats(ofproto_dpif_cast(facet->rule->up.ofproto),
> facet->mirrors, stats.n_packets, stats.n_bytes);
> @@ -4964,7 +5076,8 @@ rule_credit_stats(struct rule_dpif *rule, const struct dpif_flow_stats *stats)
> /* Pushes flow statistics to the rules which 'facet->flow' resubmits
> * into given 'facet->rule''s actions and mirrors. */
> static void
> -flow_push_stats(struct facet *facet, const struct dpif_flow_stats *stats)
> +flow_push_stats(struct facet *facet, const struct dpif_flow_stats *stats,
> + const struct ofpact *ofpacts, size_t ofpacts_len)
> {
> struct rule_dpif *rule = facet->rule;
> struct ofproto_dpif *ofproto = ofproto_dpif_cast(rule->up.ofproto);
> @@ -4974,10 +5087,11 @@ flow_push_stats(struct facet *facet, const struct dpif_flow_stats *stats)
> ofproto_rule_update_used(&rule->up, stats->used);
>
> action_xlate_ctx_init(&ctx, ofproto, &facet->flow,
> - &subfacet->initial_vals, rule, 0, NULL);
> + &subfacet->initial_vals, rule, 0, NULL,
> + facet->recirculation_id);
> ctx.resubmit_stats = stats;
> - xlate_actions_for_side_effects(&ctx, rule->up.ofpacts,
> - rule->up.ofpacts_len);
> +
> + xlate_actions_for_side_effects(&ctx, ofpacts, ofpacts_len);
> }
>
> /* Subfacets. */
> @@ -5130,14 +5244,22 @@ subfacet_make_actions(struct subfacet *subfacet, const struct ofpbuf *packet,
> struct action_xlate_ctx ctx;
>
> action_xlate_ctx_init(&ctx, ofproto, &facet->flow,
> - &subfacet->initial_vals, rule, 0, packet);
> - xlate_actions(&ctx, rule->up.ofpacts, rule->up.ofpacts_len, odp_actions);
> + &subfacet->initial_vals, rule, 0, packet,
> + facet->recirculation_id);
> + xlate_actions(&ctx, facet->ofpacts, facet->ofpacts_len, odp_actions);
> facet->tags = ctx.tags;
> facet->has_learn = ctx.has_learn;
> facet->has_normal = ctx.has_normal;
> facet->has_fin_timeout = ctx.has_fin_timeout;
> facet->nf_flow.output_iface = ctx.nf_output_iface;
> facet->mirrors = ctx.mirrors;
> + if (ctx.recirculation_id) {
> + facet->recirculation_id = ctx.recirculation_id;
> + facet->recirculation_ofpacts = ofpact_end(rule->up.ofpacts,
> + ctx.ofpacts_len);
> + facet->recirculation_ofpacts_len =
> + rule->up.ofpacts_len - ctx.ofpacts_len;
> + }
>
> subfacet->slow = (subfacet->slow & SLOW_MATCH) | ctx.slow;
> if (subfacet->actions_len != odp_actions->size
> @@ -5461,7 +5583,7 @@ rule_dpif_execute(struct rule_dpif *rule, const struct flow *flow,
> initial_vals.tunnel_ip_tos = flow->tunnel.ip_tos;
> ofpbuf_use_stub(&odp_actions, odp_actions_stub, sizeof odp_actions_stub);
> action_xlate_ctx_init(&ctx, ofproto, flow, &initial_vals,
> - rule, stats.tcp_flags, packet);
> + rule, stats.tcp_flags, packet, 0);
> ctx.resubmit_stats = &stats;
> xlate_actions(&ctx, rule->up.ofpacts, rule->up.ofpacts_len, &odp_actions);
>
> @@ -6143,6 +6265,15 @@ execute_dec_mpls_ttl_action(struct action_xlate_ctx *ctx)
> }
>
> static void
> +execute_recircualte_action(struct action_xlate_ctx *ctx)
> +{
> + if (!ctx->recirculation_id) {
> + ctx->recirculation_id = get_recirculate_id();
> + }
> + ctx->flow.skb_mark = ctx->recirculation_id;
> +}
> +
> +static void
> xlate_output_action(struct action_xlate_ctx *ctx,
> uint16_t port, uint16_t max_len, bool may_packet_in)
> {
> @@ -6383,6 +6514,7 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
> struct action_xlate_ctx *ctx)
> {
> bool was_evictable = true;
> + bool may_recirculate = false;
> const struct ofpact *a;
>
> if (ctx->rule) {
> @@ -6451,18 +6583,30 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
>
> case OFPACT_SET_IPV4_SRC:
> if (ctx->flow.dl_type == htons(ETH_TYPE_IP)) {
> + if (may_recirculate) {
> + execute_recircualte_action(ctx);
> + goto out;
> + }
> ctx->flow.nw_src = ofpact_get_SET_IPV4_SRC(a)->ipv4;
> }
> break;
>
> case OFPACT_SET_IPV4_DST:
> if (ctx->flow.dl_type == htons(ETH_TYPE_IP)) {
> + if (may_recirculate) {
> + execute_recircualte_action(ctx);
> + goto out;
> + }
> ctx->flow.nw_dst = ofpact_get_SET_IPV4_DST(a)->ipv4;
> }
> break;
>
> case OFPACT_SET_IPV4_DSCP:
> /* OpenFlow 1.0 only supports IPv4. */
> + if (may_recirculate) {
> + execute_recircualte_action(ctx);
> + goto out;
> + }
> if (ctx->flow.dl_type == htons(ETH_TYPE_IP)) {
> ctx->flow.nw_tos &= ~IP_DSCP_MASK;
> ctx->flow.nw_tos |= ofpact_get_SET_IPV4_DSCP(a)->dscp;
> @@ -6471,12 +6615,20 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
>
> case OFPACT_SET_L4_SRC_PORT:
> if (is_ip_any(&ctx->flow)) {
> + if (may_recirculate) {
> + execute_recircualte_action(ctx);
> + goto out;
> + }
> ctx->flow.tp_src = htons(ofpact_get_SET_L4_SRC_PORT(a)->port);
> }
> break;
>
> case OFPACT_SET_L4_DST_PORT:
> if (is_ip_any(&ctx->flow)) {
> + if (may_recirculate) {
> + execute_recircualte_action(ctx);
> + goto out;
> + }
> ctx->flow.tp_dst = htons(ofpact_get_SET_L4_DST_PORT(a)->port);
> }
> break;
> @@ -6517,10 +6669,15 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
>
> case OFPACT_PUSH_MPLS:
> execute_mpls_push_action(ctx, ofpact_get_PUSH_MPLS(a)->ethertype);
> + may_recirculate = false;
> break;
>
> case OFPACT_POP_MPLS:
> execute_mpls_pop_action(ctx, ofpact_get_POP_MPLS(a)->ethertype);
> + if (ctx->flow.dl_type == htons(ETH_TYPE_IP) ||
> + ctx->flow.dl_type == htons(ETH_TYPE_IPV6)) {
> + may_recirculate = true;
> + }
> break;
>
> case OFPACT_SET_MPLS_TTL:
> @@ -6536,7 +6693,10 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
> break;
>
> case OFPACT_DEC_TTL:
> - if (compose_dec_ttl(ctx, ofpact_get_DEC_TTL(a))) {
> + if (may_recirculate) {
> + execute_recircualte_action(ctx);
> + goto out;
> + } else if (compose_dec_ttl(ctx, ofpact_get_DEC_TTL(a))) {
> goto out;
> }
> break;
> @@ -6623,6 +6783,7 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
> }
>
> out:
> + ctx->ofpacts_len = (char *)(a) - (char *)ofpacts;
> if (ctx->rule) {
> ctx->rule->up.evictable = was_evictable;
> }
> @@ -6633,7 +6794,8 @@ action_xlate_ctx_init(struct action_xlate_ctx *ctx,
> struct ofproto_dpif *ofproto, const struct flow *flow,
> const struct initial_vals *initial_vals,
> struct rule_dpif *rule,
> - uint8_t tcp_flags, const struct ofpbuf *packet)
> + uint8_t tcp_flags, const struct ofpbuf *packet,
> + uint32_t recirculation_id)
> {
> ovs_be64 initial_tun_id = flow->tunnel.tun_id;
>
> @@ -6656,7 +6818,13 @@ action_xlate_ctx_init(struct action_xlate_ctx *ctx,
> * registers.
> * - Tunnel 'base_flow' is completely cleared since that is what the
> * kernel does. If we wish to maintain the original values an action
> - * needs to be generated. */
> + * needs to be generated.
> + * - The recirculation_id element of flow and base flow are set to
> + * recirculate_id, which is the id that will be used by a recirculation
> + * action of one is added. It is stored in flow and base_flow for
> + * convenience as the recirculation_id element of flow and base flow
> + * are otherwise unused by action_xlate_ctx_init().
> + */
>
> ctx->ofproto = ofproto;
> ctx->flow = *flow;
> @@ -6672,6 +6840,7 @@ action_xlate_ctx_init(struct action_xlate_ctx *ctx,
> ctx->resubmit_hook = NULL;
> ctx->report_hook = NULL;
> ctx->resubmit_stats = NULL;
> + ctx->recirculation_id = recirculation_id;
> }
>
> /* Translates the 'ofpacts_len' bytes of "struct ofpacts" starting at 'ofpacts'
> @@ -6756,6 +6925,11 @@ xlate_actions(struct action_xlate_ctx *ctx,
>
> if (tunnel_ecn_ok(ctx) && (!in_port || may_receive(in_port, ctx))) {
> do_xlate_actions(ofpacts, ofpacts_len, ctx);
> + if (ctx->recirculation_id) {
> + commit_odp_actions(&ctx->flow, &ctx->base_flow,
> + ctx->odp_actions);
> + commit_odp_recirculate_action(odp_actions);
> + }
>
> /* We've let OFPP_NORMAL and the learning action look at the
> * packet, so drop it now if forwarding is disabled. */
> @@ -7515,7 +7689,7 @@ packet_out(struct ofproto *ofproto_, struct ofpbuf *packet,
> initial_vals.vlan_tci = flow->vlan_tci;
> initial_vals.tunnel_ip_tos = 0;
> action_xlate_ctx_init(&ctx, ofproto, flow, &initial_vals, NULL,
> - packet_get_tcp_flags(packet, flow), packet);
> + packet_get_tcp_flags(packet, flow), packet, 0);
> ctx.resubmit_stats = &stats;
>
> ofpbuf_use_stub(&odp_actions,
> @@ -7900,7 +8074,7 @@ ofproto_trace(struct ofproto_dpif *ofproto, const struct flow *flow,
> ofpbuf_use_stub(&odp_actions,
> odp_actions_stub, sizeof odp_actions_stub);
> action_xlate_ctx_init(&trace.ctx, ofproto, flow, initial_vals,
> - rule, tcp_flags, packet);
> + rule, tcp_flags, packet, 0);
> trace.ctx.resubmit_hook = trace_resubmit;
> trace.ctx.report_hook = trace_report;
> xlate_actions(&trace.ctx, rule->up.ofpacts, rule->up.ofpacts_len,
> --
> 1.7.10.4
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
More information about the dev
mailing list