[ovs-dev] [L3 In-Band 9/9] in-band: Implement L3-based in-band control

Ben Pfaff blp at nicira.com
Tue Sep 1 17:10:51 UTC 2009


Justin Pettit <jpettit at nicira.com> writes:

> Previously, in-band control was L2-based.  This worked well when the
> controller was on the same network segment as the switch.  However, many
> configuration are not setup this way.  These changes allow a switch and

"configurations".  Also, "setup" is a noun; "set up" is the verb.

> controller to be on different subnets.
>
> This set of changes also fixes some problems related to passing DHCP
> traffic as described in Bug #1618.

Could you add something to the change log to indicate that you
are going to add further explanations?  Then the reader of the
change log will know to look for them.

This code has changed in master, so the merge may be a bit of
trouble.

> @@ -89,51 +91,39 @@ struct in_band {
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(60, 60);
>  
>  static const uint8_t *
> -get_controller_mac(struct in_band *ib)
> +get_remote_mac(struct in_band *ib)
>  {
> +    int retval;
> +    bool have_mac;
> +    struct in_addr c_in4, r_in4;
> +    char dev_name[IFNAMSIZ];

I noticed that you used IFNAMSIZ here and IF_NAMESIZE earlier.
It would be better to use just one of them consistently.
IFNAMSIZ is Linux-specific, IF_NAMESIZE is in POSIX, so the
latter is a better choice.

> +    if (now >= ib->next_remote_refresh) {
> +        c_in4.s_addr = ib->controller_ip;
> +        memset(ib->remote_mac, 0, sizeof ib->remote_mac);
> +        retval = netdev_get_next_hop(&c_in4, &r_in4, dev_name);
> +        if (retval) {
> +            VLOG_WARN("cannot find route for controller ("IP_FMT"): %s",
> +                    IP_ARGS(&ib->controller_ip), strerror(retval));
> +            return NULL;

I think that this will fail to update next_remote_refresh in the
failure case, so that in that case we will try to refresh every
time we arrive here until we succeed.  Was that change
intentional?  I was trying to reduce the throttle the number of
potentially expensive system calls here.

> +        have_mac = !eth_addr_is_zero(ib->remote_mac);
> +
> +        if (have_mac 
> +                && memcmp(ib->last_remote_mac, ib->remote_mac, ETH_ADDR_LEN)) {

It would be nice to use !eth_addr_equals(ib->last_remote_mac,
ib->remote_mac) here.  (I realize that you only moved this code
around.)

> @@ -214,72 +206,163 @@ setup_flow(struct in_band *in_band, int rule_idx, const flow_t *flow,
>      }
>  }
>  
> +/* Returns true if 'packet' should be sent to the local port regardless
> + * of the flow table. */ 
> +bool
> +in_band_msg_in_hook(struct in_band *in_band, const flow_t *flow, 
> +                    const struct ofpbuf *packet)
> +{
> +    if (!in_band) {
> +        return false;
> +    }
> +
> +    /* Regardless of how the flow table is configured, we want to be
> +     * able to see replies to our DHCP requests. */
> +    if ((flow->nw_proto == IP_TYPE_UDP)
> +            && (flow->tp_src == htons(DHCP_SERVER_PORT))
> +            && (flow->tp_dst == htons(DHCP_CLIENT_PORT))
> +            && (packet->size >= DHCP_HEADER_LEN)) {
> +        struct dhcp_header *dhcp = packet->l7;

Shouldn't this check for dl_type == htons(ETH_TYPE_IP)?

I think that the size test is ignoring the size of the Ethernet
and IP headers.  And I'm kind of paranoid about using packet->l7
without testing it for nonnull (even though I *think* it should
always be nonnull here).

How about:
    && packet->l7) {
        struct dhcp_header *dhcp;
        const uint8_t *local_mac;

        dhcp = ofpbuf_at(packet, (char *) packet->l7 - packet->data,
                         sizeof *dhcp);
        if (!dhcp) {
            return false;
        }

Or you could test the test against the length of the L7 header
directly if you like that better.

> +        const uint8_t *local_mac;
> +
> +        local_mac = get_local_mac(in_band);
> +        if (!memcmp(dhcp->chaddr, local_mac, ETH_ADDR_LEN)) {

eth_addr_equals() again.

> /* Returns true if the rule that would match 'flow' with 'actions' is 
>  * allowed to be setup in the datapath. */
> bool
> in_band_rule_check(struct in_band *in_band, const flow_t *flow,
>                    const struct odp_actions *actions)

Hmm, is there a reason that we disable setting up a flow instead
of adding a deliver-to-local-port action?  Probably, but I don't
recall it.

> +    /* Don't allow flows that would prevent DHCP replies from being seen
> +     * by the local port. */
> +    if ((flow->nw_proto == IP_TYPE_UDP)
> +            && (flow->tp_src == htons(DHCP_SERVER_PORT)) 
> +            && (flow->tp_dst == htons(DHCP_CLIENT_PORT))) {

Also test dl_type?

> -    if (time_now() < MIN(in_band->next_refresh, in_band->next_local_refresh)) {
> +    if (now < MIN(in_band->next_remote_refresh, in_band->next_local_refresh)) {

MIN is such an ugly macro.  Now that we have time_now() factored
out, let's rewrite this as
    if (now < in_band->next_remote_refresh
        && now < in_band->next_local_refresh)
Even though it needs to be broken across lines, I still like it
better.

> -    /* Switch traffic sent by the local port. */
> +    /* Allow DHCP requests to be sent from the local port. */
>      memset(&flow, 0, sizeof flow);
>      flow.in_port = ODPP_LOCAL;
> -    setup_flow(in_band, IBR_FROM_LOCAL_PORT, &flow, OFPFW_IN_PORT,
> -               OFPP_NORMAL);
> +    flow.dl_type = htons(ETH_TYPE_IP);
> +    flow.nw_proto = IP_TYPE_UDP;
> +    flow.tp_src = htons(DHCP_CLIENT_PORT);
> +    flow.tp_dst = htons(DHCP_SERVER_PORT);
> +    setup_flow(in_band, IBR_FROM_LOCAL_DHCP, &flow,
> +               (OFPFW_IN_PORT | OFPFW_DL_TYPE | OFPFW_NW_PROTO 
> +                | OFPFW_TP_SRC | OFPFW_TP_DST), OFPP_NORMAL);

Can we fix dl_src as local_mac here?

> +    if (controller_ip) {
> +        /* Allow ARP requests for the controller's IP. */
>          memset(&flow, 0, sizeof flow);
>          flow.dl_type = htons(ETH_TYPE_ARP);
> -        memcpy(flow.dl_dst, eth_addr_broadcast, ETH_ADDR_LEN);
> -        memcpy(flow.dl_src, controller_mac, ETH_ADDR_LEN);
> -        setup_flow(in_band, IBR_ARP_FROM_CTL, &flow,
> -                   OFPFW_DL_TYPE | OFPFW_DL_DST | OFPFW_DL_SRC,
> +        flow.nw_proto = ARP_OP_REQUEST;
> +        flow.nw_src = controller_ip;
> +        setup_flow(in_band, IBR_TO_CTL_ARP, &flow,
> +                   (OFPFW_DL_TYPE | OFPFW_NW_PROTO | OFPFW_NW_SRC_MASK),
>                     OFPP_NORMAL);

The comment on this flow setup is confusing, since an ARP request
for the controller's IP will be sent *to* the controller (nw_dst
== controller_ip), but the flow here comes *from the controller
(nw_src == controller_ip).

> +        /* Allow ARP replies from the controller's IP. */
> +        memset(&flow, 0, sizeof flow);
> +        flow.dl_type = htons(ETH_TYPE_ARP);
> +        flow.nw_proto = ARP_OP_REPLY;
> +        flow.nw_dst = controller_ip;
> +        setup_flow(in_band, IBR_FROM_CTL_ARP, &flow,
> +                   (OFPFW_DL_TYPE | OFPFW_NW_PROTO | OFPFW_NW_DST_MASK),
> +                   OFPP_NORMAL);

Imagine a comment similar to the previous one here, in reverse.

> @@ -2123,6 +2123,13 @@ xlate_actions(const union ofp_action *in, size_t n_in,
>      ctx.tags = tags ? tags : &no_tags;
>      ctx.may_setup_flow = true;
>      do_xlate_actions(in, n_in, &ctx);
> +
> +    /* Check with in-band control to see if we're allowed to setup this
> +     * flow. */

...to "set up" this flow.

> +    if (!in_band_rule_check(ofproto->in_band, flow, out)) {
> +        ctx.may_setup_flow = false;
> +    }
> +

> @@ -3034,6 +3041,21 @@ handle_odp_msg(struct ofproto *p, struct ofpbuf *packet)
>      struct ofpbuf payload;
>      flow_t flow;
>  
> +    payload.data = msg + 1;
> +    payload.size = msg->length - sizeof *msg;
> +    flow_extract(&payload, msg->port, &flow);
> +
> +    /* Check with in-band control to see if this packet should be sent
> +     * to the local port regardless of the flow table. */
> +    if (in_band_msg_in_hook(p->in_band, &flow, &payload)) {
> +        union odp_action action;
> +
> +        memset(&action, 0, sizeof(action));
> +        action.output.type = ODPAT_OUTPUT;
> +        action.output.port = ODPP_LOCAL;
> +        dpif_execute(&p->dpif, flow.in_port, &action, 1, &payload);
> +    }

I think that this can safely go below the _ODPL_ACTION_NR test,
to avoid extracting the flow data in that case.  _ODPL_ACTION_NR
can only be sent by a flow, and in the case we care about here we
will never set up a flow.

>      /* Handle controller actions. */
>      if (msg->type == _ODPL_ACTION_NR) {
>          COVERAGE_INC(ofproto_ctlr_action);

Otherwise, the code looks fine.  I have no idea whether it works,
since it's magic, but as long as you think so...




More information about the dev mailing list