[ovs-dev] [bfd] bfd: Implement Bidirectional Forwarding Detection.

Ethan Jackson ethan at nicira.com
Wed Apr 10 00:40:44 UTC 2013


>> +/* XXX Finish Experimnetal BFD support.
>
> What part is experimental?  (The protocol or the implementation?)

I consider the implementation experimental until we've tested it more
thoroughly.  That said, it's a bit odd to put that in the code, so I've removed
it.

>> + * - CFM random VLAN option equivalent.  Instead of vlan, randomly modulate
>> the
>> + *   UDP source port to get the required entropy.
>
> Ugh.  Do we really need random anything?  I've never liked that
> "feature".

Not sure, we can shelve it for now.

> I think the following is microseconds, could you add a comment /*
> microseconds */, put _USEC in the name, etc.?
>
>> +#define OVS_BFD_MIN_RX 200000

Not used anywhere, I've deleted it.

> The BFD timers are 32-bit counts of microseconds.  The timers in
> struct bfd are 64-bit and appear to be milliseconds, but I'm not
> certain that all of them have the same unit.  Comments on units would
> be welcome.  Do they need to be 64-bit?

All the timers are in milliseconds, and converted to microseconds when put into
the packet.  Typically we've used long long int for time, so I've stayed with
that convention.  Any reason to change it?

> Spaces are OK in the OVSDB protocol but I don't think we use them
> much.  Usually we use hyphen or underscores (probably should
> standardize that in fact):

At this point, I think we'd have to change the database to treat them as
equivalent.  Otherwise, maintaining backwards compatibility is going to be
difficult.  At any rate I've changed it.

> The interface for bfd_configure() is unusual.  It does seem to be
> appropriate.

Yeah it's a bit odd.  I think it's better than sprinkling a little bit of bfd
config code everywhere like we do with CFM.  I'm happy to change it if you
prefer however.

> This might be a personal problem, but I find the following:
>> +        udp_src++;
>> +        udp_src = udp_src < 49152 ? 49152 : udp_src;
>> +        bfd->udp_src = udp_src;
> a little harder to verify as correct than:
>         bfd->udp_src = (udp_src++ & 16383) + 49152;

I assume you mean modulo here?

>
> In bfd_put_packet(), it might be wise to start out with
> ofpbuf_reserve(p, 2); to properly align what comes afterward.

I'm happy to do it, but I'm curious why?  Performance?  Some strange
architecture will be upset?

> I understand why we're using ip_ttl == 1, but I wonder whether you
> considered following RFC 5082 and therefore using (and checking for)
> ip_ttl == 255.

Makes sense, my oversight.

> It looks like bfd_process_packet() expects the caller to have pulled
> off everything except the L7 payload.  It would be nice to put this in
> a comment.

Perhaps I'm confused.  Based on my reading of the code, we don't expect the
headers to have been removed.  Instead, we simply require the l7 pointer to
have been properly populated.

>
> I don't see a check for this requirement from 6.8.6 (I think it only
> requires comparing p->size to msg->length):

I think we implicitly make this check by using ofpbuf_at() which will return
NULL if there aren't at least BFD_PACKET_LEN bytes.  


> The pseudocode at the end of 6.8.6 only mentions updating
> bfd.LocalDiag in a couple of cases, but the implementation appears to
> update LocalDiag in every case where it changes the session state.

I found the RFC a bit confusing in this respect.  For example, suppose you
follow their instructions with respect to the diagnostic strictly.  Then if the
remote neighbor goes down, you will set DIAG_RMT_DOWN, and have no way of going
back to DIAG_NONE when the session comes back up.  It isn't done upon reception
of BFD packet in either the DOWN or INIT states.  This didn't seem correct, to
me, so I choose to interpret the spec as setting the diagnostic to none if not
otherwise specified.  Does that sound reasonable?

> I don't see anything in bfd_process_packet() that corresponds to:
>
>       If the packet was not discarded, it has been received for purposes
>       of the Detection Time expiration rules in section 6.8.4.
>
> but maybe I just missed it.

I interpreted this line as satisfying the requirement:

    bfd->detect_time = bfd_rx_interval(bfd) * bfd->mult + time_msec();

I could be wrong though.

> The following loses a bit of information by truncating to 0.  Did you
> consider a format like "now%+lld ms" so that you get output like
> "now+5 ms" or "now-10 ms"?
>> +    ds_put_format(ds, "\tDetect Time: %lldms from now\n",
>> +                  msec_diff(time_msec(), bfd->detect_time));
>> +    ds_put_format(ds, "\tNext TX Time: %lldms from now\n",
>> +                  msec_diff(time_msec(), bfd->next_tx));
>> +    ds_put_format(ds, "\tLast TX Time: %lldms ago\n",
>> +                  msec_diff(bfd->last_tx, time_msec()));

Just to make sure I follow.  The advantage of doing this is that when we
truncate to 0, we know whether we were slightly above, or slightly below zero?
I don't think it matters that much, but I'm happy to change.

> ofproto-dpif.c
> --------------
>
> Why do we now need to call time_refresh() in run_fast()?

I was thinking that we'd want to assure that the results of time_msec() are
accurate when jumping into the CFM/BFD code.  However, Since we've increased
the accuracy of the cached time, It's probably not worth it.  At any rate, it
should be in a different patch.

An incremental follows.  Let me know if you'd like me to resend the whole patch
as well.

Ethan


---
 lib/bfd.c              |  157 +++++++++++++++++++++++++-----------------------
 lib/bfd.h              |    5 +-
 lib/odp-util.h         |    2 +-
 ofproto/ofproto-dpif.c |    3 +-
 vswitchd/vswitch.xml   |   70 ++++++++++++---------
 5 files changed, 129 insertions(+), 108 deletions(-)

diff --git a/lib/bfd.c b/lib/bfd.c
index cbaefc2..c74ffce 100644
--- a/lib/bfd.c
+++ b/lib/bfd.c
@@ -13,7 +13,7 @@
  * limitations under the License. */
 
 #include <config.h>
-#include <bfd.h>
+#include "bfd.h"
 
 #include <arpa/inet.h>
 
@@ -39,7 +39,7 @@
 
 VLOG_DEFINE_THIS_MODULE(bfd);
 
-/* XXX Finish Experimnetal BFD support.
+/* XXX Finish BFD.
  *
  * The goal of this module is to replace CFM with something both more flexible
  * and standards compliant.  In service of this goal, the following needs to be
@@ -49,6 +49,8 @@ VLOG_DEFINE_THIS_MODULE(bfd);
  *   * Implement Demand mode.
  *   * Go through the RFC line by line and verify we comply.
  *   * Test against a hardware implementation.  Preferably a popular one.
+ *   * Delete BFD packets with nw_ttl != 255 in the datapath to prevent DOS
+ *     attacks.
  *
  * - Unit tests.
  *
@@ -65,9 +67,6 @@ VLOG_DEFINE_THIS_MODULE(bfd);
  * - Scale testing.  How does it operate when there are large number of bfd
  *   sessions?  Do we ever have random flaps?  What's the CPU utilization?
  *
- * - CFM random VLAN option equivalent.  Instead of vlan, randomly modulate the
- *   UDP source port to get the required entropy.
- *
  * - Rely on data traffic for liveness by using BFD demand mode.
  *   If we're receiving traffic on a port, we can safely assume it's up (modulo
  *   unidrectional failures).  BFD has a demand mode in which it can stay quiet
@@ -76,7 +75,6 @@ VLOG_DEFINE_THIS_MODULE(bfd);
  *   interfaces. */
 
 #define BFD_VERSION 1
-#define OVS_BFD_MIN_RX 200000
 
 enum flags {
     FLAG_MULTIPOINT = 1 << 0,
@@ -88,10 +86,10 @@ enum flags {
 };
 
 enum state {
-    STATE_ADMIN_DOWN = 0,
-    STATE_DOWN = 1,
-    STATE_INIT = 2,
-    STATE_UP = 3
+    STATE_ADMIN_DOWN = 0 << 6,
+    STATE_DOWN = 1 << 6,
+    STATE_INIT = 2 << 6,
+    STATE_UP = 3 << 6
 };
 
 enum diag {
@@ -107,6 +105,7 @@ enum diag {
 };
 
 /* RFC 5880 Section 4.1
+ *  0                   1                   2                   3
  *  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  * |Vers |  Diag   |Sta|P|F|C|A|D|M|  Detect Mult  |    Length     |
@@ -136,11 +135,13 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct msg));
 
 #define DIAG_MASK 0x1f
 #define VERS_SHIFT 5
-#define STATE_SHIFT 6
+#define STATE_MASK 0xC0
 #define FLAGS_MASK 0x3f
 
 struct bfd {
-    struct list node;             /* In 'all_bfds'. */
+    struct hmap_node node;        /* In 'all_bfds'. */
+    uint32_t disc;                /* bfd.LocalDiscr. Key in 'all_bfds' hmap. */
+
     char *name;                   /* Name used for logging. */
 
     bool cpath_down;              /* Concatenated Path Down. */
@@ -155,9 +156,11 @@ struct bfd {
     enum flags flags;             /* Flags sent on messages. */
     enum flags rmt_flags;         /* Flags last received. */
 
-    uint32_t disc;                /* bfd.LocalDiscr. */
     uint32_t rmt_disc;            /* bfd.RemoteDiscr. */
 
+    uint16_t udp_src;             /* UDP source port. */
+
+    /* All timers in milliseconds. */
     long long int rmt_min_rx;     /* bfd.RemoteMinRxInterval. */
     long long int rmt_min_tx;     /* Remote minimum TX interval. */
 
@@ -171,8 +174,6 @@ struct bfd {
     long long int last_tx;        /* Last TX time. */
     long long int next_tx;        /* Next TX time. */
     long long int detect_time;    /* RFC 5880 6.8.4 Detection time. */
-
-    uint16_t udp_src;             /* UDP source port. */
 };
 
 static bool bfd_in_poll(const struct bfd *);
@@ -192,7 +193,7 @@ static void log_msg(enum vlog_level, const struct msg *, const char *message,
                     const struct bfd *);
 
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 20);
-static struct list all_bfds = LIST_INITIALIZER(&all_bfds);
+static struct hmap all_bfds = HMAP_INITIALIZER(&all_bfds);
 
 /* Returns true if the interface on which 'bfd' is running may be used to
  * forward traffic according to the BFD session state. */
@@ -212,8 +213,8 @@ bfd_get_status(const struct bfd *bfd, struct smap *smap)
     smap_add(smap, "diagnostic", bfd_diag_str(bfd->diag));
 
     if (bfd->state != STATE_DOWN) {
-        smap_add(smap, "remote state", bfd_state_str(bfd->rmt_state));
-        smap_add(smap, "remote diagnostic", bfd_diag_str(bfd->rmt_diag));
+        smap_add(smap, "remote_state", bfd_state_str(bfd->rmt_state));
+        smap_add(smap, "remote_diagnostic", bfd_diag_str(bfd->rmt_diag));
     }
 }
 
@@ -239,7 +240,7 @@ bfd_configure(struct bfd *bfd, const char *name,
 
     if (!smap_get_bool(cfg, "enable", false)) {
         if (bfd) {
-            list_remove(&bfd->node);
+            hmap_remove(&all_bfds, &bfd->node);
             free(bfd->name);
             free(bfd);
         }
@@ -249,9 +250,9 @@ bfd_configure(struct bfd *bfd, const char *name,
     if (!bfd) {
         bfd = xzalloc(sizeof *bfd);
         bfd->name = xstrdup(name);
-        list_push_back(&all_bfds, &bfd->node);
-
         bfd->disc = generate_discriminator();
+        hmap_insert(&all_bfds, &bfd->node, bfd->disc);
+
         bfd->diag = DIAG_NONE;
         bfd->min_tx = 1000;
         bfd->mult = 3;
@@ -261,14 +262,13 @@ bfd_configure(struct bfd *bfd, const char *name,
          * UDP source port number MUST be used for all BFD Control packets
          * associated with a particular session.  The source port number SHOULD
          * be unique among all BFD sessions on the system. */
-        udp_src++;
-        udp_src = udp_src < 49152 ? 49152 : udp_src;
-        bfd->udp_src = udp_src;
+        bfd->udp_src = (udp_src++ % 16383) + 49152;
 
         bfd_set_state(bfd, STATE_DOWN, DIAG_NONE);
     }
 
-    min_tx = MAX(smap_get_int(cfg, "min_tx", 100), 100);
+    min_tx = smap_get_int(cfg, "min_tx", 100);
+    min_tx = MAX(min_tx, 100);
     if (bfd->cfg_min_tx != min_tx) {
         bfd->cfg_min_tx = min_tx;
         if (bfd->state != STATE_UP
@@ -278,7 +278,8 @@ bfd_configure(struct bfd *bfd, const char *name,
         bfd_poll(bfd);
     }
 
-    min_rx = MAX(smap_get_int(cfg, "min_rx", 1000), 100);
+    min_rx = smap_get_int(cfg, "min_rx", 1000);
+    min_rx = MAX(min_rx, 100);
     if (bfd->cfg_min_rx != min_rx) {
         bfd->cfg_min_rx = min_rx;
         if (bfd->state != STATE_UP
@@ -315,7 +316,7 @@ bfd_wait(const struct bfd *bfd)
 void
 bfd_run(struct bfd *bfd)
 {
-    if (bfd->state > STATE_DOWN && time_msec() > bfd->detect_time) {
+    if (bfd->state > STATE_DOWN && time_msec() >= bfd->detect_time) {
         bfd_set_state(bfd, STATE_DOWN, DIAG_EXPIRED);
     }
 
@@ -325,7 +326,7 @@ bfd_run(struct bfd *bfd)
 }
 
 bool
-bfd_should_send_packet(struct bfd *bfd)
+bfd_should_send_packet(const struct bfd *bfd)
 {
     return bfd->flags & FLAG_FINAL || time_msec() >= bfd->next_tx;
 }
@@ -354,15 +355,16 @@ bfd_put_packet(struct bfd *bfd, struct ofpbuf *p,
      * set. */
     ovs_assert(!(bfd->flags & FLAG_POLL) || !(bfd->flags & FLAG_FINAL));
 
+    ofpbuf_reserve(p, 2); /* Properly align after the ethernet header. */
     eth = ofpbuf_put_uninit(p, sizeof *eth);
-    memcpy(eth->eth_dst, eth_addr_broadcast, sizeof eth->eth_dst);
+    memcpy(eth->eth_dst, eth_addr_broadcast, ETH_ADDR_LEN);
     memcpy(eth->eth_src, eth_src, ETH_ADDR_LEN);
     eth->eth_type = htons(ETH_TYPE_IP);
 
     ip = ofpbuf_put_zeros(p, sizeof *ip);
     ip->ip_ihl_ver = IP_IHL_VER(5, 4);
-    ip->ip_tot_len = htons(sizeof *ip + sizeof *udp + sizeof(struct msg));
-    ip->ip_ttl = 1;
+    ip->ip_tot_len = htons(sizeof *ip + sizeof *udp + sizeof *msg);
+    ip->ip_ttl = 255;
     ip->ip_proto = IPPROTO_UDP;
     ip->ip_src = htonl(0xA9FE0100); /* 169.254.1.0 Link Local. */
     ip->ip_dst = htonl(0xA9FE0101); /* 169.254.1.1 Link Local. */
@@ -371,11 +373,11 @@ bfd_put_packet(struct bfd *bfd, struct ofpbuf *p,
     udp = ofpbuf_put_zeros(p, sizeof *udp);
     udp->udp_src = htons(bfd->udp_src);
     udp->udp_dst = htons(BFD_DEST_PORT);
-    udp->udp_len = htons(sizeof *udp + sizeof(struct msg));
+    udp->udp_len = htons(sizeof *udp + sizeof *msg);
 
-    msg = ofpbuf_put_uninit(p, sizeof(struct msg));
+    msg = ofpbuf_put_uninit(p, sizeof *msg);
     msg->vers_diag = (BFD_VERSION << 5) | bfd->diag;
-    msg->flags = (bfd->state << 6) | bfd->flags;
+    msg->flags = (bfd->state & STATE_MASK) | bfd->flags;
 
     msg->mult = bfd->mult;
     msg->length = BFD_PACKET_LEN;
@@ -411,7 +413,8 @@ bfd_should_process_flow(const struct flow *flow)
 }
 
 void
-bfd_process_packet(struct bfd *bfd, const struct ofpbuf *p)
+bfd_process_packet(struct bfd *bfd, const struct flow *flow,
+                   const struct ofpbuf *p)
 {
     uint32_t rmt_min_rx, pkt_your_disc;
     enum state rmt_state;
@@ -421,6 +424,11 @@ bfd_process_packet(struct bfd *bfd, const struct ofpbuf *p)
 
     /* This function is designed to follow section RFC 5880 6.8.6 closely. */
 
+    if (flow->nw_ttl != 255) {
+        /* XXX Should drop in the kernel to prevent DOS. */
+        return;
+    }
+
     msg = ofpbuf_at(p, (uint8_t *)p->l7 - (uint8_t *)p->data, BFD_PACKET_LEN);
     if (!msg) {
         VLOG_INFO_RL(&rl, "%s: Received unparseable BFD control message.",
@@ -429,7 +437,7 @@ bfd_process_packet(struct bfd *bfd, const struct ofpbuf *p)
     }
 
     flags = msg->flags & FLAGS_MASK;
-    rmt_state = msg->flags >> STATE_SHIFT;
+    rmt_state = msg->flags & STATE_MASK;
     version = msg->vers_diag >> VERS_SHIFT;
 
     log_msg(VLL_DBG, msg, "Received BFD control message", bfd);
@@ -570,7 +578,7 @@ bfd_poll(struct bfd *bfd)
         bfd->poll_min_rx = bfd->cfg_min_rx;
         bfd->flags |= FLAG_POLL;
         bfd->next_tx = 0;
-        VLOG_INFO_RL(&rl, "Initiating poll sequence");
+        VLOG_INFO_RL(&rl, "%s: Initiating poll sequence", bfd->name);
     }
 }
 
@@ -589,7 +597,8 @@ bfd_min_tx(const struct bfd *bfd)
 static long long int
 bfd_tx_interval(const struct bfd *bfd)
 {
-    return MAX(bfd_min_tx(bfd), bfd->rmt_min_rx);
+    long long int interval = bfd_min_tx(bfd);
+    return MAX(interval, bfd->rmt_min_rx);
 }
 
 static long long int
@@ -606,41 +615,41 @@ bfd_set_next_tx(struct bfd *bfd)
     bfd->next_tx = bfd->last_tx + interval;
 }
 
-static char *
+static const char *
 bfd_flag_str(enum flags flags)
 {
     struct ds ds = DS_EMPTY_INITIALIZER;
     static char flag_str[128];
 
     if (!flags) {
-        return "NONE";
+        return "none";
     }
 
     if (flags & FLAG_MULTIPOINT) {
-        ds_put_cstr(&ds, "MULTIPOINT ");
+        ds_put_cstr(&ds, "multipoint ");
     }
 
     if (flags & FLAG_DEMAND) {
-        ds_put_cstr(&ds, "DEMAND ");
+        ds_put_cstr(&ds, "demand ");
     }
 
     if (flags & FLAG_AUTH) {
-        ds_put_cstr(&ds, "AUTH ");
+        ds_put_cstr(&ds, "auth ");
     }
 
     if (flags & FLAG_CTL) {
-        ds_put_cstr(&ds, "CTL ");
+        ds_put_cstr(&ds, "ctl ");
     }
 
     if (flags & FLAG_FINAL) {
-        ds_put_cstr(&ds, "FINAL ");
+        ds_put_cstr(&ds, "final ");
     }
 
     if (flags & FLAG_POLL) {
-        ds_put_cstr(&ds, "POLL ");
+        ds_put_cstr(&ds, "poll ");
     }
 
-    strncpy(flag_str, ds_cstr(&ds), sizeof flag_str);
+    ovs_strlcpy(flag_str, ds_cstr(&ds), sizeof flag_str);
     ds_destroy(&ds);
     return flag_str;
 }
@@ -649,11 +658,11 @@ static const char *
 bfd_state_str(enum state state)
 {
     switch (state) {
-    case STATE_ADMIN_DOWN: return "ADMIN_DOWN";
-    case STATE_DOWN: return "DOWN";
-    case STATE_INIT: return "INIT";
-    case STATE_UP: return "UP";
-    default: return "Invalid State";
+    case STATE_ADMIN_DOWN: return "admin_down";
+    case STATE_DOWN: return "down";
+    case STATE_INIT: return "init";
+    case STATE_UP: return "up";
+    default: return "invalid";
     }
 }
 
@@ -679,7 +688,7 @@ log_msg(enum vlog_level level, const struct msg *p, const char *message,
 {
     struct ds ds = DS_EMPTY_INITIALIZER;
 
-    if (level != VLL_DBG && vlog_should_drop(THIS_MODULE, level, &rl)) {
+    if (vlog_should_drop(THIS_MODULE, level, &rl)) {
         return;
     }
 
@@ -689,15 +698,17 @@ log_msg(enum vlog_level level, const struct msg *p, const char *message,
                   " length:%"PRIu8
                   "\n\tflags: %s"
                   "\n\tmy_disc:0x%"PRIx32" your_disc:0x%"PRIx32
-                  " min_tx:%"PRIu32"ms min_rx:%"PRIu32"ms min_rx_echo:%"
-                  PRIu32"ms\n\n",
+                  "\n\tmin_tx:%"PRIu32"us (%"PRIu32"ms)"
+                  "\n\tmin_rx:%"PRIu32"us (%"PRIu32"ms)"
+                  "\n\tmin_rx_echo:%"PRIu32"us (%"PRIu32"ms)",
                   bfd->name, message, p->vers_diag >> VERS_SHIFT,
                   bfd_diag_str(p->vers_diag & DIAG_MASK),
-                  bfd_state_str(p->flags >> STATE_SHIFT),
+                  bfd_state_str(p->flags & STATE_MASK),
                   p->mult, p->length, bfd_flag_str(p->flags & FLAGS_MASK),
                   ntohl(p->my_disc), ntohl(p->your_disc),
-                  ntohl(p->min_tx) / 1000, ntohl(p->min_rx) / 1000,
-                  ntohl(p->min_rx_echo) / 1000);
+                  ntohl(p->min_tx), ntohl(p->min_tx) / 1000,
+                  ntohl(p->min_rx), ntohl(p->min_rx) / 1000,
+                  ntohl(p->min_rx_echo), ntohl(p->min_rx_echo) / 1000);
     bfd_put_details(&ds, bfd);
     VLOG(level, "%s", ds_cstr(&ds));
     ds_destroy(&ds);
@@ -751,8 +762,10 @@ generate_discriminator(void)
     while (!disc) {
         struct bfd *bfd;
 
+        /* 'disc' is by defnition random, so there's no reason to waste time
+         * hashing it. */
         disc = random_uint32();
-        LIST_FOR_EACH (bfd, node, &all_bfds) {
+        HMAP_FOR_EACH_IN_BUCKET (bfd, node, disc, &all_bfds) {
             if (bfd->disc == disc) {
                 disc = 0;
                 break;
@@ -764,11 +777,11 @@ generate_discriminator(void)
 }
 
 static struct bfd *
-bfd_find(const char *name)
+bfd_find_by_name(const char *name)
 {
     struct bfd *bfd;
 
-    LIST_FOR_EACH (bfd, node, &all_bfds) {
+    HMAP_FOR_EACH (bfd, node, &all_bfds) {
         if (!strcmp(bfd->name, name)) {
             return bfd;
         }
@@ -776,12 +789,6 @@ bfd_find(const char *name)
     return NULL;
 }
 
-static long long int
-msec_diff(long long int earlier, long long int later) {
-    later -= earlier;
-    return later < 0 ? 0 : later;
-}
-
 static void
 bfd_put_details(struct ds *ds, const struct bfd *bfd)
 {
@@ -792,12 +799,12 @@ bfd_put_details(struct ds *ds, const struct bfd *bfd)
                   bfd->cpath_down ? "true" : "false");
     ds_put_format(ds, "\tTX Interval: Approx %lldms\n", bfd_tx_interval(bfd));
     ds_put_format(ds, "\tRX Interval: Approx %lldms\n", bfd_rx_interval(bfd));
-    ds_put_format(ds, "\tDetect Time: %lldms from now\n",
-                  msec_diff(time_msec(), bfd->detect_time));
-    ds_put_format(ds, "\tNext TX Time: %lldms from now\n",
-                  msec_diff(time_msec(), bfd->next_tx));
-    ds_put_format(ds, "\tLast TX Time: %lldms ago\n",
-                  msec_diff(bfd->last_tx, time_msec()));
+    ds_put_format(ds, "\tDetect Time: now %+lldms\n",
+                  time_msec() - bfd->detect_time);
+    ds_put_format(ds, "\tNext TX Time: now %+lldms\n",
+                  time_msec() - bfd->next_tx);
+    ds_put_format(ds, "\tLast TX Time: now %+lldms\n",
+                  time_msec() - bfd->last_tx);
 
     ds_put_cstr(ds, "\n");
 
@@ -832,14 +839,14 @@ bfd_unixctl_show(struct unixctl_conn *conn, int argc, const char *argv[],
     struct bfd *bfd;
 
     if (argc > 1) {
-        bfd = bfd_find(argv[1]);
+        bfd = bfd_find_by_name(argv[1]);
         if (!bfd) {
             unixctl_command_reply_error(conn, "no such bfd object");
             return;
         }
         bfd_put_details(&ds, bfd);
     } else {
-        LIST_FOR_EACH (bfd, node, &all_bfds) {
+        HMAP_FOR_EACH (bfd, node, &all_bfds) {
             ds_put_format(&ds, "---- %s ----\n", bfd->name);
             bfd_put_details(&ds, bfd);
         }
diff --git a/lib/bfd.h b/lib/bfd.h
index 19ce435..21203b3 100644
--- a/lib/bfd.h
+++ b/lib/bfd.h
@@ -29,12 +29,13 @@ struct smap;
 void bfd_wait(const struct bfd *);
 void bfd_run(struct bfd *);
 
-bool bfd_should_send_packet(struct bfd *);
+bool bfd_should_send_packet(const struct bfd *);
 void bfd_put_packet(struct bfd *bfd, struct ofpbuf *packet,
                     uint8_t eth_src[6]);
 
 bool bfd_should_process_flow(const struct flow *);
-void bfd_process_packet(struct bfd *, const struct ofpbuf *);
+void bfd_process_packet(struct bfd *, const struct flow *,
+                        const struct ofpbuf *);
 
 struct bfd *bfd_configure(struct bfd *, const char *name,
                           const struct smap *smap);
diff --git a/lib/odp-util.h b/lib/odp-util.h
index f195e58..a50f509 100644
--- a/lib/odp-util.h
+++ b/lib/odp-util.h
@@ -167,7 +167,7 @@ enum slow_path_reason {
     SLOW_IN_BAND = 1 << 3,      /* In-band control needs every packet. */
     SLOW_BFD = 1 << 4,          /* BFD packets need per-packet processing. */
 
-    /* Mutually exclusive with SLOW_CFM, SLOW_LACP, SLOW_STP.
+    /* Mutually exclusive with SLOW_BFD, SLOW_CFM, SLOW_LACP, SLOW_STP.
      * Could possibly appear with SLOW_IN_BAND. */
     SLOW_CONTROLLER = 1 << 5,   /* Packets must go to OpenFlow controller. */
 
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 5bbf34d..79e2814 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -1556,7 +1556,6 @@ run_fast(struct ofproto *ofproto_)
     struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_);
     struct ofport_dpif *ofport;
 
-    time_refresh();
     HMAP_FOR_EACH (ofport, up.hmap_node, &ofproto->up.ports) {
         port_run_fast(ofport);
     }
@@ -3526,7 +3525,7 @@ process_special(struct ofproto_dpif *ofproto, const struct flow *flow,
         return SLOW_CFM;
     } else if (ofport->bfd && bfd_should_process_flow(flow)) {
         if (packet) {
-            bfd_process_packet(ofport->bfd, packet);
+            bfd_process_packet(ofport->bfd, flow, packet);
         }
         return SLOW_BFD;
     } else if (ofport->bundle && ofport->bundle->lacp
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 26b0668..3762f6f 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -1694,17 +1694,18 @@
 
     <group title="Bidirectional Forwarding Detection (BFD)">
         <p>
-            BFD, defined in RFC 5880, allows point to point detection of
-            connectivity failures by occasional transmission of BFD control
-            messages.  It is implemented in Open vSwitch to serve as a more
-            popular and standards compliant alternative to CFM.
+            BFD, defined in RFC 5880 and RFC 5881, allows point to point
+            detection of connectivity failures by occasional transmission of
+            BFD control messages.  It is implemented in Open vSwitch to serve
+            as a more popular and standards compliant alternative to CFM.
         </p>
 
         <p>
             BFD operates by regularly transmitting BFD control messages at a
             rate negotiated independently in each direction.  Each endpoint
             specifies the rate at which it expects to receive control messages,
-            and the rate at which it's willing to transmit them.  An endpoint
+            and the rate at which it's willing to transmit them.  Open vSwitch
+            uses a detection multiplier of three, meaning that an endpoint
             which fails to receive BFD control messages for a period of three
             times the expected reception rate, will signal a connectivity
             fault.  In the case of a unidirectional connectivity issue, the
@@ -1713,46 +1714,57 @@
         </p>
 
         <p>
-            The Open vSwitch implementation of BFD aims comply faithfully with
-            the requirements put forth in RFC 5880.  Currently, the only known
-            omission is "Demand Mode", which we hope to include in future.
-            Open vSwitch does not implement the optional Authentication or
-            "Echo Mode" features.
+            The Open vSwitch implementation of BFD aims to comply faithfully
+            with the requirements put forth in RFC 5880.  Currently, the only
+            known omission is ``Demand Mode'', which we hope to include in
+            future.  Open vSwitch does not implement the optional
+            Authentication or ``Echo Mode'' features.
         </p>
 
-      <column name="bfd" key="min_rx">
-          The minimum rate, in milliseconds, at which this BFD session is
-          willing to receive BFD control messages.  The actual rate may slower
-          if the remote endpoint isn't willing to transmit as quickly as
+      <column name="bfd" key="enable">
+          When <code>true</code> BFD is enabled on this
+          <ref table="Interface"/>, otherwise it's disabled.  Defaults to
+          <code>false</code>.
+      </column>
+
+      <column name="bfd" key="min_rx"
+          type='{"type": "integer", "minInteger": 1}'>
+          The fastest rate, in milliseconds, at which this BFD session is
+          willing to receive BFD control messages.  The actual rate may be
+          slower if the remote endpoint isn't willing to transmit as quickly as
           specified.  Defaults to <code>1000</code>.
       </column>
 
-      <column name="bfd" key="min_tx">
-          The minimum rate, in milliseconds, at which this BFD session is
+      <column name="bfd" key="min_tx"
+          type='{"type": "integer", "minInteger": 1}'>
+          The fastest rate, in milliseconds, at which this BFD session is
           willing to transmit BFD control messages.  The actual rate may be
           slower if the remote endpoint isn't willing to receive as quickly as
           specified.  Defaults to <code>100</code>.
       </column>
 
-      <column name="bfd" key="cpath_down">
+      <column name="bfd" key="cpath_down" type='{"type": "boolean"}'>
           Concatenated path down may be used when the local system should not
           have traffic forwarded to it for some reason other than a connectivty
-          failure on the interface being monitored.  The local BFD session will
-          notify the remote session of the connectivity problem, at which time
-          the remote session may choose not to forward traffic.  Defaults to
+          failure on the interface being monitored.  When a controller thinks
+          this may be the case, it may set <code>cpath_down</code> to
+          <code>true</code> which may cause the remote BFD session not to
+          forward traffic to this <ref table="Interface"/>. Defaults to
           <code>false</code>.
       </column>
 
-      <column name="bfd_status" key="state">
-          State of the BFD session.  One of <code>ADMIN_DOWN</code>,
-          <code>DOWN</code>, <code>INIT</code>, or <code>UP</code>.
+      <column name="bfd_status" key="state"
+          type='{"type": "string",
+          "enum": ["set", ["admin_down", "down", "init", "up"]]}'>
+          State of the BFD session.  The BFD session is fully healthy and
+          negotiated if <code>UP</code>.
       </column>
 
-      <column name="bfd_status" key="forwarding">
+      <column name="bfd_status" key="forwarding" type='{"type": "boolean"}'>
           True if the BFD session believes this <ref table="Interface"/> may be
           used to forward traffic.  Typically this means the local session is
-          up, and the remote system isn't signalling a problem such as
-          concatenated path down.
+          signaling <code>UP</code>, and the remote system isn't signaling a
+          problem such as concatenated path down.
       </column>
 
       <column name="bfd_status" key="diagnostic">
@@ -1760,11 +1772,13 @@
           case of a problem.
       </column>
 
-      <column name="bfd_status" key="remote state">
+      <column name="bfd_status" key="remote_state"
+          type='{"type": "string",
+          "enum": ["set", ["admin_down", "down", "init", "up"]]}'>
           State of the remote endpoint's BFD session.
       </column>
 
-      <column name="bfd_status" key="remote diagnostic">
+      <column name="bfd_status" key="remote_diagnostic">
           A short message indicating what the remote endpoint's BFD session
           thinks is wrong in case of a problem.
       </column>
-- 
1.7.9.5




More information about the dev mailing list