[ovs-dev] [bfd] bfd: Implement Bidirectional Forwarding Detection.

Ethan Jackson ethan at nicira.com
Mon May 6 19:10:29 UTC 2013


Sorry to not have responded in so long on this.  It's become important
again, so I'm going to try to get it merged soon.  Seems pretty close.

> No, that's fine, but I found it hard to infer the units other than
> from the type.  Could some comments mention that they are msecs?

I added a comment in the struct indicating that all timers are in msecs.

> (x & 16383) == (x % 16384) for nonnegative integer x.  If you like %
> better then by all means use x % 16384.

I agree, doesn't matter much. I find modulo easier to reason about so
I've used that.

>> > I don't see a check for this requirement from 6.8.6 (I think it only
>> > requires comparing p->size to msg->length):
>>
>> I think we implicitly make this check by using ofpbuf_at() which will return
>> NULL if there aren't at least BFD_PACKET_LEN bytes.
>
> I agree that we know there are at least BFD_PACKET_LEN bytes.  We
> don't know that there are at least msg->length bytes, since
> msg->length might be greater than BFD_PACKET_LEN.

Actually, msg->length can't be greater that BFD_PACKET_LEN, Since we
don't allow authenticated mode, we require msg->length to be precisely
BFD_PACKET_LEN, otherwise we drop the packet. The real issue here is
that the code is confusing to someone reading the spec and cross
checking it.  I'll add a comment explaining what's going on.

>>
>> > The pseudocode at the end of 6.8.6 only mentions updating
>> > bfd.LocalDiag in a couple of cases, but the implementation appears to
>> > update LocalDiag in every case where it changes the session state.
>>
>> I found the RFC a bit confusing in this respect.  For example,
>> suppose you follow their instructions with respect to the diagnostic
>> strictly.  Then if the remote neighbor goes down, you will set
>> DIAG_RMT_DOWN, and have no way of going back to DIAG_NONE when the
>> session comes back up.  It isn't done upon reception of BFD packet
>> in either the DOWN or INIT states.  This didn't seem correct, to me,
>> so I choose to interpret the spec as setting the diagnostic to none
>> if not otherwise specified.  Does that sound reasonable?
>
> LocalDiag is described as:
>
>       The diagnostic code specifying the reason for the most recent
>       change in the local session state.  This MUST be initialized to
>       zero (No Diagnostic).
>
> I assumed, without carefully thinking it through, that this was sort
> of like errno, in that it can get set to some kind of error but
> nothing ever sets it back to 0, so that you can always use it to find
> out what the most recent error was, even after recovery.  I don't know
> whether that is the intent, but if it is then it would explain why it
> is never set back to "No Diagnostic".

It's all a bit confusing.  How about this: I'll change it to never set
"No Diagnostic" and add a note in the TODO at the top of the file to
check what another implementation does.  That should clear up the
ambiguity.

>
>> > I don't see anything in bfd_process_packet() that corresponds to:
>> >
>> >       If the packet was not discarded, it has been received for purposes
>> >       of the Detection Time expiration rules in section 6.8.4.
>> >
>> > but maybe I just missed it.
>>
>> I interpreted this line as satisfying the requirement:
>>
>>     bfd->detect_time = bfd_rx_interval(bfd) * bfd->mult + time_msec();
>
> Oh, then I think it's in the wrong place.  For example, this:
>
>       If bfd.SessionState is AdminDown
>
>           Discard the packet
>
> is implemented as:
>
>     if (bfd->state == STATE_ADMIN_DOWN) {
>         VLOG_DBG_RL(&rl, "Administratively down, dropping control message.");
>         return;
>     }
>
> but
>
>      bfd->detect_time = bfd_rx_interval(bfd) * bfd->mult + time_msec();
>
> happens before that, so that the detection time is reset even if the
> packet was discarded.

I think the spec is inconsistent here, or at the very least I don't
know how to interpret it.

First it tells us

      Update the Detection Time as described in section 6.8.4.

      If bfd.SessionState is AdminDown

          Discard the packet

Then much later it tells us

      If the packet was not discarded, it has been received for purposes
      of the Detection Time expiration rules in section 6.8.4.

It's not clear to me what it's expecting us to do in this second
location that we haven't already done in the first.  At any rate, I
think the code is reasonable as is.  If we find it's inconsistent with
other implementations, we can change it later.

In debugging the BFD implementation as I developed it, I found the
current format easier to interpret.  I'll go ahead and leave it for
now.

Since it's been so long, I'll go ahead and resend the latest version
of the patch.  I think it's pretty much reviewed and ready to go, so
don't feel obligated to comb over it as nothings changed.  Feel free
to make additional comments if you'd like of course.

Ethan



More information about the dev mailing list