[ovs-dev] [PATCH v3] ovn-northd: logical router icmp response should not care about inport

Thu May 26 18:05:18 UTC 2016

On Thu, May 26, 2016 at 12:39:36PM -0400, Flavio Fernandes wrote:
> To be discussed:
> 
> One caveat here is that ttl should be decremented before vm
> reaches <IP2.2>, assuming the ping packet is coming from
> a remote subnet.
> 
> In short, we ideally would have the match augmented to look
> like this.
> 
>   "(inport == %s || ip.ttl > 1) && (ip4.dst == "IP_FMT" || ip4.dst == "IP_FMT") && ...
> 
> Afer talking to Russel [1] it seems that we have either a bug or
> a known limitation on how a match on ip.ttl can be used with the
> greater than operator. I tried a couple of tweaks [2] against a
> test ping with ttl 10 and concluded that both
> negation and greater than are not working as expected. Assuming
> ip.ttl should be supported in the match, there are 3 issues that
> should be followed upon, which may be outside the scope of this
> patch:
>   1) fix/support negation and greater than operator for ip.ttl
>   2) update doc on supported matches (ovn-sb);
>   3) expand unit tests to verify ip.ttl in match

Inequalities are intentionally unsupported for ip.ttl.  You can see this
pretty clearly from ovstest output:

    $ echo 'ip.ttl == 1' | tests/ovstest test-ovn parse-expr
    ip.ttl == 0x1
    $ echo 'ip.ttl > 1' | tests/ovstest test-ovn parse-expr
    Only == and != operators may be used with nominal field ip.ttl.

The proximate reason for this is that ip.ttl is a nominal field.  That's
because the OpenFlow field for IP TTL (NXM_NX_IP_TTL) is nonmaskable;
that is, at the OpenFlow layer, OVS rejects attempts to match on a
subset of bits in the TTL.

It would be easy to change both of these decisions.  Making
NXM_NX_IP_TTL maskable is a one-line change to a comment (!), and making
ip.ttl ordinal would be equally trivial.

But I made NXM_NX_IP_TTL nonmaskable for good reason.  That is because
inequality matches are expensive.  The comparison "ip.ttl > 1" needs 7
OpenFlow flows, each of which tests only one bit.  This can be a big
deal because each of those flows ends up in a separate hash table in the
classifier, adding 7 separate hash table lookups to any classifier
search, which is inefficient.

That's why the existing ip.ttl checks instead use "ip.ttl == {0,1}",
which only uses 2 OpenFlow flows, both of which can fit in the same
classifier lookup table.

So, it's better if we can use a design that can use a higher-priority
flow to handle too-low TTLs as a special case, then handle the common
case with a lower priority flow.

I may be microoptimizing.  I am open to that discussion.