[ovs-dev] DNS support feature (was: Re: DNS support options)

Mark Michelson mmichels at redhat.com
Fri Oct 27 16:29:19 UTC 2017


Usually the software that performs DNS lookups and caches results is
referred to as a "DNS forwarder". You configure resolv.conf's nameserver as
127.0.0.1. This way, DNS queries go to the DNS forwarder running on
localhost. The forwarder then has real nameservers configured to send the
queries to. The forwarder receives the DNS response (or lack of response if
the nameserver is unreachable), caches the result, and sends the result
back to the system resolver. From the perspective of OVS, all it is doing
is performing a simple DNS lookup. Further DNS lookups hit the forwarder,
which has cached the previous queries and can immediately return the cached
records (or cached failure).

One common DNS forwarder that works this way is dnsmasq. There are likely
lots of others. I believe that bind can be configured to act as a DNS
forwarder as well. I recommend searching for "DNS forwarder" and seeing
what you can find. The common thread here, though, is that OVS would not
actually package or depend on a DNS forwarder. Rather, we would add
documentation that strongly suggests that applications that consider DNS
resolution to be critical configure their favorite DNS forwarder on systems
running OVS.

Since we are looking into using a third party library for DNS resolution,
it may also be worth looking into what sort of caching those libraries
provide. I am not familiar with that information off the top of my head.

Mark

On Fri, Oct 27, 2017 at 11:14 AM Yifeng Sun <pkusunyifeng at gmail.com> wrote:

> Mark, thanks a lot for the detailed and thorough explanation.
>
> Do you happen to know any other projects that we can take a peek?
>
> On Fri, Oct 27, 2017 at 8:57 AM, Mark Michelson <mmichels at redhat.com>
> wrote:
>
>> Yep, that makes good sense. I'd recommend having some min and max
>> threshold though. That way, if a record has a TTL of multiple days, you can
>> round that down to something more reasonable. Similarly, a ridiculously low
>> TTL can be rounded up.
>>
>> There's another aspect to DNS caching that I briefly mentioned in my last
>> e-mail: negative caching. Consider that you attempt to look up
>> exampledomain.com, and for some reason, you either can't reach your DNS
>> server or the DNS server returns NXDOMAIN (or some other error). Each
>> additional lookup of exampledomain.com will likely hit this same problem
>> until either you can restore the link to your DNS server or the proper
>> records get added to DNS. If exampledomain.com is a frequent destination
>> for traffic, you don't want to be doing full DNS lookups of it each time,
>> just to find out you can't send to it. This is especially true in the case
>> that the DNS server is unreachable. resolv.conf by default will wait 5
>> seconds on an attempt and will attempt the query twice before giving up.
>> This means each DNS query will take 10 seconds just to ultimately fail.
>> This can lead to large backlogs of queries. With negative caching, you
>> cache the fact that an attempt to reach exampledomain.com failed, and
>> for some amount of time, any further attempts to query that domain will
>> immediately fail rather than attempting the query again.
>>
>> I mention negative caching because in some ways, it's more important than
>> positive caching. I've seen certain applications get completely crippled by
>> having the DNS server unavailable since they'll keep attempting to perform
>> DNS lookups for domains that will ultimately fail. Caching positive results
>> based on TTL of records won't help in a case like this.
>>
>> I'm not going to take a hard-line stand that DNS caching absolutely
>> should not be added to OVS. But I point out negative caching as one of
>> those things that may seem non-obvious at first but that is very important.
>> If we can implement it successfully, then that is fantastic. But I just
>> know that other projects have already gone through the pains of trying to
>> implement this properly and if we can piggyback on their efforts, we'll
>> probably end up happier in the long run.
>>
>> Mark
>>
>> PS There's actually an RFC dedicated to negative caching of DNS
>> responses: https://tools.ietf.org/html/rfc2308
>>
>> On Fri, Oct 27, 2017 at 10:19 AM Ben Pfaff <blp at ovn.org> wrote:
>>
>>> Does it make sense to cache the entry until its declared TTL expires?
>>>
>>> On Fri, Oct 27, 2017 at 01:30:41PM +0000, Mark Michelson wrote:
>>> > This opens the can of worms that is DNS caching.
>>> >
>>> > On one end of the spectrum, you can always perform a full DNS lookup
>>> of a
>>> > target and never store the result. If the result of the lookup changes,
>>> > then you will know about it as soon as possible. However, the repeated
>>> DNS
>>> > lookups are very expensive, and most of the time, those lookups will be
>>> > redundant.
>>> >
>>> > On the other end of the spectrum, you can look up a target once and
>>> cache
>>> > the result forever. It is much less expensive since you only ever do
>>> one
>>> > DNS lookup, but if the DNS record ever changes, you will send traffic
>>> to
>>> > the wrong address.
>>> >
>>> > Usually what works best is some middle ground. Essentially, cache the
>>> > result of a DNS lookup for some configured time. After that time is up,
>>> > perform the lookup again. It's a compromise. You don't have to perform
>>> > lookups as often, which is good. However, most of the lookups you still
>>> > perform will be redundant since it is unlikely that results will
>>> change.
>>> > And if the address does change, then you will not detect it until the
>>> > current cached result is stale.
>>> >
>>> > Having said all that, caching is hard. If we can avoid having to do it
>>> > ourselves, that's a good thing. One way to go with this is to go with
>>> the
>>> > extreme of always performing DNS lookups. We can then recommend that
>>> users
>>> > of OVS that will be performing frequent DNS lookups also run a DNS
>>> > forwarder that has built-in caching (as well as negative caching.
>>> That's
>>> > another headache). That separates concerns more evenly.
>>> >
>>> > Anyways, that's just my opinion. If you decide that a DNS caching
>>> layer in
>>> > OVS is appropriate, then that's fine too.
>>> >
>>> > Mark
>>> >
>>> > On Thu, Oct 26, 2017 at 5:14 PM Yifeng Sun <pkusunyifeng at gmail.com>
>>> wrote:
>>> >
>>> > > Thanks Mark for your reply.
>>> > >
>>> > > There is one more thing. If we bring DNS into play, we may need a
>>> > > mechanism to watch for changes of ip addresses that were already
>>> resolved
>>> > > and being used.
>>> > >
>>> > > Thanks,
>>> > > Yifeng
>>> > >
>>> > > On Thu, Oct 26, 2017 at 12:10 PM, Mark Michelson <
>>> mmichels at redhat.com>
>>> > > wrote:
>>> > >
>>> > >> On Wed, Oct 25, 2017 at 4:16 PM Yifeng Sun <pkusunyifeng at gmail.com>
>>> > >> wrote:
>>> > >>
>>> > >>> I feel that unbound stands out in the available open source DNS
>>> resolver.
>>> > >>>
>>> > >>> Below is the summary for unbound:
>>> > >>> * The actual resolving work is done by a background process or
>>> thread. A
>>> > >>> background process or thread seems unavoidable. Linux's
>>> getaddrinfo_a
>>> > >>> clones a thread similarly.
>>> > >>>
>>> > >> * It is ported on Linux, BSD, Windows, MacOS/X and Solaris/SPARC.
>>> This is
>>> > >>> good because OVS runs on a large range of platforms.
>>> > >>>
>>> > >> * It complies to the standard, with optional DNSSEC support. Some
>>> of its
>>> > >>> features may not be needed in our case.
>>> > >>> * The unbound context is thread-safe. Its internal locks may bring
>>> some
>>> > >>> overhead. But since the DNS resolving is not frequent in OVS, I
>>> suppose
>>> > >>> this small overhead is not an issue.
>>> > >>>
>>> > >>> Unbound looks like a good option. Another option is to create a
>>> > >>> background thread which processes DNS resolving requests from the
>>> main
>>> > >>> thread and sends back the resulting events to the main thread.
>>> This method
>>> > >>> is quite simple and straightforward.
>>> > >>>
>>> > >>> The above are what I got so far. Please give your thoughts, thanks
>>> a lot.
>>> > >>>
>>> > >>
>>> > >> If portability to all of the systems you mentioned in your second
>>> bullet
>>> > >> point is important, then you can rule out a couple of options:
>>> > >> * getaddrinfo_a is a GNU extension and is only available with glibc
>>> > >> * The resolver functions[1] are a BSD specification so they'd be
>>> > >> available on most platforms, but not on Windows. I don't personally
>>> > >> recommend these because of the need to manually parse the DNS
>>> responses you
>>> > >> receive.
>>> > >>
>>> > >> That leaves two options:
>>> > >> * Run a background thread uses getaddrinfo() to perform resolution.
>>> > >> * Use a third-party library (like unbound).
>>> > >>
>>> > >> Of these two options, I feel like the third-party library is the
>>> better
>>> > >> option. The only downside I can think of is the extra dependency
>>> for OVS.
>>> > >> And as far as what third-party library to use, I was the one that
>>> suggested
>>> > >> unbound in the first place, so obviously I'm fine with using it :)
>>> > >>
>>> > >> Mark
>>> > >>
>>> > >> [1] http://man7.org/linux/man-pages/man3/resolver.3.html
>>> > >>
>>> > >>
>>> > >>>
>>> > >>> Below is the link for original discussion:
>>> > >>>
>>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337038.html
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>> On Wed, Oct 25, 2017 at 2:11 PM, Ben Pfaff <blp at ovn.org> wrote:
>>> > >>>
>>> > >>>> Hello everyone, please allow me to introduce Yifeng Sun (CCed),
>>> who
>>> > >>>> recently joined VMware's Open vSwitch team.  I've asked Yifeng to
>>> start
>>> > >>>> out by working on DNS support for Open vSwitch.  Yifeng, can you
>>> tell us
>>> > >>>> about what you've discovered so far, based on this thread from
>>> August
>>> > >>>> that I'm reviving, and your tentative conclusions?
>>> > >>>>
>>> > >>>> Thanks,
>>> > >>>>
>>> > >>>> Ben.
>>> > >>>>
>>> > >>>
>>> > >>>
>>> > >
>>>
>>
>


More information about the dev mailing list