[ovs-dev] DNS support feature (was: Re: DNS support options)

Yifeng Sun pkusunyifeng at gmail.com
Fri Oct 27 16:14:25 UTC 2017


Mark, thanks a lot for the detailed and thorough explanation.

Do you happen to know any other projects that we can take a peek?

On Fri, Oct 27, 2017 at 8:57 AM, Mark Michelson <mmichels at redhat.com> wrote:

> Yep, that makes good sense. I'd recommend having some min and max
> threshold though. That way, if a record has a TTL of multiple days, you can
> round that down to something more reasonable. Similarly, a ridiculously low
> TTL can be rounded up.
>
> There's another aspect to DNS caching that I briefly mentioned in my last
> e-mail: negative caching. Consider that you attempt to look up
> exampledomain.com, and for some reason, you either can't reach your DNS
> server or the DNS server returns NXDOMAIN (or some other error). Each
> additional lookup of exampledomain.com will likely hit this same problem
> until either you can restore the link to your DNS server or the proper
> records get added to DNS. If exampledomain.com is a frequent destination
> for traffic, you don't want to be doing full DNS lookups of it each time,
> just to find out you can't send to it. This is especially true in the case
> that the DNS server is unreachable. resolv.conf by default will wait 5
> seconds on an attempt and will attempt the query twice before giving up.
> This means each DNS query will take 10 seconds just to ultimately fail.
> This can lead to large backlogs of queries. With negative caching, you
> cache the fact that an attempt to reach exampledomain.com failed, and for
> some amount of time, any further attempts to query that domain will
> immediately fail rather than attempting the query again.
>
> I mention negative caching because in some ways, it's more important than
> positive caching. I've seen certain applications get completely crippled by
> having the DNS server unavailable since they'll keep attempting to perform
> DNS lookups for domains that will ultimately fail. Caching positive results
> based on TTL of records won't help in a case like this.
>
> I'm not going to take a hard-line stand that DNS caching absolutely should
> not be added to OVS. But I point out negative caching as one of those
> things that may seem non-obvious at first but that is very important. If we
> can implement it successfully, then that is fantastic. But I just know that
> other projects have already gone through the pains of trying to implement
> this properly and if we can piggyback on their efforts, we'll probably end
> up happier in the long run.
>
> Mark
>
> PS There's actually an RFC dedicated to negative caching of DNS responses:
> https://tools.ietf.org/html/rfc2308
>
> On Fri, Oct 27, 2017 at 10:19 AM Ben Pfaff <blp at ovn.org> wrote:
>
>> Does it make sense to cache the entry until its declared TTL expires?
>>
>> On Fri, Oct 27, 2017 at 01:30:41PM +0000, Mark Michelson wrote:
>> > This opens the can of worms that is DNS caching.
>> >
>> > On one end of the spectrum, you can always perform a full DNS lookup of
>> a
>> > target and never store the result. If the result of the lookup changes,
>> > then you will know about it as soon as possible. However, the repeated
>> DNS
>> > lookups are very expensive, and most of the time, those lookups will be
>> > redundant.
>> >
>> > On the other end of the spectrum, you can look up a target once and
>> cache
>> > the result forever. It is much less expensive since you only ever do one
>> > DNS lookup, but if the DNS record ever changes, you will send traffic to
>> > the wrong address.
>> >
>> > Usually what works best is some middle ground. Essentially, cache the
>> > result of a DNS lookup for some configured time. After that time is up,
>> > perform the lookup again. It's a compromise. You don't have to perform
>> > lookups as often, which is good. However, most of the lookups you still
>> > perform will be redundant since it is unlikely that results will change.
>> > And if the address does change, then you will not detect it until the
>> > current cached result is stale.
>> >
>> > Having said all that, caching is hard. If we can avoid having to do it
>> > ourselves, that's a good thing. One way to go with this is to go with
>> the
>> > extreme of always performing DNS lookups. We can then recommend that
>> users
>> > of OVS that will be performing frequent DNS lookups also run a DNS
>> > forwarder that has built-in caching (as well as negative caching. That's
>> > another headache). That separates concerns more evenly.
>> >
>> > Anyways, that's just my opinion. If you decide that a DNS caching layer
>> in
>> > OVS is appropriate, then that's fine too.
>> >
>> > Mark
>> >
>> > On Thu, Oct 26, 2017 at 5:14 PM Yifeng Sun <pkusunyifeng at gmail.com>
>> wrote:
>> >
>> > > Thanks Mark for your reply.
>> > >
>> > > There is one more thing. If we bring DNS into play, we may need a
>> > > mechanism to watch for changes of ip addresses that were already
>> resolved
>> > > and being used.
>> > >
>> > > Thanks,
>> > > Yifeng
>> > >
>> > > On Thu, Oct 26, 2017 at 12:10 PM, Mark Michelson <mmichels at redhat.com
>> >
>> > > wrote:
>> > >
>> > >> On Wed, Oct 25, 2017 at 4:16 PM Yifeng Sun <pkusunyifeng at gmail.com>
>> > >> wrote:
>> > >>
>> > >>> I feel that unbound stands out in the available open source DNS
>> resolver.
>> > >>>
>> > >>> Below is the summary for unbound:
>> > >>> * The actual resolving work is done by a background process or
>> thread. A
>> > >>> background process or thread seems unavoidable. Linux's
>> getaddrinfo_a
>> > >>> clones a thread similarly.
>> > >>>
>> > >> * It is ported on Linux, BSD, Windows, MacOS/X and Solaris/SPARC.
>> This is
>> > >>> good because OVS runs on a large range of platforms.
>> > >>>
>> > >> * It complies to the standard, with optional DNSSEC support. Some of
>> its
>> > >>> features may not be needed in our case.
>> > >>> * The unbound context is thread-safe. Its internal locks may bring
>> some
>> > >>> overhead. But since the DNS resolving is not frequent in OVS, I
>> suppose
>> > >>> this small overhead is not an issue.
>> > >>>
>> > >>> Unbound looks like a good option. Another option is to create a
>> > >>> background thread which processes DNS resolving requests from the
>> main
>> > >>> thread and sends back the resulting events to the main thread. This
>> method
>> > >>> is quite simple and straightforward.
>> > >>>
>> > >>> The above are what I got so far. Please give your thoughts, thanks
>> a lot.
>> > >>>
>> > >>
>> > >> If portability to all of the systems you mentioned in your second
>> bullet
>> > >> point is important, then you can rule out a couple of options:
>> > >> * getaddrinfo_a is a GNU extension and is only available with glibc
>> > >> * The resolver functions[1] are a BSD specification so they'd be
>> > >> available on most platforms, but not on Windows. I don't personally
>> > >> recommend these because of the need to manually parse the DNS
>> responses you
>> > >> receive.
>> > >>
>> > >> That leaves two options:
>> > >> * Run a background thread uses getaddrinfo() to perform resolution.
>> > >> * Use a third-party library (like unbound).
>> > >>
>> > >> Of these two options, I feel like the third-party library is the
>> better
>> > >> option. The only downside I can think of is the extra dependency for
>> OVS.
>> > >> And as far as what third-party library to use, I was the one that
>> suggested
>> > >> unbound in the first place, so obviously I'm fine with using it :)
>> > >>
>> > >> Mark
>> > >>
>> > >> [1] http://man7.org/linux/man-pages/man3/resolver.3.html
>> > >>
>> > >>
>> > >>>
>> > >>> Below is the link for original discussion:
>> > >>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/
>> 337038.html
>> > >>>
>> > >>>
>> > >>>
>> > >>> On Wed, Oct 25, 2017 at 2:11 PM, Ben Pfaff <blp at ovn.org> wrote:
>> > >>>
>> > >>>> Hello everyone, please allow me to introduce Yifeng Sun (CCed), who
>> > >>>> recently joined VMware's Open vSwitch team.  I've asked Yifeng to
>> start
>> > >>>> out by working on DNS support for Open vSwitch.  Yifeng, can you
>> tell us
>> > >>>> about what you've discovered so far, based on this thread from
>> August
>> > >>>> that I'm reviving, and your tentative conclusions?
>> > >>>>
>> > >>>> Thanks,
>> > >>>>
>> > >>>> Ben.
>> > >>>>
>> > >>>
>> > >>>
>> > >
>>
>


More information about the dev mailing list