[ovs-dev] DNS support feature (was: Re: DNS support options)

Miguel Angel Ajo Pelayo majopela at redhat.com
Mon Oct 30 09:52:14 UTC 2017


I don't believe rounding up TTLs would be a good practice. The
administrators
are aware of the risks of having a very low TTL, so if they decide to pick
a low TTL they may have good reasons (like the possibility of a very close
event of the IP being moved, or some sort of failover mechanism, or a
way for dealing with dynamic dns). If you round it up you add unexpected
extra downtime.



On Fri, Oct 27, 2017 at 6:38 PM, Yifeng Sun <pkusunyifeng at gmail.com> wrote:

> Thanks a lot. I will keep your guidance in mind.
>
> On Fri, Oct 27, 2017 at 9:29 AM, Mark Michelson <mmichels at redhat.com>
> wrote:
>
> > Usually the software that performs DNS lookups and caches results is
> > referred to as a "DNS forwarder". You configure resolv.conf's nameserver
> as
> > 127.0.0.1. This way, DNS queries go to the DNS forwarder running on
> > localhost. The forwarder then has real nameservers configured to send the
> > queries to. The forwarder receives the DNS response (or lack of response
> if
> > the nameserver is unreachable), caches the result, and sends the result
> > back to the system resolver. From the perspective of OVS, all it is doing
> > is performing a simple DNS lookup. Further DNS lookups hit the forwarder,
> > which has cached the previous queries and can immediately return the
> cached
> > records (or cached failure).
> >
> > One common DNS forwarder that works this way is dnsmasq. There are likely
> > lots of others. I believe that bind can be configured to act as a DNS
> > forwarder as well. I recommend searching for "DNS forwarder" and seeing
> > what you can find. The common thread here, though, is that OVS would not
> > actually package or depend on a DNS forwarder. Rather, we would add
> > documentation that strongly suggests that applications that consider DNS
> > resolution to be critical configure their favorite DNS forwarder on
> systems
> > running OVS.
> >
> > Since we are looking into using a third party library for DNS resolution,
> > it may also be worth looking into what sort of caching those libraries
> > provide. I am not familiar with that information off the top of my head.
> >
> > Mark
> >
> > On Fri, Oct 27, 2017 at 11:14 AM Yifeng Sun <pkusunyifeng at gmail.com>
> > wrote:
> >
> >> Mark, thanks a lot for the detailed and thorough explanation.
> >>
> >> Do you happen to know any other projects that we can take a peek?
> >>
> >> On Fri, Oct 27, 2017 at 8:57 AM, Mark Michelson <mmichels at redhat.com>
> >> wrote:
> >>
> >>> Yep, that makes good sense. I'd recommend having some min and max
> >>> threshold though. That way, if a record has a TTL of multiple days,
> you can
> >>> round that down to something more reasonable. Similarly, a
> ridiculously low
> >>> TTL can be rounded up.
> >>>
> >>> There's another aspect to DNS caching that I briefly mentioned in my
> >>> last e-mail: negative caching. Consider that you attempt to look up
> >>> exampledomain.com, and for some reason, you either can't reach your
> DNS
> >>> server or the DNS server returns NXDOMAIN (or some other error). Each
> >>> additional lookup of exampledomain.com will likely hit this same
> >>> problem until either you can restore the link to your DNS server or the
> >>> proper records get added to DNS. If exampledomain.com is a frequent
> >>> destination for traffic, you don't want to be doing full DNS lookups
> of it
> >>> each time, just to find out you can't send to it. This is especially
> true
> >>> in the case that the DNS server is unreachable. resolv.conf by default
> will
> >>> wait 5 seconds on an attempt and will attempt the query twice before
> giving
> >>> up. This means each DNS query will take 10 seconds just to ultimately
> fail.
> >>> This can lead to large backlogs of queries. With negative caching, you
> >>> cache the fact that an attempt to reach exampledomain.com failed, and
> >>> for some amount of time, any further attempts to query that domain will
> >>> immediately fail rather than attempting the query again.
> >>>
> >>> I mention negative caching because in some ways, it's more important
> >>> than positive caching. I've seen certain applications get completely
> >>> crippled by having the DNS server unavailable since they'll keep
> attempting
> >>> to perform DNS lookups for domains that will ultimately fail. Caching
> >>> positive results based on TTL of records won't help in a case like
> this.
> >>>
> >>> I'm not going to take a hard-line stand that DNS caching absolutely
> >>> should not be added to OVS. But I point out negative caching as one of
> >>> those things that may seem non-obvious at first but that is very
> important.
> >>> If we can implement it successfully, then that is fantastic. But I just
> >>> know that other projects have already gone through the pains of trying
> to
> >>> implement this properly and if we can piggyback on their efforts, we'll
> >>> probably end up happier in the long run.
> >>>
> >>> Mark
> >>>
> >>> PS There's actually an RFC dedicated to negative caching of DNS
> >>> responses: https://tools.ietf.org/html/rfc2308
> >>>
> >>> On Fri, Oct 27, 2017 at 10:19 AM Ben Pfaff <blp at ovn.org> wrote:
> >>>
> >>>> Does it make sense to cache the entry until its declared TTL expires?
> >>>>
> >>>> On Fri, Oct 27, 2017 at 01:30:41PM +0000, Mark Michelson wrote:
> >>>> > This opens the can of worms that is DNS caching.
> >>>> >
> >>>> > On one end of the spectrum, you can always perform a full DNS lookup
> >>>> of a
> >>>> > target and never store the result. If the result of the lookup
> >>>> changes,
> >>>> > then you will know about it as soon as possible. However, the
> >>>> repeated DNS
> >>>> > lookups are very expensive, and most of the time, those lookups will
> >>>> be
> >>>> > redundant.
> >>>> >
> >>>> > On the other end of the spectrum, you can look up a target once and
> >>>> cache
> >>>> > the result forever. It is much less expensive since you only ever do
> >>>> one
> >>>> > DNS lookup, but if the DNS record ever changes, you will send
> traffic
> >>>> to
> >>>> > the wrong address.
> >>>> >
> >>>> > Usually what works best is some middle ground. Essentially, cache
> the
> >>>> > result of a DNS lookup for some configured time. After that time is
> >>>> up,
> >>>> > perform the lookup again. It's a compromise. You don't have to
> perform
> >>>> > lookups as often, which is good. However, most of the lookups you
> >>>> still
> >>>> > perform will be redundant since it is unlikely that results will
> >>>> change.
> >>>> > And if the address does change, then you will not detect it until
> the
> >>>> > current cached result is stale.
> >>>> >
> >>>> > Having said all that, caching is hard. If we can avoid having to do
> it
> >>>> > ourselves, that's a good thing. One way to go with this is to go
> with
> >>>> the
> >>>> > extreme of always performing DNS lookups. We can then recommend that
> >>>> users
> >>>> > of OVS that will be performing frequent DNS lookups also run a DNS
> >>>> > forwarder that has built-in caching (as well as negative caching.
> >>>> That's
> >>>> > another headache). That separates concerns more evenly.
> >>>> >
> >>>> > Anyways, that's just my opinion. If you decide that a DNS caching
> >>>> layer in
> >>>> > OVS is appropriate, then that's fine too.
> >>>> >
> >>>> > Mark
> >>>> >
> >>>> > On Thu, Oct 26, 2017 at 5:14 PM Yifeng Sun <pkusunyifeng at gmail.com>
> >>>> wrote:
> >>>> >
> >>>> > > Thanks Mark for your reply.
> >>>> > >
> >>>> > > There is one more thing. If we bring DNS into play, we may need a
> >>>> > > mechanism to watch for changes of ip addresses that were already
> >>>> resolved
> >>>> > > and being used.
> >>>> > >
> >>>> > > Thanks,
> >>>> > > Yifeng
> >>>> > >
> >>>> > > On Thu, Oct 26, 2017 at 12:10 PM, Mark Michelson <
> >>>> mmichels at redhat.com>
> >>>> > > wrote:
> >>>> > >
> >>>> > >> On Wed, Oct 25, 2017 at 4:16 PM Yifeng Sun <
> pkusunyifeng at gmail.com
> >>>> >
> >>>> > >> wrote:
> >>>> > >>
> >>>> > >>> I feel that unbound stands out in the available open source DNS
> >>>> resolver.
> >>>> > >>>
> >>>> > >>> Below is the summary for unbound:
> >>>> > >>> * The actual resolving work is done by a background process or
> >>>> thread. A
> >>>> > >>> background process or thread seems unavoidable. Linux's
> >>>> getaddrinfo_a
> >>>> > >>> clones a thread similarly.
> >>>> > >>>
> >>>> > >> * It is ported on Linux, BSD, Windows, MacOS/X and Solaris/SPARC.
> >>>> This is
> >>>> > >>> good because OVS runs on a large range of platforms.
> >>>> > >>>
> >>>> > >> * It complies to the standard, with optional DNSSEC support. Some
> >>>> of its
> >>>> > >>> features may not be needed in our case.
> >>>> > >>> * The unbound context is thread-safe. Its internal locks may
> >>>> bring some
> >>>> > >>> overhead. But since the DNS resolving is not frequent in OVS, I
> >>>> suppose
> >>>> > >>> this small overhead is not an issue.
> >>>> > >>>
> >>>> > >>> Unbound looks like a good option. Another option is to create a
> >>>> > >>> background thread which processes DNS resolving requests from
> the
> >>>> main
> >>>> > >>> thread and sends back the resulting events to the main thread.
> >>>> This method
> >>>> > >>> is quite simple and straightforward.
> >>>> > >>>
> >>>> > >>> The above are what I got so far. Please give your thoughts,
> >>>> thanks a lot.
> >>>> > >>>
> >>>> > >>
> >>>> > >> If portability to all of the systems you mentioned in your second
> >>>> bullet
> >>>> > >> point is important, then you can rule out a couple of options:
> >>>> > >> * getaddrinfo_a is a GNU extension and is only available with
> glibc
> >>>> > >> * The resolver functions[1] are a BSD specification so they'd be
> >>>> > >> available on most platforms, but not on Windows. I don't
> personally
> >>>> > >> recommend these because of the need to manually parse the DNS
> >>>> responses you
> >>>> > >> receive.
> >>>> > >>
> >>>> > >> That leaves two options:
> >>>> > >> * Run a background thread uses getaddrinfo() to perform
> resolution.
> >>>> > >> * Use a third-party library (like unbound).
> >>>> > >>
> >>>> > >> Of these two options, I feel like the third-party library is the
> >>>> better
> >>>> > >> option. The only downside I can think of is the extra dependency
> >>>> for OVS.
> >>>> > >> And as far as what third-party library to use, I was the one that
> >>>> suggested
> >>>> > >> unbound in the first place, so obviously I'm fine with using it
> :)
> >>>> > >>
> >>>> > >> Mark
> >>>> > >>
> >>>> > >> [1] http://man7.org/linux/man-pages/man3/resolver.3.html
> >>>> > >>
> >>>> > >>
> >>>> > >>>
> >>>> > >>> Below is the link for original discussion:
> >>>> > >>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/
> >>>> 337038.html
> >>>> > >>>
> >>>> > >>>
> >>>> > >>>
> >>>> > >>> On Wed, Oct 25, 2017 at 2:11 PM, Ben Pfaff <blp at ovn.org> wrote:
> >>>> > >>>
> >>>> > >>>> Hello everyone, please allow me to introduce Yifeng Sun (CCed),
> >>>> who
> >>>> > >>>> recently joined VMware's Open vSwitch team.  I've asked Yifeng
> >>>> to start
> >>>> > >>>> out by working on DNS support for Open vSwitch.  Yifeng, can
> you
> >>>> tell us
> >>>> > >>>> about what you've discovered so far, based on this thread from
> >>>> August
> >>>> > >>>> that I'm reviving, and your tentative conclusions?
> >>>> > >>>>
> >>>> > >>>> Thanks,
> >>>> > >>>>
> >>>> > >>>> Ben.
> >>>> > >>>>
> >>>> > >>>
> >>>> > >>>
> >>>> > >
> >>>>
> >>>
> >>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


More information about the dev mailing list