[ovs-dev] [PATCH v3] ovn-nbctl: Fix the ovn-nbctl test "LBs - daemon" which fails during rpm build

Mark Michelson mmichels at redhat.com
Thu Nov 1 13:20:32 UTC 2018


On 10/31/2018 06:24 PM, Yifeng Sun wrote:
> Hi Ben,
> 
> The dns resolving depends on libunbound's ub_resolve, which, from
> Numan's experience as well as my reading on its documentation,
> doesn't support timeout. I agree there is a bug and we should fix it.
> 
> Thanks,
> Yifeng
> 

I don't think you're going to find many resolvers that support timeouts 
being passed to them directly. Most of the time, the system settings are 
going to be honored. On Linux distributions, this means using the 
resolv.conf timeout and attempts values. By default, these values are 
set to 5 and 2 respectively. This means that the resolution will wait 5 
seconds before it determines it has timed out, and will attempt the 
query 2 times before it decides that the query has failed.

Working this way is great when it comes to user-friendliness. System 
admins are accustomed to using resolv.conf to control resolver behavior, 
so the DNS library isn't doing anything unexpected.

However, this *sucks* when it comes to trying to test your application. 
Those defaults I specified before are not guaranteed to be the same 
across different Linux distributions, not to mention other platforms. 
Trying to predict what the timeout for your DNS query is going to be is 
going to be a pain.

If you want to implement an upper bound on a timeout, your best bet is 
to use an asynchronous query and start your own timer. When your timer 
expires, then cancel the query. However, I would only recommend doing 
this in a test environment. Like I said before, administrators won't 
like it if we're messing with their configured DNS timeouts.

I think you're onto the right idea here by modifying the behavior when 
there are no servers configured. This way, you're not relying on a 
timeout in your test for something that really should fail immediately.

> On Wed, Oct 31, 2018 at 1:59 PM Ben Pfaff <blp at ovn.org> wrote:
> 
>> On Thu, Oct 25, 2018 at 03:27:41PM +0530, nusiddiq at redhat.com wrote:
>>> From: Numan Siddique <nusiddiq at redhat.com>
>>>
>>> When 'make check' is called by the mock rpm build (which disables
>> networking),
>>> the test "ovn-nbctl: LBs - daemon" fails when it runs the command
>>> "ovn-nbctl lb-add lb0 30.0.0.1a 192.168.10.10:80,192.168.10.20:80".
>> ovn-nbctl
>>> extracts the vip by calling the socket util function
>> 'inet_parse_active()',
>>> and this function blocks when libunbound function ub_resolve() is called
>>> further down. ub_resolve() is a blocking function without timeout and
>> all the
>>> ovs/ovn utilities use this function.
>>>
>>> As reported by Timothy Redaelli, the issue can also be reproduced by
>> running
>>> the below commands
>>>
>>> $ sudo unshare -mn -- sh -c 'ip addr add dev lo 127.0.0.1 && \
>>>    mount --bind /dev/null /etc/resolv.conf && runuser $SUDO_USER'
>>> $ make sandbox SANDBOXFLAGS="--ovn"
>>> $ ovn-nbctl -vsocket_util:off lb-add lb0 30.0.0.1a \
>>>    192.168.10.10:80,192.168.10.20:80
>>>
>>> To address this issue, this patch adds a new function -
>> inet_parse_ip_addr_and_port()
>>> which expects IP:[port] address in the 'target_' argument and disables
>> resolving
>>> the host. This new function is now used in ovn-northd, ovn-nbctl and
>> ovn-trace.
>>> It is fine to use this function as load balancer VIP cannot be a
>> hostname.
>>>
>>> Reported-by: Timothy Redaelli <tredaelli at redhat.com>
>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1641672
>>> Tested-by: Timothy Redaelli <tredaelli at redhat.com>
>>> Signed-off-by: Numan Siddique <nusiddiq at redhat.com>
>>
>> I have multiple thoughts here.
>>
>> First, if the resolver in OVS never times out, then that seems like a
>> bug in the OVS resolver.  Yifeng, you wrote the DNS code.  Is it true
>> that it never times out?  If so, should we fix that.
>>
>> Second, about the mock RPM build with disabled networking.  Does this
>> environment have a /etc/resolv.conf that specifies a DNS server?  If it
>> does, then that seems like a bug in the build environment.  If it does
>> not, then that seems like a bug in our DNS resolver code, because DNS
>> resolution should immediately fail if no DNS servers are available.
>>
>> Third, again about naming.  If we are going to have two functions that
>> act similarly, with the only difference being that one resolves DNS
>> names and the other does not, then the naming should reflect that
>> clearly.  It still isn't obvious to me with the new names.
>>
>> Thanks,
>>
>> Ben.
>>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 



More information about the dev mailing list