[ovs-dev] [PATCH v4 1/2] ovs python: ovs.stream.open_block() returns success even if the remote is unreachable

Mark Michelson mmichels at redhat.com
Wed Aug 8 13:00:03 UTC 2018


On 08/08/2018 03:24 AM, Numan Siddique wrote:
> 
> 
> On Wed, Aug 8, 2018 at 2:58 AM Ben Pfaff <blp at ovn.org 
> <mailto:blp at ovn.org>> wrote:
> 
>     On Tue, Aug 07, 2018 at 05:12:32PM -0400, Mark Michelson wrote:
>      > On 08/07/2018 07:37 AM, nusiddiq at redhat.com
>     <mailto:nusiddiq at redhat.com> wrote:
>      > >From: Numan Siddique <nusiddiq at redhat.com
>     <mailto:nusiddiq at redhat.com>>
>      > >
>      > >The python function
>     ovs.socket_util.check_connection_completion() uses select()
>      > >(provided by python) to monitor the socket file descriptor. The
>     select()
>      > >returns 1 when the file descriptor becomes ready. For error
>     cases like -
>      > >111 (Connection refused) and 113 (No route to host) (POLLERR),
>     ovs.poller._SelectSelect.poll()
>      > >expects the exceptfds list to be set by select(). But that is
>     not the case.
>      > >As per the select() man page, writefds list will be set for POLLERR.
>      > >Please see "Correspondence between select() and poll()
>     notifications" section of select(2)
>      > >man page.
>      > >
>      > >Because of this behavior,
>     ovs.socket_util.check_connection_completion() returns success
>      > >even if the remote is unreachable or not listening on the port.
>      > >
>      > >This patch fixes this issue by using poll() to check the
>     connection status similar to
>      > >the C implementation of check_connection_completion().
>      > >
>      > >A new function 'get_system_poll() is added in ovs/poller.py
>     which returns the
>      > >select.poll() object. If select.poll is monkey patched by
>     eventlet/gevent, it
>      > >gets the original select.poll() and returns it.
>      >
>      > Is this safe? My concern is that eventlet/gevent monkey patches
>      > select.poll() so that the green thread yields properly. Using the
>     system
>      > select.poll() might lead to unexpected blocking.
> 
>     gevent monkey patches Python to get rid of poll() because poll() might
>     block and thereby stall Python.  This use of poll() is OK because it
>     will never block (because it uses a timeout of 0).
> 
>      > I read your conversation with Ben on the previous iteration of
>     this series.
>      > It looks like you attempted to use the system select.poll() and
>     ran into
>      > this blocking problem (the pastebin for your patch is now deleted
>     so I can't
>      > see what you actually tried). Why would this approach work
>     differently?
> 
>     I don't recall what was in the pastebin, but the use of poll() here
>     should be sound.  We use the same approach in C and we don't want to
>     block in C either.
> 
> 
> When I tried last time, I modified the class 'SelectPoll' class here
> - https://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L145
> to use system.poll and this blocked the whole process. But as Ben 
> suggested earlier
> and mentioned above , we use select.poll only when checking the 
> connection status
> with timeout of 0. So we are fine. I tested this patch with openstack 
> neutron and it worked
> fine.
> 
> Thanks
> Numan
> 

Thanks, guys. I'll give the patchset a more thorough review now.


More information about the dev mailing list