[ovs-discuss] "ovs|01253|reconnect|ERR|tcp:127.0.0.1:50814: no response to inactivity probe after 5.01 seconds, disconnecting" messages and lost packets

Guru Shetty guru at ovn.org
Thu Sep 27 19:39:49 UTC 2018


ovs-vswitchd is multi-threaded. ovsdb-server is single threaded.
(You did not answer my question about the file from which the logs were
printed in your email)

Who is at 127.0.0.1:45928 and 127.0.0.1:45930?

On Thu, 27 Sep 2018 at 11:14, Jean-Philippe Méthot <
jp.methot at planethoster.info> wrote:

> Thank you for your reply.
>
> This is Openstack with ml2 plugin. There’s no other 3rd party application
> used with our network, so no OVN or anything of the sort. Essentially, to
> give a quick idea of the topology, we have our vms on our compute nodes
> going through GRE tunnels toward network nodes where they are routed in
> network namespace toward a flat external network.
>
> Generally, the above indicates that a daemon fronting a Open vSwitch
> database hasn't been able to connect to its client. Usually happens when
> CPU consumption is very high.
>
>
> Our network nodes CPU are literally sleeping. Is openvswitch single-thread
> or multi-thread though? If ovs overloaded a single thread, it’s possible I
> may have missed it.
>
> Jean-Philippe Méthot
> Openstack system administrator
> Administrateur système Openstack
> PlanetHoster inc.
>
>
>
>
> Le 27 sept. 2018 à 14:04, Guru Shetty <guru at ovn.org> a écrit :
>
>
>
> On Wed, 26 Sep 2018 at 12:59, Jean-Philippe Méthot via discuss <
> ovs-discuss at openvswitch.org> wrote:
>
>> Hi,
>>
>> I’ve been using openvswitch for my networking backend on openstack for
>> several years now. Lately, as our network has grown, we’ve started noticing
>> some intermittent packet drop accompanied with the following error message
>> in openvswitch:
>>
>> 2018-09-26T04:15:20.676Z|00005|reconnect|ERR|tcp:127.0.0.1:45928: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:15:20.677Z|00006|reconnect|ERR|tcp:127.0.0.1:45930: no
>> response to inactivity probe after 5 seconds, disconnecting
>>
>
> Open vSwitch is a project with multiple daemons. Since you are using
> OpenStack, it is not clear from your message, what type of networking
> plugin you are using. Do you use OVN?
> Also, you did not mention from which file you have gotten the above errors.
>
> Generally, the above indicates that a daemon fronting a Open vSwitch
> database hasn't been able to connect to its client. Usually happens when
> CPU consumption is very high.
>
>
>
>> 2018-09-26T04:15:30.409Z|00007|reconnect|ERR|tcp:127.0.0.1:45874: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:15:33.661Z|00008|reconnect|ERR|tcp:127.0.0.1:45934: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:15:33.847Z|00009|reconnect|ERR|tcp:127.0.0.1:45894: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:16:03.247Z|00010|reconnect|ERR|tcp:127.0.0.1:45958: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:16:21.534Z|00011|reconnect|ERR|tcp:127.0.0.1:45956: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:16:21.786Z|00012|reconnect|ERR|tcp:127.0.0.1:45974: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:16:47.085Z|00013|reconnect|ERR|tcp:127.0.0.1:45988: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:16:49.618Z|00014|reconnect|ERR|tcp:127.0.0.1:45982: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:16:53.321Z|00015|reconnect|ERR|tcp:127.0.0.1:45964: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:17:15.543Z|00016|reconnect|ERR|tcp:127.0.0.1:45986: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:17:24.767Z|00017|reconnect|ERR|tcp:127.0.0.1:45990: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:17:31.735Z|00018|reconnect|ERR|tcp:127.0.0.1:45998: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:20:12.593Z|00019|reconnect|ERR|tcp:127.0.0.1:46014: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:23:51.996Z|00020|reconnect|ERR|tcp:127.0.0.1:46028: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:25:12.187Z|00021|reconnect|ERR|tcp:127.0.0.1:46022: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:25:28.871Z|00022|reconnect|ERR|tcp:127.0.0.1:46056: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:27:11.663Z|00023|reconnect|ERR|tcp:127.0.0.1:46046: no
>> response to inactivity probe after 5 seconds, disconnecting
>> 2018-09-26T04:29:56.161Z|00024|jsonrpc|WARN|tcp:127.0.0.1:46018: receive
>> error: Connection reset by peer
>> 2018-09-26T04:29:56.161Z|00025|reconnect|WARN|tcp:127.0.0.1:46018:
>> connection dropped (Connection reset by peer)
>>
>> This definitely kills the connection for a few seconds before it
>> reconnects. So, I’ve been wondering, what is this probe and what is really
>> happening here? What’s the cause and is there a way to fix this?
>>
>> Openvswitch version is 2.9.0-3 on CentOS 7 with Openstack Pike running on
>> it (but the issues show up on Queens too).
>>
>>
>> Jean-Philippe Méthot
>> Openstack system administrator
>> Administrateur système Openstack
>> PlanetHoster inc.
>>
>>
>>
>>
>> _______________________________________________
>> discuss mailing list
>> discuss at openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20180927/0cbc0f27/attachment.html>


More information about the discuss mailing list