[ovs-discuss] OVS stops working after 1 hour with repetitive errors in neutron-openvswitch-agent.log

Matthias Hüther matthias.huether at twenty20.eu
Fri Dec 14 12:26:08 UTC 2018


I operate 2 neutron gateways ( https://jujucharms.com/neutron-gateway/ <https://jujucharms.com/neutron-gateway/> ) in a freshly installed Openstack environment. After about an hour, the following problem occurs: When I create a new virtual router and connect a virtual network with it, then the interface remains down. In the neutron-openvswitch-agent.log the following errors occur every 10 seconds:

2018-12-06 07:49:19.011 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [req-664e8eff-df0a-4c04-b51a-ddab7806809f - - - - -] ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0x5515ee88,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out: eventlet.timeout.Timeout: 10 seconds
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int [req-664e8eff-df0a-4c04-b51a-ddab7806809f - - - - -] Failed to communicate with the switch: RuntimeError: ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0x5515ee88,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int Traceback (most recent call last):
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 92, in _send_msg
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int result = ofctl_api.send_msg(self._app, msg, reply_cls, reply_multi)
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/ryu/app/ofctl/api.py", line 89, in send_msg
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int reply_multi=reply_multi))()
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/ryu/base/app_manager.py", line 279, in send_request
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int return req.reply_q.get()
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/eventlet/queue.py", line 313, in get
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int return waiter.wait()
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/eventlet/queue.py", line 141, in wait
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int return get_hub().switch()
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/eventlet/hubs/hub.py", line 294, in switch
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int return self.greenlet.switch()
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int eventlet.timeout.Timeout: 10 seconds
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int During handling of the above exception, another exception occurred:
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int Traceback (most recent call last):
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py", line 52, in check_canary_table
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int flows = self.dump_flows(constants.CANARY_TABLE)
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 162, in dump_flows
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int reply_multi=True)
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 110, in _send_msg
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int raise RuntimeError(m)
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int RuntimeError: ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0x5515ee88,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out
2018-12-06 07:49:19.012 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int
2018-12-06 07:49:19.013 4782 WARNING neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-664e8eff-df0a-4c04-b51a-ddab7806809f - - - - -] OVS is dead. OVSNeutronAgent will keep running and checking OVS status periodically.
2018-12-06 07:49:29.015 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [req-664e8eff-df0a-4c04-b51a-ddab7806809f - - - - -] ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0x5515ee8a,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out: eventlet.timeout.Timeout: 10 seconds
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int [req-664e8eff-df0a-4c04-b51a-ddab7806809f - - - - -] Failed to communicate with the switch: RuntimeError: ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0x5515ee8a,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int Traceback (most recent call last):
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 92, in _send_msg
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int result = ofctl_api.send_msg(self._app, msg, reply_cls, reply_multi)
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/ryu/app/ofctl/api.py", line 89, in send_msg
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int reply_multi=reply_multi))()
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/ryu/base/app_manager.py", line 279, in send_request
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int return req.reply_q.get()
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/eventlet/queue.py", line 313, in get
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int return waiter.wait()
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/eventlet/queue.py", line 141, in wait
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int return get_hub().switch()
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/eventlet/hubs/hub.py", line 294, in switch
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int return self.greenlet.switch()
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int eventlet.timeout.Timeout: 10 seconds
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int During handling of the above exception, another exception occurred:
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int Traceback (most recent call last):
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py", line 52, in check_canary_table
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int flows = self.dump_flows(constants.CANARY_TABLE)
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 162, in dump_flows
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int reply_multi=True)
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 110, in _send_msg
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int raise RuntimeError(m)
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int RuntimeError: ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0x5515ee8a,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out
2018-12-06 07:49:29.017 4782 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int
2018-12-06 07:49:29.018 4782 WARNING neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-664e8eff-df0a-4c04-b51a-ddab7806809f - - - - -] OVS is dead. OVSNeutronAgent will keep running and checking OVS status periodically.



Ubuntu release: Ubuntu 18.04.1 LTS

Package versions:

neutron-common 2:13.0.1-0ubuntu1~cloud0
neutron-dhcp-agent 2:13.0.1-0ubuntu1~cloud0
neutron-fwaas-common 1:13.0.0-0ubuntu1~cloud0
neutron-l3-agent 2:13.0.1-0ubuntu1~cloud0
neutron-lbaas-common 2:13.0.0-0ubuntu1~cloud0
neutron-lbaasv2-agent 2:13.0.0-0ubuntu1~cloud0
neutron-metadata-agent 2:13.0.1-0ubuntu1~cloud0
neutron-metering-agent 2:13.0.1-0ubuntu1~cloud0
neutron-openvswitch-agent 2:13.0.1-0ubuntu1~cloud0

openvswitch-switch 2.10.0-0ubuntu2~cloud0
openvswitch-common 2.10.0-0ubuntu2~cloud0

CPU-Load average between 2-3 with 8 cores.


When I run the command: service ovs-vswitchd restart
The connections goes down and online again. 

After I did that, I can again create virtual routers and connect interfaces, which are goes active.

The errors are gone for an hour, but come back later. (Sometimes it works for 4-5 hours)

Any ideas what I can do?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20181214/43bddd75/attachment-0001.html>


More information about the discuss mailing list