[ovs-discuss] ovs-vswitchd.service crashes

Koukal Petr p.koukal at radiokomunikace.cz
Thu Nov 14 14:12:27 UTC 2019


I'll add information.
With hw-offload disabled, creating new instances is no problem.

ovs-vsctl set Open_vSwitch. other-config: hw-offload = false

We have installed kernel linux-image-generic-hwe-18.04-edge 4.18.0.16.65

Thank you for your help.
Petr

On 11/12/19 12:49 PM, Koukal Petr wrote:

Hello,
I installed the linux-generic-hwe-18.04-edge kernel.

apt list --installed | grep linux-image
linux-image-generic-hwe-18.04-edge/bionic,now 4.18.0.16.65 amd64 [installed]

hwe-support-status - verbose
Your Hardware Enablement Stack (HWE) is supported until April 2023.


Previously reported error when
In a short time after creating several new instances, the service will crash after capturing this assert
"ovs | 00001 | util (handler8) | EMER | ../ include / openvswitch / ofpbuf.h: 190: assertion offset + size <= b -> size failed inpbuf_at_assert ()"
it is not happening yet.

However, there was another problem.
When creating a new VM, the following error occurs:
in /var/log/openvswitch/ovs-vswitchd.log

2019-11-11T15:54:46.937Z|00108|connmgr|INFO|br-int<->tcp:127.0.0.1:6633: 16 flow_mods in the 4 s starting 10 s ago (12 adds, 4 deletes)
2019-11-11T16:03:53.798Z|00001|dpif_netlink(handler7)|ERR|failed to offload flow: Numerical result out of range: eth51
2019-11-11T16:03:53.810Z|00002|dpif_netlink(handler7)|ERR|failed to offload flow: Numerical result out of range: eth51
2019-11-11T16:03:54.154Z|00003|dpif_netlink(handler7)|ERR|failed to offload flow: Numerical result out of range: eth51
2019-11-11T16:03:55.178Z|00004|dpif_netlink(handler7)|ERR|failed to offload flow: Numerical result out of range: eth51


The status of ovs-vswitchd will end up with a problem with offloading rules.

oot at dev-node1:/usr/share/doc# service ovs-vswitchd status
* ovs-vswitchd.service - Open vSwitch Forwarding Unit
   Loaded: loaded (/lib/systemd/system/ovs-vswitchd.service; static; vendor preset: enabled)
   Active: active (running) since Mon 2019-11-11 16:28:43 CET; 36min ago
  Process: 26869 ExecStop=/usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server stop (code=exited, status=0/SUCCESS)
  Process: 26993 ExecStart=/usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server --no-monitor --system-id=random start $OPTIONS (code=exited,
 Main PID: 27037 (ovs-vswitchd)
    Tasks: 4 (limit: 4915)
   CGroup: /system.slice/ovs-vswitchd.service
           `-27037 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --no-chdir --log-file=/var/l

Nov 11 16:28:43 dev-node1 ovs-vsctl[27057]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . external-ids:hostname=dev-node1
Nov 11 16:28:43 dev-node1 ovs-ctl[26993]:  * Enabling remote OVSDB managers
Nov 11 16:28:43 dev-node1 systemd[1]: Started Open vSwitch Forwarding Unit.
Nov 11 17:03:53 dev-node1 ovs-vswitchd[27037]: ovs|00001|dpif_netlink(handler7)|ERR|failed to offload flow: Numerical result out of range: eth51
Nov 11 17:03:53 dev-node1 ovs-vswitchd[27037]: ovs|00002|dpif_netlink(handler7)|ERR|failed to offload flow: Numerical result out of range: eth51
Nov 11 17:03:54 dev-node1 ovs-vswitchd[27037]: ovs|00003|dpif_netlink(handler7)|ERR|failed to offload flow: Numerical result out of range: eth51


This error does not happen with every VM creation, but it always ends after a few successfully created VMs.


We have hw-offload turned on

root at dev-node1: ~ # ovs-vsctl get Open_vSwitch. other-config
{hw-offload = "true"}

Thank you for your help.

Petr Koukal





On 11/7/19 10:46 AM, James Page wrote:
Hi Koukal

I note that you're using the 4.18 kernel from Ubuntu; For Mellanox hardware offload I'd suggest you switch to using the hwe-edge kernel (5.3) as this is generally more mature for this (and other mlx5_core) feature - package name is 'linux-generic-hwe-18.04-edge'.


On Thu, Nov 7, 2019 at 9:23 AM Koukal Petr <p.koukal at radiokomunikace.cz<mailto:p.koukal at radiokomunikace.cz>> wrote:

In case I turn off hw-offload with
ovs-vsctl set Open_vSwitch. other-config: hw-offload = false
then "ovs-vswitchd" is without collision.

Is it possible to reach someone who knows what to do with the offload problem?

Thank you for your help.

Petr

On 11/6/19 6:48 PM, Ben Pfaff wrote:

On Wed, Nov 06, 2019 at 01:59:36PM +0100, Koukal Petr wrote:


The problem is the same even if hw-offload is off.

I'm sending a log from ovs-vswitchd.log just after restarting the whole
openvswitch-switch.
Here you can see what happens before the assertion pops up.

ethtool -K phys1-1 hw-tc-offload off
ethtool -K phys1-2 hw-tc-offload off


This demonstrates turning off hardware offload at the ethtool level.
However, even with that, I believe that OVS will still try to use it if
OVS is configured for hardware offload.  It looks like your OVS does
have hardware offload enabled.  To turn it off, run:

ovs-vsctl set Open_vSwitch . other-config:hw-offload=false

Then restart OVS and see if it makes a difference.


____________________________________________
discuss mailing list
discuss at openvswitch.org<mailto:discuss at openvswitch.org>
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Informace obsažené v této e-mailové zprávě a všech přiložených souborech jsou důvěrné a jsou určeny pouze pro potřebu adresáta. Prosíme, abyste v případě, že tento e-mail obdržíte omylem, neprodleně upozornili odesílatele a tento e-mail odstranili z Vašeho systému. Pokud nejste zamýšleným příjemcem, berte prosím na vědomí, že zveřejnění, kopírování, šíření či přijetí jakéhokoliv opatření v souvislosti s obsahem této zprávy je zakázáno a může být protiprávní.

_____________________________________________________________________

The information contained in this e-mail message and all attached files is confidential and is intended solely for the use of the individual or entity to whom they are addressed. Please notify the sender immediately if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is prohibited and may be unlawful.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20191114/67581467/attachment.html>


More information about the discuss mailing list