[ovs-discuss] null ptr exception in ovs_vport_get_stats+0x6a/0x130 [openvswitch]

Flaviof flavio at flaviof.com
Mon Dec 28 03:32:36 UTC 2015


Hello ovs gurus,

I seem to have hit a null exception in openvswitch that has not
been mentioned in the mailing list [ml] for the last 3 months.
The full stack trace and the kdump core are here [1] and here [2].

   ovs_vport_get_stats+0x6a/0x130 [openvswitch]
   internal_dev_get_stats+0x39/0xb0 [openvswitch]
   dev_get_stats+0x6e/0x200
   rtnl_fill_ifinfo+0x459/0xf60
   blk_rq_map_sg+0x9b/0x210
   swiotlb_map_sg_attrs+0x78/0x150
   update_curr+0xcc/0x150
   account_entity_dequeue+0xae/0xd0
   dequeue_entity+0x106/0x510

I'm running centos 7.2 (3.10.0-327.3.1.el7.x86_64 x86_64) with
virtualBox 5 (VirtualBox-5.0-5.0.12_104815_el7-1.x86_64). In
my setup, I start with an ovs bridge br1 which has the internal
address 192.168.50.1  [3]. With that, I see the address added
as the internal port br1 of bridge br1, as shown here [4]. I believe
that is the right config for using a vm with ovs. I have come to
that conclusion based on the faq [faq] by Ben.

I attempt to start a vm using Vagrant, using the following
Vagrantfile [5]. The important line in that Vagrant file that causes
the issue is this:

   node.vm.network "public_network", ip: "192.168.50.254", bridge: "br1"

with that, I'm attempting to make a bridged port on br1 for
the vm, which is given the ip address 192.168.50.254.
I have also seen  folks using the vm on top of a tap [tap] [tap2], but I
don't
quite understand why that is needed. Independent of that, I hit this
kernel panic whether I use the tap interface or the bridge's internal
port.

As the trace show, the crux of the problem is that the code is attempting
to extract stats from an internal ovs port, where the function
ovs_vport_get_stats <http://lxr.oss.org.cn/ident?i=ovs_vport_get_stats> is
given null as the vport param. One
potential fix would be to check for null in
internal_dev_get_stats
<http://lxr.oss.org.cn/ident?i=internal_dev_get_stats> but I would like to
hear from the ovs gurus
in this mailing list if that is the best way to go. I have tried both ovs
2.3.2
and 2.4.0 and hit this issue on both versions. Given this is a kernel crash,
the user-space does not really matter.

I'd be happy to propose a patch in the code, once I hear from you.

Thanks and happy new year!

-- flavio

[ml]: http://openvswitch.org/pipermail/discuss/
[1]: https://gist.github.com/daf71120b99a5d57cb26
[2]:
https://drive.google.com/folderview?id=0BxKY67eIRXcDNmNYU05aVWxWQmc&usp=sharing
[3]: https://gist.github.com/d6fbf119c99dbba61b43
[4]: https://gist.github.com/1a3d0795574c42a6530f
[faq]: https://github.com/openvswitch/ovs/blame/master/FAQ.md#L971
[tap]: https://youtu.be/rYW7kQRyUvA?t=7m22s
[tap2]:
https://ariscahyadi.wordpress.com/2013/07/16/virtual-networking-for-virtualbox-using-open-vswitch/
[5]: https://gist.github.com/446d7b5aa59164dd2bc0
[6]: http://lxr.oss.org.cn/source/net/openvswitch/vport.c#L291
[7]: http://lxr.oss.org.cn/source/net/openvswitch/vport-internal_dev.c#L53
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20151227/d63047b8/attachment-0002.html>


More information about the discuss mailing list