[ovs-dev] [PATCH] netlink-socket: Check for null sock in nl_sock_recv__()

Flavio Leitner fbl at sysclose.org
Wed Nov 24 13:16:33 UTC 2021


On Tue, Nov 23, 2021 at 07:58:52PM -0300, Murilo Opsfelder Araújo wrote:
> Hi, Flavio.
> 
> On 11/23/21 15:32, Flavio Leitner wrote:
> > On Mon, Nov 22, 2021 at 03:46:28PM -0300, Murilo Opsfelder Araújo wrote:
> > > Hi, Ilya Maximets.
> > > 
> > > On 11/19/21 13:23, Ilya Maximets wrote:
> > > > On 11/18/21 22:19, David Christensen wrote:
> > > > > 
> > > > > 
> > > > > On 11/18/21 11:56 AM, Murilo Opsfelder Araújo wrote:
> > > > > > On 11/16/21 19:31, Ilya Maximets wrote:
> > > > > > > On 10/25/21 19:45, David Christensen wrote:
> > > > > > > > In certain high load situations, such as when creating a large number of
> > > > > > > > ports on a switch, the parameter 'sock' may be passed to nl_sock_recv__()
> > > > > > > > as null, resulting in a segmentation fault when 'sock' is later
> > > > > > > > dereferenced, such as when calling recvmsg().
> > > > > > > 
> > > > > > > Hi, David.  Thanks for the patch.
> > > > > > > 
> > > > > > > It's OK to check for a NULL pointer there, I guess.  However,
> > > > > > > do you know from where it was actually called?  This function,
> > > > > > > in general, should not be called without the actual socket,
> > > > > > > so we, probably, should fix the caller instead.
> > > > > > > 
> > > > > > > Best regards, Ilya Maximets.
> > > > > > 
> > > > > > Hi, Ilya Maximets.
> > > > > > 
> > > > > > When I looked at the coredump file, ch->sock was nil and was passed to nl_sock_recv():
> > > > > > 
> > > > > > (gdb) l
> > > > > > 2701
> > > > > > 2702        while (handler->event_offset < handler->n_events) {
> > > > > > 2703            int idx = handler->epoll_events[handler->event_offset].data.u32;
> > > > > > 2704            struct dpif_channel *ch = &dpif->channels[idx];
> > > > > > 
> > > > > > (gdb) p idx
> > > > > > $26 = 4
> > > > > > (gdb) p *dpif->channels at 5
> > > > > > $27 = {{sock = 0x1001ae88240, last_poll = -9223372036854775808}, {sock = 0x1001aa9a8a0, last_poll = -9223372036854775808}, {sock = 0x1001ae09510, last_poll = 60634070}, {sock = 0x1001a9dbb60, last_poll = 60756950}, {sock = 0x0,
> > > > > >        last_poll = 61340749}}
> > > > > > 
> > > > > > 
> > > > > > The above snippet is from lib/dpif-netlink.c and the caller is dpif_netlink_recv_vport_dispatch().
> > > > > > 
> > > > > > The channel at idx=4 had sock=0x0, which was passed to nl_sock_recv() via ch->sock parameter.
> > > > > > In that function, it tried to access sock->fd when calling recvmsg(), causing the segfault.
> > > > > > 
> > > > > > I'm not enough experienced in Open vSwitch to explain why sock was nil at that given index.
> > > > > > The fix seems worth, though.
> > > > > 
> > > > > A few other points of note:
> > > > > 
> > > > > - Test system was a very large configuration (2K CPUs, > 1TB RAM)
> > > > > - OVS Switch was configured with 6K ports as follows:
> > > > > 
> > > > > # b=br0; cmds=; for i in {1..6000}; do cmds+=" -- add-port $b p$i -- set interface p$i type=internal"; done
> > > > > # time sudo ovs-vsctl $cmds
> > > > > 
> > > > > - OVS was installed from RHEL RPM.  Build from source did not exhibit the same problem.
> > > > > - Unable to reproduce on a different system (128 CPUs, 256GB RAM), even with 10K ports.
> > > > > 
> > > > > Dave
> > > > 
> > > > Thanks for all the information.  Having a NULL socket in the 'channels'
> > > > array doesn't look a good sign.  This may mean that even if OVS will
> > > > not crash, it may not receive upcalls for certain ports, which is not
> > > > good eihter.
> > > > 
> > > > Could you provide more info on which RHEL package did you use and which
> > > > version you built from source?  Maybe we can find the difference.
> > > 
> > > When the bug was reported internally, the distro package was actually from Fedora 33 ppc64le.
> > > 
> > > One difference I've found at that time was regarding the configure options.
> > > 
> > >  From the logs at [0], the distro package was configured with these options:
> > > 
> > >      + ./configure --build=ppc64le-redhat-linux-gnu --host=ppc64le-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --enable-libcapng --disable-static --enable-shared --enable-ssl --with-dpdk=shared --with-pkidir=/var/lib/openvswitch/pki
> > > 
> > > And the built-from-source binaries from Dave were built like this:
> > > 
> > >      git clone https://github.com/openvswitch/ovs.git
> > >      cd ovs/
> > >      git checkout v2.15.0
> > 
> > Do you mean 2.16.0? Because the dpif_netlink_recv_vport_dispatch()
> > from the snippet above doesn't exist in 2.15.0.
> > 
> > And the FC33 uses 2.15.0:
> > https://koji.fedoraproject.org/koji/buildinfo?buildID=1713250
> > 
> 
> I'm sorry.  I'd looked into the wrong revision.
> The correct name of the caller is dpif_netlink_recv__(), in lib/dpif-netlink.c.
> 
> The version tag is correct: v2.15.0
> It matches the version from Fedora 33.

OK, that makes sense.

> > Can you reproduce the issue?
> > The reason I am asking is because in 2.16.0 we added another way to
> > handle upcalls, so I am wondering if the bug was introduced with
> > that change 2.16.0 or if it was already there in 2.15.0.
> 
> The system where this issue happened is being used for other purposes now.
> I'll see if we can manage to reproduce on another one.

At least we know if there is a bug, it's prior to the big change
adding the new upcall dispatching. 
>
> We may also want to use latest bits of OVS and Fedora.
> 
> > 
> > Also, could you provide the full backtrace of all threads?
> > (gdb) thread apply all bt
> > 
> > It would be nice to have a dump of the struct handler too.
> 
> Absolutely!  We'll make sure to capture these data next time.

That might be available in the core you already have.
Actually, if you can put the core somewhere maybe I can
take a look here using fedora debuginfo package.

fbl


More information about the dev mailing list