[ovs-dev] Historic bug: old LTS kernel panic, crashable via plain userland network traffic (iperf); Linux Test Project case?

Fri Mar 12 01:44:21 UTC 2021

Hello,

( As discussed, I am adding in cc. some more audience up to dev at openvswitch.org<mailto:dev at openvswitch.org> )

TL;DR / Summary:
* At LTS kernel v4.19.0, I was able to send certain VMs into Kernel Panic remotely, by simply spamming them with slight traffic from user land:
  - In primis, rpm for LTS kernel 4.19.0 came via centos/elrepo, as-is; so this was a stock kernel, likely affecting several folks. I am surprised it passed QA from so many teams.
  - The bug inducing oneliner:
     * `echo 1 2 4 8 16 32 64 128|xargs -n1 iperf3 -c <my_iperf_server> -P` ## `iperf3 -s` server was running on the crashing side
  - the effect was 100% reproducible with the above, falling apart at about 8 or 16 parallel processes, see [2] below for the stacktraces
  - the configuration involved a setup of just a couple k8s pods (i.e. containers) over my openstack provider; fyi. I don’t own the hardware resource, it was my upstream.
  - I established that it needed at least 2 nodes (VMs) _and_ openvswitch was part of the prerequisites for the crash (i.e. I could not reach kernel panic with the iperf server outside of the container)
  - perhaps a crucial piece of information would have been the openstack’s supervisor kernel & qemu version; alas, I don’t have that nor has it been readily available to me.

I may have difficulty today to reproduce it, but I think the simplicity of the test makes it very attractive for LTP, so I am sending this to see what kind of feedback we might collect.

cheers and thanks for reading so far,
Fotis

p.s. @Richard, a bit of feedback follows inline, thanks for the follow up.

On 11 Mar 2021, at 12:08, Richard Palethorpe <rpalethorpe at suse.de<mailto:rpalethorpe at suse.de>> wrote:

Hello Fotis,

Fotis Georgatos <fotis.georgatos at epfl.ch<mailto:fotis.georgatos at epfl.ch>> writes:

Dear Jaime, Richard,

I am motivated by Richard’s recent FOSDEM’21 talk [1] to reach out to
both you.

That's great to know!

Back in 2019 I had to deal with an obscure and esoteric linux 4.19.0 kernel bug, described in [2], whereby any 2 k8s pods could crash each other with the right network pressure and conditions - that was reproducible.
Then, Linux kernel 4.19.1 made the bug go away, presumably/possibly
due to the bug fix in [3], but this is just my personal guess and I
have never got down to cornering it.

I guess this is not specific to k8s, but any OVS setup. A GPF caused by
packet processing could be serious. Especially if the bug is not
actually fixed, but the commit mentioned just made it more difficult to
reproduce.

My thoughts, as well. Even if it’s fixed, the generality of the test would make it very attractive to adopt it under LTP - it would cover several more bug types.

Have you tried recompiling 4.19 with KASAN and lockdep enabled then
reproducing the bug? It may fail earlier giving a more accurate picture
of what is happening.

No, I was discouraged since I didn’t have full system ownership up to metal, to corner all the factors.

Given that the testing command was very simplistic, being just an one liner, I wonder if there is more juice to extract of this, in benefit of LTP itself - and future kernels' stability:
* `iperf3 -s & echo 1 2 4 8 16 32 64 128 256|xargs -n1 iperf3 -c my_iperf_server -P` ## this would crash reproducibly `my_iperf_server`, beyond parallelism values ~=8 or ~=16

Would we like to try to make a test case out of it? Is it worthy for you or do you know if that is merely an instance of another bug report?
(i.e. my hunt there has been fruitless, possibly I am not looking for
the right keywords)

Possibly it could be reproduced in LTP by creating two processes in
different network namespaces, then linking them with OVS, then
recreating something similar to what iperf is doing. This is a pure
guess though.

Yes, it’s a reasonable guess; I’d add to it that the containers need to be on distinct nodes; bug was irreproducible within 1 node.

It is probably worth reporting to the OVS maintainers first, e.g:

$ scripts/get_maintainer.pl net/openvswitch/
Pravin B Shelar <pshelar at ovn.org<mailto:pshelar at ovn.org>> (maintainer:OPENVSWITCH)
"David S. Miller" <davem at davemloft.net<mailto:davem at davemloft.net>> (maintainer:NETWORKING [GENERAL])
Jakub Kicinski <kuba at kernel.org<mailto:kuba at kernel.org>> (maintainer:NETWORKING [GENERAL])
netdev at vger.kernel.org<mailto:netdev at vger.kernel.org> (open list:OPENVSWITCH)
dev at openvswitch.org<mailto:dev at openvswitch.org> (open list:OPENVSWITCH)
linux-kernel at vger.kernel.org<mailto:linux-kernel at vger.kernel.org> (open list)

Then once we know exactly what causes it we can create a minimal
reproducer in LTP or the OVS test suite if there is too much setup
involved for LTP. (I'm not sure what test suite OVS has, but IIRC there
is one).

All the best and thanks to either of you,
Fotis

P.S. Needless to say, I much believe this bug could be food for the teeth of `fzsync` itself [4]!

[1] https://fosdem.org/2021/schedule/event/reproducing_kernel_data_races/
[2] https://github.com/weaveworks/weave/issues/3684 ## reproducible kernel panic w. 4.19.0 & parallel iperf threads P>8, weave/2.5.*; disclaimer: `weave` is merely accelerating the effect.
[3] https://patchwork.ozlabs.org/project/openvswitch/patch/20181102114514.7023-1-jcaamano@suse.com/
[4] https://gitlab.com/Palethorpe/fuzzy-sync

—
Eur Ing Fotis Georgatos
Senior Systems Engineer
__________________________
Swiss Data Science Center, EPFL SDSC-GE, INN 218 (Bâtiment INN) Station 14, CH-1015 Lausanne
Email: fotis.georgatos at epfl.ch<mailto:fotis.georgatos at epfl.ch><mailto:fotis.georgatos at epfl.ch> Tel: +41 21 69 34067

--
Thank you,
Richard.

—
Eur Ing Fotis Georgatos
Senior Systems Engineer
__________________________
Swiss Data Science Center, EPFL SDSC-GE, INN 218 (Bâtiment INN) Station 14, CH-1015 Lausanne
Email: fotis.georgatos at epfl.ch<mailto:fotis.georgatos at epfl.ch> Tel: +41 21 69 34067