[ovs-discuss] kernel panic when setting QoS parameters more than once
Riccardo R.
riccardoravaioli at gmail.com
Thu Jan 12 10:00:05 UTC 2017
Hi Joe,
In my first message I forgot to say that the interface on which to run the
command must be an existing interface, not a dummy one as in my example.
Anyways, I recently tried on a 4.8 kernel and the problem disappeared. For
easiness I used the version of openvswitch that I already had in the
system, that is 2.4. I strongly suspect that it's a problem with kernel
4.6, independently of the version of openvswitch.
So in the setup which worked:
$ uname -a
Linux OVP800 4.8.0-0.bpo.2-amd64 #1 SMP Debian 4.8.11-1~bpo8+1 (2016-12-14)
x86_64 GNU/Linux
$ ovs-vsctl
--version
ovs-vsctl (Open vSwitch) 2.4.90
And the setup wich didn't work:
$ uname -a
Linux OVPi7RD2 4.6.0-1-amd64 #1 SMP Debian 4.6.1-1 (2016-06-06) x86_64
GNU/Linux
$ ovs-vsctl
--version
ovs-vsctl (Open vSwitch) 2.4.90
I use Debian machines running Jessie. I can't install kernels 4.4 or 4.9
right away, but here is the log from the kernel 4.6 when it crashed:
Debian GNU/Linux 8 mymachine ttyS0 - 1.0.4.4:N
mymachine login: [ 433.265229] BUG: unable to handle kernel NULL pointer
dereference at 000000000000000c
[ 433.273210] IP: [<ffffffff815c6237>] _raw_spin_lock_bh+0x17/0x30
[ 433.279317] PGD 0
[ 433.281425] Oops: 0002 [#1] SMP
[ 433.284803] Modules linked in: act_police(E) cls_basic(E) sch_ingress(E)
vhost_net(E) vhost(E) macvtap(E) macvlan(E) openvswitch(E)
nf_conntrack_ipv6(E) nf_nat_ipv6(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E)
nf_nat_ipv4(E) nf_defrag_ipv6(E) nf_nat(E) nf_conntrack(E) libcrc32c(E)
veth(E) tun(E) bridge(E) stp(E) llc(E) cpufreq_conservative(E)
cpufreq_userspace(E) cpufreq_powersave(E) cpufreq_stats(E) w83627ehf(E)
hwmon_vid(E) igb_uio(OE) uio_pci_generic(E) uio(E) intel_rapl(E)
x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E)
irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E)
iTCO_wdt(E) jitterentropy_rng(E) iTCO_vendor_support(E) i915(E) hmac(E)
evdev(E) drbg(E) ansi_cprng(E) aesni_intel(E) aes_x86_64(E) lrw(E)
gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) snd_hda_intel(E)
snd_hda_codec(E) pcspkr(E) snd_hda_core(E) snd_hwdep(E) drm_kms_helper(E)
snd_pcm(E) drm(E) i2c_i801(E) snd_timer(E) 8250_fintek(E) snd(E) tpm_tis(E)
soundcore(E) lpc_ich(E) tpm(E) mfd_core(E) shpchp(E) battery(E) video(E)
processor(E) button(E) ext4(E) crc16(E) jbd2(E) crc32c_generic(E)
mbcache(E) sg(E) sd_mod(E) ahci(E) libahci(E) crc32c_intel(E) libata(E)
xhci_pci(E) ehci_pci(E) scsi_mod(E) ehci_hcd(E) xhci_hcd(E) igb(E)
i2c_algo_bit(E) dca(E) ptp(E) usbcore(E) pps_core(E) usb_common(E) fan(E)
thermal(E) fjes(E)
[ 433.408872] CPU: 0 PID: 4244 Comm: qemu-system-x86 Tainted: G
OE 4.6.0-1-amd64 #1 Debian 4.6.1-1
[ 433.418763] Hardware name: To be filled by O.E.M. To be filled by
O.E.M./SHARKBAY, BIOS 4.6.5 10/08/2014
[ 433.428303] task: ffff8800d6674fc0 ti: ffff88040b7f0000 task.ti:
ffff88040b7f0000
[ 433.435850] RIP: 0010:[<ffffffff815c6237>] [<ffffffff815c6237>]
_raw_spin_lock_bh+0x17/0x30
[ 433.444417] RSP: 0018:ffff88041fa03e70 EFLAGS: 00010246
[ 433.449779] RAX: 0000000000000000 RBX: 000000000000000c RCX:
ffff88040a3c2200
[ 433.456966] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
000000000000000c
[ 433.464151] RBP: ffff88040a3c2200 R08: ffff88040a0e12e8 R09:
0000000000000100
[ 433.471335] R10: 00000000000001da R11: 000000000000303a R12:
ffff88040a02b200
[ 433.478519] R13: ffff880023fda110 R14: 0000000000000001 R15:
ffff88041fa16940
[ 433.485705] FS: 00007fcfcf7ed700(0000) GS:ffff88041fa00000(0000)
knlGS:0000000000000000
[ 433.493856] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 433.499657] CR2: 000000000000000c CR3: 000000040b745000 CR4:
00000000001426f0
[ 433.506844] Stack:
[ 433.508906] ffffffff814f5042 ffff880023fda0f8 ffff88040a02b218
ffffffff814f573c
[ 433.516590] ffff880023fda108 ffff880023fda110 ffff88040a02b258
000000000000000a
[ 433.524270] 0000000000000008 ffff88041fa16940 ffffffff814f388b
ffff880023fda100
[ 433.531956] Call Trace:
[ 433.534458] <IRQ>
[ 433.536422] [<ffffffff814f5042>] ? __tcf_hash_release+0x72/0xf0
[ 433.542724] [<ffffffff814f573c>] ? tcf_action_destroy+0x6c/0xa0
[ 433.548779] [<ffffffff814f388b>] ? tcf_exts_destroy+0x1b/0x30
[ 433.554666] [<ffffffffc0580213>] ? basic_delete_filter+0x13/0x30
[cls_basic]
[ 433.561855] [<ffffffff810dba59>] ? rcu_process_callbacks+0x1f9/0x5e0
[ 433.568354] [<ffffffff815c90b8>] ? __do_softirq+0xf8/0x28e
[ 433.573982] [<ffffffff8107fe3b>] ? irq_exit+0x9b/0xa0
[ 433.579173] [<ffffffff815c8ece>] ? smp_apic_timer_interrupt+0x3e/0x50
[ 433.585753] [<ffffffff815c71e2>] ? apic_timer_interrupt+0x82/0x90
[ 433.591985] <EOI>
[ 433.593952] [<ffffffffc0c02240>] ? vmx_invpcid_supported+0x20/0x20
[kvm_intel]
[ 433.601568] [<ffffffffc0c058d5>] ? vmx_interrupt_allowed+0x25/0x30
[kvm_intel]
[ 433.608958] [<ffffffffc0691140>] ? kvm_arch_vcpu_runnable+0xa0/0xd0
[kvm]
[ 433.615896] [<ffffffffc0675a8e>] ? kvm_vcpu_check_block+0xe/0x50 [kvm]
[ 433.622569] [<ffffffffc06761de>] ? kvm_vcpu_block+0x1ee/0x2b0 [kvm]
[ 433.628982] [<ffffffffc06918cc>] ? kvm_arch_vcpu_ioctl_run+0x75c/0x1520
[kvm]
[ 433.636274] [<ffffffffc0678c36>] ? kvm_vcpu_ioctl+0x316/0x5d0 [kvm]
[ 433.642678] [<ffffffff810ba0be>] ? __wake_up_common+0x4e/0x90
[ 433.648560] [<ffffffff81204a6d>] ? do_vfs_ioctl+0x9d/0x5c0
[ 433.654193] [<ffffffffc068390e>] ? kvm_on_user_return+0x3e/0x70 [kvm]
[ 433.660772] [<ffffffff81205004>] ? SyS_ioctl+0x74/0x80
[ 433.666048] [<ffffffff815c65b6>] ? system_call_fast_compare_end+0xc/0x96
[ 433.672888] Code: 01 00 00 00 c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00
00 00 0f 1f 44 00 00 65 81 05 20 71 a4 7e 00 02 00 00 31 c0 ba 01 00 00 00
<f0> 0f b1 17 85 c0 75 02 f3 c3 89 c6 e8 18 a0 af ff 66 90 c3 0f
[ 433.695297] RIP [<ffffffff815c6237>] _raw_spin_lock_bh+0x17/0x30
[ 433.701501] RSP <ffff88041fa03e70>
[ 433.705047] CR2: 000000000000000c
[ 433.708417] ---[ end trace befdabb552866e5d ]---
[ 433.713082] Kernel panic - not syncing: Fatal exception in interrupt
[ 433.719503] Kernel Offset: disabled
[ 433.723046] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt
[ 433.730244] ------------[ cut here ]------------
[ 433.734915] WARNING: CPU: 0 PID: 4244 at
/build/linux-Jaz5I6/linux-4.6.1/arch/x86/kernel/smp.c:125
update_process_times+0x4a/0x60
[ 433.746629] Modules linked in: act_police(E) cls_basic(E) sch_ingress(E)
vhost_net(E) vhost(E) macvtap(E) macvlan(E) openvswitch(E)
nf_conntrack_ipv6(E) nf_nat_ipv6(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E)
nf_nat_ipv4(E) nf_defrag_ipv6(E) nf_nat(E) nf_conntrack(E) libcrc32c(E)
veth(E) tun(E) bridge(E) stp(E) llc(E) cpufreq_conservative(E)
cpufreq_userspace(E) cpufreq_powersave(E) cpufreq_stats(E) w83627ehf(E)
hwmon_vid(E) igb_uio(OE) uio_pci_generic(E) uio(E) intel_rapl(E)
x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E)
irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E)
iTCO_wdt(E) jitterentropy_rng(E) iTCO_vendor_support(E) i915(E) hmac(E)
evdev(E) drbg(E) ansi_cprng(E) aesni_intel(E) aes_x86_64(E) lrw(E)
gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) snd_hda_intel(E)
snd_hda_codec(E) pcspkr(E) snd_hda_core(E) snd_hwdep(E) drm_kms_helper(E)
snd_pcm(E) drm(E) i2c_i801(E) snd_timer(E) 8250_fintek(E) snd(E) tpm_tis(E)
soundcore(E) lpc_ich(E) tpm(E) mfd_core(E) shpchp(E) battery(E) video(E)
processor(E) button(E) ext4(E) crc16(E) jbd2(E) crc32c_generic(E)
mbcache(E) sg(E) sd_mod(E) ahci(E) libahci(E) crc32c_intel(E) libata(E)
xhci_pci(E) ehci_pci(E) scsi_mod(E) ehci_hcd(E) xhci_hcd(E) igb(E)
i2c_algo_bit(E) dca(E) ptp(E) usbcore(E) pps_core(E) usb_common(E) fan(E)
thermal(E) fjes(E)
[ 433.870854] CPU: 0 PID: 4244 Comm: qemu-system-x86 Tainted: G D
OE 4.6.0-1-amd64 #1 Debian 4.6.1-1
[ 433.880743] Hardware name: To be filled by O.E.M. To be filled by
O.E.M./SHARKBAY, BIOS 4.6.5 10/08/2014
[ 433.890283] 0000000000000086 000000003b22f910 ffffffff81311425
0000000000000000
[ 433.897969] 0000000000000000 ffffffff8107a50e ffff8800d6674fc0
0000000000000000
[ 433.905655] ffff88041fa03c18 ffffffff810f1dd0 0000000000000003
ffff88041fa0ee40
[ 433.913344] Call Trace:
[ 433.915843] <IRQ> [<ffffffff81311425>] ? dump_stack+0x5c/0x77
[ 433.921872] [<ffffffff8107a50e>] ? __warn+0xbe/0xe0
[ 433.926890] [<ffffffff810f1dd0>] ? tick_sched_handle.isra.13+0x50/0x50
[ 433.933558] [<ffffffff810e2e5a>] ? update_process_times+0x4a/0x60
[ 433.939789] [<ffffffff810f1da0>] ? tick_sched_handle.isra.13+0x20/0x50
[ 433.946456] [<ffffffff810f1e08>] ? tick_sched_timer+0x38/0x70
[ 433.952336] [<ffffffff810e36ea>] ? __hrtimer_run_queues+0xea/0x280
[ 433.958657] [<ffffffff810e3e69>] ? hrtimer_interrupt+0x99/0x190
[ 433.964716] [<ffffffff815c8ec9>] ? smp_apic_timer_interrupt+0x39/0x50
[ 433.971294] [<ffffffff815c71e2>] ? apic_timer_interrupt+0x82/0x90
[ 433.977526] [<ffffffff8116f9fe>] ? panic+0x1e0/0x220
[ 433.982626] [<ffffffff8116f9f7>] ? panic+0x1d9/0x220
[ 433.987733] [<ffffffff8102faa3>] ? oops_end+0xc3/0xd0
[ 433.992925] [<ffffffff81064011>] ? no_context+0x131/0x370
[ 433.998460] [<ffffffff815c85f8>] ? page_fault+0x28/0x30
[ 434.003827] [<ffffffff815c6237>] ? _raw_spin_lock_bh+0x17/0x30
[ 434.009798] [<ffffffff814f5042>] ? __tcf_hash_release+0x72/0xf0
[ 434.015857] [<ffffffff814f573c>] ? tcf_action_destroy+0x6c/0xa0
[ 434.021915] [<ffffffff814f388b>] ? tcf_exts_destroy+0x1b/0x30
[ 434.027798] [<ffffffffc0580213>] ? basic_delete_filter+0x13/0x30
[cls_basic]
[ 434.034987] [<ffffffff810dba59>] ? rcu_process_callbacks+0x1f9/0x5e0
[ 434.041475] [<ffffffff815c90b8>] ? __do_softirq+0xf8/0x28e
[ 434.047099] [<ffffffff8107fe3b>] ? irq_exit+0x9b/0xa0
[ 434.052289] [<ffffffff815c8ece>] ? smp_apic_timer_interrupt+0x3e/0x50
[ 434.058871] [<ffffffff815c71e2>] ? apic_timer_interrupt+0x82/0x90
[ 434.065099] <EOI> [<ffffffffc0c02240>] ?
vmx_invpcid_supported+0x20/0x20 [kvm_intel]
[ 434.073139] [<ffffffffc0c058d5>] ? vmx_interrupt_allowed+0x25/0x30
[kvm_intel]
[ 434.080516] [<ffffffffc0691140>] ? kvm_arch_vcpu_runnable+0xa0/0xd0
[kvm]
[ 434.087450] [<ffffffffc0675a8e>] ? kvm_vcpu_check_block+0xe/0x50 [kvm]
[ 434.094123] [<ffffffffc06761de>] ? kvm_vcpu_block+0x1ee/0x2b0 [kvm]
[ 434.100538] [<ffffffffc06918cc>] ? kvm_arch_vcpu_ioctl_run+0x75c/0x1520
[kvm]
[ 434.107831] [<ffffffffc0678c36>] ? kvm_vcpu_ioctl+0x316/0x5d0 [kvm]
[ 434.114235] [<ffffffff810ba0be>] ? __wake_up_common+0x4e/0x90
[ 434.120117] [<ffffffff81204a6d>] ? do_vfs_ioctl+0x9d/0x5c0
[ 434.125747] [<ffffffffc068390e>] ? kvm_on_user_return+0x3e/0x70 [kvm]
[ 434.132330] [<ffffffff81205004>] ? SyS_ioctl+0x74/0x80
[ 434.137605] [<ffffffff815c65b6>] ? system_call_fast_compare_end+0xc/0x96
[ 434.144445] ---[ end trace befdabb552866e5e ]---
Let me know if more details are needed.
Thanks!
Riccardo
On Wed, Jan 11, 2017 at 11:30 PM, Joe Stringer <joe at ovn.org> wrote:
> On 10 January 2017 at 05:46, Riccardo R. <riccardoravaioli at gmail.com>
> wrote:
> > Hi,
> >
> > By setting QoS parameters more than once on an OpenvSwitch bridge, we
> caused
> > kernel panic on our machine. This is
> > reproducible with the following commands:
> >
> > $ ovs-vsctl add-br mybridge
> > $ ovs-vsctl add-port mybridge myiface
> > $ ovs-vsctl set interface myiface ingress_policing_rate=100
> > # with the command below the kernel will crash, regardless of the value
> > given as input:
> > $ ovs-vsctl set interface myiface ingress_policing_rate=1000
> >
> > We tried this with openvswitch 2.4 and 2.6.1 on a 4.6 Linux kernel.
> > Interestingly, it works correctly on openvswitch 2.3.0 installed on a
> 3.16
> > kernel.
>
> Are you able to get the kernel backtrace from the console when it crashes?
>
> Most likely this is a regression in upstream Linux for the kernel that
> you are seeing the problem. Can you provide more details? Eg, where
> did you get your 4.6 kernel from? What is the full "uname -r" version
> for it? What .config was used when compiling it? Are you able to also
> try other versions, eg 4.4 or 4.9?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20170112/ddc2190c/attachment-0001.html>
More information about the discuss
mailing list