[ovs-discuss] Null pointer when allocating OVS flow in ovs_packet_cmd_execute

Aaron Conole aconole at redhat.com
Fri May 29 15:03:25 UTC 2020


Daryl Wang via discuss <ovs-discuss at openvswitch.org> writes:

> We noticed that an OVS datapath stopped responding to stats request. On checking dmesg, we found that OVS
> ran into a null pointer in ovs_flow_alloc while in ovs_packet_cmd_execute. Is this a known bug?

Not as far as I am aware.

> We have not seen the failure again, so we don't have a good sense of what triggers the error. Logs didn't record
> any noteworthy changes to the datapath around the time of failure.
> Open vSwitch version is 2.11.2. Unfortunately we did not have kernel debugging enabled at the time of the
> crash:
>
> May 15 09:23:14 hostname kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
> May 15 09:23:14 hostname kernel: #PF: supervisor read access in kernel mode
> May 15 09:23:14 hostname kernel: #PF: error_code(0x0000) - not-present page
> May 15 09:23:14 hostname kernel: PGD 0 P4D 0 
> May 15 09:23:14 hostname kernel: Oops: 0000 [#1] SMP PTI
> May 15 09:23:14 hostname kernel: CPU: 8 PID: 158558 Comm: handler91 Tainted: G           O     
> 5.2.17-1rodete3-amd64 #1 Debian 5.2.17-1rodete3
> May 15 09:23:14 hostname kernel: Hardware name: "Hardware information"
> May 15 09:23:14 hostname kernel: RIP: 0010:kmem_cache_alloc_node+0x7e/0x1f0

Note that this is the kmem_cache infra that gets either the flow or
flow_stats object during allocation.

How easily can you reproduce this?  It looks like something broke that
cache object - was something unloading / reloading the ovs module during
this time?  Just wondering how to reproduce it.

> May 15 09:23:14 hostname kernel: Code: 75 01 00 00 4d 8b 07 65 49 8b 50 08 65 4c 03 05 70 0e fc 74 4d
> 8b 30 4d 85 f6 74 1a 41 83 fc ff 0f 84 83 00 00 00 49 8b 40 10 <48>
>  8b 00 48 c1 e8 3a 41 39 c4 74 73 48 8b 0c 24 44 89 e2 89 ee 4c
> May 15 09:23:14 hostname kernel: RSP: 0018:ffffbc910e0afa68 EFLAGS: 00010213
> May 15 09:23:14 hostname kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000000000
> May 15 09:23:14 hostname kernel: RDX: 0000000000043ce7 RSI: 0000000000000dc0 RDI: ffff9a7f80126280
> May 15 09:23:14 hostname kernel: RBP: 0000000000000dc0 R08: ffffdc90fc317330 R09: ffff9a7b06225b60
> May 15 09:23:14 hostname kernel: R10: 0000000000000003 R11: ffffffffc0c5e510 R12: 0000000000000000
> May 15 09:23:14 hostname kernel: R13: ffff9a7f80126280 R14: ffff9a7f64dbada0 R15: ffff9a7f80126280
> May 15 09:23:14 hostname kernel: FS:  00007fdb877fe700(0000) GS:ffff9a803f100000(0000)
> knlGS:0000000000000000
> May 15 09:23:14 hostname kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> May 15 09:23:14 hostname kernel: CR2: 0000000000000000 CR3: 0000000ed7d46004 CR4:
> 00000000003606e0
> May 15 09:23:14 hostname kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> May 15 09:23:14 hostname kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> May 15 09:23:14 hostname kernel: Call Trace:
> May 15 09:23:14 hostname kernel: ? ovs_flow_alloc+0x4d/0x90 [openvswitch]
> May 15 09:23:14 hostname kernel: ovs_flow_alloc+0x4d/0x90 [openvswitch]
> May 15 09:23:14 hostname kernel: ovs_packet_cmd_execute+0xd0/0x2a0 [openvswitch]
> May 15 09:23:14 hostname kernel: ? _cond_resched+0x15/0x30
> May 15 09:23:14 hostname kernel: genl_family_rcv_msg+0x1d2/0x410
> May 15 09:23:14 hostname kernel: ? recalibrate_cpu_khz+0x10/0x10
> May 15 09:23:14 hostname kernel: ? ktime_get_raw_ts64+0x32/0xc0
> May 15 09:23:14 hostname kernel: genl_rcv_msg+0x47/0x90
> May 15 09:23:14 hostname kernel: ? __kmalloc_node_track_caller+0x1cb/0x290
> May 15 09:23:14 hostname kernel: ? genl_family_rcv_msg+0x410/0x410
> May 15 09:23:14 hostname kernel: netlink_rcv_skb+0x49/0x110
> May 15 09:23:14 hostname kernel: genl_rcv+0x24/0x40
> May 15 09:23:15 hostname kernel: netlink_unicast+0x17e/0x200
> May 15 09:23:15 hostname kernel: netlink_sendmsg+0x204/0x3d0
> May 15 09:23:15 hostname kernel: sock_sendmsg+0x4c/0x50
> May 15 09:23:15 hostname kernel: ___sys_sendmsg+0x29f/0x300
> May 15 09:23:15 hostname kernel: ? ep_send_events_proc+0xf7/0x250
> May 15 09:23:15 hostname kernel: ? ep_read_events_proc+0xe0/0xe0
> May 15 09:23:15 hostname kernel: ? ep_scan_ready_list.constprop.21+0x1fe/0x230
> May 15 09:23:15 hostname kernel: __sys_sendmsg+0x57/0xa0
> May 15 09:23:15 hostname kernel: do_syscall_64+0x53/0x130
> May 15 09:23:15 hostname kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
> May 15 09:23:15 hostname kernel: RIP: 0033:0x7fdd0141f22d
> May 15 09:23:15 hostname kernel: Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 0a ed ff ff 8b 54 24
> 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2f 44 89 c7 48 89 44 24
> 08 e8 3e ed ff ff 48
> May 15 09:23:15 hostname kernel: RSP: 002b:00007fdb8779f100 EFLAGS: 00000293 ORIG_RAX:
> 000000000000002e
> May 15 09:23:15 hostname kernel: RAX: ffffffffffffffda RBX: 00007fdb8779ff80 RCX: 00007fdd0141f22d
> May 15 09:23:15 hostname kernel: RDX: 0000000000000000 RSI: 00007fdb8779f190 RDI:
> 0000000000000015
>
>
>
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss



More information about the discuss mailing list