[ovs-dev] [PATCH RFC v2 0/8] Introduce connection tracking tc offload
Marcelo Ricardo Leitner
marcelo.leitner at gmail.com
Tue Jul 23 15:42:17 UTC 2019
On Sat, Jul 20, 2019 at 08:26:59AM +0000, Paul Blakey wrote:
> Hi Marcelo, thanks for reporting this, can you dump the datapath rules via ovs-appctl dpctl/dump-flows -m --names
> Also running "tc filter show dev ns2-veth-ab ingress" while it happened can show us more details.
>
> I'll try and reproduce it on my end.
> How often does this happen? and what is the setup ?
Quite often.
Setup is:
ns1 ns2
--------------. .--------------
| v------ br0 ------v |
ns1-veth-ba--x----ns1-veth-ab ns2-veth-ab----x-- ns2-veth-ba
| |
--------------' '--------------
Seems related to the skb_ext patches.. with a kernel with more
debugging stuff enabled, I'm seeing:
[ 17.467576] =============================
[ 17.468717] WARNING: suspicious RCU usage
[ 17.469687] 5.2.0.c1f3d.g83fb7bc8ff16+ #2 Not tainted
[ 17.471496] -----------------------------
[ 17.472922] net/sched/sch_ingress.c:52 suspicious rcu_dereference_protected() usage!
[ 17.474715]
other info that might help us debug this:
[ 17.476219]
rcu_scheduler_active = 2, debug_locks = 1
[ 17.477408] no locks held by ovs-vswitchd/1063.
[ 17.478267]
stack backtrace:
[ 17.479545] CPU: 10 PID: 1063 Comm: ovs-vswitchd Not tainted 5.2.0.c1f3d.g83fb7bc8ff16+ #2
[ 17.481932] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-2.fc30 04/01/2014
[ 17.483725] Call Trace:
[ 17.484179] dump_stack+0x85/0xc0
[ 17.484782] ingress_tcf_block+0x4c/0x50 [sch_ingress]
[ 17.485717] __tcf_block_find+0x28/0x80
[ 17.486419] tc_new_tfilter+0x1b6/0x960
[ 17.487093] ? tc_del_tfilter+0x720/0x720
[ 17.487854] rtnetlink_rcv_msg+0x389/0x4b0
[ 17.488590] ? netlink_deliver_tap+0x95/0x400
[ 17.489343] ? rtnl_dellink+0x2d0/0x2d0
[ 17.490034] netlink_rcv_skb+0x49/0x110
[ 17.490726] netlink_unicast+0x171/0x200
[ 17.491409] netlink_sendmsg+0x21e/0x3e0
[ 17.492096] sock_sendmsg+0x5e/0x60
[ 17.492792] ___sys_sendmsg+0x2ae/0x330
[ 17.493725] ? ___sys_recvmsg+0x159/0x1f0
[ 17.494734] ? up_write+0x1c/0xc0
[ 17.495515] ? ext4_file_write_iter+0xd1/0x3b0
[ 17.496624] ? find_held_lock+0x2b/0x80
[ 17.497362] ? ksys_write+0xc0/0xe0
[ 17.498000] __sys_sendmsg+0x59/0xa0
[ 17.498653] do_syscall_64+0x5c/0xa0
[ 17.499307] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 17.500225] RIP: 0033:0x7f6bd4ab2b45
And I could trigger a crash with this trace:
(crash was on a different run, while the RCU trace above and outputs below are
from the same run)
[ 382.670318] BUG: kernel NULL pointer dereference, address: 0000000000000030
[ 382.671927] #PF: supervisor read access in kernel mode
[ 382.673070] #PF: error_code(0x0000) - not-present page
[ 382.674170] PGD 0 P4D 0
[ 382.674873] Oops: 0000 [#1] SMP NOPTI
[ 382.675753] CPU: 1 PID: 6988 Comm: nc Kdump: loaded Not tainted 5.2.0.c1f3d.g83fb7bc8ff16+ #1
[ 382.677350] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-2.fc30 04/01/2014
[ 382.679255] RIP: 0010:tcf_classify+0x104/0x1e0
[ 382.680942] Code: 5c 41 5d 41 5e 41 5f c3 48 8b 55 00 be 01 00 00 00 4c 89 ff 48 89 14 24 e8 79 e3 f9 ff 48 8b 14 24 48 85 c0 0f 84 cb 00 00
00 <48> 8b 4a 30 8b 49 40 89 08 41 8d 45 01 41 83 fd 03 7e ac e8 84 37
[ 382.684792] RSP: 0018:ffffbd19000f4dd0 EFLAGS: 00010282
[ 382.685746] RAX: ffff9e04772b0008 RBX: ffff9e0479088360 RCX: 0000000000000001
[ 382.686929] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9e047ad9a140
[ 382.688144] RBP: ffffbd19000f4e40 R08: ffff9e047b870b60 R09: 0000000000000002
[ 382.689347] R10: ffff9e04550ccd68 R11: ffff9e0476050a00 R12: 0000000000000001
[ 382.690534] R13: 0000000000000000 R14: ffff9e0479088360 R15: ffff9e0479229ce0
[ 382.691730] FS: 00007efd4d75bb80(0000) GS:ffff9e047b840000(0000) knlGS:0000000000000000
[ 382.693033] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 382.694061] CR2: 0000000000000030 CR3: 0000000163abc000 CR4: 00000000003406e0
[ 382.695283] Call Trace:
[ 382.695962] <IRQ>
[ 382.696573] __netif_receive_skb_core+0x3c0/0xcf0
[ 382.697492] ? reweight_entity+0x15b/0x1a0
[ 382.698340] __netif_receive_skb_one_core+0x37/0x90
[ 382.699274] process_backlog+0x9c/0x150
[ 382.700199] net_rx_action+0xff/0x350
[ 382.701093] __do_softirq+0xee/0x2ff
[ 382.701976] do_softirq_own_stack+0x2a/0x40
[ 382.702948] </IRQ>
[ 382.703626] do_softirq.part.0+0x41/0x50
[ 382.704556] __local_bh_enable_ip+0x4b/0x50
[ 382.705536] ip_finish_output2+0x1a9/0x580
[ 382.706518] ip_output+0x71/0xf0
[ 382.707392] ? __ip_finish_output+0x1c0/0x1c0
[ 382.708422] __ip_queue_xmit+0x16c/0x420
[ 382.709423] ? __kmalloc_reserve.isra.0+0x2d/0x70
[ 382.711184] __tcp_transmit_skb+0x52c/0xab0
[ 382.712582] tcp_connect+0xaed/0xda0
[ 382.713403] ? kvm_clock_get_cycles+0xd/0x10
[ 382.714315] tcp_v4_connect+0x3f1/0x500
[ 382.715156] __inet_stream_connect+0xc5/0x360
[ 382.716070] inet_stream_connect+0x37/0x50
[ 382.716947] __sys_connect+0xd6/0x100
[ 382.717766] ? do_fcntl+0x1a5/0x650
[ 382.718588] ? __x64_sys_fcntl+0x84/0xb0
[ 382.719450] __x64_sys_connect+0x16/0x20
[ 382.720310] do_syscall_64+0x5b/0x180
[ 382.721146] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Then on the test:
+ ovs-appctl dpctl/dump-conntrack
tcp,orig=(src=192.168.0.2,dst=192.168.0.1,sport=35948,dport=5001),reply=(src=192.168.0.1,dst=233.185.30.138,sport=5001,dport=35948),zone=2,protoinfo=(state=SYN_SENT)
tcp,orig=(src=192.168.0.2,dst=192.168.0.1,sport=35946,dport=5001),reply=(src=192.168.0.1,dst=192.168.0.2,sport=5001,dport=35946),protoinfo=(state=TIME_WAIT)
+ ovs-appctl dpctl/dump-flows -m --names
ufid:a7998e58-919b-4d05-9cf5-4ab05527bf91, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x20),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ns2-veth-ab),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=6,tos=0/0,ttl=0/0,frag=no),tcp(src=0/0,dst=0/0), packets:3, bytes:180, used:2.890s, dp:tc, actions:ct(zone=2),recirc(0x1)
ufid:89ec315b-9068-4c2c-a795-3ea78c406616, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ns2-veth-ab),packet_type(ns=0/0,id=0/0),eth(src=12:1d:05:89:52:0b,dst=33:33:00:00:00:02),eth_type(0x86dd),ipv6(src=::/::,dst=::/::,label=0/0,proto=0/0,tclass=0/0,hlimit=0/0,frag=no), packets:1, bytes:56, used:1.480s, dp:tc, actions:br0,ns1-veth-ab
ufid:7becab91-3e24-4ba5-abe0-39b7650243ff, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ns2-veth-ab),packet_type(ns=0/0,id=0/0),eth(src=12:1d:05:89:52:0b,dst=33:33:ff:89:52:0b),eth_type(0x86dd),ipv6(src=::/::,dst=::/::,label=0/0,proto=0/0,tclass=0/0,hlimit=0/0,frag=no), packets:0, bytes:0, used:6.010s, dp:tc, actions:br0,ns1-veth-ab
ufid:868ba402-078f-4220-9865-6c58ee6e02bb, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ns2-veth-ab),packet_type(ns=0/0,id=0/0),eth(src=12:1d:05:89:52:0b,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(src=::/::,dst=::/::,label=0/0,proto=0/0,tclass=0/0,hlimit=0/0,frag=no), packets:0, bytes:0, used:6.010s, dp:tc, actions:br0,ns1-veth-ab
ufid:10757d31-40c8-4359-8735-b40e2e16b07c, skb_priority(0/0),skb_mark(0/0),ct_state(0x21/0x21),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0x1),dp_hash(0/0),in_port(ns2-veth-ab),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=6,tos=0/0,ttl=0/0,frag=no),tcp(src=0/0,dst=0/0), packets:3, bytes:180, used:2.890s, dp:tc, actions:ct(commit,zone=2,nat(src=192.168.0.30)),ns1-veth-ab
ufid:9e322377-a849-45d5-8c9e-719bb60bf76a, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ns1-veth-ab),packet_type(ns=0/0,id=0/0),eth(src=0e:78:05:f3:ea:8c,dst=33:33:ff:f3:ea:8c),eth_type(0x86dd),ipv6(src=::/::,dst=::/::,label=0/0,proto=0/0,tclass=0/0,hlimit=0/0,frag=no), packets:0, bytes:0, used:6.010s, dp:tc, actions:br0,ns2-veth-ab
ufid:1c0c9b46-ceb1-4c6f-95ae-f44169f6395b, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ns1-veth-ab),packet_type(ns=0/0,id=0/0),eth(src=0e:78:05:f3:ea:8c,dst=33:33:00:00:00:02),eth_type(0x86dd),ipv6(src=::/::,dst=::/::,label=0/0,proto=0/0,tclass=0/0,hlimit=0/0,frag=no), packets:1, bytes:56, used:2.500s, dp:tc, actions:br0,ns2-veth-ab
ufid:6e12004f-a24f-42d5-b946-d213491f93c6, skb_priority(0/0),skb_mark(0/0),ct_state(0/0x23),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ns1-veth-ab),packet_type(ns=0/0,id=0/0),eth(src=0e:78:05:f3:ea:8c,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(src=::/::,dst=::/::,label=0/0,proto=0/0,tclass=0/0,hlimit=0/0,frag=no), packets:0, bytes:0, used:6.000s, dp:tc, actions:br0,ns2-veth-ab
+ tc filter show dev ns2-veth-ab ingress
filter protocol ip pref 3 flower chain 0
filter protocol ip pref 3 flower chain 0 handle 0x1
eth_type ipv4
ip_proto tcp
ip_flags nofrag
ct_state -trk
not_in_hw
action order 1: ct zone 2 pipe
index 1 ref 1 bind 1
cookie 588e99a7054d9b91b04af59c91bf2755
action order 2: gact action goto chain 1
random type none pass val 0
index 1 ref 1 bind 1
cookie 588e99a7054d9b91b04af59c91bf2755
filter protocol ipv6 pref 4 flower chain 0
filter protocol ipv6 pref 4 flower chain 0 handle 0x1
dst_mac 33:33:00:00:00:02
src_mac 12:1d:05:89:52:0b
eth_type ipv6
ip_flags nofrag
ct_state -trk-new-est
not_in_hw
action order 1: skbedit ptype host pipe
index 1 ref 1 bind 1
action order 2: mirred (Ingress Mirror to device br0) pipe
index 2 ref 1 bind 1
cookie 5b31ec892c4c6890a73e95a71666408c
action order 3: mirred (Egress Redirect to device ns1-veth-ab) stolen
index 3 ref 1 bind 1
cookie 5b31ec892c4c6890a73e95a71666408c
filter protocol ipv6 pref 4 flower chain 0 handle 0x2
dst_mac 33:33:ff:89:52:0b
src_mac 12:1d:05:89:52:0b
eth_type ipv6
ip_flags nofrag
ct_state -trk-new-est
not_in_hw
action order 1: skbedit ptype host pipe
index 2 ref 1 bind 1
action order 2: mirred (Ingress Mirror to device br0) pipe
index 4 ref 1 bind 1
cookie 91abec7ba54b243eb739e0abff430265
action order 3: mirred (Egress Redirect to device ns1-veth-ab) stolen
index 5 ref 1 bind 1
cookie 91abec7ba54b243eb739e0abff430265
filter protocol ipv6 pref 4 flower chain 0 handle 0x3
dst_mac 33:33:00:00:00:16
src_mac 12:1d:05:89:52:0b
eth_type ipv6
ip_flags nofrag
ct_state -trk-new-est
not_in_hw
action order 1: skbedit ptype host pipe
index 3 ref 1 bind 1
action order 2: mirred (Ingress Mirror to device br0) pipe
index 6 ref 1 bind 1
cookie 02a48b8620428f07586c6598bb026eee
action order 3: mirred (Egress Redirect to device ns1-veth-ab) stolen
index 7 ref 1 bind 1
cookie 02a48b8620428f07586c6598bb026eee
filter protocol ip pref 3 flower chain 1
filter protocol ip pref 3 flower chain 1 handle 0x1
eth_type ipv4
ip_proto tcp
ip_flags nofrag
ct_state +trk+new
not_in_hw
action order 1: ct commit zone 2 nat src addr 192.168.0.30-0.0.0.0 pipe
index 2 ref 1 bind 1
cookie 317d75105943c8400eb435877cb0162e
action order 2: mirred (Egress Redirect to device ns1-veth-ab) stolen
index 1 ref 1 bind 1
cookie 317d75105943c8400eb435877cb0162e
Test script is attached.
More information about the dev
mailing list