[ovs-discuss] Re: Re:Re: [HELP] Question about icmp pkt marked Invalid by userspace conntrack

txfh2007 txfh2007 at aliyun.com
Thu Nov 7 05:37:24 UTC 2019


Hi Darrell:
    Sorry, I forgot to tell you the attached flow is based on VM tx direction rate limit. So the datapath action order is conntrack -> meter -> forward decision -> output, For the  VM rx direction rate limit, the datapath flow is as below, please help to check, thank you!
   Also, for the same flow table and meter configuration, the kernel datapath rate limit is more accurate than userspace datapath. 

For VM rx direction rate limit:

ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),recirc_id(0x29),in_port(5),packet_type(ns=0,id=0),eth(src=fa:16:3e:33:02:d8,dst=fa:16:3e:12:d7:77),eth_type(0x0800),ipv4(dst=192.168.1.10/255.255.255.248,proto=6,frag=no),tcp_flags(ack), packets:1031455, bytes:1481163900, used:0.149s, flags:., actions:ct(zone=4),recirc(0x2a)

ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),recirc_id(0x2a),in_port(5),packet_type(ns=0,id=0),eth(src=fa:16:3e:33:02:d8,dst=fa:16:3e:12:d7:77),eth_type(0x0800),ipv4(src=192.168.1.8,dst=192.168.1.10,proto=6,frag=no), packets:1685180, bytes:2415638857, used:0.118s, flags:P., actions:meter(1),6




------------------------------------------------------------------
:Re: [ovs-discuss] Re:Re: [HELP] Question about icmp pkt marked Invalid by userspace conntrack


Hi Timo

On Wed, Nov 6, 2019 at 2:06 AM txfh2007 <txfh2007 at aliyun.com> wrote:

Hi Darrell:

the flow dump result is as below: Please help to check
BEFORE:

ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),recirc_id(0x1b),in_port(6),packet_type(ns=0,id=0),eth(src=fa:16:3e:12:d7:77,dst=fa:16:3e:33:02:d8),eth_type(0x0800),ipv4(dst=192.168.1.8/255.255.255.248,proto=6,frag=no),tcp_flags(psh|ack), packets:18934, bytes:26602222, used:0.000s, flags:P., actions:ct(zone=1),recirc(0x1c)

ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),recirc_id(0x1c),in_port(6),packet_type(ns=0,id=0),eth(src=fa:16:3e:12:d7:77,dst=fa:16:3e:33:02:d8),eth_type(0x0800),ipv4(src=192.168.1.10,dst=192.168.1.8,proto=6,frag=no), packets:5345996, bytes:7676256441, used:0.000s, flags:P., actions:5

AFTER:

ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),recirc_id(0x19),in_port(6),packet_type(ns=0,id=0),eth(src=fa:16:3e:12:d7:77,dst=fa:16:3e:33:02:d8),eth_type(0x0800),ipv4(dst=192.168.1.8/255.255.255.248,proto=6,frag=no),tcp_flags(ack), packets:2473174, bytes:3551472384, used:0.136s, flags:., actions:meter(0),ct(zone=1),recirc(0x1a)


meter is being applied by above rule and then the pipeline continues below to do another pass thru the datapath;
this would likely explain the numbers
Can you double check the Openflow rules and do the metering at output rule.


 
ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),recirc_id(0x1a),in_port(6),packet_type(ns=0,id=0),eth(src=fa:16:3e:12:d7:77,dst=fa:16:3e:33:02:d8),eth_type(0x0800),ipv4(src=192.168.1.10,dst=192.168.1.8,proto=6,frag=no), packets:5292889, bytes:7599875381, used:0.046s, flags:P., actions:5



meter rate is 1Gbps, iperf result is around 800Mbps

[  5]  95.00-96.00  sec   104 MBytes   869 Mbits/sec
[  5]  96.00-97.00  sec  79.4 MBytes   666 Mbits/sec
[  5]  97.00-98.00  sec   107 MBytes   896 Mbits/sec
[  5]  98.00-99.00  sec  75.4 MBytes   632 Mbits/sec
[  5]  99.00-100.00 sec  98.3 MBytes   824 Mbits/sec
[  5] 100.00-100.04 sec  0.00 Bytes  0.00 bits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-100.04 sec  0.00 Bytes  0.00 bits/sec                  sender
[  5]   0.00-100.04 sec  9.29 GBytes   798 Mbits/sec                  receiver



------------------------------------------------------------------
:Darrell Ball <dlu998 at gmail.com>
:2019年11月6日(星期三) 02:46
:txfh2007 <txfh2007 at aliyun.com>
:Ben Pfaff <blp at ovn.org>; ovs-discuss <ovs-discuss at openvswitch.org>
:Re: [ovs-discuss] Re:Re: [HELP] Question about icmp pkt marked Invalid by userspace conntrack


Hi Timo

On Mon, Nov 4, 2019 at 11:29 PM txfh2007 <txfh2007 at aliyun.com> wrote:

Hi Darrell:
    The meter rate limit is set as 1Gbps, but the actual rate is around 500Mbps.. I have read the meter patch, but this patch is to prevent delta_t changed to 0. But in my case, the delta_t is around 35500ms.


It might be good to just include all known related fixes anyways, including this other one


https://github.com/openvswitch/ovs/commit/acc5df0e3cb036524d49891fdb9ba89b609dd26a




For my case, the meter action is on openflow table 46, and the ct action is on table 44, the output action is on table 65, so I guess the order is right?


Could you dump the 'relevant' datapath flows before adding the meter rule and after adding the meter rule ?
ovs-appctl dpif/dump-flows <bridge>



Thanks 

Timo 



------------------------------------------------------------------
:Darrell Ball <dlu998 at gmail.com>
:2019年11月5日(星期二) 06:56
:txfh2007 <txfh2007 at aliyun.com>
:Ben Pfaff <blp at ovn.org>; ovs-discuss <ovs-discuss at openvswitch.org>
:Re: [ovs-discuss] Re:Re: [HELP] Question about icmp pkt marked Invalid by userspace conntrack


Hi Timo

On Sun, Nov 3, 2019 at 5:12 PM txfh2007 <txfh2007 at aliyun.com> wrote:

Hi Darrell:
    Sorry for my late reply. Yes, the two VMs under test are on same compute node , and pkts rx/tx via vhost user type port.

Got it


Firstly if I don't configure meter table, then Iperf TCP bandwidth result From VM1 to VM2 is around 5Gbps, then I set the meter entry and constraint the rate, and the deviation is larger than I throught.


IIUC, pre-meter, you get 5 Gbps, then post-meter 0.5 Gpbs, which is less than you expected ?
What did you expect the metered rate to be ?
Note Ben pointed you to a meter related bug fix on the alias b4.

    I guess the recalculation of l4 checksum during conntrack would impact the actual rate?


are you applying the meter rule at end of the complete pipeline ?


Thank you 
Timo 




txfh2007 <txfh2007 at aliyun.com>
Ben Pfaff <blp at ovn.org>; ovs-discuss <ovs-discuss at openvswitch.org>
Re: [ovs-discuss] Re:Re: [HELP] Question about icmp pkt marked Invalid by userspace conntrack


Hi Timo


I read thru this thread to get more context on what you are doing; you have a base OVS-DPDK
use case and are measuring VM to VM performance across 2 compute nodes. You are probably using
vhost-user-client ports ? Pls correct me if I am wrong.
In this case, "per direction" you have one rx virtual interface to handle in OVS; there will be a tradeoff b/w
checksum validation security and performance.
JTBC, in terms of your measurements, how did you arrive at the 5Gbps - instrumented code or otherwise ?.
(I can verify that later when I have a setup).


Darrell










On Thu, Oct 31, 2019 at 9:23 AM Darrell Ball <dlu998 at gmail.com> wrote:




On Thu, Oct 31, 2019 at 3:04 AM txfh2007 via discuss <ovs-discuss at openvswitch.org> wrote:

Hi Ben && Darrell:
     This patch works, but after merging this patch I have found the iperf throughout decrease from 5Gbps+ to 500Mbps.

what is the 5Gbps number ? Is that the number with marking all packets as invalid in initial sanity checks ?


Typically one wants to offload checksum checks. The code checks whether that has been done and skips
doing it in software; can you verify that you have the capability and are using it ?


Skipping checksum checks reduces security, of course, but it can be added if there is a common case of
not being able to offload checksumming. 



 I guess maybe we should add a switch to turn off layer4 checksum validation when doing userspace conntrack ? I have found for kernel conntrack, there is a related button named "nf_conntrack_checksum"  .

Any advice?

Thank you !

------------------------------------------------------------------

:Ben Pfaff <blp at ovn.org>
:ovs-discuss <ovs-discuss at openvswitch.org>
:Re:Re:[ovs-discuss] [HELP] Question about icmp pkt marked Invalid by userspace conntrack


Hi Ben && Darrell:
     Thanks, this patch works! Now the issue seems fixed 

Timo


Re: Re:[ovs-discuss] [HELP] Question about icmp pkt marked Invalid by userspace conntrack


I see.

It sounds like Darrell pointed out the solution, but please let me know
if it did not help.

On Fri, Oct 11, 2019 at 08:57:58AM +0800, txfh2007 wrote:
> Hi Ben:
> 
>      I just found the GCC_UNALIGNED_ACCESSORS error during gdb trace and not sure this is a misaligned error or others. What I can confirm is  during "extract_l4" of this icmp reply packet, when we do "check_l4_icmp", the unaligned error emits and the "extract_l4" returned false. So this packet be marked as ct_state=invalid.
> 
> Thank you for your help.
> 
> Timo
> 
> Topic:Re: [ovs-discuss] [HELP] Question about icmp pkt marked Invalid by userspace conntrack
> 
> 
> It's very surprising.
> 
> Are you using a RISC architecture that insists on aligned accesses?  On
> the other hand, if you are using x86-64 or some other architecture that
> ordinarily does not care, are you sure that this is about a misaligned
> access (it is more likely to simply be a bad pointer)?
> 
> On Thu, Oct 10, 2019 at 10:50:33PM +0800, txfh2007 via discuss wrote:
> > 
> > Hi all:
> >     I was using OVS-DPDK(version 2.10-1), and I have found pinging between two VMs on different compute nodes failed. I have checked my env and found there is one node's NIC cannot strip CRC of a frame, the other node's NIC is normal(I mean it can strip CRC ). And the reason of ping fail is the icmp reply pkt (from node whose NIC cannot strip CRC) is marked as invalid . So the icmp request From Node A is 64 bytes, but the icmp reply From Node B is 68 bytes(with 4 bytes CRC). And when doing "check_l4_icmp", when we call csum task(in lib/csum.c). Gcc emits unaligned accessor error. The backtrace is as below: 
> > 
> >     I just want to confirm if this phenomenon is reasonable?
> > 
> > Many thanks
> > 
> > Timo
> > 
> > 
> > get_unaligned_be16 (p=0x7f2ad0b1ed5c) at lib/unaligned.h:89
> > 89 GCC_UNALIGNED_ACCESSORS(ovs_be16, be16);
> > (gdb) bt
> > #0  get_unaligned_be16 (p=0x7f2ad0b1ed5c) at lib/unaligned.h:89
> > #1  0x000000000075a584 in csum_continue (partial=0, data_=0x7f2ad0b1ed5c, n=68) at lib/csum.c:46
> > #2  0x000000000075a552 in csum (data=0x7f2ad0b1ed5c, n=68) at lib/csum.c:33
> > #3  0x00000000008ddf18 in check_l4_icmp (data=0x7f2ad0b1ed5c, size=68, validate_checksum=true) at lib/conntrack.c:1638
> > #4  0x00000000008de650 in extract_l4 (key=0x7f32a20df120, data=0x7f2ad0b1ed5c, size=68, related=0x7f32a20df15d, l3=0x7f2ad0b1ed48, 
> >     validate_checksum=true) at lib/conntrack.c:1888
> > #5  0x00000000008de90d in conn_key_extract (ct=0x7f32b42a2d98, pkt=0x7f2ad0b1e9c0, dl_type=8, ctx=0x7f32a20df120, zone=4)
> >     at lib/conntrack.c:1973
> > #6  0x00000000008dd49c in conntrack_execute (ct=0x7f32b42a2d98, pkt_batch=0x7f32a20e08b0, dl_type=8, force=false, commit=false, 
> >     zone=4, setmark=0x0, setlabel=0x0, tp_src=0, tp_dst=0, helper=0x0, nat_action_info=0x0, now=5395897849) at lib/conntrack.c:1318
> > #7  0x000000000076d651 in dp_execute_cb (aux_=0x7f32a20dfb00, packets_=0x7f32a20e08b0, a=0x7f32a20e0ac8, should_steal=false)
> >     at lib/dpif-netdev.c:6711
> > #8  0x00000000007b2d49 in odp_execute_actions (dp=0x7f32a20dfb00, batch=0x7f32a20e08b0, steal=true, actions=0x7f32a20e0ac8, 
> >     actions_len=20, dp_execute_action=0x76ca60 <dp_execute_cb>) at lib/odp-execute.c:726
> > #9  0x000000000076d71b in dp_netdev_execute_actions (pmd=0x7f2a6e1ce010, packets=0x7f32a20e08b0, should_steal=true, 
> >     flow=0x7f32a20dfb60, actions=0x7f32a20e0ac8, actions_len=20) at lib/dpif-netdev.c:6754
> > #10 0x000000000076b900 in handle_packet_upcall (pmd=0x7f2a6e1ce010, packet=0x7f2ad0b1e9c0, key=0x7f32a20e1100, 
> >     actions=0x7f32a20e0a40, put_actions=0x7f32a20e0a80) at lib/dpif-netdev.c:6056
> > #11 0x000000000076bdf0 in fast_path_processing (pmd=0x7f2a6e1ce010, packets_=0x7f32a20e2b60, keys=0x7f32a20e10c0, 
> >     batches=0x7f32a20e0f90, n_batches=0x7f32a20e13c0, in_port=15) at lib/dpif-netdev.c:6153
> > #12 0x000000000076c3df in dp_netdev_input__ (pmd=0x7f2a6e1ce010, packets=0x7f32a20e2b60, md_is_valid=true, port_no=0)
> >     at lib/dpif-netdev.c:6230
> > #13 0x000000000076c4d4 in dp_netdev_recirculate (pmd=0x7f2a6e1ce010, packets=0x7f32a20e2b60) at lib/dpif-netdev.c:6265
> > #14 0x000000000076ceae in dp_execute_cb (aux_=0x7f32a20e1db0, packets_=0x7f32a20e2b60, a=0x7f32a20e2d78, should_steal=true) 
> > 
> > 
> > _______________________________________________
> > discuss mailing list
> > discuss at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> 



_______________________________________________
discuss mailing list
discuss at openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss




















More information about the discuss mailing list