[ovs-discuss] RE: Re:Re: [HELP] Question about icmp pkt marked Invalid by userspace conntrack

txfh2007 txfh2007 at aliyun.com
Mon Nov 4 01:12:32 UTC 2019


Hi Darrell:
    Sorry for my late reply. Yes, the two VMs under test are on same compute node , and pkts rx/tx via vhost user type port.Firstly if I don't configure meter table, then Iperf TCP bandwidth result From VM1 to VM2 is around 5Gbps, then I set the meter entry and constraint the rate, and the deviation is larger than I throught.
    I guess the recalculation of l4 checksum during conntrack would impact the actual rate?

Thank you 
Timo 




txfh2007 <txfh2007 at aliyun.com>
Ben Pfaff <blp at ovn.org>; ovs-discuss <ovs-discuss at openvswitch.org>
Re: [ovs-discuss] Re:Re: [HELP] Question about icmp pkt marked Invalid by userspace conntrack


Hi Timo


I read thru this thread to get more context on what you are doing; you have a base OVS-DPDK
use case and are measuring VM to VM performance across 2 compute nodes. You are probably using
vhost-user-client ports ? Pls correct me if I am wrong.
In this case, "per direction" you have one rx virtual interface to handle in OVS; there will be a tradeoff b/w
checksum validation security and performance.
JTBC, in terms of your measurements, how did you arrive at the 5Gbps - instrumented code or otherwise ?.
(I can verify that later when I have a setup).


Darrell
 









On Thu, Oct 31, 2019 at 9:23 AM Darrell Ball <dlu998 at gmail.com> wrote:




On Thu, Oct 31, 2019 at 3:04 AM txfh2007 via discuss <ovs-discuss at openvswitch.org> wrote:

Hi Ben && Darrell:
     This patch works, but after merging this patch I have found the iperf throughout decrease from 5Gbps+ to 500Mbps.

what is the 5Gbps number ? Is that the number with marking all packets as invalid in initial sanity checks ?


Typically one wants to offload checksum checks. The code checks whether that has been done and skips
doing it in software; can you verify that you have the capability and are using it ?


Skipping checksum checks reduces security, of course, but it can be added if there is a common case of
not being able to offload checksumming. 


 
 I guess maybe we should add a switch to turn off layer4 checksum validation when doing userspace conntrack ? I have found for kernel conntrack, there is a related button named "nf_conntrack_checksum"  .

Any advice?

Thank you !

------------------------------------------------------------------

:Ben Pfaff <blp at ovn.org>
:ovs-discuss <ovs-discuss at openvswitch.org>
:Re:Re:[ovs-discuss] [HELP] Question about icmp pkt marked Invalid by userspace conntrack


Hi Ben && Darrell:
     Thanks, this patch works! Now the issue seems fixed 

Timo


Re: Re:[ovs-discuss] [HELP] Question about icmp pkt marked Invalid by userspace conntrack


I see.

It sounds like Darrell pointed out the solution, but please let me know
if it did not help.

On Fri, Oct 11, 2019 at 08:57:58AM +0800, txfh2007 wrote:
> Hi Ben:
> 
>      I just found the GCC_UNALIGNED_ACCESSORS error during gdb trace and not sure this is a misaligned error or others. What I can confirm is  during "extract_l4" of this icmp reply packet, when we do "check_l4_icmp", the unaligned error emits and the "extract_l4" returned false. So this packet be marked as ct_state=invalid.
> 
> Thank you for your help.
> 
> Timo
> 
> Topic:Re: [ovs-discuss] [HELP] Question about icmp pkt marked Invalid by userspace conntrack
> 
> 
> It's very surprising.
> 
> Are you using a RISC architecture that insists on aligned accesses?  On
> the other hand, if you are using x86-64 or some other architecture that
> ordinarily does not care, are you sure that this is about a misaligned
> access (it is more likely to simply be a bad pointer)?
> 
> On Thu, Oct 10, 2019 at 10:50:33PM +0800, txfh2007 via discuss wrote:
> > 
> > Hi all:
> >     I was using OVS-DPDK(version 2.10-1), and I have found pinging between two VMs on different compute nodes failed. I have checked my env and found there is one node's NIC cannot strip CRC of a frame, the other node's NIC is normal(I mean it can strip CRC ). And the reason of ping fail is the icmp reply pkt (from node whose NIC cannot strip CRC) is marked as invalid . So the icmp request From Node A is 64 bytes, but the icmp reply From Node B is 68 bytes(with 4 bytes CRC). And when doing "check_l4_icmp", when we call csum task(in lib/csum.c). Gcc emits unaligned accessor error. The backtrace is as below: 
> > 
> >     I just want to confirm if this phenomenon is reasonable?
> > 
> > Many thanks
> > 
> > Timo
> > 
> > 
> > get_unaligned_be16 (p=0x7f2ad0b1ed5c) at lib/unaligned.h:89
> > 89 GCC_UNALIGNED_ACCESSORS(ovs_be16, be16);
> > (gdb) bt
> > #0  get_unaligned_be16 (p=0x7f2ad0b1ed5c) at lib/unaligned.h:89
> > #1  0x000000000075a584 in csum_continue (partial=0, data_=0x7f2ad0b1ed5c, n=68) at lib/csum.c:46
> > #2  0x000000000075a552 in csum (data=0x7f2ad0b1ed5c, n=68) at lib/csum.c:33
> > #3  0x00000000008ddf18 in check_l4_icmp (data=0x7f2ad0b1ed5c, size=68, validate_checksum=true) at lib/conntrack.c:1638
> > #4  0x00000000008de650 in extract_l4 (key=0x7f32a20df120, data=0x7f2ad0b1ed5c, size=68, related=0x7f32a20df15d, l3=0x7f2ad0b1ed48, 
> >     validate_checksum=true) at lib/conntrack.c:1888
> > #5  0x00000000008de90d in conn_key_extract (ct=0x7f32b42a2d98, pkt=0x7f2ad0b1e9c0, dl_type=8, ctx=0x7f32a20df120, zone=4)
> >     at lib/conntrack.c:1973
> > #6  0x00000000008dd49c in conntrack_execute (ct=0x7f32b42a2d98, pkt_batch=0x7f32a20e08b0, dl_type=8, force=false, commit=false, 
> >     zone=4, setmark=0x0, setlabel=0x0, tp_src=0, tp_dst=0, helper=0x0, nat_action_info=0x0, now=5395897849) at lib/conntrack.c:1318
> > #7  0x000000000076d651 in dp_execute_cb (aux_=0x7f32a20dfb00, packets_=0x7f32a20e08b0, a=0x7f32a20e0ac8, should_steal=false)
> >     at lib/dpif-netdev.c:6711
> > #8  0x00000000007b2d49 in odp_execute_actions (dp=0x7f32a20dfb00, batch=0x7f32a20e08b0, steal=true, actions=0x7f32a20e0ac8, 
> >     actions_len=20, dp_execute_action=0x76ca60 <dp_execute_cb>) at lib/odp-execute.c:726
> > #9  0x000000000076d71b in dp_netdev_execute_actions (pmd=0x7f2a6e1ce010, packets=0x7f32a20e08b0, should_steal=true, 
> >     flow=0x7f32a20dfb60, actions=0x7f32a20e0ac8, actions_len=20) at lib/dpif-netdev.c:6754
> > #10 0x000000000076b900 in handle_packet_upcall (pmd=0x7f2a6e1ce010, packet=0x7f2ad0b1e9c0, key=0x7f32a20e1100, 
> >     actions=0x7f32a20e0a40, put_actions=0x7f32a20e0a80) at lib/dpif-netdev.c:6056
> > #11 0x000000000076bdf0 in fast_path_processing (pmd=0x7f2a6e1ce010, packets_=0x7f32a20e2b60, keys=0x7f32a20e10c0, 
> >     batches=0x7f32a20e0f90, n_batches=0x7f32a20e13c0, in_port=15) at lib/dpif-netdev.c:6153
> > #12 0x000000000076c3df in dp_netdev_input__ (pmd=0x7f2a6e1ce010, packets=0x7f32a20e2b60, md_is_valid=true, port_no=0)
> >     at lib/dpif-netdev.c:6230
> > #13 0x000000000076c4d4 in dp_netdev_recirculate (pmd=0x7f2a6e1ce010, packets=0x7f32a20e2b60) at lib/dpif-netdev.c:6265
> > #14 0x000000000076ceae in dp_execute_cb (aux_=0x7f32a20e1db0, packets_=0x7f32a20e2b60, a=0x7f32a20e2d78, should_steal=true) 
> > 
> > 
> > _______________________________________________
> > discuss mailing list
> > discuss at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> 



_______________________________________________
discuss mailing list
discuss at openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss








More information about the discuss mailing list