[ovs-dev] group dp_hash method works incorrectly when using snat

ychen ychen103103 at 163.com
Mon Sep 30 04:09:18 UTC 2019


Hi,
   We found that when the same TCP session  using snat with dp_hash group as output actionj, 
   SYN packet and the other packets behaves different, SYN packet outputs to one group bucket, and the other packets outputs to another group bucket.


   Here is the ovs flows:
   table=0,in_port=DOWN_PORT,tun_id=vni,ip,actions=ct(nat,zone=ZID,table=1)
   table=1,ip,ct_state=+new,ct(commit,nat,src=SNAT_PUB_IP,zone=ZID,table=2)
   table=1,ip,ct_state=-new,actions=goto_table(table=2)
   table=2,ip,actions=group:1
   group=1,type=select,selection_method=dp_hash,bucket=actions=output:UP_PORT1,bucket=actions=output:UP_PORT2


  Here is the datapath flow:
  tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(src=192.168.100.16/255.255.255.240,frag=no), packets:5, bytes:455, used:2.978s, flags:FP., actions:meter(248),meter(249),ct(zone=1298,nat),recirc(0x176)
flow-dump from pmd on cpu core: 6
tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),ct_state(+new-inv),ct_zone(0x512),recirc_id(0x176),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:meter(250),ct(commit,zone=1298,nat(src=172.16.1.152:1024-65535)),recirc(0x177)
tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),ct_state(-new-inv),ct_zone(0x512),recirc_id(0x176),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,frag=no), packets:4, bytes:389, used:3.002s, flags:FP., actions:set(eth(src=fa:25:fa:c2:52:71,dst=xx:xx:xx:xx:xx:xx)),set(ipv4(ttl=63)),hash(hash_l4(0)),recirc(0x178)
flow-dump from pmd on cpu core: 6
tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0x178),dp_hash(0x8a6c9809/0xf),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:4, bytes:389, used:3.025s, flags:FP., actions:2
tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0x178),dp_hash(0xbab97b2e/0xf),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:3
flow-dump from pmd on cpu core: 6
tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0x177),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,frag=no), packets:0, bytes:0, used:never, actions:set(eth(src=fa:25:fa:c2:52:71,dst=xx:xx:xx:xx:xx:xx)),set(ipv4(ttl=63)),hash(hash_l4(0)),recirc(0x178)


from the above datapath flow, we can get the conclusion:
 1. the first SYN packet match ct_state=+new, and recirculates 3 times
 2. other packets match ct_state=-new, and recirculates only 2 times
 3. packet's match +new and packets match -new have different dp_hash value, hence may output to different port
   (same session TCP packets output to different port may increase the disorder risk) 


we researched ovs code, and found the following:
 dpif_netdev_packet_get_rss_hash(struct dp_packet *packet,
                                const struct miniflow *mf)
{
    uint32_t hash, recirc_depth;


    if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
        hash = dp_packet_get_rss_hash(packet);
    } else {
        hash = miniflow_hash_5tuple(mf, 0);
        dp_packet_set_rss_hash(packet, hash);
    }


    /* The RSS hash must account for the recirculation depth to avoid
     * collisions in the exact match cache */
    recirc_depth = *recirc_depth_get_unsafe();
    if (OVS_UNLIKELY(recirc_depth)) {
        hash = hash_finish(hash, recirc_depth);=====> this code changes the RSS hash, and this function is called before EMC lookup
        dp_packet_set_rss_hash(packet, hash);
    }
    return hash;
}


so is there any method to fix this problem? 
we tried change the ovs flow with :
 table=1,ip,ct_state=-new,actions=ct(commit, table=2)
and problem dispeer, but in this time ,packets match ct_state=-new also need recirc 3 times which may decrease performance.


More information about the dev mailing list