[ovs-discuss] [OVN] OVN Load balancing algorithm

Numan Siddique numans at ovn.org
Tue Apr 21 07:37:54 UTC 2020


On Fri, Apr 17, 2020 at 12:56 PM Han Zhou <zhouhan at gmail.com> wrote:
>
>
>
> On Tue, Apr 7, 2020 at 7:03 AM Maciej Jozefczyk <mjozefcz at redhat.com> wrote:
> >
> > Hello!
> >
> > I would like to ask you to clarify how the OVN Load balancing algorithm works.
> >
> > Based on the action [1]:
> > 1) If connection is alive the same 'backend' will be chosen,
> >
> > 2) If it is a new connection the backend will be chosen based on selection_method=dp_hash [2].
> > Based on changelog the dp_hash uses '5 tuple hash' [3].
> > The hash is calculated based on values: source and destination IP,  source port, protocol and arbitrary value - 42. [4]
> > Based on that information we could name it SOURCE_IP_PORT.
> >
> > Unfortunately we recently got a bug report in OVN Octavia provider driver project, that the Load Balancing in OVN
> > works differently [5]. The report shows even when the test uses the same source ip and port, but new TCP connection,
> > traffic is randomly distributed, but based on [2] it shouldn't?
> >
> > Is it a bug?  Is something else taken to account while creating a hash? Can it be fixed in OVS/OVN?
> >
> >
> >
> > Thanks,
> > Maciej
> >
> >
> > [1] https://github.com/ovn-org/ovn/blob/branch-20.03/lib/actions.c#L1017
> > [2] https://github.com/ovn-org/ovn/blob/branch-20.03/lib/actions.c#L1059
> > [3] https://github.com/openvswitch/ovs/blob/d58b59c17c70137aebdde37d3c01c26a26b28519/NEWS#L364-L371
> > [4] https://github.com/openvswitch/ovs/blob/74286173f4d7f51f78e9db09b07a6d4d65263252/lib/flow.c#L2217
> > [5] https://bugs.launchpad.net/neutron/+bug/1871239
> >
> > --
> > Best regards,
> > Maciej Józefczyk
> > _______________________________________________
> > discuss mailing list
> > discuss at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
> Hi Maciej,
>
> Thanks for reporting. It is definitely strange that same 5-tuple flow resulted in hitting different backends. I didn't observed such behavior before (maybe I should try again myself to confirm). Can you make sure during the testing the group bucket didn't change? You can do so by:
> # ovs-ofctl dump-groups br-int
> and also check the group stats and see if multiple buckets has counter increased during the test
> # ovs-ofctl dump-group-stats br-int [group]
>
> For the 5-tuple hash function you are seeing flow_hash_5tuple(), it is using all the 5-tuples. It adds both ports (src and dst) at once:
>        /* Add both ports at once. */
>         hash = hash_add(hash,
>                         ((const uint32_t *)flow)[offsetof(struct flow, tp_src)
>                                                  / sizeof(uint32_t)]);
>
> The tp_src is the start of the offset, and the size is 32, meaning both src and dst, each is 16 bits. (Although I am not sure if dp_hash method is using this function or not. Need to check more code)
>
> BTW, I am not sure why Neutron give it the name SOURCE_IP_PORT. Shall it be called just 5-TUPLE, since protocol, destination IP and PORT are also considered in the hash.
>


Hi Maciej and Han,

I did some testing and I can confirm as you're saying. OVN is not
choosing the same backend with the src ip, src port fixed.

I think there is an issue with OVN on how it is programming the group
flows.  OVN is setting the selection_method as dp_hash.
But when ovs-vswitchd receives the  GROUP_MOD openflow message, I
noticed that the selection_method is not set.
>From the code I see that selection_method will be encoded only if
ovn-controller uses openflow version 1.5 [1]

Since selection_method is NULL, vswitchd uses the dp_hash method [2].
dp_hash means it uses the hash calculated by
the datapath. In the case of kernel datapath, from what I understand
it uses skb_get_hash().

I modified the vswitchd code to use the selection_method "hash" if
selection_method is not set. In this case the load balancer
works as expected. For a fixed src ip, src port, dst ip and dst port,
the group action is selecting the same bucket always. [3]

I think we need to fix a few issues in OVN
  - Use openflow 1.5 so that ovn can set selection_method
 -  Use "hash" method if dp_hash is not choosing the same bucket for
5-tuple hash.
  - May be provide the option for the CMS to choose an algorithm i.e.
to use dp_hash or hash.

I'll look into it on how to support this.

[1] - https://github.com/openvswitch/ovs/blob/master/lib/ofp-group.c#L2120
       https://github.com/openvswitch/ovs/blob/master/lib/ofp-group.c#L2082

[2] - https://github.com/openvswitch/ovs/blob/master/ofproto/ofproto-dpif.c#L5108
[3] - https://github.com/openvswitch/ovs/blob/master/ofproto/ofproto-dpif-xlate.c#L4553


Thanks
Numan


> Thanks,
> Han
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


More information about the discuss mailing list