[ovs-discuss] [PATCH-RFC 1/2] Improve ARP latency
Jesse Gross
jesse at nicira.com
Thu Oct 1 00:06:18 UTC 2015
On Tue, Sep 29, 2015 at 10:50 PM, <dwilder at us.ibm.com> wrote:
> Hi-
>
> I have been conducting scaling tests with OVS and docker. My tests revealed
> that the latency of ARP packets can become very large resulting in many ARP
> re-transmissions and time-outs. I found the source of the poor latency to be
> with the handling of arp packets in ovs_vport_find_upcall_portid(). Each
> packet is hashed in ovs_vport_find_upcall_portid() by calling
> skb_get_hash(). This hash is used to select a netlink socket in which to
> send the packet to userspace. However, skb_get_hash() is not supporting ARP
> packets returning a 0 (invalid hash) for every ARP. This results in a
> single ovs-vswitchd handler thread processing every arp packet thus severely
> impacting the average latency of ARPs. I am purposing a change to
> ovs_vport_find_upcall_portid() that spreads the ARP packets evenly between
> all the handler threads (patch to follow). Please let me know if you have
> suggestions/comments.
This is definitely an interesting analysis but I'm a little surprised
at the basic scenario. First, I guess it seems to me that the L2
domain is too large if there are this many ARPs. The speed also
generally seems slower than I would expect but in any case I don't
disagree that it is better to spread the load among all the cores.
On the patch itself, can't we just make skb_get_hash() be able to
decode ARP? It seems like that is cleaner and more generic.
More information about the discuss
mailing list