[ovs-discuss] Megaflow Inspection

Matan Rosenberg matan129 at gmail.com
Thu Jan 9 07:52:19 UTC 2020


Thanks for the quick responses.

Levi - you'd provided a lot of info, I'm still looking into some of the
points. At time point, this is what I know:

1)  No, Scapy is used only to create the packets. I can make a very diverse
cap and then send it with tcpreplay, it should be fast enough - I'll look
into it.
2) In general, I don't think that the veth pair performance is the
bottleneck here.
3) In production, according to dpctl/dump-flows I see ~7k megaflows, and
about ~3k masks (!). The hit/pkt is around 1k, which is *huge*.
4) About the TCP offloading: I don't think this is it, but I'll check.

Ben - I've taken a second look at the megaflows in production, and mapped
the fields that are matched beyond the default dl_type and IP fragmentation.
I have some bridges with flood rules, some with normal (MAC learning), and
some of them also contain vxlan ports (remote_ip and vni are defined per
port, not flow based).

The fields that I see matched are:

Both actions=flood/normal (or even just output to manually specified ports):

   - VLAN IDs
   - We use about 3K vlans out of the available 4K vlan range, so it's
      quite a lot.
      - Most of the traffic is vlan tagged, so this applies to most
      megaflows.
      - Just to clarify, I don't actually need OVS to care about the vlans.
      - If a packet is vlan-tagged, it's eth_type and fragmentation is also
   matched via encap(eth_type(...)).
      - This makes a cartesian product: the handful eth_types we have times
      the number of active vlans.
   - Tunnel-related fields, but that's normal for the vxlan ports.
   - I also see some other IP flags being matched, like tos and tclass.

Only with actions=normal (MAC learning):

   - I obviously also see dl_src/dst addresses, which is sensible.
   - Additionaly, I see OVS matching against specific ARP fields (for
   example, src/dst IP).



On Wed, 8 Jan 2020 at 02:25, Levente Csikor <levente.csikor at gmail.com>
wrote:

> Hi Matan,
>
> I guess you were referring to my talk at OVS Fall 2018 :)
> As Ben has pointed out in his last email, even you are matching only on
> the in_port, because of your (not-manually-inserted) default drop
> rule(s), you will still have a couple of megaflow entries to handle
> different packets (as you could see, usually they are an IPv6 related
> discovery message and an ARP).
>
> Before going into the megaflow cache code, according to your setup,
> could you confirm the following things?
>
> 1) by using scapy for generating the packets, are you actually able to
> achieve the intended packet rate at the generator?
>
> 2) if YES: without OVS, can you see at the other end of your veth pair
> the performance you are having for generation?
> ----
> These two things can easily be the bottleneck, so we have to justify
> that they are not the bad guys in your case.
>
>
> 3) After checking the megaflow entries with the command Ben has shared
> (ovs-dpctl show/dump-flows), how many entries/masks did you see?
> (Note, I did not go through thoroughly your flow rules and packets)
> If the number is just a handful, then megaflow won't be you issue!
> If the number is more than ~100, it still would not be an issue,
> however if it is then it can be caused by two reasons:
>  - you are using an OVS (version), which is delivered by your
> distribution -> we realized (in 2018 with Ben et al.) that the default
> kernel module coming with the distribution has the microflow cache
> switched OFF (the main networking guys responsible for the kernel
> modules are not a huge fans of caching) - so you either enable it
> (somehow if possible) or simply install OVS from source.
>  - OR there are still some issues with your VETHS! we experienced such
> a bad performance with relatively low number of masks and traffic, if
> TCP offload was switched off on the physical NIC, or when we were using
> UDP packets (as there is no offloading function for UDP).
> Have you tried playing these values for your veth (ethtool -K <iface>)?
> TL;DR recently, I have experienced that switching off TCP offloading
> for a veth (that I don't think it should have an effect) produced
> better throughput :/
>
> After you can check this things, we will be much smarter ;)
>
> Cheers,
> Levi
>
> On Wed, 2020-01-08 at 00:52 +0200, Matan Rosenberg wrote:
> > Running oproto/trace unfortunately does not explain why OVS chose to
> > look at these fields.
> > Using the same setup, for example:
> >
> > #  ovs-appctl ofproto/trace br0 in_port=a-
> > blue,dl_src=11:22:33:44:55:66,dl_dst=aa:bb:cc:dd:ee:ff,ipv4,nw_src=1.
> > 2.3.4
> > Flow:
> > ip,in_port=4,vlan_tci=0x0000,dl_src=11:22:33:44:55:66,dl_dst=aa:bb:cc
> > :dd:ee:ff,nw_src=1.2.3.4,nw_dst=0.0.0.0,nw_proto=0,nw_tos=0,nw_ecn=0,
> > nw_ttl=0
> >
> > bridge("br0")
> > -------------
> >  0. in_port=4, priority 32768
> >     output:5
> >
> > Final flow: unchanged
> > Megaflow: recirc_id=0,eth,ip,in_port=4,nw_frag=no
> > Datapath actions: 3
> >
> > It seems that the OpenFlow rule (not to be confused with the megaflow
> > entry) was correctly identified, and no other actions take place.
> > Since the relevant OpenFlow rule has nothing to do with the IP layer,
> > I don't understand why the megaflow is aware of it.
> >
> > I'll try to look at the classifier/megaflow code (?) tomorrow, but
> > I'd like to know if there's a high-level way to avoid such trouble.
> >
> > Thanks
> >
> > On Wed, 8 Jan 2020 at 00:39, Ben Pfaff <blp at ovn.org> wrote:
> > > On Tue, Jan 07, 2020 at 10:44:57PM +0200, Matan Rosenberg wrote:
> > > > Acutally, I do think I have a megaflow (or other caching) issue.
> > > >
> > > > We use OVS for L2 packet forwarding; that is, given a packet, we
> > > don't need
> > > > OVS to look at other protocols beyond the Ethernet layer.
> > > > Additionally, we use VXLAN to establish L2 overlay networks
> > > across multiple
> > > > OVS servers.
> > > >
> > > > Just to make thing clear, these are some typical flow rules that
> > > you might
> > > > see on a bridge:
> > > >
> > > > - in_port=1,actions=2,3
> > > > - in_port=42,actions=FLOOD
> > > > - actions=NORMAL
> > > >
> > > > No IP matching, conntrack, etc.
> > > >
> > > > We're experiencing severe performance issues with OVS - in this
> > > use case,
> > > > it cannot handle more than couple thousand packets/s.
> > > > After some exploring, I've noticed that the installed megaflows
> > > try to
> > > > match on fields that are not present in the rules, apparently for
> > > no reason.
> > > > Here's a complete example to reproduce, using OVS 2.12.0:
> > > >
> > > > # ip link add dev a-blue type veth peer name a-red
> > > > # ip link add dev b-blue type veth peer name b-red
> > > >
> > > > # ovs-vsctl add-br br0
> > > > # ovs-vsctl add-port br0 a-blue
> > > > # ovs-vsctl add-port br0 b-blue
> > > >
> > > > # ovs-ofctl del-flows br0
> > > > # ovs-ofctl add-flow br0 in_port=a-blue,actions=b-blue
> > > > # ovs-ofctl add-flow br0 in_port=b-blue,actions=a-blue
> > > >
> > > > After injecting ~100 random packets (IP, IPv6, TCP, UDP, ARP with
> > > random
> > > > addresses) to one of the red interfaces (with
> > > https://pastebin.com/Y6dPFCKJ),
> > > > these are the installed flows:
> > > > # ovs-dpctl dump-flows
> > > > recirc_id(0),in_port(2),eth(),eth_type(0x0806), packets:54,
> > > bytes:2268,
> > > > used:1.337s, actions:3
> > > > recirc_id(0),in_port(2),eth(),eth_type(0x86dd),ipv6(frag=no),
> > > packets:28,
> > > > bytes:1684, used:1.430s, flags:S, actions:3
> > > > recirc_id(0),in_port(2),eth(),eth_type(0x0800),ipv4(frag=no),
> > > packets:15,
> > > > bytes:610, used:1.270s, flags:S, actions:3
> > > >
> > > > As you can see, for some reason, OVS had split the single
> > > relevant OpenFlow
> > > > rule to three separate megaflows, one for each eth_type (and even
> > > other
> > > > fields - IP fragmentation?).
> > > > In my production scenario, the packets are even more diversified,
> > > and we
> > > > see OVS installing flows which match on even more fields,
> > > including
> > > > specific Ethernet and IP addresses.
> > > >
> > > > This leads to a large number of flows that have extremely low hit
> > > rate -
> > > > each flow handles not more than ~100 packets (!) during its
> > > entire lifetime.
> > > >
> > > > We suspect that this causes the performance peanalty; either
> > > > 1) The EMC/megaflow table is full, so vswitchd upcalls are all
> > > over the
> > > > place, or
> > > > 2) The huge number of inefficient megaflows leads to terrible
> > > lookup times
> > > > in the in-kernel megaflow table itslef (due to large number of
> > > masks, etc.)
> > > >
> > > > In short: how can I just make OVS oblivious to these fields? Why
> > > does it
> > > > try to match on irrlevant fields?
> > >
> > > I can see how this would be distressing.
> > >
> > > You can use ofproto/trace with a few examples to help figure out
> > > why OVS
> > > is matching on more fields than you expect.
> >
> > _______________________________________________
> > discuss mailing list
> > discuss at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200109/53ce3ad2/attachment-0001.html>


More information about the discuss mailing list