[ovs-discuss] Packet drops with high rate of Packet_In

Thu Nov 21 16:56:35 UTC 2013

Please don't drop the mailing list.

You have begun to narrow down where the drops occur, but it's still not
clear exactly where.  I suggest following the troubleshooting procedure
in the FAQ.

Q: I have a sophisticated network setup involving Open vSwitch, VMs or
   multiple hosts, and other components.  The behavior isn't what I
   expect.  Help!

A: To debug network behavior problems, trace the path of a packet,
   hop-by-hop, from its origin in one host to a remote host.  If
   that's correct, then trace the path of the response packet back to
   the origin.

   Usually a simple ICMP echo request and reply ("ping") packet is
   good enough.  Start by initiating an ongoing "ping" from the origin
   host to a remote host.  If you are tracking down a connectivity
   problem, the "ping" will not display any successful output, but
   packets are still being sent.  (In this case the packets being sent
   are likely ARP rather than ICMP.)

   Tools available for tracing include the following:

       - "tcpdump" and "wireshark" for observing hops across network
         devices, such as Open vSwitch internal devices and physical
         wires.

       - "ovs-appctl dpif/dump-flows <br>" in Open vSwitch 1.10 and
         later or "ovs-dpctl dump-flows <br>" in earlier versions.
         These tools allow one to observe the actions being taken on
         packets in ongoing flows.

         See ovs-vswitchd(8) for "ovs-appctl dpif/dump-flows"
         documentation, ovs-dpctl(8) for "ovs-dpctl dump-flows"
         documentation, and "Why are there so many different ways to
         dump flows?" above for some background.

       - "ovs-appctl ofproto/trace" to observe the logic behind how
         ovs-vswitchd treats packets.  See ovs-vswitchd(8) for
         documentation.  You can out more details about a given flow
         that "ovs-dpctl dump-flows" displays, by cutting and pasting
         a flow from the output into an "ovs-appctl ofproto/trace"
         command.

       - SPAN, RSPAN, and ERSPAN features of physical switches, to
         observe what goes on at these physical hops.

   Starting at the origin of a given packet, observe the packet at
   each hop in turn.  For example, in one plausible scenario, you
   might:

       1. "tcpdump" the "eth" interface through which an ARP egresses
          a VM, from inside the VM.

       2. "tcpdump" the "vif" or "tap" interface through which the ARP
          ingresses the host machine.

       3. Use "ovs-dpctl dump-flows" to spot the ARP flow and observe
          the host interface through which the ARP egresses the
          physical machine.  You may need to use "ovs-dpctl show" to
          interpret the port numbers.  If the output seems surprising,
          you can use "ovs-appctl ofproto/trace" to observe details of
          how ovs-vswitchd determined the actions in the "ovs-dpctl
          dump-flows" output.

       4. "tcpdump" the "eth" interface through which the ARP egresses
          the physical machine.

       5. "tcpdump" the "eth" interface through which the ARP
          ingresses the physical machine, at the remote host that
          receives the ARP.

       6. Use "ovs-dpctl dump-flows" to spot the ARP flow on the
          remote host that receives the ARP and observe the VM "vif"
          or "tap" interface to which the flow is directed.  Again,
          "ovs-dpctl show" and "ovs-appctl ofproto/trace" might help.

       7. "tcpdump" the "vif" or "tap" interface to which the ARP is
          directed.

       8. "tcpdump" the "eth" interface through which the ARP
          ingresses a VM, from inside the VM.

   It is likely that during one of these steps you will figure out the
   problem.  If not, then follow the ARP reply back to the origin, in
   reverse.

On Thu, Nov 21, 2013 at 04:55:13PM +0100, Anton Matsiuk wrote:
> I request log files up to debug level, namely:
> ovs-vswitchd.log
> ovs-dpctl.log
> ovs-ofctl.log
> but none of them shows any messages related to packet drops. All the
> statistics shows that correct number of flows was installed and only part
> of packets was processed.
> That's why I am asking, is there any else possibilities (beyond log files)
> to track packet drops in input buffers and probably to fix them? Or at
> least in which direction I should search for a solution?
> 
> 
> On 20 November 2013 18:13, Ben Pfaff <blp at nicira.com> wrote:
> 
> > On Wed, Nov 20, 2013 at 12:35:25PM +0100, Anton Matsiuk wrote:
> > > I test Open vSwitch in the following scheme: I use 2 hosts directly
> > > connected to OVS and external OpenFlow Controller. Host1 generates UDP
> > > datagrams with sequential ports towards Host2, Host 2 listens for these
> > UDP
> > > datagrams. In responce to every UDP datagram OVS generates Packet_In and
> > > Controller sends Flow_Mod back with L4 granularity (so for every pair of
> > > UDP port numbers it installs separate flow). I send bunch of UDP
> > datagrams
> > > from Host1 and calculate how many of them arrived to Host2. I tried both
> > > with detached controller and running in the same machine as OVS. I tested
> > > it on different machines (in Mininet and with separated real hosts). I
> > use
> > > out-of-band option for controller and disable-in-band=true.
> > >
> > >
> > > Starting  some number of packets ( around >300) packet drops are
> > observed.
> > > For instance, if I generate 500 UDP packets in 120 ms only around 350 of
> > > them arrive to Host2 (Subsequent packets of the same flow can arrive to
> > > Host2, but first packets of flows always experience drops)
> > >
> > >
> > > ovs-ofctl dump-aggregate show that all the flows are installed but only
> > > part of packets are processed through them:
> > >
> > > NXST_AGGREGATE reply (xid=0x4): packet_count=356 byte_count=42364
> > > flow_count=500
> > >
> > >
> > > ovs-ofctl dump-ports also shows that 500 packets arrive on ingress
> > > interface and only 356 leave egress.
> > >
> > >
> > > ovs-dpctl show ?s shows the same ?  500 flows installed and 356 packets
> > > processed.
> > >
> > >
> > > Also I tried to replace Flow_Mods with Packet_Out messages for every
> > > packet, but I experienced the same drops. It seems like OVS starts
> > dropping
> > > packets after some threshold (or buffer overload).
> > >
> > >
> > > Is there any possibility to debug these drops and maybe to manipulate
> > > ingress buffer sizes (or queue priorities) in order to avoid such drops?
> >
> > Yes, I think you will have to do the initial debugging yourself, to find
> > out where the drop is occurring.  When you report that back to us, we
> > can help you figure out how to fix it.
> >
> 
> 
> 
> -- 
> Best regards,
> Anton Matsiuk