[ovs-discuss] Packet drops with high rate of Packet_In
Ben Pfaff
blp at nicira.com
Thu Nov 21 16:56:35 UTC 2013
Please don't drop the mailing list.
You have begun to narrow down where the drops occur, but it's still not
clear exactly where. I suggest following the troubleshooting procedure
in the FAQ.
Q: I have a sophisticated network setup involving Open vSwitch, VMs or
multiple hosts, and other components. The behavior isn't what I
expect. Help!
A: To debug network behavior problems, trace the path of a packet,
hop-by-hop, from its origin in one host to a remote host. If
that's correct, then trace the path of the response packet back to
the origin.
Usually a simple ICMP echo request and reply ("ping") packet is
good enough. Start by initiating an ongoing "ping" from the origin
host to a remote host. If you are tracking down a connectivity
problem, the "ping" will not display any successful output, but
packets are still being sent. (In this case the packets being sent
are likely ARP rather than ICMP.)
Tools available for tracing include the following:
- "tcpdump" and "wireshark" for observing hops across network
devices, such as Open vSwitch internal devices and physical
wires.
- "ovs-appctl dpif/dump-flows <br>" in Open vSwitch 1.10 and
later or "ovs-dpctl dump-flows <br>" in earlier versions.
These tools allow one to observe the actions being taken on
packets in ongoing flows.
See ovs-vswitchd(8) for "ovs-appctl dpif/dump-flows"
documentation, ovs-dpctl(8) for "ovs-dpctl dump-flows"
documentation, and "Why are there so many different ways to
dump flows?" above for some background.
- "ovs-appctl ofproto/trace" to observe the logic behind how
ovs-vswitchd treats packets. See ovs-vswitchd(8) for
documentation. You can out more details about a given flow
that "ovs-dpctl dump-flows" displays, by cutting and pasting
a flow from the output into an "ovs-appctl ofproto/trace"
command.
- SPAN, RSPAN, and ERSPAN features of physical switches, to
observe what goes on at these physical hops.
Starting at the origin of a given packet, observe the packet at
each hop in turn. For example, in one plausible scenario, you
might:
1. "tcpdump" the "eth" interface through which an ARP egresses
a VM, from inside the VM.
2. "tcpdump" the "vif" or "tap" interface through which the ARP
ingresses the host machine.
3. Use "ovs-dpctl dump-flows" to spot the ARP flow and observe
the host interface through which the ARP egresses the
physical machine. You may need to use "ovs-dpctl show" to
interpret the port numbers. If the output seems surprising,
you can use "ovs-appctl ofproto/trace" to observe details of
how ovs-vswitchd determined the actions in the "ovs-dpctl
dump-flows" output.
4. "tcpdump" the "eth" interface through which the ARP egresses
the physical machine.
5. "tcpdump" the "eth" interface through which the ARP
ingresses the physical machine, at the remote host that
receives the ARP.
6. Use "ovs-dpctl dump-flows" to spot the ARP flow on the
remote host that receives the ARP and observe the VM "vif"
or "tap" interface to which the flow is directed. Again,
"ovs-dpctl show" and "ovs-appctl ofproto/trace" might help.
7. "tcpdump" the "vif" or "tap" interface to which the ARP is
directed.
8. "tcpdump" the "eth" interface through which the ARP
ingresses a VM, from inside the VM.
It is likely that during one of these steps you will figure out the
problem. If not, then follow the ARP reply back to the origin, in
reverse.
On Thu, Nov 21, 2013 at 04:55:13PM +0100, Anton Matsiuk wrote:
> I request log files up to debug level, namely:
> ovs-vswitchd.log
> ovs-dpctl.log
> ovs-ofctl.log
> but none of them shows any messages related to packet drops. All the
> statistics shows that correct number of flows was installed and only part
> of packets was processed.
> That's why I am asking, is there any else possibilities (beyond log files)
> to track packet drops in input buffers and probably to fix them? Or at
> least in which direction I should search for a solution?
>
>
> On 20 November 2013 18:13, Ben Pfaff <blp at nicira.com> wrote:
>
> > On Wed, Nov 20, 2013 at 12:35:25PM +0100, Anton Matsiuk wrote:
> > > I test Open vSwitch in the following scheme: I use 2 hosts directly
> > > connected to OVS and external OpenFlow Controller. Host1 generates UDP
> > > datagrams with sequential ports towards Host2, Host 2 listens for these
> > UDP
> > > datagrams. In responce to every UDP datagram OVS generates Packet_In and
> > > Controller sends Flow_Mod back with L4 granularity (so for every pair of
> > > UDP port numbers it installs separate flow). I send bunch of UDP
> > datagrams
> > > from Host1 and calculate how many of them arrived to Host2. I tried both
> > > with detached controller and running in the same machine as OVS. I tested
> > > it on different machines (in Mininet and with separated real hosts). I
> > use
> > > out-of-band option for controller and disable-in-band=true.
> > >
> > >
> > > Starting some number of packets ( around >300) packet drops are
> > observed.
> > > For instance, if I generate 500 UDP packets in 120 ms only around 350 of
> > > them arrive to Host2 (Subsequent packets of the same flow can arrive to
> > > Host2, but first packets of flows always experience drops)
> > >
> > >
> > > ovs-ofctl dump-aggregate show that all the flows are installed but only
> > > part of packets are processed through them:
> > >
> > > NXST_AGGREGATE reply (xid=0x4): packet_count=356 byte_count=42364
> > > flow_count=500
> > >
> > >
> > > ovs-ofctl dump-ports also shows that 500 packets arrive on ingress
> > > interface and only 356 leave egress.
> > >
> > >
> > > ovs-dpctl show ?s shows the same ? 500 flows installed and 356 packets
> > > processed.
> > >
> > >
> > > Also I tried to replace Flow_Mods with Packet_Out messages for every
> > > packet, but I experienced the same drops. It seems like OVS starts
> > dropping
> > > packets after some threshold (or buffer overload).
> > >
> > >
> > > Is there any possibility to debug these drops and maybe to manipulate
> > > ingress buffer sizes (or queue priorities) in order to avoid such drops?
> >
> > Yes, I think you will have to do the initial debugging yourself, to find
> > out where the drop is occurring. When you report that back to us, we
> > can help you figure out how to fix it.
> >
>
>
>
> --
> Best regards,
> Anton Matsiuk
More information about the discuss
mailing list