[ovs-discuss] packets not being forwarded beyond 65 hops of
fbl at redhat.com
Thu Aug 14 18:14:05 UTC 2014
On Thu, Aug 14, 2014 at 10:36:49AM -0700, Ben Pfaff wrote:
> On Thu, Aug 14, 2014 at 02:29:49PM -0300, Flavio Leitner wrote:
> > On Thu, Aug 14, 2014 at 10:01:56AM -0700, Ben Pfaff wrote:
> > > On Wed, Aug 13, 2014 at 6:57 PM, Flavio Leitner <fbl at redhat.com> wrote:
> > > > If it's kernel DP, then patch ports are internal ports which during TX
> > > > inserts the packet (skb) into the CPU backlog queue (enqueue_to_backlog())
> > > > for later processing. Later on, the backlog queue is processed
> > > > (process_backlog()) and the packet is actually received by the other port.
> > >
> > > I missed that we're talking about patch ports. That explains the whole issue.
> > >
> > > Patch ports are implemented in userspace regardless of the datapath in use.
> > > They do not exist as ports visible from the kernel (definitely not as internal
> > > ports). Hops across patch ports are "optimized out" in userspace, but the
> > > recursion is limited to 64 levels, hence the issue you're seeing.
> > That is a surprise to me. So if I have a pair of patch ports
> > connecting two bridges, then all traffic goes from kernel to
> > userspace and then back to kernel DP?
> Not at all. Patch ports are implemented efficiently, but they are not
> visible at the kernel layer at all. Suppose you have br0 with ports
> vif0 and patch0 and br1 with ports vif1 and patch1, with patch0 and
> patch1 connected. If a packet is received on vif0, egresses patch0 on
> br0 and ingresses patch1 on br1, and then sent out vif1, OVS in
> userspace just translates this to a single kernel action "output vif1".
> The result is actually more efficient than a kernel implementation of
> patch ports: if the kernel had patch ports then we'd have two trips to
> userspace to set up the flow (one for ingress on vif0, one for ingress
> on patch1), but with the userspace implementation there is just a single
> userspace trip, for ingress on vif0.
It's clear now. Thanks for the detailed explanation.
I was looking at how internal ports works and I wrongly assumed that patch
ports would be an internal port too.
Speaking on how ports works and limits, I've heard about an interesting
problem when using containers. If you add more than 1k containers to
an OVS bridge, the network starts to fail. The reason is that when
flooding packets with ARP requests, OVS clones the packet for each
port. However, for veth devices this ends up in the CPU backlog which
is limited by sysctl netdev_max_backlog to 1000, so the excess gets dropped.
I don't see how to prevent that in OVS because the DP just loops executing
the actions, sending the packet regardless of the device's type, etc.
More information about the discuss