[ovs-dev] [PATCH] tnl-neigh-cache: Purge learnt neighbors when port/bridge is deleted

Flavio Leitner fbl at sysclose.org
Mon Jul 8 13:41:07 UTC 2019


On Tue, Jul 02, 2019 at 08:50:16PM -0400, Vasu Dasari wrote:
> Hi Flavio,
> 
> I am trying to emulate the test case scenario you mentioned earlier, where
> in we need to clean you neighbor cache when an external interface goes
> down. To study that, I wrote a small script based off of test case
> system-traffic.at, "datapath - ping over vxlan tunnel". And this experiment
> gave me good experience around the understanding of netdev and kmod
> implementation of VxLAN.
> 
> Various tests I have performed using the new script include,
> 1. ovs-vswitchd started normally, for performing kernel mode VxLAN tests.
> And in this mode, tested following cases
>    a. VxLAN interface being part of OVS.
>    b. VxLAN interface external to OVS
>    I see that in kernel mode of operation, native tunneling is off, and
> hence no ARP entries are learnt in tnl-neigh-cache. Please refer to
> ofproto-dpif-xlate.c:terminate_native_tunnel(). So, whatever changes we are
> making here(as part of my commit) are not effected in kernel mode of
> operation.

In either case my understanding is that OvS doesn't care about the
VXLAN and just hands off the packet to the interface. In another
words, it's the interface's job to encapsulate the traffic and
that's why it doesn't impact on the tnl-neigh-cache.


> 2. ovs-vswitchd started with "--disable-system" option for performing
> usermode VxLAN tests. And in this mode, tested following cases
>    a. VxLAN interface being part of OVS. This test case works. In this
> case, entries are cleanly removed by user deleting the ports from the
> bridge.

Right, this is the scenario you tested first, if I recall correctly.

>    b. VxLAN interface external to OVS. I could not get this case to work.

I think OvS will inject the packet to the VXLAN interface, but I
never tried this as it seems unusual scenario to have.

> 3. ovs-vswitchd started normally(without --disable-system option) for
> performing usermode VxLAN tests. And in this mode, tested following cases
>    a. VxLAN interface being part of OVS. This test case works. In this
> case, entries are cleanly removed by user deleting the ports from the
> bridge.
>    b. VxLAN interface external to OVS. I could not get this case to work.

This looks like a perfect copy from item #2 which is okay, just checking
if there was no copy&paste error somewhere :-)

 
> I could not 2b and 3b use cases to at all. Do you expect these to work
> normally?  The reason I ask this is, as I understand from the code, even
> though "ovs/route/show" shows the route to remote vtep, OVS does not know
> which ODP port to take to send the packet out of. Please refer to
> ofproto-dpif-xlate.c:native_tunnel_output() and tnl_route_lookup_flow() and
> hence packet transmission fails with "native tunnel routing failed".
> 
> If you agree that 2b and 3b are not supported use cases, then I can submit
> my code changes as per your review comments.

Let me step back a bit because in the scenario I told initially had
the egress device (ethX) with the endpoint address outside of the
bridge while the VXLAN interface was always part of the bridge.
Therefore we had two scenarios to cover with the difference being
the endpoint address being part or not part of the OvS bridge. 

However, looking more closely to the code if the VXLAN is part of the
bridge in userspace, then native tunnel will need the tnl-neigh-cache
to build the headers.  Then it sends out the ARP Request packet to the
datapath only, so it doesn't seem to support endpoint addresses outside
of the OvS. I guess your initial patch covering only interfaces as part
of OvS was good enough then.

Do you agree with that?

Thanks!
fbl


> 
> Please let me know what you think of.
> -Vasu
> 
> *Vasu Dasari*
> 
> 
> On Mon, Jun 24, 2019 at 6:15 PM Flavio Leitner <fbl at sysclose.org> wrote:
> 
> > On Mon, Jun 24, 2019 at 05:43:26PM -0400, Vasu Dasari wrote:
> > > On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner <fbl at sysclose.org> wrote:
> > > > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote:
> > > > > +{
> > > > > +    struct tnl_neigh_entry *neigh;
> > > > > +    bool changed = false;
> > > > > +
> > > > > +    ovs_mutex_lock(&mutex);
> > > > > +    CMAP_FOR_EACH(neigh, cmap_node, &table) {
> > > > > +        if (nullable_string_is_equal(neigh->br_name, br_name)) {
> > > > > +            tnl_neigh_delete(neigh);
> > > > > +            changed = true;
> > > >
> > > > Do you expect to match on additional entries? Otherwise it
> > > > could bail out here.
> > > >
> > > >
> > > Say there are two or more neighbors on the port or on bridge, then by
> > > bailing out we would be missing others. So, will leave it there.
> >
> > You're right.
> >
> > [...]
> > > > However, as I mentioned in the discussion, the tnl IP address could
> > > > be on a device that doesn't belong to OVS at all. For example:
> > [...]
> > > > The tnl endpoint must have a valid route, so I suggest to hook the
> > > > tnl_neigh_cache_flush into route-table.c which receives a notification
> > > > when a route changes. If a route is deleted, we should flush the
> > > > device's tnl cache. Doing so, would cover both userspace and kernel
> > > > datapath for OVS and non-OVS devices.
> > > >
> > > >
> > > I see what you are saying. Let me play with code a bit and resubmit
> > patch.
> >
> > OK, looking forward to it!
> > Thanks Vasu,
> > fbl
> >


More information about the dev mailing list