[ovs-discuss] KVS/OVS/NIC bridging issue on kernel 5.x not seen on 4.19?

Gregory Rose gvrose8192 at gmail.com
Thu Feb 18 17:48:12 UTC 2021



On 2/17/2021 7:54 PM, Tyler Stachecki wrote:
> Thanks - I was worried about the unforeseen side effects, especially with
> my unfamiliarity of this part of the tree and OVS.
> 
> After looking at the tree, I see uses of dev_net(dev) for this kind of test
> within IP tunneling sections, which also handles the CONFIG_NET_NS=n case.
> 
> Thanks for you time,
> Tyler

Right, dev_net(dev) is the proper way to do that test.  Glad you got it
figured out.

- Greg


> 
> On Tue, Feb 16, 2021 at 3:03 PM Gregory Rose <gvrose8192 at gmail.com> wrote:
> 
>>
>>
>> On 2/13/2021 8:01 AM, Tyler Stachecki wrote:
>>> I've fixed the issue in such a way that it works for me (TM), but would
>>> appreciate confirmation from an OVS expert that I'm not overlooking
>>> something here:
>>>
>>> Based on my last post, we need:
>>> --- a/net/openvswitch/vport.c
>>> +++ b/net/openvswitch/vport.c
>>> @@ -503,6 +503,7 @@ void ovs_vport_send(struct vport *vport, struct
>> sk_buff
>>> *skb, u8 mac_proto)
>>>           }
>>>
>>>           skb->dev = vport->dev;
>>> +       skb->tstamp = 0;
>>>           vport->ops->send(skb);
>>>           return;
>>
>> Hmm... I'm not so sure about this.  The skb_scrub_packet() function only
>> clears skb->tstamp if the @xnet boolean parameter is true.  In this case
>> you are doing it unconditionally which very well might have unforeseen
>> side effects.
>>
>> Maybe test skb->dev->nd_net and if it isn't NULL then clear the
>> tstamp?
>>
>> What do you think?
>>
>> - Greg
>>
>>>
>>> As the timestamp must be cleared when forwarding packets to a different
>>> namespace ref:
>>>
>> https://patchwork.ozlabs.org/project/netdev/patch/20180307011230.24001-3-jesus.sanchez-palencia@intel.com/#1871003
>>>
>>> Cheers,
>>> Tyler
>>>
>>> On Sat, Feb 13, 2021 at 12:04 AM Tyler Stachecki <
>> stachecki.tyler at gmail.com>
>>> wrote:
>>>
>>>> Here's the offender:
>>>>
>>>> commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad)
>>>> Author: Eric Dumazet <edumazet at ...gle.com>
>>>> Date:   Fri Sep 28 10:28:44 2018 -0700
>>>>
>>>>       tcp/fq: move back to CLOCK_MONOTONIC
>>>>
>>>> Without this, I wasn't able to make it past the 4.20 series.  I
>>>> forward-ported a reversion to 5.4 LTS for fun and things still work
>> great.
>>>> Though it sounds like simply reverting this is not the right fix -- some
>>>> interesting discussion on others impact of this commit:
>>>> https://lists.openwall.net/netdev/2019/01/10/36
>>>>
>>>>> Then, we probably need to clear skb->tstamp in more paths (you are
>>>>> mentioning bridge ...)
>>>>
>>>> I will try to take a peek sometime this weekend to see if I can spot
>> where
>>>> in OVS, assuming it is there.
>>>>
>>>> On Tue, Feb 9, 2021 at 4:22 PM Gregory Rose <gvrose8192 at gmail.com>
>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On 2/8/2021 4:19 PM, Tyler Stachecki wrote:
>>>>>> Thanks for the reply.  This is router, so it is using conntrack;
>> unsure
>>>>> if
>>>>>> there is additional connection tracking in OVS.  `ovs-ofctl dump-flows
>>>>>> br-util` shows exactly one flow: the default one.
>>>>>>
>>>>>> Here's my approx /etc/network//interfaces.  I just attach VMs to this
>>>>> with
>>>>>> libvirt and having nothing else added at this point:
>>>>>> allow-ovs br-util
>>>>>> iface br-util inet manual
>>>>>>            ovs_type OVSBridge
>>>>>>            ovs_ports enp0s20f1.102 vrf-util
>>>>>>
>>>>>> allow-br-util enp0s20f1.102
>>>>>> auto enp0s20f1.102
>>>>>> iface enp0s20f1.102 inet manual
>>>>>>            ovs_bridge br-util
>>>>>>            ovs_type OVSPort
>>>>>>            mtu 9000
>>>>>>
>>>>>> allow-br-util vrf-util
>>>>>> iface vrf-util inet static
>>>>>>            ovs_bridge br-util
>>>>>>            ovs_type OVSIntPort
>>>>>>            address 10.10.2.1/24
>>>>>>            mtu 9000
>>>>>>
>>>>>> I roughly transcribed what I was doing into a Linux bridge, and it
>>>>> works as
>>>>>> expected in 5.10... e.g. this in my /etc/network/interfaces:
>>>>>> auto enp0s20f1.102
>>>>>> iface enp0s20f1.102 inet manual
>>>>>>            mtu 9000
>>>>>>
>>>>>> auto vrf-util
>>>>>> iface vrf-util inet static
>>>>>>            bridge_ports enp0s20f1.102
>>>>>>            bridge-vlan-aware no
>>>>>>            address 10.10.2.1/24
>>>>>>            mtu 9000
>>>>>>
>>>>>> I'm having a bit of a tough time following the dataflow code, and the
>> ~1
>>>>>> commit or so I was missing from the kernel staging tree does not seem
>> to
>>>>>> have fixed the issue.
>>>>>
>>>>> Hi Tyler,
>>>>>
>>>>> this does not sound like the previous issue I mentioned because that
>> one
>>>>> was caused by flow programming for dropping packets.
>>>>>
>>>>> I hate to say it but you're probably going to have to resort to a
>>>>> bisect to find this one.
>>>>>
>>>>> - Greg
>>>>>
>>>>>>
>>>>>> On Mon, Feb 8, 2021 at 6:21 PM Gregory Rose <gvrose8192 at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2/6/2021 9:50 AM, Tyler Stachecki wrote:
>>>>>>>> I have simple forwarding issues when running the Debian stable
>>>>> backports
>>>>>>>> kernel (5.9) that I don't see with the stable, non-backported 4.19
>>>>>>> kernel.
>>>>>>>> Big fat disclaimer: I compiled my OVS (2.14.1) from source, but
>> given
>>>>> it
>>>>>>>> works with the 4.19 kernel I doubt it has anything to do with it.
>> For
>>>>>>> good
>>>>>>>> measure, I also compiled 5.10.8 from source and see the same issue I
>>>>> do
>>>>>>> in
>>>>>>>> 5.9.
>>>>>>>>
>>>>>>>> The issue I see on 5.x (config snippets below):
>>>>>>>> My VM (vnet0 - 10.10.0.16/24) can ARP/ping for other physical hosts
>>>>> on
>>>>>>> its
>>>>>>>> subnet (e.g. 00:07:32:4d:2f:71 = 10.10.0.23/24 below), but only the
>>>>>>> first
>>>>>>>> echo request in a sequence is seen by the destination host.  I then
>>>>> have
>>>>>>> to
>>>>>>>> wait about 10 seconds before pinging the destination host from the
>> VM
>>>>>>>> again, but again only the first echo in a sequence gets a reply.
>>>>>>>>
>>>>>>>> I've tried tcpdump'ing enp0s20f1.102 (the external interface on the
>>>>>>>> hypervisor) and see the pings going out that interface at the rate I
>>>>>>> would
>>>>>>>> expect.  OTOH, when I tcpdump on the destination host, I only see
>> the
>>>>>>> first
>>>>>>>> of the ICMP echo requests in a sequence (for which an echo reply is
>>>>>>> sent).
>>>>>>>>
>>>>>>>> I then added an OVS internal port on the hypervisor (i.e., on
>> br-util)
>>>>>>> and
>>>>>>>> gave it an IP address (10.10.2.1/24).  It is able to ping that same
>>>>>>>> external host just fine.  Likewise, I am able to ping between the VM
>>>>> and
>>>>>>>> the OVS internal port just fine.
>>>>>>>>
>>>>>>>> When I rollback to 4.19, this weirdness about traffic going out of
>>>>>>>> enp0s20f1.102 *for the VM* goes away and everything just works.  Any
>>>>>>> clues
>>>>>>>> while I start ripping into code?
>>>>>>>
>>>>>>> Are you using any of the connection tracking capabilities? I vaguely
>>>>>>> recall some issue that sounds a lot like what you're seeing but do
>> not
>>>>>>> see anything in the git log to stir my memory.  IIRC though it was a
>>>>>>> similar problem.
>>>>>>>
>>>>>>> Maybe provide a dump of your flows.
>>>>>>>
>>>>>>> - Greg
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 


More information about the discuss mailing list