[ovs-dev] [PATCH ovn 1/2] system-test: Fix flake in ECMP IPv6 symmetric reply test

Mark Gray mark.d.gray at redhat.com
Tue Jul 13 19:29:24 UTC 2021


On 13/07/2021 20:20, Han Zhou wrote:
> On Tue, Jul 13, 2021 at 2:05 AM Dumitru Ceara <dceara at redhat.com> wrote:
>>
>> On 7/13/21 9:29 AM, Mark Gray wrote:
>>> On 13/07/2021 01:40, Han Zhou wrote:
>>>> On Mon, Jul 12, 2021 at 10:28 AM Mark Gray <mark.d.gray at redhat.com>
> wrote:
>>>>>
>>>>> Statically add IPv6 neighbor MAC addresses to avoid NS messages
>>>>> evicting datapath flows causing occasional test failures.
>>>>>
>>>>> We also configure all interfaces to send only one IPv6 router
>>>>> solicitation message. These messages can cause datapath flows
>>>>> to be unexpectedly evicted causing test failures.
>>>>>
>>>>> Fixes: 7c927c0c0be1 ("ovn-northd: Fix IPv6 ECMP symmetric reply
> flows")
>>>>> Signed-off-by: Mark Gray <mark.d.gray at redhat.com>
>>>>> ---
>>>>>  tests/system-ovn.at | 51
> +++++++++++++++++++++++++--------------------
>>>>>  1 file changed, 28 insertions(+), 23 deletions(-)
>>>>>
>>>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>>>>> index 79879c6e003b..fc377bbd1a47 100644
>>>>> --- a/tests/system-ovn.at
>>>>> +++ b/tests/system-ovn.at
>>>>> @@ -5833,17 +5833,35 @@ ovn-nbctl lr-route-add R3 fd01::/64 fd02::1
>>>>>
>>>>>  # Logical port 'alice1' in switch 'alice'.
>>>>>  ADD_NAMESPACES(alice1)
>>>>> +# Only send 1 router solicitation as any additional ones can cause
>>>> datapath
>>>>> +# flows to get evicted, causing unexpected failures below.
>>>>> +NS_CHECK_EXEC([alice1], [sysctl -w
>>>> net.ipv6.conf.default.router_solicitations=1], [0], [dnl
>>>>> +net.ipv6.conf.default.router_solicitations = 1
>>>>> +])
>>>>>  ADD_VETH(alice1, alice1, br-int, "fd01::2/64", "f0:00:00:01:02:04", \
>>>>>           "fd01::1")
>>>>>  OVS_WAIT_UNTIL([test "$(ip netns exec alice1 ip a | grep fd01::2 |
> grep
>>>> tentative)" = ""])
>>>>>  ovn-nbctl lsp-add alice alice1 \
>>>>>  -- lsp-set-addresses alice1 "f0:00:00:01:02:04 fd01::2"
>>>>> +# Add neighbour MAC address to avoid sending IPv6 NS messages which
> could
>>>>> +# cause datapath flows to be evicted
>>>>> +NS_CHECK_EXEC([alice1], [ip -6 neigh add fd01::1 lladdr
>>>> 00:00:01:01:02:03 dev alice1], [0])
>>>>>
>>>>>  # Logical port 'bob1' in switch 'bob'.
>>>>>  ADD_NAMESPACES(bob1)
>>>>> +# Only send 1 router solicitation as any additional ones can cause
>>>> datapath
>>>>> +# flows to get evicted, causing unexpected failures below.
>>>>> +NS_CHECK_EXEC([bob1], [sysctl -w
>>>> net.ipv6.conf.default.router_solicitations=1], [0], [dnl
>>>>> +net.ipv6.conf.default.router_solicitations = 1
>>>>> +])
>>>>>  ADD_VETH(bob1, bob1, br-int, "fd07::1/64", "f0:00:00:01:02:06", \
>>>>>           "fd07::2")
>>>>>  OVS_WAIT_UNTIL([test "$(ip netns exec bob1 ip a | grep fd07::1 | grep
>>>> tentative)" = ""])
>>>>> +# Add neighbour MAC addresses to avoid sending IPv6 NS messages which
>>>> could
>>>>> +# cause datapath flows to be evicted
>>>>> +NS_CHECK_EXEC([bob1], [ip -6 neigh add fd07::2 lladdr
> 00:00:02:01:02:03
>>>> dev bob1], [0])
>>>>> +NS_CHECK_EXEC([bob1], [ip -6 neigh add fd07::3 lladdr
> 00:00:01:01:02:04
>>>> dev bob1], [0])
>>>>> +
>>>>>  ovn-nbctl lsp-add bob bob1 \
>>>>>  -- lsp-set-addresses bob1 "f0:00:00:01:02:06 fd07::1"
>>>>>
>>>>> @@ -5852,45 +5870,32 @@ ovn-nbctl --wait=hv sync
>>>>>
>>>>>  on_exit 'ovs-ofctl dump-flows br-int'
>>>>>
>>>>> -# Later in this test we will check for a datapath flow that matches:
>>>>> -#
> "ct_state(+new-est-rpl+trk).*ct(.*label=0x200000401020400000000/.*)".
>>>> Due
>>>>> -# to the way OVS generates datapath flows with wildcards, ICMPv6 NS
>>>> flows will
>>>>> -# evict this datapath flow. In order to ensure that the flow does not
>>>>> -# get evicted, we send one ping packet in order to carry out neighbor
>>>>> -# discovery. We then flush the datpath to remove the NS flows so that
>>>> the flow
>>>>> -#
> "ct_state(+new-est-rpl+trk).*ct(.*label=0x200000401020400000000/.*)"
>>>> will
>>>>> -# be present when we check for it.
>>>>> -NS_CHECK_EXEC([bob1], [ping -q -c 2 -i 0.3 -w 15 fd01::2 |
> FORMAT_PING],
>>>> \
>>>>> -[0], [dnl
>>>>> -2 packets transmitted, 2 received, 0% packet loss, time 0ms
>>>>> -])
>>>>> -ovs-appctl dpctl/del-flows
>>>>> -
>>>>>  # 'bob1' should be able to ping 'alice1' directly.
>>>>>  NS_CHECK_EXEC([bob1], [ping -q -c 20 -i 0.3 -w 15 fd01::2 |
>>>> FORMAT_PING], \
>>>>>  [0], [dnl
>>>>>  20 packets transmitted, 20 received, 0% packet loss, time 0ms
>>>>>  ])
>>>>>
>>>>> -# Ensure conntrack entry is present. We should not try to predict
>>>>> -# the tunnel key for the output port, so we strip it from the labels
>>>>> -# and just ensure that the known ethernet address is present.
>>>>> -AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(fd01::2) | \
>>>>> -sed -e 's/zone=[[0-9]]*/zone=<cleared>/' |
>>>>> -sed -e
>>>>
> 's/labels=0x[[0-9a-f]]*00000401020400000000/labels=0x00000401020400000000/'],
>>>> [0], [dnl
>>>>>
>>>>
> -icmpv6,orig=(src=fd07::1,dst=fd01::2,id=<cleared>,type=128,code=0),reply=(src=fd01::2,dst=fd07::1,id=<cleared>,type=129,code=0),zone=<cleared>,labels=0x00000401020400000000
>>>>> -])
>>>>> -
>>>>>  # Ensure datapaths show conntrack states as expected
>>>>>  # Like with conntrack entries, we shouldn't try to predict
>>>>>  # port binding tunnel keys. So omit them from expected labels.
>>>>>  AT_CHECK([ovs-appctl dpctl/dump-flows | grep
>>>> 'ct_state(+new-est-rpl+trk).*ct(.*label=0x200000401020400000000/.*)'
> -c],
>>>> [0], [dnl
>>>>>  1
>>>>>  ])
>>>>> +
>>>>>  AT_CHECK([ovs-appctl dpctl/dump-flows | grep
>>>> 'ct_state(-new+est+rpl+trk).*ct_label(0x.*00000401020400000000/.*)'
> -c],
>>>> [0], [dnl
>>>>>  1
>>>>>  ])
>>>>>
>>>>> +# Ensure conntrack entry is present. We should not try to predict
>>>>> +# the tunnel key for the output port, so we strip it from the labels
>>>>> +# and just ensure that the known ethernet address is present.
>>>>> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(fd01::2) | \
>>>>> +sed -e 's/zone=[[0-9]]*/zone=<cleared>/' |
>>>>> +sed -e
>>>>
> 's/labels=0x[[0-9a-f]]*00000401020400000000/labels=0x00000401020400000000/'],
>>>> [0], [dnl
>>>>>
>>>>
> +icmpv6,orig=(src=fd07::1,dst=fd01::2,id=<cleared>,type=128,code=0),reply=(src=fd01::2,dst=fd07::1,id=<cleared>,type=129,code=0),zone=<cleared>,labels=0x00000401020400000000
>>>>> +])
>>>>> +
>>>>>  ovs-ofctl dump-flows br-int
>>>>>
>>>>>  OVS_APP_EXIT_AND_WAIT([ovn-controller])
>>>>> --
>>>>> 2.27.0
>>>>>
>>>>
>>>> Thanks Mark for the fix! Just a minor comment regarding changing the
> sysctl
>>>> settings for router_solicitations. It is better to save the original
> value
>>>> before changing it and restore it afterwards. Maybe still a problem if
> the
>>>> test fails in between the settings could not be restored, but I don't
> have
>>>> a better idea.
>>>
>>> I shouldn't need to change it back because I am setting it in a network
>>> namespace so it shouldn't affect the system configuration and the
>>> namespace gets deleted at the end of the test.
>>>
> Yes this makes sense. Sorry I forgot it is in namespace.
> 
>>>>
>>>> This approach does fix the problem at least on my machine. However, I
> don't
>>>> think I fully understood the problem. I tried many times when
> reverting the
>>>> last NAT related northd change and the test always passes, while
> adding the
>>>> change it fails 100%. I understand that there are RS packets causing
> the
>>>
>>> This test has been failing intermittently for a couple of weeks (even
>>> before the NAT change you are referring to).
>>>
>>>> expected flows got evicted from DP. The older version passes the test
>>>> probably because the pipelines were different so that other flows got
>>>> evicted but not the ones we are checking in this test.
>>>
>>> It was failing before - but it is a timing issue. It fails very
>>> infrequently on my laptop (e.g. failing once in 24 hours while running
>>> the test in a loop) but was failing quite regularly on the CI (maybe 1
>>> in 3 times)
>>>
>>>> However, what I
>>>> don't understand is that we are checking the DP flows immediately
> after the
>>>> 20 ping packets succeed, how would the RS/NS packets always come right
>>>> after the ping tests causing the DP flows evicted?
>>>
>>> I don't know why it is failing more frequently on your setup but it
>>> should be clear that it is the cause because if you dump the flows
>>> before the check, you can clearly see NS (or RS) packets in the DP
> table.
>>>
> 
> Yes I did dump the flows immediately after the 20 ping packets, and
> realized that it was NS and RS packets messing things up, but just wasn't
> clear about how they came right after the ping (because if they evict the
> flows before the ping completes, then either ping would fail or ping would
> triggering reinstallation of the DP flows, for my understanding.

My guess is that this is what was causing the flakiness and the race. My
assumption is that they were getting evicted between completing the ping
and dumping the flows. In fact, on the CI, I generally saw it when
checking for the established flow (rather than the new flow) which was
the second check - leaving a larger window for the NS/RS to arrive. I
also think this is why this fails more frequently in the CI - because it
is running in a much slower environment (versus my core i7). Either way,
I think the fix is correct .. it was a nasty one to debug.

> 
> To be clear this wasn't something I think should hold the merge, since it
> is clear that without NS/RS intervention the behavior is expected. It was
> just out of curiosity :)
> And I saw Numan had already merged it. Thanks again for fixing this.
> 

Thanks for reviewing

> Han
> 
>>>> If the RS/NS packets
>>>> come at some intervals, we wouldn't be so lucky, right? Could you help
>>>> clarify my confusion?
>>>>
>>>> Thanks,
>>>> Han
>>>>
>>>
>>
>> Han has the final call on this, but the change looks good to me:
>>
>> Acked-by: Dumitru Ceara <dceara at redhat.com>
>>
>> Regards,
>> Dumitru
>>
> 



More information about the dev mailing list