[ovs-dev] [PATCH] ovn: Fix IPv6 DAD failure for child ports

Numan Siddique nusiddiq at redhat.com
Fri Oct 5 11:24:49 UTC 2018


On Fri, Oct 5, 2018 at 2:32 AM Guru Shetty <guru at ovn.org> wrote:

>
>
> On Thu, 4 Oct 2018 at 13:29, Numan Siddique <nusiddiq at redhat.com> wrote:
>
>>
>> Thanks for the review Guru.
>>
>> Please see below for the comments.
>>
>>
>>
>> On Fri, Oct 5, 2018 at 12:09 AM Guru Shetty <guru at ovn.org> wrote:
>>
>>>
>>>
>>> On Mon, 17 Sep 2018 at 02:24, <nusiddiq at redhat.com> wrote:
>>>
>>>> From: Numan Siddique <nusiddiq at redhat.com>
>>>>
>>>> When a child vlan interface is created inside a VM, the below kernel
>>>> message
>>>> is seen and IPv6 doesn't work on that interface.
>>>>
>>>
>>> On which interface doesn't IPv6 work? On the Vm's interface or on the
>>> container's interface?
>>>
>>>
>>
>> It is seen on the container's interface.
>>
>>>
>>>> [  138.000753] IPv6: vlan4: IPv6 duplicate address <IPv6 LLA> detected!
>>>>
>>>> When a child port sends a broadcast packet, OVN delivers the same
>>>> packet back to the child port (and hence the DAD check fails).
>>>>
>>>
>>> Is this limited to IP broadcast only? Or does this happen with mac
>>> broadcast too? (I am surprised why I hadn't seen this behavior with ARP
>>> packets of containers).
>>>
>>
>> It happens with mac broadcast. You can actually reproduce the issue by
>> applying the code in tests/ovn.at without this patch.
>>
>> Looks like I have confused you with the patch.  Let me try to give an
>> example. Suppose we have switch - sw0 - with ports p1, p2 and p3 and all of
>> them reside in the same chassis.
>> When p1 sends a broadcast/multicast packet, the below logical flow is hit
>>
>> table=17(ls_in_l2_lkup      ), priority=100  , match=(eth.mcast),
>> action=(outport = "_MC_flood"; output;)
>>
>> This translates to the OF flow
>> - metadata=0x1,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00
>> actions=load:0xffff->NXM_NX_REG15[],resubmit(,32)
>>
>> In the table 33 if reg15 has 0xffff, it will resubmit the flow with reg15
>> set to each logical port
>>
>> i.e reg15=0xffff,metadata=0x1
>> actions=load:0x1->NXM_NX_REG15[],resubmit(,34),load:0x2>NXM_NX_REG15[],resubmit(,34),load:0x3->NXM_NX_REG15[],resubmit(,34),load:0xffff->NXM_NX_REG15[]
>>
>> In the normal case, the packet will not be delivered back to p1 and
>> instead will be dropped in table 34 (OFTABLE_CHECK_LOOPBACK)
>> since reg10[0] is not set.
>>
>> table=34, n_packets=0, n_bytes=0,
>> priority=100,reg10=0/0x1,reg14=0x2,reg15=0x2,metadata=0x1 actions=drop
>>
>> When the register reg10[0] is set, it means the packet needs to be loop
>> backed to the source port. So table 64 has flows to clear the in_port
>> i,.e
>> table=64,  priority=100,reg10=0x1/0x1,reg15=0x1,metadata=0x1
>> actions=push:NXM_OF_IN_PORT[],load:0->NXM_OF_IN_PORT[],resubmit(,65),pop:NXM_OF_IN_PORT[]
>>
>> Now if we take the example of child ports, let's say "p2" is the child
>> port of "p1" and in_port of p1 is 1
>> In table 0, we will see 2 flows for in_port=1
>>
>> match=(in_port=1) -
>> actions=(load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[] and submit to
>> table 8)
>> match=(in_port=1, dl_vlan=4) actions(load:0x1->NXM_NX_REG10[0],
>> strip_vlan, load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[] , and
>> submit to table 8)
>>
>> Suppose if p2 sends a braodcast packet, reg10[0] is set in table 0.
>>
>> In this case, since reg10[0] is set, the broadcast packet is not dropped
>> in table 34 for port "p2" and it gets delivered back.
>> So this patch instead of using reg10[0], it uses a new register bit
>> -MLF_NESTED_CONTAINER_BIT
>>
>
> Thanks for the detailed response. I understand till the above point. So
> not using NXM_NX_REG10[0] (or MLF_ALLOW_LOOPBACK_BIT) actually lets the
> packet drop in table 34.  Can you please add that information to the commit
> message. I think it will be helpful in the future.
>
> I then don't understand, what we are doing with the new code
> in OFTABLE_SAVE_INPORT.  I would expect that the new code is needed to save
> inport if the destination is a nested container. Not quite understand why
> we have a match for parent port. I will wait for the comment in the code to
> understand it.
>

The code to push, clear the inport and pop the inport is already present
for the child ports -
https://github.com/openvswitch/ovs/blob/master/ovn/controller/physical.c#L265
And the code to clear the inport for the parent port or any port when
reg10[0] is set is here -
https://github.com/openvswitch/ovs/blob/master/ovn/controller/physical.c#L256

Now with this patch since we are using reg10[5] to indicate the packet is
from child ports, we need to clear the in_port if the packet from the child
port needs to be sent to the parent port.
And that's why have a new match for the parent ports.

I changed the signature of the function 'put_local_common_flows' because I
wanted to add the flow to check reg10[5] only if a port is a parent port.
Otherwise we will add the flow to check reg10[5] for all
the ports in the switch.

I will add proper documentation and make it clear.

Thanks
Numan


>
>
>>
>> We also need to clear the in_port in table 64 if MLF_NESTED_CONTAINER_BIT
>> is set.
>>
>> I will address the review comments and submit p2.
>>
>> Hope this clears your confusion.
>>
>> Thanks
>> Numan
>>
>>
>>
>>>
>>>
>>> This is because 'MLF_ALLOW_LOOPBACK_BIT' is set in REG10 in table 0 when
>>>> the packet is received from any child port and table
>>>> 'OFTABLE_CHECK_LOOPBACK'
>>>> doesn't drop the packet.
>>>>
>>>
>>> I had thought that the previous tables only replicate packets to ports
>>> other than the inport. But I may be wrong.
>>> Some of my comments further down could be stupid because I may have not
>>> understood the problem well.
>>>
>>>
>>>>
>>>> This patch fixes the issue by using a new register bit
>>>> (MLF_NESTED_CONTAINER_BIT)
>>>> instead of 'MLF_ALLOW_LOOPBACK_BIT' and sets it in REG10 for the
>>>> packets received
>>>> from child ports.
>>>>
>>>> Signed-off-by: Numan Siddique <nusiddiq at redhat.com>
>>>> ---
>>>>  ovn/controller/ofctrl.c   | 29 ++++++++++++++++++++---------
>>>>  ovn/controller/ofctrl.h   |  5 +++++
>>>>  ovn/controller/physical.c | 38 ++++++++++++++++++++++++++++++++------
>>>>  ovn/lib/logical-fields.h  |  4 ++++
>>>>  tests/ovn.at              | 11 +++++++++++
>>>>  5 files changed, 72 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
>>>> index 96c57f143..2f2b185ae 100644
>>>> --- a/ovn/controller/ofctrl.c
>>>> +++ b/ovn/controller/ofctrl.c
>>>> @@ -630,9 +630,11 @@ ofctrl_recv(const struct ofp_header *oh, enum
>>>> ofptype type)
>>>>   *
>>>>   * The caller should initialize its own hmap to hold the flows. */
>>>>  void
>>>> -ofctrl_add_flow(struct hmap *desired_flows,
>>>> -                uint8_t table_id, uint16_t priority, uint64_t cookie,
>>>> -                const struct match *match, const struct ofpbuf
>>>> *actions)
>>>> +ofctrl_check_and_add_flow(struct hmap *desired_flows,
>>>> +                          uint8_t table_id, uint16_t priority,
>>>> uint64_t cookie,
>>>> +                          const struct match *match,
>>>> +                          const struct ofpbuf *actions,
>>>> +                          bool log_duplicate_flow)
>>>>  {
>>>>      struct ovn_flow *f = xmalloc(sizeof *f);
>>>>      f->table_id = table_id;
>>>> @@ -644,13 +646,14 @@ ofctrl_add_flow(struct hmap *desired_flows,
>>>>      f->cookie = cookie;
>>>>
>>>>      if (ovn_flow_lookup(desired_flows, f)) {
>>>> -        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
>>>> -        if (!VLOG_DROP_DBG(&rl)) {
>>>> -            char *s = ovn_flow_to_string(f);
>>>> -            VLOG_DBG("dropping duplicate flow: %s", s);
>>>> -            free(s);
>>>> +        if (log_duplicate_flow) {
>>>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>>> 5);
>>>> +            if (!VLOG_DROP_DBG(&rl)) {
>>>> +                char *s = ovn_flow_to_string(f);
>>>> +                VLOG_DBG("dropping duplicate flow: %s", s);
>>>> +                free(s);
>>>> +            }
>>>>          }
>>>> -
>>>>          ovn_flow_destroy(f);
>>>>          return;
>>>>      }
>>>> @@ -658,6 +661,14 @@ ofctrl_add_flow(struct hmap *desired_flows,
>>>>      hmap_insert(desired_flows, &f->hmap_node, f->hmap_node.hash);
>>>>  }
>>>>
>>>> +void
>>>> +ofctrl_add_flow(struct hmap *desired_flows,
>>>> +                uint8_t table_id, uint16_t priority, uint64_t cookie,
>>>> +                const struct match *match, const struct ofpbuf
>>>> *actions)
>>>> +{
>>>> +    ofctrl_check_and_add_flow(desired_flows, table_id, priority,
>>>> cookie,
>>>> +                              match, actions, true);
>>>> +}
>>>>
>>>>  /* ovn_flow. */
>>>>
>>>> diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h
>>>> index e3fc95b05..ae0cfa513 100644
>>>> --- a/ovn/controller/ofctrl.h
>>>> +++ b/ovn/controller/ofctrl.h
>>>> @@ -55,4 +55,9 @@ void ofctrl_add_flow(struct hmap *desired_flows,
>>>> uint8_t table_id,
>>>>                       uint16_t priority, uint64_t cookie,
>>>>                       const struct match *, const struct ofpbuf
>>>> *ofpacts);
>>>>
>>>> +void ofctrl_check_and_add_flow(struct hmap *desired_flows, uint8_t
>>>> table_id,
>>>> +                               uint16_t priority, uint64_t cookie,
>>>> +                               const struct match *,
>>>> +                               const struct ofpbuf *ofpacts,
>>>> +                               bool log_duplicate_flow);
>>>>  #endif /* ovn/ofctrl.h */
>>>> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
>>>> index c38d7b09f..d781ae459 100644
>>>> --- a/ovn/controller/physical.c
>>>> +++ b/ovn/controller/physical.c
>>>> @@ -189,7 +189,8 @@ get_zone_ids(const struct sbrec_port_binding
>>>> *binding,
>>>>
>>>>  static void
>>>>  put_local_common_flows(uint32_t dp_key, uint32_t port_key,
>>>> -                       bool nested_container, const struct zone_ids
>>>> *zone_ids,
>>>> +                       uint32_t parent_port_key,
>>>> +                       const struct zone_ids *zone_ids,
>>>>                         struct ofpbuf *ofpacts_p, struct hmap
>>>> *flow_table)
>>>>  {
>>>>      struct match match;
>>>> @@ -253,7 +254,7 @@ put_local_common_flows(uint32_t dp_key, uint32_t
>>>> port_key,
>>>>      ofpbuf_clear(ofpacts_p);
>>>>      match_set_metadata(&match, htonll(dp_key));
>>>>      match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
>>>> -    if (!nested_container) {
>>>> +    if (!parent_port_key) {
>>>>          match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
>>>>                               MLF_ALLOW_LOOPBACK, MLF_ALLOW_LOOPBACK);
>>>>      }
>>>>
>>> I think before your patch, the above code would always save inport for
>>> nested containers. They continue to do so now. I hope that was your
>>> intention?
>>>
>>>
>>>
>>>> @@ -264,6 +265,25 @@ put_local_common_flows(uint32_t dp_key, uint32_t
>>>> port_key,
>>>>      put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
>>>>      ofctrl_add_flow(flow_table, OFTABLE_SAVE_INPORT, 100, 0,
>>>>
>>>
>>>                    &match, ofpacts_p);
>>>>
>>> +
>>>>
>>> Can you add a comment here on what we are doing below?
>>>
>>> When a packet is headed to a nested container, the outport in OVN should
>>> be that of the destination nested container.
>>> But here it looks like we are matching on the OVN outport being the
>>> parent port and saying that if the outport is parent port, we should save
>>> the physical inport.
>>> Is that what we are doing? I don't quite understand how this helps to
>>> prevent packet from going back to the originator child port. I am sure I am
>>> missing something simple here.
>>>
>>> +    if (parent_port_key) {
>>>> +        match_init_catchall(&match);
>>>> +        ofpbuf_clear(ofpacts_p);
>>>> +        match_set_metadata(&match, htonll(dp_key));
>>>> +        match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0,
>>>> parent_port_key);
>>>> +        match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
>>>> +                             MLF_NESTED_CONTAINER,
>>>> MLF_NESTED_CONTAINER);
>>>> +
>>>> +        put_stack(MFF_IN_PORT, ofpact_put_STACK_PUSH(ofpacts_p));
>>>> +        put_load(0, MFF_IN_PORT, 0, 16, ofpacts_p);
>>>> +        put_resubmit(OFTABLE_LOG_TO_PHY, ofpacts_p);
>>>> +        put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
>>>> +        /* When a parent port has multiple child ports, we don't want
>>>> to
>>>> +         * log the duplicate flow message.
>>>> +         */
>>>> +        ofctrl_check_and_add_flow(flow_table, OFTABLE_SAVE_INPORT,
>>>> 100, 0,
>>>> +                                  &match, ofpacts_p, false);
>>>> +    }
>>>>  }
>>>>
>>>>  static void
>>>> @@ -328,7 +348,7 @@ consider_port_binding(struct ovsdb_idl_index
>>>> *sbrec_chassis_by_name,
>>>>          }
>>>>
>>>>          struct zone_ids binding_zones = get_zone_ids(binding,
>>>> ct_zones);
>>>> -        put_local_common_flows(dp_key, port_key, false, &binding_zones,
>>>> +        put_local_common_flows(dp_key, port_key, 0, &binding_zones,
>>>>                                 ofpacts_p, flow_table);
>>>>
>>>>          match_init_catchall(&match);
>>>> @@ -452,6 +472,7 @@ consider_port_binding(struct ovsdb_idl_index
>>>> *sbrec_chassis_by_name,
>>>>
>>>>      int tag = 0;
>>>>      bool nested_container = false;
>>>> +    const struct sbrec_port_binding *parent_port = NULL;
>>>>      ofp_port_t ofport;
>>>>      bool is_remote = false;
>>>>      if (binding->parent_port && *binding->parent_port) {
>>>> @@ -463,6 +484,8 @@ consider_port_binding(struct ovsdb_idl_index
>>>> *sbrec_chassis_by_name,
>>>>          if (ofport) {
>>>>              tag = *binding->tag;
>>>>              nested_container = true;
>>>> +            parent_port = lport_lookup_by_name(
>>>> +                sbrec_port_binding_by_name, binding->parent_port);
>>>>          }
>>>>      } else {
>>>>          ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
>>>> @@ -523,7 +546,8 @@ consider_port_binding(struct ovsdb_idl_index
>>>> *sbrec_chassis_by_name,
>>>>           */
>>>>
>>>>          struct zone_ids zone_ids = get_zone_ids(binding, ct_zones);
>>>> -        put_local_common_flows(dp_key, port_key, nested_container,
>>>> &zone_ids,
>>>> +        uint32_t parent_port_key = parent_port ?
>>>> parent_port->tunnel_key : 0;
>>>> +        put_local_common_flows(dp_key, port_key, parent_port_key,
>>>> &zone_ids,
>>>>                                 ofpacts_p, flow_table);
>>>>
>>>>          /* Table 0, Priority 150 and 100.
>>>> @@ -553,8 +577,10 @@ consider_port_binding(struct ovsdb_idl_index
>>>> *sbrec_chassis_by_name,
>>>>              if (nested_container) {
>>>>                  /* When a packet comes from a container sitting behind
>>>> a
>>>>                   * parent_port, we should let it loopback to other
>>>> containers
>>>> -                 * or the parent_port itself. */
>>>> -                put_load(MLF_ALLOW_LOOPBACK, MFF_LOG_FLAGS, 0, 1,
>>>> ofpacts_p);
>>>> +                 * or the parent_port itself. Indicate this by setting
>>>> the
>>>> +                 * MLF_NESTED_CONTAINER_BIT in MFF_LOG_FLAGS.*/
>>>> +                put_load(1, MFF_LOG_FLAGS, MLF_NESTED_CONTAINER_BIT, 1,
>>>> +                         ofpacts_p);
>>>>              }
>>>>              ofpact_put_STRIP_VLAN(ofpacts_p);
>>>>          }
>>>> diff --git a/ovn/lib/logical-fields.h b/ovn/lib/logical-fields.h
>>>> index b1dbb035a..ab31327e9 100644
>>>> --- a/ovn/lib/logical-fields.h
>>>> +++ b/ovn/lib/logical-fields.h
>>>> @@ -50,6 +50,7 @@ enum mff_log_flags_bits {
>>>>      MLF_FORCE_SNAT_FOR_DNAT_BIT = 2,
>>>>      MLF_FORCE_SNAT_FOR_LB_BIT = 3,
>>>>      MLF_LOCAL_ONLY_BIT = 4,
>>>> +    MLF_NESTED_CONTAINER_BIT = 5,
>>>>  };
>>>>
>>>>  /* MFF_LOG_FLAGS_REG flag assignments */
>>>> @@ -75,6 +76,9 @@ enum mff_log_flags {
>>>>       * hypervisors should instead only be output to local targets
>>>>       */
>>>>      MLF_LOCAL_ONLY = (1 << MLF_LOCAL_ONLY_BIT),
>>>> +
>>>> +    /* Indicate that a packet has received from a nested container. */
>>>> +    MLF_NESTED_CONTAINER = (1 << MLF_NESTED_CONTAINER_BIT),
>>>>  };
>>>>
>>>>  #endif /* ovn/lib/logical-fields.h */
>>>> diff --git a/tests/ovn.at b/tests/ovn.at
>>>> index 769e09f81..a7aba5ccd 100644
>>>> --- a/tests/ovn.at
>>>> +++ b/tests/ovn.at
>>>> @@ -7038,6 +7038,17 @@
>>>> packet=${dst_mac}${src_mac}8100000208004500001c000000003f110100${src_ip}${dst_ip
>>>>  echo  $packet >> expected1
>>>>  OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
>>>>
>>>> +# Send broadcast packet from foo1. foo1 should not receive the same
>>>> packet.
>>>> +src_mac="f00000010205"
>>>> +dst_mac="ffffffffffff"
>>>> +src_ip=`ip_to_hex 192 168 1 2`
>>>> +dst_ip=`ip_to_hex 255 255 255 255`
>>>>
>>>> +packet=${dst_mac}${src_mac}8100000108004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
>>>> +as hv1 ovs-appctl netdev-dummy/receive vm1 $packet
>>>> +
>>>> +# expected packet at VM1
>>>> +OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
>>>> +
>>>>  OVN_CLEANUP([hv1],[hv2])
>>>>
>>>>  AT_CLEANUP
>>>> --
>>>> 2.17.1
>>>>
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev at openvswitch.org
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>
>>>


More information about the dev mailing list