[ovs-discuss] ip6gre issue: a way to recirculate encapsulated packets back into a switch for forwarding is needed

Lazarov, Lyubomir lyubomir.lazarov at student.kit.edu
Mon Dec 9 10:05:15 UTC 2019

To whom it may concern,

I am an undergraduate student currently working on a thesis involving an OVS setup. The setup consists of multiple Docker containers each running an OVS instance. The containers are interconnected by Containernet. For brevity, a small one-hop example is described below:

In essence, node 1 GRE-encaps packets with ipv6_dst fc00::1 and fc00::2 into packets with ipv6_dst fcaa::1 and fcaa::2 respectively and forwards them to node 2. Pinging fc00::1 or fc00::2 generates traffic for node 2. In node 1 we have:

# needed to get ipv6 working in containers
sysctl net.ipv6.conf.all.disable_ipv6=0

ovs-ctl start
ovs-vsctl add-br switch
ovs-vsctl add-port switch in -- set interface in ofport=1 type=internal
ovs-vsctl add-port switch allgre -- set interface allgre ofport=2 type=ip6gre options:packet_type=legacy_l2 options:remote_ip=flow

ip l set in up
IFACE_IN_MAC=$(cat /sys/class/net/in/address)

ip -6 r add fc00::1 dev in
ip -6 r add fc00::2 dev in
ip n add fc00::1 dev in lladdr $IFACE_IN_MAC
ip n add fc00::2 dev in lladdr $IFACE_IN_MAC

ip -6 r add fcaa::1 dev n1-eth0
ip -6 r add fcaa::2 dev n1-eth0

# next hop MAC address, i.e. the MAC address of n2-eth0 on the second docker node
# n1-eth0 and n2-eth0 are connected to each other via containernet
ip n add fcaa::1 dev n1-eth0 lladdr 00:00:00:00:00:02
ip n add fcaa::2 dev n1-eth0 lladdr 00:00:00:00:00:02

ovs-ofctl add-flow switch in_port=1,ipv6,ipv6_dst=fc00::1,actions="set_field:fcaa::1->tun_ipv6_dst",output:2
ovs-ofctl add-flow switch in_port=1,ipv6,ipv6_dst=fc00::2,actions="set_field:fcaa::2->tun_ipv6_dst",output:2

#ip l show n1-eth0
#    n1-eth0 at if13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
#    link/ether 00:00:00:00:00:01 brd ff:ff:ff:ff:ff:ff link-netnsid 1

In docker node 2:

sysctl net.ipv6.conf.all.disable_ipv6=0

ovs-vsctl add-br switch
ovs-vsctl add-port switch n2-eth0 -- set interface n2-eth0 ofport=1
ovs-vsctl add-port switch allgre -- set interface allgre ofport=2 type=ip6gre options:packet_type=legacy_l2 options:remote_ip=flow

ip l set switch up
ip -6 r add local fcaa::/16 dev switch    # AnyIP kernel feature

# use case intended: no need to decapsulate, swap the outermost destination ip, forward to the next hop, possibly a third docker node (hence the output to a "potential" port with OF number 3)
ovs-ofctl add-flow switch in_port=1,ipv6,ipv6_dst=fcaa::1,actions="set_field:fcaa::ff->ipv6_dst,output:3"

# use case intended: decapsulate. Send the packet to the Linux networking stack, it will loop back and appear on the "allgre" port where further matching will be made
ovs-ofctl add-flow switch in_port=1,ipv6,ipv6_dst=fcaa::2,actions="output:LOCAL"

# use case intended: do something useful with a decapsulated packet
ovs-ofctl add-flow switch in_port=2,tun_ipv6_dst=fcaa::2,actions="..."

#ip l show n2-eth0
#    n2-eth0 at if14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
#    link/ether 00:00:00:00:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 1

The problem is that the second docker node must also be able to GRE-encap packets received on Containernet ports and set their tun_ipv6_dst to an IP in the fcaa::/16 range. Thus, we have two conflicting use cases. Some packets in the fcaa::/16 prefix need to be decapsulated and looped back to the switch for further processing/forwarding. Some packets of another prefix, e.g. fc00::/16, must be encapsulated with GRE, their outermost IP destination addresses must be set in the fcaa::/16 range based on flows, then the (now encapsulated) packets must also loop back to the switch for forwarding. I am currently unable to implement the latter use case. Because the whole fcaa::/16 prefix is bound to the switch's LOCAL port, I can decapsulate any packet from this prefix and then forward it via OVS, since I just "receive" it on the "allgre" port. However, my understanding is that tunnel ports in OVS utilise kernel routing, so my prefix bind would also mean that all packets which I encapsulate through "allgre" will bounce back and get decapsulated right away: OVS sees a GRE packet and a matching generic tunneling port for it. I need a way to get the encapsulated packets back in the switch to perform forwarding, however, they should not land on the "allgre" port.

How could I solve the described problem? Is it perhaps achieved by some kind of datapath configuration (dpctl)? Any help will be greatly appreciated.

Best regards,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20191209/65b09f58/attachment.html>

More information about the discuss mailing list