[ovs-dev] [PATCH v4 1/4] Add support for 802.1ad (QinQ tunneling)
Xiao Liang
shaw.leon at gmail.com
Sun Jul 31 00:22:47 UTC 2016
Thanks! I'm working on code changes according to your comments. I
think we need more discussion about the ethertype matching. Please see
inline.
Thanks,
Xiao
On Thu, Jul 28, 2016 at 2:40 AM, Ben Pfaff <blp at ovn.org> wrote:
> On Tue, Jul 12, 2016 at 11:38:54PM +0800, Xiao Liang wrote:
>> Flow key handleing changes:
>> - Add VLAN header array in struct flow, to record multiple 802.1q VLAN
>> headers.
>> - Add dpif multi-VLAN capability probing. If datapath supports multi-VLAN,
>> increase the maximum depth of nested OVS_KEY_ATTR_ENCAP.
>>
>> Refacter VLAN handling in dpif-xlate:
>> - Introduce 'xvlan' to track VLAN stack during flow processing.
>> - Input and output VLAN translation according to the xbundle type.
>>
>> Push VLAN action support:
>> - Allow ethertype 0x88a8 in VLAN headers and push_vlan action.
>> - Support push_vlan on dot1q packets.
>>
>> Add new port VLAN mode "dot1q-tunnel":
>> - Example:
>> ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100
>> Pushes another VLAN 100 header on packets (tagged and untagged) on ingress,
>> and pops it on egress.
>> - Customer VLAN check:
>> ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100 cvlans=10,20
>> Only customer VLAN of 10 and 20 are allowed.
>>
>> Signed-off-by: Xiao Liang <shaw.leon at gmail.com>
>
> Thanks for working on this! I can see that it was a great deal of
> work. In addition to the "sparse" related fixes from a few days ago,
> I have some more detailed comments.
>
> It looks like this patch causes some tests to fail and then later
> patches fix them. This is not our practice: a patch should never make
> tests fails (at least not intentionally). Instead, if a patch
> requires changes to tests, then the same patch should update the
> tests. If the later patches are required to fix tests, they should be
> folded into this one.
I will squash the commits.
>
> Our practice is to give arrays names that are plural, like 'vlans'.
>
> Let's talk about VLANs in the flow struct. The use of an array of
> TPID+TCI seems fine. I think, however, that we need to define and
> enforce an invariant: if vlan[i] matches on anything, for i > 0, then
> every vlan[j], for j < i, needs to match on (tci & VLAN_CFI) != 0.
> Otherwise we can end up with nonsensical matches like one that matches
> on vlan[0] being not present (i.e. match on (vlan[0] & VLAN_CFI) == 0)
> but vlan[1] being present (i.e. match on (vlan[1] & VLAN_CFI) != 0).
> We enforce a similar invariant for MPLS.
>
You are right, thanks.
> I'm concerned about backward compatibility. Consider some application
> built on Open vSwitch using OpenFlow. Today, it can distinguish
> single-tagged and double-tagged packets (that use outer Ethertype
> 0x8100), as follows:
>
> - A single-tagged packet has vlan_tci != 0 and some non-VLAN
> dl_type.
>
> - A double-tagged packet has vlan_tci != 0 and a VLAN dl_type.
>
> With this patch, this won't work, because neither kind of packet has a
> VLAN dl_type. Instead, applications need to first match on the outer
> VLAN, then pop it off, then match on the inner VLAN. This difference
> could lead to security problems in applications. An application
> might, for example, want to pop an outer VLAN and forward the packet,
> but only if there is no inner VLAN. If it is implemented according to
> the previous rules, then it will not notice the inner VLAN.
Maybe some applications are implemented this way, but they are
probably wrong. OpenFlow says eth_type is "ethernet type of the
OpenFlow packet payload, after VLAN tags", so it is the payload
ethtype for a double-tagged packet. It's the same for single-tagged
packet: application must explicitly match vlan_tci to decide whether
it has VLAN tag.
The problem is that there is currently no way to peek inner VLAN
without popping the outer one. I heard from Tom that you have plan to
support TPID matching. Is it possible to add extension match fields
like TPID1, TPID2 to match both TPIDs? This may also provide a way to
differentiate 0x8100 and 0x88a8.
>
> There are probably multiple ways to deal with this problem. Let me
> propose one. It is somewhat awkward, so there might be a more
> graceful way. Basically the idea is to retain the current view from
> an OpenFlow perspective:
>
> - Packet without tags: vlan_tci == 0, dl_type is non-VLAN
> - Packet with 1 tag: vlan_tci != 0, dl_type is non-VLAN
> - Packet with 2+ tags: vlan_tci != 0, dl_type is 2nd VLAN
>
> So, when a packet with 2 tags pops off the outermost one, dl_type
> becomes the inner Ethertype (such as IPv4) and vlan_tci becomes the
> single remaining VLAN.
>
> I think there's another backward compatibility risk. This patch adds
> support for TPID 0x88a8 without adding any way for OpenFlow
> applications to distinguish 88a8 from 8100. This means that
> applications that previously handled 0x8100 VLANs will now handle
> 0x88a8 VLANs whereas previously they were able to filter these out. I
> don't have a wonderful idea on how to handle this but I think we need
> some way. (The OpenFlow spec is totally unhelpful here.)
>
> The tests seem incomplete in that they do not seem to add much in the
> way of tests for OpenFlow handling of multiple VLANs. I'd feel more
> confident if it added some.
>
I found some tests added in Tom's patches. I'm trying to include them
and other tests.
> In a few places it would be more convenient to make struct
> flow_vlan_hdr into a union, like this:
>
> union flow_vlan_hdr {
> ovs_be32 qtag;
> struct {
> ovs_be16 tpid; /* ETH_TYPE_VLAN_DOT1Q or ETH_TYPE_DOT1AD */
> ovs_be16 tci;
> };
> };
That's a good idea.
>
> We could take advantage of it like this, for example:
>
> @@ -326,16 +326,12 @@ parse_mpls(const void **datap, size_t *sizep)
> }
>
> static inline bool
> -parse_vlan(const void **datap, size_t *sizep, struct flow_vlan_hdr *vlan_hdrs)
> +parse_vlan(const void **datap, size_t *sizep, union flow_vlan_hdr *vlan_hdrs)
> {
> int encaps;
> const ovs_be16 *eth_type;
> - struct qtag_prefix {
> - ovs_be16 eth_type;
> - ovs_be16 tci;
> - };
>
> - memset(vlan_hdrs, 0, sizeof(struct flow_vlan_hdr) * FLOW_MAX_VLAN_HEADERS);
> + memset(vlan_hdrs, 0, sizeof(union flow_vlan_hdr) * FLOW_MAX_VLAN_HEADERS);
> data_pull(datap, sizep, ETH_ADDR_LEN * 2);
>
> eth_type = *datap;
> @@ -343,11 +339,11 @@ parse_vlan(const void **datap, size_t *sizep, struct flow_vlan_hdr *vlan_hdrs)
> for (encaps = 0;
> eth_type_vlan(*eth_type) && encaps < FLOW_MAX_VLAN_HEADERS;
> encaps++) {
> - if (OVS_LIKELY(*sizep
> - >= sizeof(struct qtag_prefix) + sizeof(ovs_be16))) {
> - const struct qtag_prefix *qp = data_pull(datap, sizep, sizeof *qp);
> - vlan_hdrs[encaps].tpid = qp->eth_type;
> - vlan_hdrs[encaps].tci = qp->tci | htons(VLAN_CFI);
> + if (OVS_LIKELY(*sizep >= sizeof(ovs_be32) + sizeof(ovs_be16))) {
> + const ovs_16aligned_be32 *qp = data_pull(datap, sizep, sizeof *qp);
> + vlan_hdrs[encaps].qtag = get_16aligned_be32(qp);
> + vlan_hdrs[encaps].tci |= htons(VLAN_CFI);
> +
> eth_type = *datap;
> } else {
> return false;
> diff --git a/lib/flow.h b/lib/flow.h
> index 4882e26..4851a32 100644
> --- a/lib/flow.h
> +++ b/lib/flow.h
> @@ -735,8 +735,8 @@ static inline uint16_t
> miniflow_get_vid(const struct miniflow *flow, size_t n)
> {
> if (n < FLOW_MAX_VLAN_HEADERS) {
> - uint32_t u32 = MINIFLOW_GET_U32(flow, vlan[n]);
> - return vlan_tci_to_vid(((struct flow_vlan_hdr *)&u32)->tci);
> + union flow_vlan_hdr hdr = { .qtag = MINIFLOW_GET_BE32(flow, vlan[n]) };
> + return vlan_tci_to_vid(hdr.tci);
> }
> return 0;
> }
> diff --git a/lib/odp-util.c b/lib/odp-util.c
> index 56a6145..6838d7d 100644
> --- a/lib/odp-util.c
> +++ b/lib/odp-util.c
> @@ -5560,8 +5560,7 @@ commit_vlan_action(const struct flow* flow, struct flow *base,
> for (base_n--, flow_n--;
> base_n >= 0 && flow_n >= 0;
> base_n--, flow_n--) {
> - if (memcmp(&base->vlan[base_n], &flow->vlan[flow_n],
> - sizeof(struct flow_vlan_hdr))) {
> + if (base->vlan[base_n].qtag != flow->vlan[flow_n].qtag) {
> break;
> }
> }
> @@ -5569,7 +5568,7 @@ commit_vlan_action(const struct flow* flow, struct flow *base,
> /* Pop all mismatching vlan of base, push thoses of flow */
> for (; base_n >= 0; base_n--) {
> nl_msg_put_flag(odp_actions, OVS_ACTION_ATTR_POP_VLAN);
> - memset(&wc->masks.vlan[base_n], 0xff, sizeof(wc->masks.vlan[base_n]));
> + wc->masks.vlan[base_n] = OVS_BE32_MAX;
> }
>
> for (; flow_n >= 0; flow_n--) {
>
> It seems risky to fail to add a way for match_format() to print out
> matches past the first VLAN. It is likely to cause great confusion at
> some point.
>
Yes, I need to think carefully about this.
> This code uses the term "shift" for what is usually termed "push". A
> "shift" can go in either direction. I'd use "push".
>
Yes, "push" looks symmetric. I used "shift" because it leaves room for
a header rather than push data.
> I suggest that flow_{shift,push}_vlan() should take an optional 'wc'
> argument like so many other functions that modify flows. This would
> make it more like the MPLS code and thus easier to understand.
>
> I'm not confident about the change to handle_packet_upcall().
>
> struct xvlan xvlan might be easier to use as just a flow_vlan_hdr.
flow_vlan_hdr is in big-endian and in which vlan-id/pcp are encoded. I
defined struct xvlan to avoid conversion each time.
>
> The commit_vlan_action() function would be better if it was more like
> commit_mpls_action(), to make it easier to verify correctness. The
> main difference is that at the datapath level VLAN only has push and
> pop, not "set", but that's a straightforward change. I'd introduce
> functions to count and find common VLAN tags, like we have for MPLS.
Agree, extracting a function makes it more clear.
>
> I'll probably have further comments on later versions of this patch.
>
> Thanks again for working on this!
More information about the dev
mailing list