[ovs-dev] [PATCH 1/1] netdev-tc-offloads: Support match on priority tags

Louis Peens louis.peens at netronome.com
Thu Jun 27 12:55:37 UTC 2019


On Fri, Jun 14, 2019 at 6:01 PM Eli Britstein <elibr at mellanox.com> wrote:

>
> On 6/14/2019 5:40 PM, Louis Peens wrote:
>
>
>
> On Wed, Jun 12, 2019 at 11:53 AM Eli Britstein <elibr at mellanox.com> wrote:
>
>>
>> On 6/11/2019 2:21 PM, Ilya Maximets wrote:
>> >> On Mon, Jun 3, 2019 at 4:42 PM Simon Horman <simon.horman at
>> netronome.com
>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnetronome.com&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227223260&sdata=nO14Y2PABBRpM4JGuW26l5Fg5oP47m0VDz10xGCcheI%3D&reserved=0>
>> >
>> >> wrote:
>> >>
>> >>> On Fri, May 31, 2019 at 02:45:25PM +0200, Simon Horman wrote:
>> >>>> Sorry Eli,
>> >>>>
>> >>>> I had missed this but I have it now.
>> >>>>
>> >>>> On Fri, 31 May 2019 at 09:35, Eli Britstein <elibr at mellanox.com>
>> wrote:
>> >>>>
>> >>>>> Ping
>> >>>>> ------------------------------
>> >>>>> *From:* Eli Britstein <elibr at mellanox.com>
>> >>>>> *Sent:* Tuesday, May 21, 2019 3:11:51 PM
>> >>>>> *To:* dev at openvswitch.org
>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fopenvswitch.org&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227223260&sdata=t3dIZiyDY7ZUmrjZQd%2FwCQ3QSCxsQGhVa8bevoD1zwQ%3D&reserved=0>;
>> Simon Horman
>> >>>>> *Cc:* Roi Dayan; Paul Blakey; Eli Cohen; Eli Britstein
>> >>>>> *Subject:* [PATCH 1/1] netdev-tc-offloads: Support match on priority
>> >>> tags
>> >>>>> The logic by which a TC rule has a VLAN match is by the VLAN TCI
>> field,
>> >>>>> either the VID, PCP or CFI are non-zero. For priority-tag packets
>> >>>>> there is a VLAN tag header with a zero VLAN TCI. Match on existence
>> of
>> >>>>> VLAN header (TPID) regardless of TCI matching.
>> >>>>>
>> >>>>> Signed-off-by: Eli Britstein <elibr at mellanox.com>
>> >>>>> Reviewed-by: Roi Dayan <roid at mellanox.com>
>> >>> Thanks this seems to be a nice enhancement, applied to master.
>> >>>
>> >> Hey
>> >>
>> >> During some internal testing we found that this patch broke some
>> >> set-L4 cases when using tc-offloads. Setting up a simple bridge and
>> >> flow rule looking something like this:
>> >>
>> >>    ovs-vsctl show:
>> >>      Bridge "br0"
>> >>          Port "br0"
>> >>              Interface "br0"
>> >>                  type: internal
>> >>          Port "veth0r"
>> >>              Interface "veth0r"
>> >>          Port "veth1r"
>> >>              Interface "veth1r"
>> >>
>> >>    ovs-ofctl add-flow br0
>> cookie=1,table=0,in_port=veth0r,tcp,tcp_dst=4000, \
>> >>      actions=set_field:4002-\>tcp_dst,output:veth1r
>> >>
>> >> and then sending VLAN tagged traffic leads to packets not egressing the
>> >> port.
>> >>
>> >> The TC rule that gets installed looks like this:
>> >>    vlan_ethtype ip
>> > There must be no vlan_ethertype match just because there was no such
>> field match
>> > in the original flow.
>> >
>> > Looking at the code I see that it checks for key.tpid which makes no
>> sense,
>> > because it must check for the mask first. At least there must be
>> something like
>> > this:
>> >
>> > diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>> > index 2af0f10d9..7e7160426 100644
>> > --- a/lib/netdev-offload-tc.c
>> > +++ b/lib/netdev-offload-tc.c
>> > @@ -1146,9 +1146,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct
>> match *match,
>> >       }
>> >       mask->mpls_lse[0] = 0;
>> >
>> > -    if (eth_type_vlan(key->vlans[0].tpid)) {
>> > +    if (mask->vlans[0].tpid && eth_type_vlan(key->vlans[0].tpid)) {
>> >           flower.key.encap_eth_type[0] = flower.key.eth_type;
>> > +        flower.mask.encap_eth_type[0] = flower.mask.eth_type;
>> >           flower.key.eth_type = key->vlans[0].tpid;
>> > +        flower.mask.eth_type = mask->vlans[0].tpid;
>> >       }
>> >       if (mask->vlans[0].tci) {
>> >           ovs_be16 vid_mask = mask->vlans[0].tci & htons(VLAN_VID_MASK);
>> > @@ -1179,9 +1181,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct
>> match *match,
>> >           }
>> >       }
>> >
>> > -    if (eth_type_vlan(key->vlans[1].tpid)) {
>> > +    if (mask->vlans[1].tpid && eth_type_vlan(key->vlans[1].tpid)) {
>> >           flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0];
>> > +        flower.mask.encap_eth_type[1] = flower.mask.encap_eth_type[0];
>> >           flower.key.encap_eth_type[0] = key->vlans[1].tpid;
>> > +        flower.mask.encap_eth_type[0] = mask->vlans[1].tpid;
>> >       }
>> >       if (mask->vlans[1].tci) {
>> >           ovs_be16 vid_mask = mask->vlans[1].tci & htons(VLAN_VID_MASK);
>> > ---
>> >
>> > I'm not an expert here, so it needs checking and review form someone
>> more
>> > familiar with flower.
>> This seems correct. Need to test it.
>> >
>> >
>> >>    eth_type ipv4
>> >>    ip_proto tcp
>> >>    dst_port 4000
>> >>    ip_flags nofrag
>> >>    not_in_hw
>> >>          action order 1:  pedit action pipe keys 1
>> >>           index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>           key #0  at tcp+0: val 00000fa2 mask ffff0000
>> >>          Action statistics:
>> >>          Sent 368 bytes 8 pkt (dropped 0, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>
>> >>          action order 2: csum (tcp) action pipe
>> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>          Action statistics:
>> >>          Sent 368 bytes 8 pkt (dropped 8, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>
>> >>          action order 3: mirred (Egress Redirect to device ens260np1)
>> stolen
>> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>          Action statistics:
>> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>          cookie 8eb331199f4d41dc739e769a2f79f1ba
>> >>
>> >> Note the drop counters in the csum(tcp) section. Before the patch the
>> rule
>> >> looks like this:
>> >>    eth_type ipv4
>> >>    ip_proto tcp
>> >>    dst_port 4000
>> >>    ip_flags nofrag
>> >>    not_in_hw
>> >>          action order 1:  pedit action pipe keys 1
>> >>           index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>           key #0  at tcp+0: val 00000fa2 mask ffff0000
>> >>          Action statistics:
>> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>
>> >>          action order 2: csum (tcp) action pipe
>> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>          Action statistics:
>> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>
>> >>          action order 3: mirred (Egress Redirect to device ens260np1)
>> stolen
>> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>          Action statistics:
>> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>          cookie 83b3464d0c47e3747956bca7e8857a53
>> >>
>> >> Here I suspect there may be a different issue as none of the counters
>> >> increase
>> >> so the TC rule is probably never hit (confirmed by the datapath rule
>> timing
>> >> out even with continuous traffic), but at least packets are correctly
>> >> egressing via the userspace fallback path. It looks like this patch is
>> >> installing a rule in TC that does get hit, but is now mysteriously
>> >> dropping packets in the middle of the action processing pipeline.
>> >>
>> >> Another strange observation is that this only seems to happen when
>> setting
>> >> one of the layer4 fields, setting layer3 seems to work as expected.
>>
>> I tried setting such tc rule (using tc tool), and it works OK for me.
>>
>> Make sure you have commit 2ecba2d1e45b ("net: sched: act_csum: Fix csum
>> calc for tagged packets") in your kernel.
>>
> Hey, thanks for the input so far. Currently I'm still seeing this issue,
> even with the above mentioned
> csum patch included, but I suspect the issue I'm seeing is probably
> related. I can reproduce the issue using
> TC rules only, so this moves the blame from OVS. In case somebody
> wants do dig deeper, here is my TC setup:
>
> echo "Create veth interfaces"
> ip link add $V0r type veth peer name $V0n
> ip link add $V1r type veth peer name $V1n
> echo "Recreate qdiscs"
> tc qdisc del dev $V0r handle ffff: ingress
> tc qdisc del dev $V1r handle ffff: ingress
> tc qdisc add dev $V0r handle ffff: ingress
> tc qdisc add dev $V1r handle ffff: ingress
> echo "Add flow rules"
> match="802.1Q flower vlan_ethtype ipv4 ip_proto tcp dst_port 4000 ip_flags
> nofrag"
> action="pedit ex munge tcp dport set 4002 pipe csum tcp "
> action+="pipe mirred egress redirect dev $V1r"
> tc filter add dev $V0r parent ffff: protocol ${match} action ${action}
>
> The pcap I use looks like this:
> reading from file tcp_vlan.pcap, link-type EN10MB (Ethernet)
> 15:16:55.293780 52:12:ef:32:45:ab > 54:a2:32:ef:54:1b, ethertype 802.1Q
> (0x8100), length 64: vlan 100, p 2, ethertype IPv4, (tos 0xc, ttl 64, id 1,
> offset 0, flags [none], proto TCP (6), length 46)
>     1.2.3.4.3000 > 5.6.7.8.4000: Flags [S], cksum 0x2529 (correct), seq
> 0:6, win 8192, length 6
>
> With the above config I see the same behaviour where the packets are
> dropped at the csum action part. Further instrumentation in the
> act_csum module shows that there is skb corruption happening - the 4002
> value that is suppose to end up in the TCP destination
> field ends up in the IP-header length field. This makes me think that
> there is possibly another issue similar to the act_csum one that needs to
> be fixed in
> act_pedit but I've not found that yet.
>
> This was on the linux stable tree at tag v5.2-rc4. Hope somebody here with
> more knowledge on the kernel tc code can provide some input.
> Thanks
>
> Can you try this? it is still not final (and not yet accepted), but maybe
> it fixes your issue
>
>
> https://lore.kernel.org/netdev/830aa8f07b528b50b212c01d53de6ec651500535.1559322531.git.dcaratti@redhat.com/
>
Hey
Sorry about the delay, I only got back to this now. Thanks for the pointer
to the patches above, they do seem to be in the
correct direction. Unfortunately they don't make a difference in my case
since it only makes changes for L3 and not L4 (tcp src/dst in my test).
I tried implementing something similar for L4 but could not do so with any
success yet.
I did find that both: "skb_network_offset" and "skb_transport_offset" are 0
in the
"TCA_PEDIT_KEY_EX_HDR_TYPE_TCP/UDP" case statement in
"pedit_skb_hdr_offset". This is inside the
"skb_transport_header_was_set" check, so I would expect the
transport_offset to be non-zero or at least different to the network_offset.
That explains why pedit is changing the value at the incorrect offset,
however I could not figure out what is causing this yet.

That is the update from my side, maybe this triggers another idea from
someone on what is going wrong.
Thanks

>
> Louis Peens
>
>>
>> >>
>> >> Regards
>> >> Louis Peens
>> >>
>> >>
>> >>> _______________________________________________
>> >>> dev mailing list
>> >>> dev at openvswitch.org
>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fopenvswitch.org&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227233250&sdata=IX1K1vVoW%2B1EH8WJzpZ8sY0jHbpH%2F5fkSj5HE3rAfvU%3D&reserved=0>
>> >>>
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&amp;data=02%7C01%7Celibr%40mellanox.com%7Cc6de245c204d41f12c4408d6ee5ef971%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636958489017862941&amp;sdata=Z4b89HrFIv%2F%2BMnGVuyUqACYLBag%2Fmxi8YYt8i0omc3A%3D&amp;reserved=0
>> <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227233250&sdata=7THf8MLACqYwOrnsT1xIOiBS1mfMJyBPOM5ijl%2Fwr44%3D&reserved=0>
>> >>>
>>
>
>
>


More information about the dev mailing list