[ovs-dev] [PATCH 1/1] daemon-unix: Support OVS-DPDK HW offloads for non-root user

Ilya Maximets i.maximets at ovn.org
Thu Mar 11 21:07:44 UTC 2021

On 3/11/21 9:44 PM, David Marchand wrote:
> On Wed, Sep 16, 2020 at 10:06 PM Aaron Conole <aconole at redhat.com> wrote:
>> David Marchand <david.marchand at redhat.com> writes:
>>> On Tue, Sep 15, 2020 at 12:52 PM Ameer Mahagneh <ameerm at nvidia.com> wrote:
>>>> For security reasons only root or privileged user can allocate Interconnect
>>>> Context Memory (ICM). Add this capability for vendors that require ICM
>>>> allocation when applying DPDK rte flows.
>>>> Signed-off-by: Ameer Mahagneh <ameerm at nvidia.com>
>>>> Acked-by: Eli Britstein <elibr at nvidia.com>
>>>> ---
>> Why is this needed?  SYS_RAWIO is extremely privileged and means that
>> there is no point even in dropping privs or changing UID - the process
>> with these caps is allowed to alter anything, map /dev/mem and
>> /dev/kmem, etc.
>> Is there really no other way of doing this?  This feels somewhat like a
>> security regression rather than an improvement.  NOTE that we cannot
>> even use an LSM to protect against this - sys_rawio is able to perform
>> operations that can subvert LSMs.
> I had forgotten about this patch... I was expecting someone from
> Nvidia to reply but I see nothing on the ml.
> I do not have the full story, but I hit an issue just yesterday and
> spent today figuring this out.
> For me, the impact is simple: without this capability, full
> hw-offloads with mlx5 devices are unavailable with ovs running as non
> root.
> The logs are not helping btw, example:
> 2021-03-11T17:48:01.407Z|00062|netdev_offload_dpdk(dp_netdev_flow_5)|WARN|dpdk0:
> rte_flow creation failed: 1 ((null)).

At least, I think mlx driver should provide better error for this case
instead of 'null', e.g. EPERM with some meaningful message, so users
will know that they need to bump their privilege level.

> 2021-03-11T17:48:01.407Z|00063|netdev_offload_dpdk(dp_netdev_flow_5)|WARN|dpdk0:
> Failed flow:   flow create 2 ingress priority 0 group 0 transfer
> pattern eth src is 0c:42:a1:00:a8:7c dst is 6a:20:8f:82:52:49 type is
> 0x0800 / ipv4 / end actions count / port_id original 0 id 5 / end
> And OVS automatically falls back to partial offloading.
> Can nvidia people explain the need for this capability and if other
> options have been considered?
> Thanks.

More information about the dev mailing list