[ovs-dev] [PATCH 1/1] daemon-unix: Support OVS-DPDK HW offloads for non-root user

Gaetan Rivet gaetanr at nvidia.com
Fri Mar 19 16:59:30 UTC 2021


On Mon, Mar 15, 2021, at 08:53, Maxime Coquelin wrote: 
> 
> 
> On 3/11/21 10:07 PM, Ilya Maximets wrote:
> > On 3/11/21 9:44 PM, David Marchand wrote:
> >> On Wed, Sep 16, 2020 at 10:06 PM Aaron Conole <aconole at redhat.com> wrote:
> >>>
> >>> David Marchand <david.marchand at redhat.com> writes:
> >>>
> >>>> On Tue, Sep 15, 2020 at 12:52 PM Ameer Mahagneh <ameerm at nvidia.com> wrote:
> >>>>>
> >>>>> For security reasons only root or privileged user can allocate Interconnect
> >>>>> Context Memory (ICM). Add this capability for vendors that require ICM
> >>>>> allocation when applying DPDK rte flows.
> >>>>>
> >>>>> Signed-off-by: Ameer Mahagneh <ameerm at nvidia.com>
> >>>>> Acked-by: Eli Britstein <elibr at nvidia.com>
> >>>>> ---
> >>>
> >>> Why is this needed?  SYS_RAWIO is extremely privileged and means that
> >>> there is no point even in dropping privs or changing UID - the process
> >>> with these caps is allowed to alter anything, map /dev/mem and
> >>> /dev/kmem, etc.
> >>>
> >>> Is there really no other way of doing this?  This feels somewhat like a
> >>> security regression rather than an improvement.  NOTE that we cannot
> >>> even use an LSM to protect against this - sys_rawio is able to perform
> >>> operations that can subvert LSMs.
> >>
> >> I had forgotten about this patch... I was expecting someone from
> >> Nvidia to reply but I see nothing on the ml.
> >>
> >> I do not have the full story, but I hit an issue just yesterday and
> >> spent today figuring this out.
> >>
> >> For me, the impact is simple: without this capability, full
> >> hw-offloads with mlx5 devices are unavailable with ovs running as non
> >> root.
> >> The logs are not helping btw, example:
> >> 2021-03-11T17:48:01.407Z|00062|netdev_offload_dpdk(dp_netdev_flow_5)|WARN|dpdk0:
> >> rte_flow creation failed: 1 ((null)).
> > 
> > At least, I think mlx driver should provide better error for this case
> > instead of 'null', e.g. EPERM with some meaningful message, so users
> > will know that they need to bump their privilege level.
> 
> +1
> 
> Please note that on my side with ConnectX-6 Dx, without this capability,
> flows are marked as partially offloaded but no packet is received on the
> VFs. Adding the proper capability fixes it (except that sometimes, for
> unknown reasons, half the flow are partially offloaded).
> 
> Regards,
> Maxime
> 
> >> 2021-03-11T17:48:01.407Z|00063|netdev_offload_dpdk(dp_netdev_flow_5)|WARN|dpdk0:
> >> Failed flow:   flow create 2 ingress priority 0 group 0 transfer
> >> pattern eth src is 0c:42:a1:00:a8:7c dst is 6a:20:8f:82:52:49 type is
> >> 0x0800 / ipv4 / end actions count / port_id original 0 id 5 / end
> >> And OVS automatically falls back to partial offloading.
> >>
> >> Can nvidia people explain the need for this capability and if other
> >> options have been considered?
> >>
> >>
> >> Thanks.
> >>
>

Hello everyone,

We should have addressed this issue earlier, sorry about that.

Our rte_flow implementation uses ICM mappings to program our hardware,
which requires super privileged access. We are looking into ways to avoid it.

In the meantime, we failed to properly communicate this need in the rte_flow API.
We will improve the documentation and the error path in DPDK.

I can also update OVS documentation if anyone thinks it could help, but it is vendor-specific.
I would expect it to be more relevant at the DPDK level.

Best regards,
-- 
Gaetan Rivet 


More information about the dev mailing list