[ovs-dev] [RFC PATCH v2 1/1] netdev-dpdk: Fix requested MTU size validation.

Kavanagh, Mark B mark.b.kavanagh at intel.com
Mon Jan 22 10:58:07 UTC 2018


Hey Ian,

Thanks for this patch.

Apart from the issues that Flavio pointed out, I have a few additional comments inline.

Cheers,
Mark

>From: Stokes, Ian
>Sent: Friday, January 19, 2018 2:47 PM
>To: Flavio Leitner <fbl at sysclose.org>
>Cc: dev at openvswitch.org; Kavanagh, Mark B <mark.b.kavanagh at intel.com>
>Subject: RE: [ovs-dev] [RFC PATCH v2 1/1] netdev-dpdk: Fix requested MTU
>size validation.
>
>> Hi Ian,
>>
>> On Thu, Jan 18, 2018 at 05:32:22PM +0000, Ian Stokes wrote:
>> > This commit replaces MTU_TO_FRAME_LEN(mtu) with
>> > MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
>> > NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
>> >
>> > When setting an MTU we first check if the requested MTU frame size
>> > will exceed the maximum packet frame size supported in
>> > netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
>> > ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a
>> > later stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).
>>
>> Perhaps?
>> dpdk_eth_dev_init()
>> rte_eth_dev_set_mtu()
>>
>>
>> > However when using rte_ethdev_set_mtu(mtu) the calculation used to
>> > check that the MTU does not exceed the max frame length for that
>> > device varies between DPDK device drivers. For example ixgbe driver
>> > calculates MTU frame length  as
>> >
>> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
>> >
>> > i40e driver calculates it as
>> >
>> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2
>> >
>> > em driver calculates it as
>> >
>> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
>> >
>> > Currently it is possible to set an MTU for a netdev_dpdk device that
>> > exceeds the upper limit MTU for that devices DPDK driver. This leads
>> > to a segfault. This is because the MTU frame length comparison as is,
>> > does not take into account the addition of the vlan tag overhead
>> > expected in the drivers. The netdev_dpdk_set_mtu() call will
>> > incorrectly succeed but the subsequent dpdk_ethdev_init() will fail
>> > before the queues have been created for the DPDK device. This coupled
>> > with assumptions regarding reconfiguration requirements for the netdev
>> > will lead to a segfault when the rxq is polled for this device.
>> >
>> > A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when
>> > validating a requested MTU in netdev_dpdk_set_mtu().
>> > MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following:
>> >
>> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN)
>> >
>> > It's use accounts for the potential of up to two vlan headers being

Since you're spinning a V3 anyway... ;)
<micronit>Its, not "It's"</micronit>

>> > included in the overhead required by some devices in MTU frame length
>> > at the netdev_dpdk_set_mtu() stage. This allows OVS to flag an error
>> > rather than the DPDK driver if the MTU exceeds the max DPDK frame
>> > length. OVS can fail gracefully at this point by falling back to a

I think 'falling back' here is a bit misleading, as it suggest some kind of rollback, or previous state restoration.
In truth, the MTU in this case never changes; it was initialized to 1500, as stays as such, since the new value was rejected.

>> > default MTU of 1500 and continue to configure the port.
>> >
>> > Note: this fix is a work around, a better approach would be if DPDK
>> > devices could report the maximum MTU value that can be requested on a
>> > per device basis. This capability however is not currently available.
>> > A downside of this patch is that the MTU upper limit will be reduced
>> > by 8 bytes for DPDK devices that do not need to account for vlan tags
>> > in MTU frame length driver calculations e.g. ixgbe devices upper MTU
>> > limit is reduced from the OVS point of view from 9710 to 9702.
>> >
>> > CC: Mark Kavanagh <mark.b.kavanagh at intel.com>
>> > Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
>> > Signed-off-by: Ian Stokes <ian.stokes at intel.com>
>> >
>> > ---
>> > v1->v2
>> > * Add note to limitations in DPDK documentation regarding MTU.
>> > * Correct MTU calculation in commit message.
>> > * Flag that we provision 8 bytes (2 x vlan header) by using
>> >   MTU_TO_MAX_FRAME_LEN in commit message and code comments.
>> > ---
>> >  Documentation/intro/install/dpdk.rst |   12 ++++++++++++
>> >  lib/netdev-dpdk.c                    |   13 ++++++++++++-
>> >  2 files changed, 24 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/Documentation/intro/install/dpdk.rst
>> > b/Documentation/intro/install/dpdk.rst
>> > index 3fecb5c..5800096 100644
>> > --- a/Documentation/intro/install/dpdk.rst
>> > +++ b/Documentation/intro/install/dpdk.rst
>> > @@ -585,6 +585,18 @@ Limitations
>> >
>> >  .. _DPDK release notes:
>> > http://dpdk.org/doc/guides/rel_notes/release_17_11.html
>> >
>> > +- Upper bound MTU: DPDK device drivers differ in how MTU overhead is
>> > +  calculated e.g. i40e driver includes 2 x vlan headers in MTU
>> > +overhead,

As a general point, this is technically frame (i.e L2) overhead, whereas MTU is an L3 concept. I'd rephrase it as 'how the L2 overhead for a given MTU value is calculated'.

>> > +  em driver includes 1 x vlan header, ixgbe driver does not include a
>> > +vlan
>> > +  header in overhead. Currently it is not possible for OVS DPDK to
>> > +know what
>> > +  upper bound MTU value is supported for a given device. As such OVS
>> > +DPDK
>> > +  must provision for the case where the  maximum MTU value includes 2
>> > +x vlan
>> > +  headers. This reduces the upper bound MTU value for devices that do
>> > +not
>> > +  include vlan headers in their overhead by 8 bytes e.g. ixgbe
>> > +devices  upper
>>
>> two spaces
>>
>> > +  bound MTU is reduced from 9710 to 9702. This work around is
>> > + temporary and  is expected to be removed once a method is provided
>> > + by DPDK to query the  maximum MTU for a given device.

Has it been confirmed that DPDK will provide such functionality? If not, you may want to rephrase the previous sentence, so as not to set unrealistic expectations.

>> > +
>> >  Reporting Bugs
>> >  --------------
>> >
>> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
>> > e32c7f6..2ac76e3 100644
>> > --- a/lib/netdev-dpdk.c
>> > +++ b/lib/netdev-dpdk.c
>> > @@ -2134,7 +2134,18 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int
>> > mtu)  {
>> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>> >
>> > -    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>> > +    /* XXX: We need to ensure the requested MTU frame length does not

I'd probably rephrase to something along these lines: "ensure that the overall frame length of the requested MTU...".

>> > +     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ
>> > +     * in how the MTU frame length is calculated when
>> rte_ethdev_set_mtu(mtu)
>> > +     * is called e.g. i40e driver includes 2 x vlan headers in MTU
>> overhead
>> > +     * the em driver includes 1 x vlan tag in MTU overhead, the ixgbe
>> driver
>> > +     * does not include vlan tags in MTU overhead. As such we should
>> use
>> > +     * MTU_TO_MAX_FRAME_LEN(mtu) which includes an additional 2 x vlan
>> headers
>> > +     * (8 bytes) for comparison. This avoids a failure later with
>> > +     * rte_ethdev_set_mtu(). This approach should be used until DPDK
>> > + provides
>>
>> rte_eth_dev_set_mtu
>>
>> > +     * a method to retrieve maximum MTU frame length for a given
>> device.
>> > +     */
>> > +    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>> >          || mtu < ETHER_MIN_MTU) {
>> >          VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
>> >          return EINVAL;
>>
>> We got a bug report for another related MTU issue because in older
>> OVS/DPDK, the i40e driver didn't include the 2 VLAN headers overhead, so
>> if the packet length is the maximum allowed, it can't forward packets with
>> VLAN headers included. So, I agree that this needs improvements on the
>> DPDK side to report not only the maximum MTU but also to standardize what
>> means MTU. Today we don't know the overhead each PMD will account.
>>
>> In the hope that the committer can fix the small typos above or if there
>> is another spin:
>>
>> Acked-by: Flavio Leitner <fbl at sysclose.org>
>
>Thanks Flavio,
>
>I agree, we can add this to requirements for DPDK MTU capabilities.
>
>I'll re-spin a v3 and remove the RFC, once there's an ACK from Mark I'll add
>this to the DPDK_MERGE over the weekend.
>
>Thanks
>Ian
>>
>> Thanks Ian!
>>
>>
>> > --
>> > 1.7.0.7
>> >
>> > _______________________________________________
>> > dev mailing list
>> > dev at openvswitch.org
>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>> --
>> Flavio



More information about the dev mailing list