[ovs-discuss] OVS-DPDK fails after clearing buffer

Burakov, Anatoly anatoly.burakov at intel.com
Mon Apr 8 08:56:12 UTC 2019


> -----Original Message-----
> From: Tobias Hofmann -T (tohofman - AAP3 INC at Cisco)
> [mailto:tohofman at cisco.com]
> Sent: Friday, April 5, 2019 9:39 PM
> To: bugs at openvswitch.org; Burakov, Anatoly <anatoly.burakov at intel.com>
> Cc: Shriroop Joshi (shrirjos) <shrirjos at cisco.com>; Stokes, Ian
> <ian.stokes at intel.com>
> Subject: Re: [ovs-discuss] OVS-DPDK fails after clearing buffer
> 
> Hi Anatoly,
> 
> I just wanted to follow up on the issue reported below. (It's already been 2
> weeks ago)
> 
> I don’t really understand the first solution you suggested: use IOVA as VA
> mode Does that mean I shall load vfio-pci driver before I set dpdk-init to
> true? So, doing a 'modprobe vfio-pci'? Actually I use vfio-pci but I wait with
> loading the vfio-pci until I actually bind an interface to it.

Hi Tobias,

As far as I can remember, in 18.08, IOVA as VA mode will be enabled if

0) modprobe vfio-pci, enable IOMMU in the BIOS, etc.
1) you have *at least one physical device* (otherwise EAL defaults to IOVA as PA mode)
2) *all* of your *physical* devices are bound to vfio-pci

Provided all of this is true, DPDK should run in IOVA as VA mode.

Alternatively, DPDK 17.11 and 18.11 will have --iova-mode command-line switch which will allow forcing IOVA as VA mode if possible, but I'm not sure if 18.08 has it.

> 
> Also, to answer your last question: Transparent HugePages are enabled. I've
> just disabled them and was still able to reproduce the issue.

Unfortunately, I can't be of much help here as I did not look into how vmcaches work on Linux, let alone what happens when hugepages end up in said cache. I obviously don't know specifics of your use case and whether it's really necessary to drop caches, however a cursory Google search indicates that the general sentiment seems to be that you shouldn't drop caches in the first place, and that it is not a good practice in general.

> 
> Regards
> Toby
> 
> 
> On 3/21/19, 12:19 PM, "Ian Stokes" <ian.stokes at intel.com> wrote:
> 
>     On 3/20/2019 10:37 PM, Tobias Hofmann -T (tohofman - AAP3 INC at Cisco)
>     via discuss wrote:
>     > Hello,
>     >
> 
>     Hi,
> 
>     I wasnt sure at first glance what was happening so discussed with
>     Anatoly (Cc'd) who has worked a considerable amount with DPDK memory
>     models. Please see response below to what the suspected issue is.
>     Anatoly, thanks for you input on this.
> 
>     > I want to use Open vSwitch with DPDK enabled. For this purpose, I first
>     > allocate 512 HugePages of size 2MB to have a total of 1GB of HugePage
>     > memory available for OVS-DPDK. (I don’t set any value for
>     > */dpdk-socket-mem/ *so the default value of 1GB is taken). Then I set
>     > */dpdk-init=true/*. This normally works fine.
>     >
>     > However, I have realized that I can’t allocate HugePages from memory
>     > that is inside the buff/cache (visible through */free -h/*). To solve
>     > this issue, I decided to clear the cache/buffer in Linux before
>     > allocating HugePages by running */echo 1 >
> /proc/sys/vm/drop_caches/*.
>     >
>     > After that, allocation of the HugePages still works fine. However, when
>     > I then run */ovs-vsctl set open_vswitch other_config:dpdk-init=true/*
>     > the process crashes and inside the ovs-vswitchd.log I observe the
> following:
>     >
>     > *ovs-vswitchd log output:*
>     >
>     > 2019-03-18T13:32:41.112Z|00015|dpdk|ERR|EAL: Can only reserve 270
> pages
>     > from 512 requested
>     >
>     > Current CONFIG_RTE_MAX_MEMSEG=256 is not enough
> 
>     After you drop the cache, from the above log it is clear that, as a
>     result, hugepages’ physical addresses get fragmented, as DPDK cannot
>     concatenate pages into segments any more (which results in
>     1-page-per-segment type situation which causes you to run out of
> memseg
>     structures, of which there are only 256). We have no control over what
>     addresses we get from the OS, so there’s really no way to “unfragment”
>     the pages.
> 
>     So, the above only happens when
> 
>     1) you’re running in IOVA as PA mode (so, using real physical addresses).
>     2) your hugepages are heavily fragmented.
> 
>     Possible solutions for this are:
> 
>     1. Use IOVA as VA mode (so, use VFIO, not igb_uio), this way, the pages
>     will still be fragmented, but the IOMMU will remap them to be contiguous
>     – this is the recommended option, with VFIO being available it is the
>     better choice than igb_uio.
> 
>     2. Use bigger page sizes. Strictly speaking, this isn’t a solution as
>     memory would be fragmented too, but a 1GB-long standalone segment is
> way
>     more useful than a standalone 2MB-long segment.
> 
>     3. Reboot (as you have done), maybe try re-reserving all pages? E.g.
>     i. Clean your hugetlbfs contents to free any leftover pages
>     ii. echo 0 > /sys/kernel/mm/hugepages/hugepage-<size>/nr_hugepages
>     iii. echo 512 > /sys/kernel/mm/hugepages/hugepage-
> <size>/nr_hugepages
> 
>     Alternatively if you upgrade to OVs 2.11 it will use DPDK 18.11. This
>     would make a difference as since DPDK 18.05+ we don’t require
>     PA-contiguous segments any more
> 
>     I would also question why these pages are in the regular page cache in
>     the first place. Are transparent hugepages enabled?
> 
>     HTL
>     Ian
> 
>     >
>     > Please either increase it or request less amount of memory.
>     >
>     > 2019-03-18T13:32:41.112Z|00016|dpdk|ERR|EAL: Cannot init memory
>     >
>     > 2019-03-18T13:32:41.128Z|00002|daemon_unix|ERR|fork child died
> before
>     > signaling startup (killed (Aborted))
>     >
>     > 2019-03-18T13:32:41.128Z|00003|daemon_unix|EMER|could not detach
> from
>     > foreground session
>     >
>     > *Tech Details:*
>     >
>     >   * Open vSwitch version: 2.9.2
>     >   * DPDK version: 17.11
>     >   * System has only a single NUMA node.
>     >
>     > This problem is consistently reproducible when having a relatively high
>     > amount of memory in the buffer/cache (usually around 5GB) and clearing
>     > the buffer afterwards with the command outlined above.
>     >
>     > On the Internet, I found some posts saying that this is due to memory
>     > fragmentation but normally I’m not even able to allocate HugePages in
>     > the first place when my memory is already fragmented. In this scenario
>     > however the allocation of HugePages works totally fine after clearing
>     > the buffer so why would they be fragmented?
>     >
>     > A workaround that I know of is a reboot.
>     >
>     > I’d be very grateful about any opinion on that.
> 
> 
> 
> 
>     >
>     > Thank you
>     >
>     > Tobias
>     >
>     >
>     > _______________________________________________
>     > discuss mailing list
>     > discuss at openvswitch.org
>     > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>     >
> 
> 



More information about the discuss mailing list