[ovs-discuss] OVS-DPDK fails after clearing buffer
Burakov, Anatoly
anatoly.burakov at intel.com
Mon Apr 8 08:56:12 UTC 2019
> -----Original Message-----
> From: Tobias Hofmann -T (tohofman - AAP3 INC at Cisco)
> [mailto:tohofman at cisco.com]
> Sent: Friday, April 5, 2019 9:39 PM
> To: bugs at openvswitch.org; Burakov, Anatoly <anatoly.burakov at intel.com>
> Cc: Shriroop Joshi (shrirjos) <shrirjos at cisco.com>; Stokes, Ian
> <ian.stokes at intel.com>
> Subject: Re: [ovs-discuss] OVS-DPDK fails after clearing buffer
>
> Hi Anatoly,
>
> I just wanted to follow up on the issue reported below. (It's already been 2
> weeks ago)
>
> I don’t really understand the first solution you suggested: use IOVA as VA
> mode Does that mean I shall load vfio-pci driver before I set dpdk-init to
> true? So, doing a 'modprobe vfio-pci'? Actually I use vfio-pci but I wait with
> loading the vfio-pci until I actually bind an interface to it.
Hi Tobias,
As far as I can remember, in 18.08, IOVA as VA mode will be enabled if
0) modprobe vfio-pci, enable IOMMU in the BIOS, etc.
1) you have *at least one physical device* (otherwise EAL defaults to IOVA as PA mode)
2) *all* of your *physical* devices are bound to vfio-pci
Provided all of this is true, DPDK should run in IOVA as VA mode.
Alternatively, DPDK 17.11 and 18.11 will have --iova-mode command-line switch which will allow forcing IOVA as VA mode if possible, but I'm not sure if 18.08 has it.
>
> Also, to answer your last question: Transparent HugePages are enabled. I've
> just disabled them and was still able to reproduce the issue.
Unfortunately, I can't be of much help here as I did not look into how vmcaches work on Linux, let alone what happens when hugepages end up in said cache. I obviously don't know specifics of your use case and whether it's really necessary to drop caches, however a cursory Google search indicates that the general sentiment seems to be that you shouldn't drop caches in the first place, and that it is not a good practice in general.
>
> Regards
> Toby
>
>
> On 3/21/19, 12:19 PM, "Ian Stokes" <ian.stokes at intel.com> wrote:
>
> On 3/20/2019 10:37 PM, Tobias Hofmann -T (tohofman - AAP3 INC at Cisco)
> via discuss wrote:
> > Hello,
> >
>
> Hi,
>
> I wasnt sure at first glance what was happening so discussed with
> Anatoly (Cc'd) who has worked a considerable amount with DPDK memory
> models. Please see response below to what the suspected issue is.
> Anatoly, thanks for you input on this.
>
> > I want to use Open vSwitch with DPDK enabled. For this purpose, I first
> > allocate 512 HugePages of size 2MB to have a total of 1GB of HugePage
> > memory available for OVS-DPDK. (I don’t set any value for
> > */dpdk-socket-mem/ *so the default value of 1GB is taken). Then I set
> > */dpdk-init=true/*. This normally works fine.
> >
> > However, I have realized that I can’t allocate HugePages from memory
> > that is inside the buff/cache (visible through */free -h/*). To solve
> > this issue, I decided to clear the cache/buffer in Linux before
> > allocating HugePages by running */echo 1 >
> /proc/sys/vm/drop_caches/*.
> >
> > After that, allocation of the HugePages still works fine. However, when
> > I then run */ovs-vsctl set open_vswitch other_config:dpdk-init=true/*
> > the process crashes and inside the ovs-vswitchd.log I observe the
> following:
> >
> > *ovs-vswitchd log output:*
> >
> > 2019-03-18T13:32:41.112Z|00015|dpdk|ERR|EAL: Can only reserve 270
> pages
> > from 512 requested
> >
> > Current CONFIG_RTE_MAX_MEMSEG=256 is not enough
>
> After you drop the cache, from the above log it is clear that, as a
> result, hugepages’ physical addresses get fragmented, as DPDK cannot
> concatenate pages into segments any more (which results in
> 1-page-per-segment type situation which causes you to run out of
> memseg
> structures, of which there are only 256). We have no control over what
> addresses we get from the OS, so there’s really no way to “unfragment”
> the pages.
>
> So, the above only happens when
>
> 1) you’re running in IOVA as PA mode (so, using real physical addresses).
> 2) your hugepages are heavily fragmented.
>
> Possible solutions for this are:
>
> 1. Use IOVA as VA mode (so, use VFIO, not igb_uio), this way, the pages
> will still be fragmented, but the IOMMU will remap them to be contiguous
> – this is the recommended option, with VFIO being available it is the
> better choice than igb_uio.
>
> 2. Use bigger page sizes. Strictly speaking, this isn’t a solution as
> memory would be fragmented too, but a 1GB-long standalone segment is
> way
> more useful than a standalone 2MB-long segment.
>
> 3. Reboot (as you have done), maybe try re-reserving all pages? E.g.
> i. Clean your hugetlbfs contents to free any leftover pages
> ii. echo 0 > /sys/kernel/mm/hugepages/hugepage-<size>/nr_hugepages
> iii. echo 512 > /sys/kernel/mm/hugepages/hugepage-
> <size>/nr_hugepages
>
> Alternatively if you upgrade to OVs 2.11 it will use DPDK 18.11. This
> would make a difference as since DPDK 18.05+ we don’t require
> PA-contiguous segments any more
>
> I would also question why these pages are in the regular page cache in
> the first place. Are transparent hugepages enabled?
>
> HTL
> Ian
>
> >
> > Please either increase it or request less amount of memory.
> >
> > 2019-03-18T13:32:41.112Z|00016|dpdk|ERR|EAL: Cannot init memory
> >
> > 2019-03-18T13:32:41.128Z|00002|daemon_unix|ERR|fork child died
> before
> > signaling startup (killed (Aborted))
> >
> > 2019-03-18T13:32:41.128Z|00003|daemon_unix|EMER|could not detach
> from
> > foreground session
> >
> > *Tech Details:*
> >
> > * Open vSwitch version: 2.9.2
> > * DPDK version: 17.11
> > * System has only a single NUMA node.
> >
> > This problem is consistently reproducible when having a relatively high
> > amount of memory in the buffer/cache (usually around 5GB) and clearing
> > the buffer afterwards with the command outlined above.
> >
> > On the Internet, I found some posts saying that this is due to memory
> > fragmentation but normally I’m not even able to allocate HugePages in
> > the first place when my memory is already fragmented. In this scenario
> > however the allocation of HugePages works totally fine after clearing
> > the buffer so why would they be fragmented?
> >
> > A workaround that I know of is a reboot.
> >
> > I’d be very grateful about any opinion on that.
>
>
>
>
> >
> > Thank you
> >
> > Tobias
> >
> >
> > _______________________________________________
> > discuss mailing list
> > discuss at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> >
>
>
More information about the discuss
mailing list