[ovs-discuss] OVS-DPDK fails after clearing buffer

Ian Stokes ian.stokes at intel.com
Thu Mar 21 19:18:54 UTC 2019


On 3/20/2019 10:37 PM, Tobias Hofmann -T (tohofman - AAP3 INC at Cisco) 
via discuss wrote:
> Hello,
> 

Hi,

I wasnt sure at first glance what was happening so discussed with 
Anatoly (Cc'd) who has worked a considerable amount with DPDK memory 
models. Please see response below to what the suspected issue is. 
Anatoly, thanks for you input on this.

> I want to use Open vSwitch with DPDK enabled. For this purpose, I first 
> allocate 512 HugePages of size 2MB to have a total of 1GB of HugePage 
> memory available for OVS-DPDK. (I don’t set any value for 
> */dpdk-socket-mem/ *so the default value of 1GB is taken). Then I set 
> */dpdk-init=true/*. This normally works fine.
> 
> However, I have realized that I can’t allocate HugePages from memory 
> that is inside the buff/cache (visible through */free -h/*). To solve 
> this issue, I decided to clear the cache/buffer in Linux before 
> allocating HugePages by running */echo 1 > /proc/sys/vm/drop_caches/*.
> 
> After that, allocation of the HugePages still works fine. However, when 
> I then run */ovs-vsctl set open_vswitch other_config:dpdk-init=true/* 
> the process crashes and inside the ovs-vswitchd.log I observe the following:
> 
> *ovs-vswitchd log output:*
> 
> 2019-03-18T13:32:41.112Z|00015|dpdk|ERR|EAL: Can only reserve 270 pages 
> from 512 requested
> 
> Current CONFIG_RTE_MAX_MEMSEG=256 is not enough

After you drop the cache, from the above log it is clear that, as a 
result, hugepages’ physical addresses get fragmented, as DPDK cannot 
concatenate pages into segments any more (which results in 
1-page-per-segment type situation which causes you to run out of memseg 
structures, of which there are only 256). We have no control over what 
addresses we get from the OS, so there’s really no way to “unfragment” 
the pages.

So, the above only happens when

1) you’re running in IOVA as PA mode (so, using real physical addresses).
2) your hugepages are heavily fragmented.

Possible solutions for this are:

1. Use IOVA as VA mode (so, use VFIO, not igb_uio), this way, the pages 
will still be fragmented, but the IOMMU will remap them to be contiguous 
– this is the recommended option, with VFIO being available it is the 
better choice than igb_uio.

2. Use bigger page sizes. Strictly speaking, this isn’t a solution as 
memory would be fragmented too, but a 1GB-long standalone segment is way 
more useful than a standalone 2MB-long segment.

3. Reboot (as you have done), maybe try re-reserving all pages? E.g.
i. Clean your hugetlbfs contents to free any leftover pages
ii. echo 0 > /sys/kernel/mm/hugepages/hugepage-<size>/nr_hugepages
iii. echo 512 > /sys/kernel/mm/hugepages/hugepage-<size>/nr_hugepages
	
Alternatively if you upgrade to OVs 2.11 it will use DPDK 18.11. This 
would make a difference as since DPDK 18.05+ we don’t require 
PA-contiguous segments any more

I would also question why these pages are in the regular page cache in 
the first place. Are transparent hugepages enabled?

HTL
Ian

> 
> Please either increase it or request less amount of memory.
> 
> 2019-03-18T13:32:41.112Z|00016|dpdk|ERR|EAL: Cannot init memory
> 
> 2019-03-18T13:32:41.128Z|00002|daemon_unix|ERR|fork child died before 
> signaling startup (killed (Aborted))
> 
> 2019-03-18T13:32:41.128Z|00003|daemon_unix|EMER|could not detach from 
> foreground session
> 
> *Tech Details:*
> 
>   * Open vSwitch version: 2.9.2
>   * DPDK version: 17.11
>   * System has only a single NUMA node.
> 
> This problem is consistently reproducible when having a relatively high 
> amount of memory in the buffer/cache (usually around 5GB) and clearing 
> the buffer afterwards with the command outlined above.
> 
> On the Internet, I found some posts saying that this is due to memory 
> fragmentation but normally I’m not even able to allocate HugePages in 
> the first place when my memory is already fragmented. In this scenario 
> however the allocation of HugePages works totally fine after clearing 
> the buffer so why would they be fragmented?
> 
> A workaround that I know of is a reboot.
> 
> I’d be very grateful about any opinion on that.




> 
> Thank you
> 
> Tobias
> 
> 
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> 



More information about the discuss mailing list