Re: munmap() failure due to sloppy handling of hugepage size

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: munmap() failure due to sloppy handling of hugepage size
Дата
Msg-id CAHyXU0wB7oT58jSzYniy7df7bwQauyt=TVNKvsHvu1eRPSaMDQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: munmap() failure due to sloppy handling of hugepage size  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: munmap() failure due to sloppy handling of hugepage size  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Wed, Oct 12, 2016 at 5:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Tom Lane wrote:
>>> According to
>>> https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
>>> looking into /proc/meminfo is the longer-standing API and thus is
>>> likely to work on more kernel versions.  Also, if you look into
>>> /sys then you are going to see multiple possible values and it's
>>> not clear how to choose the right one.
>
>> I'm not sure that this is the best rationale.  In my system there are
>> 2MB and 1GB huge page sizes; in systems with lots of memory (let's say 8
>> GB of shared memory is requested) it seems a clear winner to allocate 8
>> 1GB hugepages than 4096 2MB hugepages because the page table is so much
>> smaller.  The /proc interface only shows the 2MB page size, so if we go
>> that route we'd not be getting the full benefit of the feature.
>
> And you'll tell mmap() which one to do how exactly?  I haven't found
> anything explaining how applications get to choose which page size applies
> to their request.  The kernel document says that /proc/meminfo reflects
> the "default" size, and I'd assume that that's what we'll get from mmap.

hm. for (recent) linux, I see:
      MAP_HUGE_2MB, MAP_HUGE_1GB (since Linux 3.8)             Used in conjunction with MAP_HUGETLB to select
alternative            hugetlb page sizes (respectively, 2 MB and 1 GB) on systems             that support multiple
hugetlbpage sizes.
 
             More generally, the desired huge page size can be configured             by encoding the base-2 logarithm
ofthe desired page size in             the six bits at the offset MAP_HUGE_SHIFT.  (A value of zero             in this
bitfield provides the default huge page size; the             default huge page size can be discovered vie the
Hugepagesize            field exposed by /proc/meminfo.)  Thus, the above two             constants are defined as:
 
                 #define MAP_HUGE_2MB    (21 << MAP_HUGE_SHIFT)                 #define MAP_HUGE_1GB    (30 <<
MAP_HUGE_SHIFT)
             The range of huge page sizes that are supported by the system             can be discovered by listing the
subdirectoriesin             /sys/kernel/mm/hugepages.
 


via: http://man7.org/linux/man-pages/man2/mmap.2.html#NOTES

ISTM all this silliness is pretty much unique to linux anyways.
Instead of reading the filesystem, what about doing test map and test
unmap?  We could zero in on the page size for default I think with
some probing of known possible values.

merlin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: munmap() failure due to sloppy handling of hugepage size
Следующее
От: Tom Lane
Дата:
Сообщение: Re: munmap() failure due to sloppy handling of hugepage size