Re: Report: Linux huge pages with Postgres

Поиск
Список
Период
Сортировка
От Kenneth Marshall
Тема Re: Report: Linux huge pages with Postgres
Дата
Msg-id 20101128223038.GA13313@aart.is.rice.edu
обсуждение исходный текст
Ответ на Report: Linux huge pages with Postgres  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Report: Linux huge pages with Postgres  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Sat, Nov 27, 2010 at 02:27:12PM -0500, Tom Lane wrote:
> We've gotten a few inquiries about whether Postgres can use "huge pages"
> under Linux.  In principle that should be more efficient for large shmem
> regions, since fewer TLB entries are needed to support the address
> space.  I spent a bit of time today looking into what that would take.
> My testing was done with current Fedora 13, kernel version
> 2.6.34.7-61.fc13.x86_64 --- it's possible some of these details vary
> across other kernel versions.
> 
> You can test this with fairly minimal code changes, as illustrated in
> the attached not-production-grade patch.  To select huge pages we have
> to include SHM_HUGETLB in the flags for shmget(), and we have to be
> prepared for failure (due to permissions or lack of allocated
> hugepages).  I made the code just fall back to a normal shmget on
> failure.  A bigger problem is that the shmem request size must be a
> multiple of the system's hugepage size, which is *not* a constant
> even though the test patch just uses 2MB as the assumed value.  For a
> production-grade patch we'd have to scrounge the active value out of
> someplace in the /proc filesystem (ick).
> 

I would expect that you can just iterate through the size possibilities
pretty quickly and just use the first one that works -- no /proc
groveling.

> In addition to the code changes there are a couple of sysadmin
> requirements to make huge pages available to Postgres:
> 
> 1. You have to configure the Postgres user as a member of the group
> that's permitted to allocate hugepage shared memory.  I did this:
> sudo sh -c "id -g postgres >/proc/sys/vm/hugetlb_shm_group"
> For production use you'd need to put this in the PG initscript,
> probably, to ensure it gets re-set after every reboot and before PG
> is started.
> 
Since it would take advantage of them automatically, this would be
just a normal DBA/admin task.

> 2. You have to manually allocate some huge pages --- there doesn't
> seem to be any setting that says "just give them out on demand".
> I did this:
> sudo sh -c "echo 600 >/proc/sys/vm/nr_hugepages"
> which gave me a bit over 1GB of space reserved as huge pages.
> Again, this'd have to be done over again at each system boot.
> 
Same.

> For testing purposes, I figured that what I wanted to stress was
> postgres process swapping and shmem access.  I built current git HEAD
> with --enable-debug and no other options, and tested with these
> non-default settings:
>  shared_buffers        1GB
>  checkpoint_segments    50
>  fsync            off
> (fsync intentionally off since I'm not trying to measure disk speed).
> The test machine has two dual-core Nehalem CPUs.  Test case is pgbench
> at -s 25; I ran several iterations of "pgbench -c 10 -T 60 bench"
> in each configuration.
> 
> And the bottom line is: if there's any performance benefit at all,
> it's on the order of 1%.  The best result I got was about 3200 TPS
> with hugepages, and about 3160 without.  The noise in these numbers
> is more than 1% though.
> 
> This is discouraging; it certainly doesn't make me want to expend the
> effort to develop a production patch.  However, perhaps someone else
> can try to show a greater benefit under some other test conditions.
> 
>             regards, tom lane
> 
I would not really expect to see much benefit in the region that the
normal TLB page size would cover with the typical number of TLB entries.
1GB of shared buffers would not be enough to cause TLB thrashing with
most processors. Bump it to 8-32GB or more and if the queries use up
TLB entries with local work_mem you should see some more value in the
patch. 

Regards,
Ken


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Rethinking representation of sort/hash semantics in queries and plans
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: contrib: auth_delay module