Обсуждение: Re: Doc tweak for huge_pages?
On 11/30/17 23:35, Thomas Munro wrote: > On Fri, Dec 1, 2017 at 5:04 PM, Justin Pryzby <pryzby@telsasoft.com> wrote: >> On Fri, Dec 01, 2017 at 04:01:24PM +1300, Thomas Munro wrote: >>> Hi hackers, >>> >>> The manual implies that only Linux can use huge pages. That is not >>> true: FreeBSD, Illumos and probably others support larger page sizes >>> using transparent page coalescing algorithms. On my FreeBSD box >>> procstat -v often shows PostgreSQL shared buffers in "S"-flagged >>> memory. I think we should adjust the manual to make clear that it's >>> the *explicit request for huge pages* that is supported only on Linux >>> (and hopefully soon Windows). Am I being too pedantic? >> >> I suggest to remove "other" and include Linux in the enumeration, since it also >> supports "transparent" hugepages. > > Hmm. Yeah, it does, but apparently it's not so transparent. So if we > mention that we'd better indicate in the same paragraph that you > probably don't actually want to use it. How about the attached? Part of the confusion is that the huge_pages setting is only for shared memory, whereas the kernel settings affect all memory. Is the same true for the proposed Windows patch? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Dec 2, 2017 at 4:08 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 11/30/17 23:35, Thomas Munro wrote: >> On Fri, Dec 1, 2017 at 5:04 PM, Justin Pryzby <pryzby@telsasoft.com> wrote: >>> On Fri, Dec 01, 2017 at 04:01:24PM +1300, Thomas Munro wrote: >>>> Hi hackers, >>>> >>>> The manual implies that only Linux can use huge pages. That is not >>>> true: FreeBSD, Illumos and probably others support larger page sizes >>>> using transparent page coalescing algorithms. On my FreeBSD box >>>> procstat -v often shows PostgreSQL shared buffers in "S"-flagged >>>> memory. I think we should adjust the manual to make clear that it's >>>> the *explicit request for huge pages* that is supported only on Linux >>>> (and hopefully soon Windows). Am I being too pedantic? >>> >>> I suggest to remove "other" and include Linux in the enumeration, since it also >>> supports "transparent" hugepages. >> >> Hmm. Yeah, it does, but apparently it's not so transparent. So if we >> mention that we'd better indicate in the same paragraph that you >> probably don't actually want to use it. How about the attached? > > Part of the confusion is that the huge_pages setting is only for shared > memory, whereas the kernel settings affect all memory. Right. And more specifically, just the main shared memory area, not DSM segments. Updated to make this point. (I have wondered whether DSM segments should respect this GUC: it seems plausible that they should when the size is a multiple of the huge page size, so that very large DSA areas finish up mostly backed by huge pages, so that very large shared hash tables would benefit from lower TLB miss rates. I have only read in an academic paper that this is a good idea, I haven't investigated whether that would really help us in practice, and the first problem is that Linux shm_open doesn't support huge pages anyway so you've need one of the other DSM implementation options which are currently non-default.) > Is the same true > for the proposed Windows patch? Yes. It adds a flag to the request for the main shared memory area (after jumping through various permissions hoops). -- Thomas Munro http://www.enterprisedb.com
Вложения
On Fri, Dec 1, 2017 at 10:09 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: >> On 11/30/17 23:35, Thomas Munro wrote: >>> Hmm. Yeah, it does, but apparently it's not so transparent. So if we >>> mention that we'd better indicate in the same paragraph that you >>> probably don't actually want to use it. How about the attached? Here's a review for v3. I find that the first paragraph is an improvement as it's more precise. What I didn't like about the second paragraph is that it pointed out Linux transparent huge pages too favorably while they are actually known to cause big (huge?, pardon the pun) issues (as witnessed in this thread as well). v3 basically says "in Linux it can be transparent or explicit and explicit is faster than transparent". Reading that, and seeing that explicit needs tweaking of kernel parameters and so on, one might very well conclude "I'll use the slightly-slower-but-still-better-than-nothing transparent version". So I tried to redo the second paragraph and ended up with the attached. Rationale for the changes: * changed "this feature" to "explicitly requesting huge pages" to contrast with the automatic one described below * made the wording of Linux THP more negative (but still with some wiggle room for future kernel versions which might improve THP), contrasting with the positive explicit request from this GUC * integrated your mention of other OSes with automatic huge pages * moved the new text to the last paragraph to lower its importance What do you think?
Вложения
On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote: > On Fri, Dec 1, 2017 at 10:09 PM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >>> On 11/30/17 23:35, Thomas Munro wrote: >>>> Hmm. Yeah, it does, but apparently it's not so transparent. So if we >>>> mention that we'd better indicate in the same paragraph that you >>>> probably don't actually want to use it. How about the attached? > > Here's a review for v3. Thanks! > I find that the first paragraph is an improvement as it's more precise. > > What I didn't like about the second paragraph is that it pointed out > Linux transparent huge pages too favorably while they are actually > known to cause big (huge?, pardon the pun) issues (as witnessed in > this thread as well). v3 basically says "in Linux it can be > transparent or explicit and explicit is faster than transparent". > Reading that, and seeing that explicit needs tweaking of kernel > parameters and so on, one might very well conclude "I'll use the > slightly-slower-but-still-better-than-nothing transparent version". > > So I tried to redo the second paragraph and ended up with the > attached. Rationale for the changes: > * changed "this feature" to "explicitly requesting huge pages" to > contrast with the automatic one described below > * made the wording of Linux THP more negative (but still with some > wiggle room for future kernel versions which might improve THP), > contrasting with the positive explicit request from this GUC > * integrated your mention of other OSes with automatic huge pages > * moved the new text to the last paragraph to lower its importance > > What do you think? I don't know enough about this to make such a strong recommendation myself, which is why I was only trying to report that bad performance had been observed on some version, not that you shouldn't do it. Any other views on this stronger statement? -- Thomas Munro http://www.enterprisedb.com
On 12/1/17 10:08, Peter Eisentraut wrote: > Part of the confusion is that the huge_pages setting is only for shared > memory, whereas the kernel settings affect all memory. Is the same true > for the proposed Windows patch? Btw., I'm kind of hoping that the Windows patch would be committed first, so that we don't have to rephrase this whole thing again after that. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote: >> So I tried to redo the second paragraph and ended up with the >> attached. Rationale for the changes: >> * changed "this feature" to "explicitly requesting huge pages" to >> contrast with the automatic one described below >> * made the wording of Linux THP more negative (but still with some >> wiggle room for future kernel versions which might improve THP), >> contrasting with the positive explicit request from this GUC >> * integrated your mention of other OSes with automatic huge pages >> * moved the new text to the last paragraph to lower its importance >> >> What do you think? > > I don't know enough about this to make such a strong recommendation > myself, which is why I was only trying to report that bad performance > had been observed on some version, not that you shouldn't do it. Any > other views on this stronger statement? Now that the Windows huge pages patch has landed, here is a rebase. I took your alternative and tweaked it a tiny bit more. Thoughts? -- Thomas Munro http://www.enterprisedb.com
Вложения
On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote: > On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: > > On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote: > > I don't know enough about this to make such a strong recommendation > > myself, which is why I was only trying to report that bad performance > > had been observed on some version, not that you shouldn't do it. Any > > other views on this stronger statement? > > Now that the Windows huge pages patch has landed, here is a rebase. I > took your alternative and tweaked it a tiny bit more. Thoughts? + <para> + Note that, besides explicitly requesting huge pages via + <varname>huge_pages</varname>, => I would just say: "Note that, besides huge pages requested explicitly, ..." + In Linux this automatic use is => ON Linux comma? + called "transparent huge pages" and is not enabled by default in + popular distributions as of the time of writing, but since transparent => really ? I don't know if I've ever seen it not enabled. In any case, that's a strong statement to make (to be disabled in ALL popular distributions). I checked all our servers, including centos6 and ubuntu t-LTS and x-LTS. On a limited few where it was disabled, I'd explicitly done so. On a server on which I just installed ubuntu-x LTS, with 4.13.0-26-generic: pryzbyj@gta-ubuntu:~$ cat /sys/kernel/mm/transparent_hugepage/enabled always [madvise] never https://github.com/torvalds/linux/commit/13ece886d99cd668483113f7238e419d5331af26 => the compile time default is to disable, but (if enabled at compile time), the runtime default is "always". On centos7 Linux template0 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux $ cat /sys/kernel/mm/transparent_hugepage/enabled [always] madvise never $ grep TRANS /boot/config-3.10.0-693.11.6.el7.x86_64 CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set https://blog.nelhage.com/post/transparent-hugepages/ => It is enabled (”enabled=always”) by default in most Linux distributions. Justin
On Mon, Jan 22, 2018 at 6:30 PM, Justin Pryzby <pryzby@telsasoft.com> wrote: > On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote: >> On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro >> <thomas.munro@enterprisedb.com> wrote: >> > On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote: >> > I don't know enough about this to make such a strong recommendation >> > myself, which is why I was only trying to report that bad performance >> > had been observed on some version, not that you shouldn't do it. Any >> > other views on this stronger statement? >> >> Now that the Windows huge pages patch has landed, here is a rebase. I >> took your alternative and tweaked it a tiny bit more. Thoughts? > > + <para> > + Note that, besides explicitly requesting huge pages via > + <varname>huge_pages</varname>, > => I would just say: > "Note that, besides huge pages requested explicitly, ..." +1 > + In Linux this automatic use is > => ON Linux comma? +1 > + called "transparent huge pages" and is not enabled by default in > + popular distributions as of the time of writing, but since transparent > > => really ? I don't know if I've ever seen it not enabled. In any case, > that's a strong statement to make (to be disabled in ALL popular distributions). Argh. > https://blog.nelhage.com/post/transparent-hugepages/ > => It is enabled (”enabled=always”) by default in most Linux distributions. Sorry, right, that was 100% wrong. It would probably be correct to remove the "not", but let's just remove that bit. New version attached. Thanks. -- Thomas Munro http://www.enterprisedb.com
Вложения
On Mon, Jan 22, 2018 at 07:10:33PM +1300, Thomas Munro wrote: > On Mon, Jan 22, 2018 at 6:30 PM, Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote: > >> On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro > >> <thomas.munro@enterprisedb.com> wrote: > >> > On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob <iacobcatalin@gmail.com> wrote: > >> > I don't know enough about this to make such a strong recommendation > >> > myself, which is why I was only trying to report that bad performance > >> > had been observed on some version, not that you shouldn't do it. Any > >> > other views on this stronger statement? > >> > >> Now that the Windows huge pages patch has landed, here is a rebase. I > >> took your alternative and tweaked it a tiny bit more. Thoughts? > > Sorry, right, that was 100% wrong. It would probably be correct to > remove the "not", but let's just remove that bit. New version > attached. + <productname>PostgreSQL</productname>. On Linux, this is called + "transparent huge pages", but since that feature is known to cause + performance degradation with + <productname>PostgreSQL</productname> on current Linux versions + (unlike explicit use of <varname>huge_pages</varname>), its use is + discouraged. Consider this shorter, less-severe sounding alternative: "... (but note that this feature can degrade performance of some <productname>PostgreSQL</productname> workloads)." Justin
On Mon, Jan 22, 2018 at 7:23 AM, Justin Pryzby <pryzby@telsasoft.com> wrote: > Consider this shorter, less-severe sounding alternative: > "... (but note that this feature can degrade performance of some > <productname>PostgreSQL</productname> workloads)." I think the patch looks good now. As Justin mentions, as far as I see the only arguable piece is how strong the language should be against Linux THP. On one hand it can be argued that warning about THP issues is not the job of this patch. On the other hand this patch does say more about THP and Googling does bring up a lot of trouble and advice to disable THP, including: https://www.postgresql.org/message-id/CANQNgOrD02f8mR3Y8Pi=zFsoL14RqNQA8hwz1r4rSnDLr1b2Cw@mail.gmail.com https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/s-memory-transhuge The RedHat article above says "However, THP is not recommended for database workloads." I'll leave this to the committer and switch this patch to Ready for Committer. By the way, Fedora 27 does disable THP by default, they deviate from upstream in this regard: [catalin@fedie scripts]$ cat /sys/kernel/mm/transparent_hugepage/enabled always [madvise] never [catalin@fedie scripts]$ grep TRANSPARENT /boot/config-4.14.13-300.fc27.x86_64 CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y CONFIG_TRANSPARENT_HUGEPAGE=y # CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y CONFIG_TRANSPARENT_HUGE_PAGECACHE=y When I have some time I'll try to do some digging into history of the Fedora kernel package to see if they provide a rationale for changing the default. That might hint whether it's likely that future RHEL will change as well.
On Tue, Jan 23, 2018 at 7:13 PM, Catalin Iacob <iacobcatalin@gmail.com> wrote: > By the way, Fedora 27 does disable THP by default, they deviate from > upstream in this regard: > When I have some time I'll try to do some digging into history of the > Fedora kernel package to see if they provide a rationale for changing > the default. That might hint whether it's likely that future RHEL will > change as well. I see Peter assigned himself as committer, some more information below for him to decide on the strength of the anti THP message. commit 9a031d5070d9f8f5916c48637bd0c237cd52eaf9 Author: Josh Boyer <jwboyer@redhat.com> Date: Thu Mar 27 18:31:16 2014 -0400 Switch to CONFIG_TRANSPARENT_HUGEPAGE_MADVISE instead of always on The benefit of THP has been somewhat questionable overall for a while, and it's been known to cause performance issues with some workloads. Upstream also considers it to be overly complicated and really not worth it on machines with memory in the amounts found on typical desktops/SMB servers. Switch to using it via madvise, which most applications that care about it should likely already be doing. Debian 9 also seems to default to madvise instead of always. Digging more into it, there were changes in the 4.6 kernel (released May 2016) that should improve THP, more precisely: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=444eb2a449ef36fe115431ed7b71467c4563c7f1 This also lead Debian to change their default in September 2017 (so for the future Debian release) back to always, referencing the 44eb2a improvements: https://anonscm.debian.org/cgit/kernel/linux.git/commit/debian/changelog?id=611a8e67260e8b8190ab991206a3867681d6df91 Ben Hutchings <ben@decadent.org.uk>2017-09-29 14:32:09 (GMT) thp: Enable TRANSPARENT_HUGEPAGE_ALWAYS instead of TRANSPARENT_HUGEPAGE_MADVISE As advised by Andrea Arcangeli - since commit 444eb2a449ef "mm: thp: set THP defrag by default to madvise and add a stall-free defrag option" this will generally be best for performance. So maybe we should weaken the language against THP. Maybe present the known facts so far, even if the post 4.6 situation is vague/unknown: before Linux 4.6 there were repeated reports of THP problems with Postgres, Linux >= 4.6 might improve things but this isn't confirmed. And it would be good if somebody could run benchmarks on pre 4.6 and post 4.6 kernels. I would love to but have no access to big (or medium) hardware.
On Wed, Jan 24, 2018 at 07:46:41AM +0100, Catalin Iacob wrote: > I see Peter assigned himself as committer, some more information below > for him to decide on the strength of the anti THP message. Thanks for digging this up! > And it would be good if somebody could run benchmarks on pre 4.6 and > post 4.6 kernels. I would love to but have no access to big (or > medium) hardware. I should be able to do this, since I have a handful of kernels upgrades on my todo list. Can you recommend a test ? Otherwise I'll come up with something for pgbench. But I think any test should be independant of and not influence the doc change (I don't know anywhere else in the docs which talks about behaviors of specific kernel versions, which often have vendor patches backpatched anyway). > So maybe we should weaken the language against THP. Maybe present the > known facts so far, even if the post 4.6 situation is vague/unknown: > before Linux 4.6 there were repeated reports of THP problems with > Postgres, Linux >= 4.6 might improve things but this isn't confirmed. > And it would be good if somebody could run benchmarks on pre 4.6 and > post 4.6 kernels. I would love to but have no access to big (or > medium) hardware. I think all the details should go elsewhere in the docs; config.sgml already references this: https://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-HUGE-PAGES ..but it doesn't currently mention "transparent" hugepages. Justin
On 1/22/18 01:10, Thomas Munro wrote: > Sorry, right, that was 100% wrong. It would probably be correct to > remove the "not", but let's just remove that bit. New version > attached. Committed that. I reordered some of the existing material because it seemed to have gotten a bit out of order with repeated patching. I also softened the advice against THP just a bit, since that is apparently still changing all the time. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services