Обсуждение: Has anybody think about changing BLCKSZ to an option of initdb?
After all, re-initdb is much easier than re-build the whole package. And there seems nothing diffcult to implement this. Is that true?
"Jacky Leng" <lengjianquan@163.com> writes: > And there seems nothing diffcult to implement this. Is that true? No. regards, tom lane
On Wed, Mar 11, 2009 at 12:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> And there seems nothing diffcult to implement this. Is that true? > > No. Eh? There's nothing difficult in implementing it. But there are a lot of other constants dependant on this value which are currently compile-time constants. The only downside I'm aware of is that with this change they become dynamically calculated values which might have a cpu cost since they'll be recalculated quite often. -- greg
Greg Stark <stark@enterprisedb.com> writes: > On Wed, Mar 11, 2009 at 12:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> And there seems nothing diffcult to implement this. Is that true? >> >> No. > Eh? There's nothing difficult in implementing it. > But there are a lot of other constants dependant on this value which > are currently compile-time constants. Exactly, and we rely on them being constants, eg to size arrays. There's no free lunch, and in this particular case there is no evidence whatsoever that it'd be worth the trouble to support run-time-variable BLCKSZ. regards, tom lane
On Wed, Mar 11, 2009 at 1:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Greg Stark <stark@enterprisedb.com> writes: >> On Wed, Mar 11, 2009 at 12:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> And there seems nothing diffcult to implement this. Is that true? >>> >>> No. > >> Eh? There's nothing difficult in implementing it. > >> But there are a lot of other constants dependant on this value which >> are currently compile-time constants. > > Exactly, and we rely on them being constants, eg to size arrays. > > There's no free lunch, and in this particular case there is no evidence > whatsoever that it'd be worth the trouble to support run-time-variable > BLCKSZ. The main advantage would be for circumstances such as the Windows installer where users are installing precompiled binaries. They don't get an opportunity to choose the block size at all. (Similarly for users of binary-only commercial products such as EDB's but the Windows installer makes a pretty good argument on its own). I think the question hinges on whether there's any real benefit to block size at all. The current situation is that the facility is available for people to test and demonstrate that it's helpful. But there are so many variables -- filesystem type, filesystem block size, raid array stripe size, OS readahead, database work-load -- that nobody's done that kind of testing extensively enough to separate the effects of block size from other effects. If we had a solid use case for adjusting block size at all I think we would also need to make it adjustable at initdb time for those binary-only installs. Until we do leaving the compile-time configuration in for people to experiment with is sufficient. -- greg
On Wed, Mar 11, 2009 at 01:29:43PM +0000, Greg Stark wrote: > The main advantage would be for circumstances such as the Windows > installer where users are installing precompiled binaries. They don't > get an opportunity to choose the block size at all. (Similarly for > users of binary-only commercial products such as EDB's but the Windows > installer makes a pretty good argument on its own). And all the linux distributions which ship precompiled binaries. I'm sure there are people who compile postgres themselves but I think there are more who don't. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
On Sat, 2009-03-14 at 13:53 +0100, Martijn van Oosterhout wrote: > On Wed, Mar 11, 2009 at 01:29:43PM +0000, Greg Stark wrote: > > The main advantage would be for circumstances such as the Windows > > installer where users are installing precompiled binaries. They don't > > get an opportunity to choose the block size at all. (Similarly for > > users of binary-only commercial products such as EDB's but the Windows > > installer makes a pretty good argument on its own). > > And all the linux distributions which ship precompiled binaries. I'm > sure there are people who compile postgres themselves but I think there > are more who don't. I think that is an understatement. I would say 99% of postgresql users do NOT compile from source. Heck the only time I compile from source is when I need to fix mis-configured defaults in RH packages (which is why we now have rpms that fix those defaults) or when we have back patched something for a customer. Joshua D. Drake > > Have a nice day, -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
"Joshua D. Drake" <jd@commandprompt.com> writes: > On Sat, 2009-03-14 at 13:53 +0100, Martijn van Oosterhout wrote: >> On Wed, Mar 11, 2009 at 01:29:43PM +0000, Greg Stark wrote: >> > The main advantage would be for circumstances such as the Windows >> > installer where users are installing precompiled binaries. They don't >> > get an opportunity to choose the block size at all. (Similarly for >> > users of binary-only commercial products such as EDB's but the Windows >> > installer makes a pretty good argument on its own). >> >> And all the linux distributions which ship precompiled binaries. I'm >> sure there are people who compile postgres themselves but I think there >> are more who don't. > > I think that is an understatement. I would say 99% of postgresql users > do NOT compile from source. Heck the only time I compile from source is > when I need to fix mis-configured defaults in RH packages (which is why > we now have rpms that fix those defaults) or when we have back patched > something for a customer. So has anyone here done any experiments with live systems with different block sizes? What were your experiences? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!
On Sat, 2009-03-14 at 15:29 +0000, Gregory Stark wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: > > I think that is an understatement. I would say 99% of postgresql users > > do NOT compile from source. Heck the only time I compile from source is > > when I need to fix mis-configured defaults in RH packages (which is why > > we now have rpms that fix those defaults) or when we have back patched > > something for a customer. > > So has anyone here done any experiments with live systems with different block > sizes? What were your experiences? I tested with 4k once. The system tanked. This might be a good one for the performance lab. Joshua D. Drake > > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com > Ask me about EnterpriseDB's RemoteDBA services! > -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
Gregory Stark <stark@enterprisedb.com> writes: > So has anyone here done any experiments with live systems with different block > sizes? What were your experiences? That should really have been the *first* question. We are not going to make this a tunable unless there is some pretty strong evidence that it's worth twiddling. Aside from the implementation costs of making it variable, there is the oft repeated refrain that Postgres has too many configuration knobs already. regards, tom lane
On Sat, 2009-03-14 at 11:47 -0400, Tom Lane wrote: > Gregory Stark <stark@enterprisedb.com> writes: > > So has anyone here done any experiments with live systems with different block > > sizes? What were your experiences? > > That should really have been the *first* question. We are not going to > make this a tunable unless there is some pretty strong evidence that > it's worth twiddling. Aside from the implementation costs of making > it variable, there is the oft repeated refrain that Postgres has too > many configuration knobs already. Well that "too many knobs" argument doesn't apply to this scenario etc. Anyone who is making use of these need those knobs. It is the other 98% that really just need to crank up half a dozen parameters and PostgreSQL is blazing fast for them that make that argument (which is why we should rip everything out of the postgresql.conf). Joshua D. Drake > > regards, tom lane > -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
"Joshua D. Drake" <jd@commandprompt.com> writes: > On Sat, 2009-03-14 at 11:47 -0400, Tom Lane wrote: >> ... Aside from the implementation costs of making >> it variable, there is the oft repeated refrain that Postgres has too >> many configuration knobs already. > Well that "too many knobs" argument doesn't apply to this scenario etc. > Anyone who is making use of these need those knobs. That's nonsense --- on that argument, any variable no matter how obscure should be exposed as a tunable because there might be somebody somewhere who could benefit from it. You are ignoring the costs to everybody else who don't need it, but still have to study a GUC variable definition and try to figure out whether it needs changing for their usage. Not to mention the people who set it to a bad value and suffer lost performance as a result (cf vacuum_cost_delay). Note that I am not saying "no", I am saying "give us some evidence *first*". The costs in implementation effort and user confusion are certain, the benefits are not. regards, tom lane
Tom Lane wrote: > Gregory Stark <stark@enterprisedb.com> writes: >> So has anyone here done any experiments with live systems with different block >> sizes? What were your experiences? Mark tested this back in the OSDL days. His findings on DBT2 was that the right *combination* of OS and PG blocksizes gave up to a 5% performance increase, I think. Hardly enough to make it worth the headache of running with non-default PG and non-deafault Linux block sizes, especially since the wrong combination resulted in a decrease in performance, sometimes dramatically so. However, at Greenplum I remember determining that larger PG block sizes, if matched with larger filesystem block sizes did significantly help on performance of data warehouses which do a lot of seq scans -- but that our ceiling of 32K was still too small to really make this work. I don't have the figures for that, though; Luke reading this? --Josh
Josh Berkus wrote: > However, at Greenplum I remember determining that larger PG block sizes, > if matched with larger filesystem block sizes did significantly help on > performance of data warehouses which do a lot of seq scans -- but that > our ceiling of 32K was still too small to really make this work. I > don't have the figures for that, though; Luke reading this? And did they study the effect of tuning the kernel's readahead? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Sat, 2009-03-14 at 12:25 -0400, Tom Lane wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: > > On Sat, 2009-03-14 at 11:47 -0400, Tom Lane wrote: > >> ... Aside from the implementation costs of making > >> it variable, there is the oft repeated refrain that Postgres has too > >> many configuration knobs already. > > > Well that "too many knobs" argument doesn't apply to this scenario etc. > > Anyone who is making use of these need those knobs. > > That's nonsense --- on that argument, any variable no matter how obscure > should be exposed as a tunable because there might be somebody somewhere > who could benefit from it. You are ignoring the costs to everybody else > who don't need it, but still have to study a GUC variable definition and > try to figure out whether it needs changing for their usage. Not to > mention the people who set it to a bad value and suffer lost performance > as a result (cf vacuum_cost_delay). I think you misunderstood me. I wasn't actually arguing for the variable. I was arguing that if the variable was required that those are the people that would need it. I frankly don't see a need for this variable but again, I think that the performance lab would be provide the information we need to make such a determination. > Note that I am not saying "no", I am saying "give us some evidence > *first*". The costs in implementation effort and user confusion are > certain, the benefits are not. I do not disagree with this. Sincerely, Joshua D. Drake > > regards, tom lane > -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
"Joshua D. Drake" <jd@commandprompt.com> wrote: > > So has anyone here done any experiments with live systems with different block > > sizes? What were your experiences? > > I tested with 4k once. The system tanked. This might be a good one for > the performance lab. I'm using 16k blocks for one system. There are tables with 5kB+/row. The perfomance was worst if 8kB blocks because of many TOASTed fields and unusable spaces. There are some users who don't want to recompile postgres because they think recompiled version of postgres are not tested well and not supported by companies and 3rd party tools. Their database designs are bad, of course, but they want to resolve their problem using knobs of databases. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center