Large block size problems and notes...

Поиск
Список
Период
Сортировка
От Sean Chittenden
Тема Large block size problems and notes...
Дата
Msg-id 20031002185815.GE46581@perrin.nxad.com
обсуждение исходный текст
Ответы Re: Large block size problems and notes...  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Sorry, no benchmark results in this post, but I do have a few notes to
pass along for folks:

1) FreeBSD -devel port now has configurable block sizes
2) 65K blocks fail, I think erroneously
3) The size of the postmaster proc and friends explodes to 45MB
4) effective_cache_size is a bad name for a GUC
5) Larger databases with lots of seqscans should use 32K pages, IMHO

1) For the FreeBSD folks in the crowd, it is possible to test non 8K
block sizes with the databases/postgresql-devel port.
       WITH_PGBLOCKSIZE        It's possible to change the blocksize to "16K"                               or "32K"
(defaultsto 8K and different                               block sizes require a dump, initdb, reload!)
 

# cd /usr/ports/databases/postgresql-devel
# make WITH_PGBLOCKSIZE=32K

Should be easy enough for folks to do.  Values of 16K and 32K will
change the block size values.  For those with the paranoid hat on,
I've also included the following message as a warning:

======================================================================       * DEFAULT BLOCK SIZE CONFIGURABLE *

Starting with 7.4, PostgreSQL's block size for the -devel port can be
changed from the default 8K blocks to either 16K or 32K blocks by
setting WITH_PGBLOCKSIZE to either "16K" or "32K".  Changing block
sizes has uncertain performance implications and can result in faster
operation, in other cases slower.  Benchmarking and testing your
installation is *highly recommended* before using these values in any
form of production!  When changing block sizes, it is necessary to
dump, initdb, and reload data when changing block sizes.  Be careful
when switching from databases/postgresql-devel and
databases/postgresql7 or from databases/postgresql-devel -> src
tarball builds that have different block sizes!
======================================================================



2) While testing the above, I noted that 65K blocks fail and 32K is as
big as PostgreSQL can handle, for one reason or another.  When you try
to initdb (postmaster dies in RelationGetBufferForTuple() in
src/backend/access/heap/hio.c, on line 274):

### BEGIN ###
Running in noclean mode.  Mistakes will not be cleaned up.
The files belonging to this database system will be owned by user "sean".
This user must also own the server process.

The database cluster will be initialized with locale C.

creating directory
/usr/ports/databases/postgresql-devel/work/postgresql-7.4.b3.2003.09.30/src/test/regress/./tmp_check/data...ok
 
creating directory
/usr/ports/databases/postgresql-devel/work/postgresql-7.4.b3.2003.09.30/src/test/regress/./tmp_check/data/base...ok
 
creating directory
/usr/ports/databases/postgresql-devel/work/postgresql-7.4.b3.2003.09.30/src/test/regress/./tmp_check/data/global...ok
 
creating directory
/usr/ports/databases/postgresql-devel/work/postgresql-7.4.b3.2003.09.30/src/test/regress/./tmp_check/data/pg_xlog...ok
 
creating directory
/usr/ports/databases/postgresql-devel/work/postgresql-7.4.b3.2003.09.30/src/test/regress/./tmp_check/data/pg_clog...ok
 
selecting default shared_buffers... 1000
selecting default max_connections... 100
creating configuration files... ok
creating template1 database in
/usr/ports/databases/postgresql-devel/work/postgresql-7.4.b3.2003.09.30/src/test/regress/./tmp_check/data/base/1...
PANIC: tuple is too big: size 268
 
Abort trap (core dumped)
### END ###

I'm not sure if this is a case of over-testing or not, however.
Someone with more PG heap foo than me will have to comment on this,
but, it seems as though the test at the bottom of
RelationGetBufferForTuple():
       if (len > PageGetFreeSpace(pageHeader))       {               /* We should not get here given the test at the
top*/               elog(PANIC, "tuple is too big: size %lu", (unsigned long) len);       }
 

should be changed to:
       if (len > MaxTupleSize)

based off of the comments.  *shrug* I've spent a few min looking at
the code, and I'm not sure where the problem is.  Jan or Tom?


3) Using 32K blocks, the postmaster takes a whopping 45MB of RAM!  From  top(1):
 539 pgsql           96    0 44312K    12K select   0:00  0.00%  0.00% postgre 548 pgsql           96    0 44340K
12Kselect   0:00  0.00%  0.00% postgre 547 pgsql           96    0 45284K    12K select   0:00  0.00%  0.00% postgre
 

Nifty, huh?  I haven't been able to figure out where the bloat is
coming from other than to mention that in production, with 8K blocks,
postmaster processes only take up 8MB in RAM.  Anyone know where the
extra is coming from? I can only assume it's from some kind of page
cache, but I don't see where that's being set.  Even then, I'd expect
only a 4X increase in proportion from going from 8K to 32K, not a 5x
increase.


4) effective_cache_size needs to be tuned accordingly because it is  based off of pages, not bytes, which I think isn't
portableacross  different installations with differing block sizes.  effective_cache_size should be
effective_cache_pagesand  effective_cache_size should be done in bytes and dynamically  calculate the
effective_cache_pagesthat way the same config  applies across multiple, different block-size backends.
 


5) On the plus side, the 32K blocks have completely saturated my HDD  IO on the machines that I have tested, which is
niceeven though  the backend is larger.  I'd think for large data warehouse  operations, 32K pages would be the way to
go,esp when it comes  time to run reports.  With 8K pages, I couldn't max out the IO, but  with 32K, it's very possible
andmuch easier to do.  Beyond that, I  don't have much else I want to comment on at the moment.
 

-sc

-- 
Sean Chittenden


В списке pgsql-hackers по дате отправления:

Предыдущее
От: James Rogers
Дата:
Сообщение: Re: Index/Function organized table layout
Следующее
От: Bruno Wolff III
Дата:
Сообщение: Re: Thoughts on maintaining 7.3