Обсуждение: Moving postgresql.conf tunables into 2003...

От:
Sean Chittenden
Дата:

What are the odds of going through and revamping some of the tunables
in postgresql.conf for the 7.4 release?  I was just working with
someone on IRC and on their 7800 RPM IDE drives, their
random_page_cost was ideally suited to be 0.32: a far cry from 4.
Doing so has been a win across the board and the problem query went
from about 40sec (seq scan) down to 0.25ms (using idx, higher than
0.32 resulted in a query time jump to 2sec, and at 0.4 it went back up
to a full seq scan at 40sec).

I know Josh is working on revamping the postgresql.conf file, but
would it be possible to include suggested values for various bits of
hardware and then solicit contributions from admins on this list who
have tuned their DB correctly?

## random_page_cost -- units are one sequential page fetch cost
#random_page_cost = 4           # default - very conservative
#random_page_cost = 0.9         # IDE 5200 RPM, 8MB disk cache
#random_page_cost = 0.3         # IDE 7800 RPM, 4MB disk cache
#random_page_cost = 0.1         # SCSI RAID 5, 10,000RPM, 64MB cache
#random_page_cost = 0.05        # SCSI RAID 1+0, 15,000RPM, 128MB cache
#...

## next_hardware_dependent_tunable....
#hardware_dependent_tunable

I know these tables could get somewhat lengthy or organized
differently, but given the file is read _once_ at _startup_, seen by
thousands of DBAs, is visited at least once for every installation (at
the least to turn on TCP connections), is often the only file other
than pg_hba.conf that gets modified or looked at, this could be a very
nice way of introducing DBAs to tuning PostgreSQL and reducing the
number of people crying "PostgreSQL's slow."  Having postgresql.conf a
clearing house for tunable values for various bits of hardware would
be a huge win for the community and would hopefully radically change
this database's perception.  At the top of the file, it would be
useful to include a blurb to the effect of:

# The default values for PostgreSQL are extremely conservative and are
# likely far from ideal for a site's needs.  Included in this
# configuration, however, are _suggested_ values to help aid in
# tuning.  The values below are not authoritative, merely contributed
# suggestions from PostgreSQL DBAs and committers who have
# successfully tuned their databases.  Please take these values as
# advisory only and remember that they will very likely have to be
# adjusted according to your site's specific needs.  If you have a
# piece of hardware that isn't mentioned below and have tuned your
# configuration aptly and have found a suggested value that the
# PostgreSQL community would benefit from, please send a description
# of the hardware, the name of the tunable, and the tuned value to
#  to be considered for inclusion in future
# releases.
#
# It should also go without saying that the PostgreSQL Global
# Development Group and its community of committers, contributors,
# administrators, and commercial supporters are absolved from any
# responsibility or liability with regards to the use of its software
# (see this software's license for details).  Any data loss,
# corruption, or performance degradation is the responsibility of the
# individual or group of individuals using/managing this installation.
#
# Hints to DBAs:
#
# *) Setup a regular backup schedule (hint: pg_dump(1)/pg_dumpall(1) +
#    cron(8))
#
# *) Tuning: Use psql(1) to test out values before changing values for
#    the entire database.  In psql(1), type:
#
#    1) SHOW [tunabe_name];
#    2) SET [tunable_name] = [value];
#    3) [run query]
#    4) [repeat adjustments as necessary before setting a value here in
#       the postgresql.conf].
#    5) [Send a SIGHUP signal to the backend to have the config values
#       re-read]
#
# *) Never use kill -9 on the backend to shut it down.
#
# *) VACUUM ANALYZE your databases regularly.
#
# *) Use EXPLAIN ANALYZE [query] to tune queries.
#
# *) Read the online documentation at:
#    http://www.postgresql.org/docs/
#
#     -- PostgreSQL Global Development Group

Just a thought.  A bit lengthy, but given that out of the box most
every value is set to be extremely conservative (detrimentally so, esp
since the majority of users aren't running PostgreSQL in embedded
devices, are on reasonably new hardware > 3 years old), and the config
is only read in once and generally the only file viewed by DBAs, it'd
make PostgreSQL more competitive in the performance dept if there were
some kind of suggested values for various tunables.  Having someone
whine, "my PostgreSQL database is slow" is really getting old when its
really not and it's a lack of tuning that is at fault, lowering the
bar to a successful and speedy PostgreSQL installation would be a win
for everyone.  The person who I was helping also had the same data,
schema, and query running on MySQL and the fastest it could go was
2.7s (about 40M rows in the table).

<gets_off_of_soap_box_to_watch_and_listen/> -sc

--
Sean Chittenden

От:
Rod Taylor
Дата:

On Thu, 2003-07-03 at 19:05, Sean Chittenden wrote:
> What are the odds of going through and revamping some of the tunables
> in postgresql.conf for the 7.4 release?  I was just working with
> someone on IRC and on their 7800 RPM IDE drives, their
> random_page_cost was ideally suited to be 0.32: a far cry from 4.

I find it very very hard to believe a random read was cheaper than a
sequential read.  Something is shifty in your testing.

От:
"scott.marlowe"
Дата:

On Thu, 3 Jul 2003, Sean Chittenden wrote:

> What are the odds of going through and revamping some of the tunables
> in postgresql.conf for the 7.4 release?  I was just working with
> someone on IRC and on their 7800 RPM IDE drives, their
> random_page_cost was ideally suited to be 0.32: a far cry from 4.
> Doing so has been a win across the board and the problem query went
> from about 40sec (seq scan) down to 0.25ms (using idx, higher than
> 0.32 resulted in a query time jump to 2sec, and at 0.4 it went back up
> to a full seq scan at 40sec).

I'm the guy who advocates settings of 1 to 2, and that still sounds low to
me. :-)  I'm wondering if the effective_cache_size was set properly, as
well as there be enough buffers allocated.

I generally set effective cache size to 100,000 pages (800 megs or so) on
my box, which is where it sits most days.  with this setting I've found
that settings of under 1 are not usually necessary to force the planner to
take the path of righteousness (i.e. the fastest one :-) 1.2 to 1.4 are
optimal to me.

Since theoretically a random page of of 1 means no penalty to move the
heads around, and there's ALWAYS a penalty for moving the heads around, we
have to assume:

1: That either the planner is making poor decisions on some
other variable, and we can whack the planner in the head with a really low
random page count.

OR

2: The other settings are suboptimal (buffers, sort_mem,
effective_cache_size, etc...) and lowering random page costs helps there.

I've always wondered if most performance issues aren't a bit of both.

The answer, of course, is fixing the planner so that a random_page_cost of
anything less than 1 would never be needed, since by design, anything
under 1 represents a computer that likely doesn't exist (in theory of
course.)  A 1 would be a machine that was using solid state hard drives
and had the same cost in terms of OS paths to do random accesses as
sequential.

What constants in the planner, and / or formulas would be the likely
culprits I wonder?  I've wandered through that page and wasn't sure what
to play with.

> I know Josh is working on revamping the postgresql.conf file, but
> would it be possible to include suggested values for various bits of
> hardware and then solicit contributions from admins on this list who
> have tuned their DB correctly?
>
> ## random_page_cost -- units are one sequential page fetch cost
> #random_page_cost = 4           # default - very conservative
> #random_page_cost = 0.9         # IDE 5200 RPM, 8MB disk cache
> #random_page_cost = 0.3         # IDE 7800 RPM, 4MB disk cache
> #random_page_cost = 0.1         # SCSI RAID 5, 10,000RPM, 64MB cache
> #random_page_cost = 0.05        # SCSI RAID 1+0, 15,000RPM, 128MB cache
> #...
>
> ## next_hardware_dependent_tunable....
> #hardware_dependent_tunable
>
> I know these tables could get somewhat lengthy or organized
> differently, but given the file is read _once_ at _startup_, seen by
> thousands of DBAs, is visited at least once for every installation (at
> the least to turn on TCP connections), is often the only file other
> than pg_hba.conf that gets modified or looked at, this could be a very
> nice way of introducing DBAs to tuning PostgreSQL and reducing the
> number of people crying "PostgreSQL's slow."  Having postgresql.conf a
> clearing house for tunable values for various bits of hardware would
> be a huge win for the community and would hopefully radically change
> this database's perception.  At the top of the file, it would be
> useful to include a blurb to the effect of:
>
> # The default values for PostgreSQL are extremely conservative and are
> # likely far from ideal for a site's needs.  Included in this
> # configuration, however, are _suggested_ values to help aid in
> # tuning.  The values below are not authoritative, merely contributed
> # suggestions from PostgreSQL DBAs and committers who have
> # successfully tuned their databases.  Please take these values as
> # advisory only and remember that they will very likely have to be
> # adjusted according to your site's specific needs.  If you have a
> # piece of hardware that isn't mentioned below and have tuned your
> # configuration aptly and have found a suggested value that the
> # PostgreSQL community would benefit from, please send a description
> # of the hardware, the name of the tunable, and the tuned value to
> #  to be considered for inclusion in future
> # releases.
> #
> # It should also go without saying that the PostgreSQL Global
> # Development Group and its community of committers, contributors,
> # administrators, and commercial supporters are absolved from any
> # responsibility or liability with regards to the use of its software
> # (see this software's license for details).  Any data loss,
> # corruption, or performance degradation is the responsibility of the
> # individual or group of individuals using/managing this installation.
> #
> # Hints to DBAs:
> #
> # *) Setup a regular backup schedule (hint: pg_dump(1)/pg_dumpall(1) +
> #    cron(8))
> #
> # *) Tuning: Use psql(1) to test out values before changing values for
> #    the entire database.  In psql(1), type:
> #
> #    1) SHOW [tunabe_name];
> #    2) SET [tunable_name] = [value];
> #    3) [run query]
> #    4) [repeat adjustments as necessary before setting a value here in
> #       the postgresql.conf].
> #    5) [Send a SIGHUP signal to the backend to have the config values
> #       re-read]
> #
> # *) Never use kill -9 on the backend to shut it down.
> #
> # *) VACUUM ANALYZE your databases regularly.
> #
> # *) Use EXPLAIN ANALYZE [query] to tune queries.
> #
> # *) Read the online documentation at:
> #    http://www.postgresql.org/docs/
> #
> #     -- PostgreSQL Global Development Group
>
> Just a thought.  A bit lengthy, but given that out of the box most
> every value is set to be extremely conservative (detrimentally so, esp
> since the majority of users aren't running PostgreSQL in embedded
> devices, are on reasonably new hardware > 3 years old), and the config
> is only read in once and generally the only file viewed by DBAs, it'd
> make PostgreSQL more competitive in the performance dept if there were
> some kind of suggested values for various tunables.  Having someone
> whine, "my PostgreSQL database is slow" is really getting old when its
> really not and it's a lack of tuning that is at fault, lowering the
> bar to a successful and speedy PostgreSQL installation would be a win
> for everyone.  The person who I was helping also had the same data,
> schema, and query running on MySQL and the fastest it could go was
> 2.7s (about 40M rows in the table).
>
> <gets_off_of_soap_box_to_watch_and_listen/> -sc
>
>


От:
"Michael Mattox"
Дата:

I don't have much to add because I'm pretty new to Postgres and have been
soliciting advice here recently, but I totally agree with everything you
said.  I don't mind if it's in the postgres.conf file or in a faq that is
easy to find, I just would like it to be in one place.  A good example of
the need for this is when I was tuning "effective_cache" I thought that was
creating a cache for Postgres when in fact as it was pointed out to me, it's
just hinting to postgres the size of the OS cache.  Lots of ways for people
to get really confused here.

Also some people have said I should have used MySQL and to be honest I did
consider trying it out.  Because Postgres is hard to tune a lot of people
think it's slower than MySQL.  So anything that improves the quality of the
documentation and makes it easier to tune will improve Postgres' reputation
which will in turn encourage more people to use it!

Michael

> -----Original Message-----
> From: 
> [mailto:]On Behalf Of Sean
> Chittenden
> Sent: Thursday, July 03, 2003 9:05 PM
> To: 
> Subject: [PERFORM] Moving postgresql.conf tunables into 2003...
>
>
> What are the odds of going through and revamping some of the tunables
> in postgresql.conf for the 7.4 release?  I was just working with
> someone on IRC and on their 7800 RPM IDE drives, their
> random_page_cost was ideally suited to be 0.32: a far cry from 4.
> Doing so has been a win across the board and the problem query went
> from about 40sec (seq scan) down to 0.25ms (using idx, higher than
> 0.32 resulted in a query time jump to 2sec, and at 0.4 it went back up
> to a full seq scan at 40sec).
>
> I know Josh is working on revamping the postgresql.conf file, but
> would it be possible to include suggested values for various bits of
> hardware and then solicit contributions from admins on this list who
> have tuned their DB correctly?
>
> ## random_page_cost -- units are one sequential page fetch cost
> #random_page_cost = 4           # default - very conservative
> #random_page_cost = 0.9         # IDE 5200 RPM, 8MB disk cache
> #random_page_cost = 0.3         # IDE 7800 RPM, 4MB disk cache
> #random_page_cost = 0.1         # SCSI RAID 5, 10,000RPM, 64MB cache
> #random_page_cost = 0.05        # SCSI RAID 1+0, 15,000RPM, 128MB cache
> #...
>
> ## next_hardware_dependent_tunable....
> #hardware_dependent_tunable
>
> I know these tables could get somewhat lengthy or organized
> differently, but given the file is read _once_ at _startup_, seen by
> thousands of DBAs, is visited at least once for every installation (at
> the least to turn on TCP connections), is often the only file other
> than pg_hba.conf that gets modified or looked at, this could be a very
> nice way of introducing DBAs to tuning PostgreSQL and reducing the
> number of people crying "PostgreSQL's slow."  Having postgresql.conf a
> clearing house for tunable values for various bits of hardware would
> be a huge win for the community and would hopefully radically change
> this database's perception.  At the top of the file, it would be
> useful to include a blurb to the effect of:
>
> # The default values for PostgreSQL are extremely conservative and are
> # likely far from ideal for a site's needs.  Included in this
> # configuration, however, are _suggested_ values to help aid in
> # tuning.  The values below are not authoritative, merely contributed
> # suggestions from PostgreSQL DBAs and committers who have
> # successfully tuned their databases.  Please take these values as
> # advisory only and remember that they will very likely have to be
> # adjusted according to your site's specific needs.  If you have a
> # piece of hardware that isn't mentioned below and have tuned your
> # configuration aptly and have found a suggested value that the
> # PostgreSQL community would benefit from, please send a description
> # of the hardware, the name of the tunable, and the tuned value to
> #  to be considered for inclusion in future
> # releases.
> #
> # It should also go without saying that the PostgreSQL Global
> # Development Group and its community of committers, contributors,
> # administrators, and commercial supporters are absolved from any
> # responsibility or liability with regards to the use of its software
> # (see this software's license for details).  Any data loss,
> # corruption, or performance degradation is the responsibility of the
> # individual or group of individuals using/managing this installation.
> #
> # Hints to DBAs:
> #
> # *) Setup a regular backup schedule (hint: pg_dump(1)/pg_dumpall(1) +
> #    cron(8))
> #
> # *) Tuning: Use psql(1) to test out values before changing values for
> #    the entire database.  In psql(1), type:
> #
> #    1) SHOW [tunabe_name];
> #    2) SET [tunable_name] = [value];
> #    3) [run query]
> #    4) [repeat adjustments as necessary before setting a value here in
> #       the postgresql.conf].
> #    5) [Send a SIGHUP signal to the backend to have the config values
> #       re-read]
> #
> # *) Never use kill -9 on the backend to shut it down.
> #
> # *) VACUUM ANALYZE your databases regularly.
> #
> # *) Use EXPLAIN ANALYZE [query] to tune queries.
> #
> # *) Read the online documentation at:
> #    http://www.postgresql.org/docs/
> #
> #     -- PostgreSQL Global Development Group
>
> Just a thought.  A bit lengthy, but given that out of the box most
> every value is set to be extremely conservative (detrimentally so, esp
> since the majority of users aren't running PostgreSQL in embedded
> devices, are on reasonably new hardware > 3 years old), and the config
> is only read in once and generally the only file viewed by DBAs, it'd
> make PostgreSQL more competitive in the performance dept if there were
> some kind of suggested values for various tunables.  Having someone
> whine, "my PostgreSQL database is slow" is really getting old when its
> really not and it's a lack of tuning that is at fault, lowering the
> bar to a successful and speedy PostgreSQL installation would be a win
> for everyone.  The person who I was helping also had the same data,
> schema, and query running on MySQL and the fastest it could go was
> 2.7s (about 40M rows in the table).
>
> <gets_off_of_soap_box_to_watch_and_listen/> -sc
>
> --
> Sean Chittenden
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to  so that your
>       message can get through to the mailing list cleanly
>



От:
Ron
Дата:

Hear, hear!
Well said Sean. I know that there has been disagreement about this in
the past (recommending settings, with some very good reasons), however
as a programmer/part-time DBA, something like this would be extremely
useful. Our company recently developed a web-based app for a client who
is very happy with their product (we used postgresql db) and we are just
beginning to revive a standalone app using postgresql instead of db2 as
the database. As I'm the DBA only on a part-time basis it is really time
consuming to have to 1) find all relevant documentation and 2) learn it
sufficiently to try to tune the db properly and 3) forget about most of
it until we set up a new project in another year. I like postgresql and
have convinced two of our clients to use it, but if I could fine tune it
so it could 'fly', it would be easier for me (and others) to get more
people to use it.

Ron St.Pierre

BTW I'm looking forward to Josh's configuration doc.

Sean Chittenden wrote:

>What are the odds of going through and revamping some of the tunables
>in postgresql.conf for the 7.4 release?  I was just working with
>someone on IRC and on their 7800 RPM IDE drives, their
>random_page_cost was ideally suited to be 0.32: a far cry from 4.
>Doing so has been a win across the board and the problem query went
>from about 40sec (seq scan) down to 0.25ms (using idx, higher than
>0.32 resulted in a query time jump to 2sec, and at 0.4 it went back up
>to a full seq scan at 40sec).
>
>I know Josh is working on revamping the postgresql.conf file, but
>would it be possible to include suggested values for various bits of
>hardware and then solicit contributions from admins on this list who
>have tuned their DB correctly?
>
>## random_page_cost -- units are one sequential page fetch cost
>#random_page_cost = 4           # default - very conservative
>#random_page_cost = 0.9         # IDE 5200 RPM, 8MB disk cache
>#random_page_cost = 0.3         # IDE 7800 RPM, 4MB disk cache
>#random_page_cost = 0.1         # SCSI RAID 5, 10,000RPM, 64MB cache
>#random_page_cost = 0.05        # SCSI RAID 1+0, 15,000RPM, 128MB cache
>#...
>
>## next_hardware_dependent_tunable....
>#hardware_dependent_tunable
>
>I know these tables could get somewhat lengthy or organized
>differently, but given the file is read _once_ at _startup_, seen by
>thousands of DBAs, is visited at least once for every installation (at
>the least to turn on TCP connections), is often the only file other
>than pg_hba.conf that gets modified or looked at, this could be a very
>nice way of introducing DBAs to tuning PostgreSQL and reducing the
>number of people crying "PostgreSQL's slow."  Having postgresql.conf a
>clearing house for tunable values for various bits of hardware would
>be a huge win for the community and would hopefully radically change
>this database's perception.  At the top of the file, it would be
>useful to include a blurb to the effect of:
>
># The default values for PostgreSQL are extremely conservative and are
># likely far from ideal for a site's needs.  Included in this
># configuration, however, are _suggested_ values to help aid in
># tuning.  The values below are not authoritative, merely contributed
># suggestions from PostgreSQL DBAs and committers who have
># successfully tuned their databases.  Please take these values as
># advisory only and remember that they will very likely have to be
># adjusted according to your site's specific needs.  If you have a
># piece of hardware that isn't mentioned below and have tuned your
># configuration aptly and have found a suggested value that the
># PostgreSQL community would benefit from, please send a description
># of the hardware, the name of the tunable, and the tuned value to
>#  to be considered for inclusion in future
># releases.
>#
># It should also go without saying that the PostgreSQL Global
># Development Group and its community of committers, contributors,
># administrators, and commercial supporters are absolved from any
># responsibility or liability with regards to the use of its software
># (see this software's license for details).  Any data loss,
># corruption, or performance degradation is the responsibility of the
># individual or group of individuals using/managing this installation.
>#
># Hints to DBAs:
>#
># *) Setup a regular backup schedule (hint: pg_dump(1)/pg_dumpall(1) +
>#    cron(8))
>#
># *) Tuning: Use psql(1) to test out values before changing values for
>#    the entire database.  In psql(1), type:
>#
>#    1) SHOW [tunabe_name];
>#    2) SET [tunable_name] = [value];
>#    3) [run query]
>#    4) [repeat adjustments as necessary before setting a value here in
>#       the postgresql.conf].
>#    5) [Send a SIGHUP signal to the backend to have the config values
>#       re-read]
>#
># *) Never use kill -9 on the backend to shut it down.
>#
># *) VACUUM ANALYZE your databases regularly.
>#
># *) Use EXPLAIN ANALYZE [query] to tune queries.
>#
># *) Read the online documentation at:
>#    http://www.postgresql.org/docs/
>#
>#     -- PostgreSQL Global Development Group
>
>Just a thought.  A bit lengthy, but given that out of the box most
>every value is set to be extremely conservative (detrimentally so, esp
>since the majority of users aren't running PostgreSQL in embedded
>devices, are on reasonably new hardware > 3 years old), and the config
>is only read in once and generally the only file viewed by DBAs, it'd
>make PostgreSQL more competitive in the performance dept if there were
>some kind of suggested values for various tunables.  Having someone
>whine, "my PostgreSQL database is slow" is really getting old when its
>really not and it's a lack of tuning that is at fault, lowering the
>bar to a successful and speedy PostgreSQL installation would be a win
>for everyone.  The person who I was helping also had the same data,
>schema, and query running on MySQL and the fastest it could go was
>2.7s (about 40M rows in the table).
>
><gets_off_of_soap_box_to_watch_and_listen/> -sc
>
>
>



От:
Tom Lane
Дата:

Sean Chittenden <> writes:
> What are the odds of going through and revamping some of the tunables
> in postgresql.conf for the 7.4 release?

I was arguing awhile back for bumping the default shared_buffers up,
but the discussion trailed off with no real resolution.

> I was just working with
> someone on IRC and on their 7800 RPM IDE drives, their
> random_page_cost was ideally suited to be 0.32: a far cry from 4.

It is not physically sensible for random_page_cost to be less than one.
The system only lets you set it there for experimental purposes; there
is no way that postgresql.conf.sample will recommend it.  If you needed
to push it below one to force indexscans, there is some other problem
that needs to be solved.  (I'd wonder about index correlation myself;
we know that that equation is pretty bogus.)

> I know Josh is working on revamping the postgresql.conf file, but
> would it be possible to include suggested values for various bits of
> hardware and then solicit contributions from admins on this list who
> have tuned their DB correctly?

I think such material belongs in the SGML docs, not hidden away in a
config file that people may not look at...

            regards, tom lane

От:
Brian Hirt
Дата:

I'm curious how many of the configuration values can be determined
automatically, or with the help of some script.  It seem like there
could be some perl script in contrib that could help figure this out.
Possibly you are asked a bunch of questions and then the values are
computed based on that.   Something like:

How many tables will the system have?
How much memory will be available to the postmaster?
How many backends will there typically be?
What is the avg seek time of the drive?
What's the transfer rate of the drive?

Seems to me that a lot of reasonable default values can be figure out
from these basic questions.  FSM settings, Sort Mem, Random Page Cost,
Effective Cache Size, Shared Memor, etc, etc.


On Thursday, July 3, 2003, at 02:14 PM, scott.marlowe wrote:

> On Thu, 3 Jul 2003, Sean Chittenden wrote:
>
>> What are the odds of going through and revamping some of the tunables
>> in postgresql.conf for the 7.4 release?  I was just working with
>> someone on IRC and on their 7800 RPM IDE drives, their
>> random_page_cost was ideally suited to be 0.32: a far cry from 4.
>> Doing so has been a win across the board and the problem query went
>> from about 40sec (seq scan) down to 0.25ms (using idx, higher than
>> 0.32 resulted in a query time jump to 2sec, and at 0.4 it went back up
>> to a full seq scan at 40sec).
>
> I'm the guy who advocates settings of 1 to 2, and that still sounds
> low to
> me. :-)  I'm wondering if the effective_cache_size was set properly, as
> well as there be enough buffers allocated.
>
> I generally set effective cache size to 100,000 pages (800 megs or so)
> on
> my box, which is where it sits most days.  with this setting I've found
> that settings of under 1 are not usually necessary to force the
> planner to
> take the path of righteousness (i.e. the fastest one :-) 1.2 to 1.4 are
> optimal to me.
>
> Since theoretically a random page of of 1 means no penalty to move the
> heads around, and there's ALWAYS a penalty for moving the heads
> around, we
> have to assume:
>
> 1: That either the planner is making poor decisions on some
> other variable, and we can whack the planner in the head with a really
> low
> random page count.
>
> OR
>
> 2: The other settings are suboptimal (buffers, sort_mem,
> effective_cache_size, etc...) and lowering random page costs helps
> there.
>
> I've always wondered if most performance issues aren't a bit of both.
>
> The answer, of course, is fixing the planner so that a
> random_page_cost of
> anything less than 1 would never be needed, since by design, anything
> under 1 represents a computer that likely doesn't exist (in theory of
> course.)  A 1 would be a machine that was using solid state hard drives
> and had the same cost in terms of OS paths to do random accesses as
> sequential.
>
> What constants in the planner, and / or formulas would be the likely
> culprits I wonder?  I've wandered through that page and wasn't sure
> what
> to play with.
>
>> I know Josh is working on revamping the postgresql.conf file, but
>> would it be possible to include suggested values for various bits of
>> hardware and then solicit contributions from admins on this list who
>> have tuned their DB correctly?
>>
>> ## random_page_cost -- units are one sequential page fetch cost
>> #random_page_cost = 4           # default - very conservative
>> #random_page_cost = 0.9         # IDE 5200 RPM, 8MB disk cache
>> #random_page_cost = 0.3         # IDE 7800 RPM, 4MB disk cache
>> #random_page_cost = 0.1         # SCSI RAID 5, 10,000RPM, 64MB cache
>> #random_page_cost = 0.05        # SCSI RAID 1+0, 15,000RPM, 128MB
>> cache
>> #...
>>
>> ## next_hardware_dependent_tunable....
>> #hardware_dependent_tunable
>>
>> I know these tables could get somewhat lengthy or organized
>> differently, but given the file is read _once_ at _startup_, seen by
>> thousands of DBAs, is visited at least once for every installation (at
>> the least to turn on TCP connections), is often the only file other
>> than pg_hba.conf that gets modified or looked at, this could be a very
>> nice way of introducing DBAs to tuning PostgreSQL and reducing the
>> number of people crying "PostgreSQL's slow."  Having postgresql.conf a
>> clearing house for tunable values for various bits of hardware would
>> be a huge win for the community and would hopefully radically change
>> this database's perception.  At the top of the file, it would be
>> useful to include a blurb to the effect of:
>>
>> # The default values for PostgreSQL are extremely conservative and are
>> # likely far from ideal for a site's needs.  Included in this
>> # configuration, however, are _suggested_ values to help aid in
>> # tuning.  The values below are not authoritative, merely contributed
>> # suggestions from PostgreSQL DBAs and committers who have
>> # successfully tuned their databases.  Please take these values as
>> # advisory only and remember that they will very likely have to be
>> # adjusted according to your site's specific needs.  If you have a
>> # piece of hardware that isn't mentioned below and have tuned your
>> # configuration aptly and have found a suggested value that the
>> # PostgreSQL community would benefit from, please send a description
>> # of the hardware, the name of the tunable, and the tuned value to
>> #  to be considered for inclusion in future
>> # releases.
>> #
>> # It should also go without saying that the PostgreSQL Global
>> # Development Group and its community of committers, contributors,
>> # administrators, and commercial supporters are absolved from any
>> # responsibility or liability with regards to the use of its software
>> # (see this software's license for details).  Any data loss,
>> # corruption, or performance degradation is the responsibility of the
>> # individual or group of individuals using/managing this installation.
>> #
>> # Hints to DBAs:
>> #
>> # *) Setup a regular backup schedule (hint: pg_dump(1)/pg_dumpall(1) +
>> #    cron(8))
>> #
>> # *) Tuning: Use psql(1) to test out values before changing values for
>> #    the entire database.  In psql(1), type:
>> #
>> #    1) SHOW [tunabe_name];
>> #    2) SET [tunable_name] = [value];
>> #    3) [run query]
>> #    4) [repeat adjustments as necessary before setting a value here
>> in
>> #       the postgresql.conf].
>> #    5) [Send a SIGHUP signal to the backend to have the config values
>> #       re-read]
>> #
>> # *) Never use kill -9 on the backend to shut it down.
>> #
>> # *) VACUUM ANALYZE your databases regularly.
>> #
>> # *) Use EXPLAIN ANALYZE [query] to tune queries.
>> #
>> # *) Read the online documentation at:
>> #    http://www.postgresql.org/docs/
>> #
>> #     -- PostgreSQL Global Development Group
>>
>> Just a thought.  A bit lengthy, but given that out of the box most
>> every value is set to be extremely conservative (detrimentally so, esp
>> since the majority of users aren't running PostgreSQL in embedded
>> devices, are on reasonably new hardware > 3 years old), and the config
>> is only read in once and generally the only file viewed by DBAs, it'd
>> make PostgreSQL more competitive in the performance dept if there were
>> some kind of suggested values for various tunables.  Having someone
>> whine, "my PostgreSQL database is slow" is really getting old when its
>> really not and it's a lack of tuning that is at fault, lowering the
>> bar to a successful and speedy PostgreSQL installation would be a win
>> for everyone.  The person who I was helping also had the same data,
>> schema, and query running on MySQL and the fastest it could go was
>> 2.7s (about 40M rows in the table).
>>
>> <gets_off_of_soap_box_to_watch_and_listen/> -sc
>>
>>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 8: explain analyze is your friend


От:
Sean Chittenden
Дата:

> > What are the odds of going through and revamping some of the
> > tunables in postgresql.conf for the 7.4 release?  I was just
> > working with someone on IRC and on their 7800 RPM IDE drives,
> > their random_page_cost was ideally suited to be 0.32: a far cry
> > from 4.  Doing so has been a win across the board and the problem
> > query went from about 40sec (seq scan) down to 0.25ms (using idx,
> > higher than 0.32 resulted in a query time jump to 2sec, and at 0.4
> > it went back up to a full seq scan at 40sec).
>
> I'm the guy who advocates settings of 1 to 2, and that still sounds
> low to me. :-) I'm wondering if the effective_cache_size was set
> properly, as well as there be enough buffers allocated.
>
> I generally set effective cache size to 100,000 pages (800 megs or
> so) on my box, which is where it sits most days.  with this setting
> I've found that settings of under 1 are not usually necessary to
> force the planner to take the path of righteousness (i.e. the
> fastest one :-) 1.2 to 1.4 are optimal to me.

This is a nightly report that's run, cache sizes won't impact
performance of the query at all.  The planner was consistently
choosing a sequential scan over using the index until the
random_page_cost was set to 0.32.  After adjustment, the query just
flies ( vs  vs. 40s@>0.4).  Since it's a nightly
report that only gets performed once a day and data is COPY'ed in once
every few minutes, there's a huge amount of data that's not cached nor
should it be.

> Since theoretically a random page of of 1 means no penalty to move
> the heads around, and there's ALWAYS a penalty for moving the heads
> around, we have to assume:
>
> 1: That either the planner is making poor decisions on some other
> variable, and we can whack the planner in the head with a really low
> random page count.

By all accounts of having played with this query+data, this is the
correct assumption from what I can tell.

> OR
>
> 2: The other settings are suboptimal (buffers, sort_mem,
> effective_cache_size, etc...) and lowering random page costs helps
> there.

None of those other than possibly sort_mem had any impact on the
query, but even then, lower sort_mem doesn't help until the data's
been picked out of the table.  Sorting ~16k of rows is quicker with
more sort_mem.  Higher sort_mem has zero impact on fetching ~16K rows
out of a table with 40M rows of data.  Getting the planner to pick
using the index to filter out data inserted in the last 3 days over
doing a seq scan...  well, I don't know how you could do that without
changing the random_page_cost.  A good thump to the side of the head
would be welcome too if I'm wrong, just make sure it's a good thump
with the appropriate clue-bat.

> I've always wondered if most performance issues aren't a bit of both.

Eh, in my experience, it's generally that random_page_cost needs to be
adjusted to match the hardware and this value every year with new
hardware, seems to be getting lower.

> The answer, of course, is fixing the planner so that a
> random_page_cost of anything less than 1 would never be needed,
> since by design, anything under 1 represents a computer that likely
> doesn't exist (in theory of course.)  A 1 would be a machine that
> was using solid state hard drives and had the same cost in terms of
> OS paths to do random accesses as sequential.

Well, this could be a bug then, but I'm skeptical.  What's odd to me
is that hanging the value between 0.32, 0.33, and 0.4 all radically
change the performance of the query.

> What constants in the planner, and / or formulas would be the likely
> culprits I wonder?  I've wandered through that page and wasn't sure
> what to play with.

random_page_cost should be proportional to the seek time necessary for
the disk to find a page of data on its platters.  It makes sense that
this value, as time progresses, gets smaller as hardware gets faster.

-sc

--
Sean Chittenden

От:
Sean Chittenden
Дата:

> I'm curious how many of the configuration values can be determined
> automatically, or with the help of some script.  It seem like there
> could be some perl script in contrib that could help figure this out.
> Possibly you are asked a bunch of questions and then the values are
> computed based on that.   Something like:
>
> How many tables will the system have?
> How much memory will be available to the postmaster?
> How many backends will there typically be?
> What is the avg seek time of the drive?
> What's the transfer rate of the drive?
>
> Seems to me that a lot of reasonable default values can be figure out
> from these basic questions.  FSM settings, Sort Mem, Random Page Cost,
> Effective Cache Size, Shared Memor, etc, etc.

Someone was working on a thing called pg_autotune or some such program
that'd do exactly what you're thinking of.

http://archives.postgresql.org/pgsql-performance/2002-10/msg00101.php
http://gborg.postgresql.org/project/pgautotune/projdisplay.php


--
Sean Chittenden

От:
Sean Chittenden
Дата:

> > What are the odds of going through and revamping some of the
> > tunables in postgresql.conf for the 7.4 release?  I was just
> > working with someone on IRC and on their 7800 RPM IDE drives,
> > their random_page_cost was ideally suited to be 0.32: a far cry
> > from 4.
>
> I find it very very hard to believe a random read was cheaper than a
> sequential read.  Something is shifty in your testing.

This is the procedure used to zero in on the number:

SET random_page_cost = 3;
[run query three times]
SET random_page_cost = 2;
[run query three times]
SET random_page_cost = 1;
[run query three times]
SET random_page_cost = 0.01;  -- verify that this tunable would make
                  -- a difference eventually
[run query three times]
SET random_page_cost = 0.5;
[run query three times]
SET random_page_cost = 0.2;   -- this was the 1st query that didn't
                  -- do a seq scan
[run query three times]
SET random_page_cost = 0.4;   -- back to a seq scan
[run query three times]
SET random_page_cost = 0.3;   -- idx scan, how high can I push the rpc?
[run query three times]
SET random_page_cost = 0.35;  -- interesting, the query time jumped to
                  -- about 0.2s... better than 40s, but not as
                  -- nice as the 0.25ms when the rpc was at 0.3
[run query three times]
SET random_page_cost = 0.32;  -- Sweet, 0.25ms for the query
[run query three times]
SET random_page_cost = 0.33;  -- Bah, back up to 0.2s
[run query three times]
SET random_page_cost = 0.31;  -- Down to 0.25ms, too low
[run query three times]
SET random_page_cost = 0.33;  -- Double check that it wasn't an errant
                  -- performance at 0.33
[run query three times]
SET random_page_cost = 0.32;  -- Double check that 0.32 is the magic number
[run query three times]

[edit postgresql.conf && killall -SIGHUP postmaster]

-sc

--
Sean Chittenden

От:
Sean Chittenden
Дата:

> > What are the odds of going through and revamping some of the tunables
> > in postgresql.conf for the 7.4 release?
>
> I was arguing awhile back for bumping the default shared_buffers up,
> but the discussion trailed off with no real resolution.
>
> > I was just working with someone on IRC and on their 7800 RPM IDE
> > drives, their random_page_cost was ideally suited to be 0.32: a
> > far cry from 4.
>
> It is not physically sensible for random_page_cost to be less than
> one.  The system only lets you set it there for experimental
> purposes; there is no way that postgresql.conf.sample will recommend
> it.  If you needed to push it below one to force indexscans, there
> is some other problem that needs to be solved.  (I'd wonder about
> index correlation myself; we know that that equation is pretty
> bogus.)

Could be.  I had him create a multi-column index on the date and a
non-unique highly redundant id.  This is a production machine so the
load times are heavier now than they were earlier.  The stats sample
was increased to 1000 too to see if that made any difference in the
planners estimations.

mss_masterlog=> SHOW random_page_cost;
 random_page_cost
------------------
 4
(1 row)

mss_masterlog=> EXPLAIN ANALYZE SELECT srca, COUNT(srca) FROM mss_fwevent WHERE
mss_masterlog-> sensorid = 7 AND evtime > (now() - '6 hours'::INTERVAL)
mss_masterlog-> AND NOT action GROUP BY srca ORDER BY COUNT DESC LIMIT 20;
                                                                      QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=240384.69..240384.74 rows=20 width=12) (actual time=24340.04..24340.39 rows=20 loops=1)
   ->  Sort  (cost=240384.69..240426.80 rows=16848 width=12) (actual time=24340.02..24340.14 rows=21 loops=1)
         Sort Key: count(srca)
         ->  Aggregate  (cost=237938.36..239201.95 rows=16848 width=12) (actual time=24322.84..24330.73 rows=23
loops=1)
               ->  Group  (cost=237938.36..238780.75 rows=168478 width=12) (actual time=24322.57..24328.45 rows=320
loops=1)
                     ->  Sort  (cost=237938.36..238359.55 rows=168478 width=12) (actual time=24322.55..24324.34
rows=320loops=1) 
                           Sort Key: srca
                           ->  Seq Scan on mss_fwevent  (cost=0.00..223312.60 rows=168478 width=12) (actual
time=24253.66..24319.87rows=320 loops=1) 
                                 Filter: ((sensorid = 7) AND (evtime > (now() - '06:00'::interval)) AND (NOT "action"))
 Total runtime: 24353.67 msec
(10 rows)

mss_masterlog=> SET enable_seqscan = false;
SET
mss_masterlog=> EXPLAIN ANALYZE SELECT srca, COUNT(srca) FROM mss_fwevent WHERE
mss_masterlog-> sensorid = 7 AND evtime > (now() - '6 hours'::INTERVAL)
mss_masterlog-> AND NOT action GROUP BY srca ORDER BY COUNT DESC LIMIT 20;
                                                                                        QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=2459596.79..2459596.84 rows=20 width=12) (actual time=162.92..163.25 rows=20 loops=1)
   ->  Sort  (cost=2459596.79..2459638.91 rows=16848 width=12) (actual time=162.90..163.01 rows=21 loops=1)
         Sort Key: count(srca)
         ->  Aggregate  (cost=2457150.46..2458414.05 rows=16848 width=12) (actual time=135.62..143.46 rows=23 loops=1)
               ->  Group  (cost=2457150.46..2457992.85 rows=168478 width=12) (actual time=135.35..141.22 rows=320
loops=1)
                     ->  Sort  (cost=2457150.46..2457571.66 rows=168478 width=12) (actual time=135.33..137.14 rows=320
loops=1)
                           Sort Key: srca
                           ->  Index Scan using mss_fwevent_evtime_sensorid_idx on mss_fwevent  (cost=0.00..2442524.70
rows=168478width=12) (actual time=68.36..132.84 rows=320 loops=1) 
                                 Index Cond: ((evtime > (now() - '06:00'::interval)) AND (sensorid = 7))
                                 Filter: (NOT "action")
 Total runtime: 163.60 msec
(11 rows)
mss_masterlog=> SET enable_seqscan = true;
SET
mss_masterlog=> SET random_page_cost = 0.32;
SET
mss_masterlog=> EXPLAIN ANALYZE SELECT srca, COUNT(srca) FROM mss_fwevent WHERE
mss_masterlog-> sensorid = 7 AND evtime > (now() - '6 hours'::INTERVAL)
mss_masterlog-> AND NOT action GROUP BY srca ORDER BY COUNT DESC LIMIT 20;
                                                                                      QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=227274.85..227274.90 rows=20 width=12) (actual time=28.42..28.75 rows=20 loops=1)
   ->  Sort  (cost=227274.85..227316.97 rows=16848 width=12) (actual time=28.41..28.52 rows=21 loops=1)
         Sort Key: count(srca)
         ->  Aggregate  (cost=224828.52..226092.11 rows=16848 width=12) (actual time=20.26..28.13 rows=23 loops=1)
               ->  Group  (cost=224828.52..225670.91 rows=168478 width=12) (actual time=19.99..25.86 rows=320 loops=1)
                     ->  Sort  (cost=224828.52..225249.72 rows=168478 width=12) (actual time=19.98..21.76 rows=320
loops=1)
                           Sort Key: srca
                           ->  Index Scan using mss_fwevent_evtime_sensorid_idx on mss_fwevent  (cost=0.00..210202.76
rows=168478width=12) (actual time=0.35..17.61 rows=320 loops=1) 
                                 Index Cond: ((evtime > (now() - '06:00'::interval)) AND (sensorid = 7))
                                 Filter: (NOT "action")
 Total runtime: 29.09 msec
(11 rows)

And there 'ya have it.  The times are different from when I had him
send me the queries this morning, but they're within an order of
magnitude difference between each and show the point.  Oh, today they
did a bunch of pruning of old data (nuked June's data)... the runtime
differences are basically the same though.

> > I know Josh is working on revamping the postgresql.conf file, but
> > would it be possible to include suggested values for various bits of
> > hardware and then solicit contributions from admins on this list who
> > have tuned their DB correctly?
>
> I think such material belongs in the SGML docs, not hidden away in a
> config file that people may not look at...

The config file isn't hidden though and is very visible in the tuning
process and to DBAs.  I don't know if a PostgreSQL distributions ship
with TCP connections enabled by default (FreeBSD doesn't), so the
config is always seen and viewed by DBAs.  If it's not the TCP
connections setting, it's the max connections setting or sort_mem,
etc... having the values dup'ed in the SGML, however, would be good
too, but it's of most practical relevance in the actual config: as an
admin setting up a DB, I'd rather not have to fish around on
postgresql.org to find a recommended setting, having it inline and
just having to uncomment it is by far and away the most DBA friendly
and likely to be used in the wild by admins.

-sc

--
Sean Chittenden

От:
Tom Lane
Дата:

Sean Chittenden <> writes:
> Getting the planner to pick
> using the index to filter out data inserted in the last 3 days over
> doing a seq scan...  well, I don't know how you could do that without
> changing the random_page_cost.

This sounds a *whole* lot like a correlation issue.  If the data in
question were scattered randomly in the table, it's likely that an
indexscan would be a loser.  The recently-inserted data is probably
clustered near the end of the table (especially if they're doing VACUUM
FULL after data purges; are they?).  But the planner's correlation stats
are much too crude to recognize that situation, if the rest of the table
is not well-ordered.

If their typical process involves a periodic data purge and then a
VACUUM FULL, it might be worth experimenting with doing a CLUSTER on the
timestamp index instead of the VACUUM FULL.  The CLUSTER would reclaim
space as effectively as VACUUM FULL + REINDEX, and it would leave the
table with an unmistakable 1.0 correlation ... which should tilt the
planner towards an indexscan without needing a physically impossible
random_page_cost to do it.  I think CLUSTER would probably be a little
slower than VACUUM FULL but it's hard to be sure without trying.

            regards, tom lane

От:
Josh Berkus
Дата:

Sean, Tom, Rod, Michael, Brian, Ron:

I'm going to paste everything into one monumental response.  So be prepared to
scroll.

Sean Asks:
> What are the odds of going through and revamping some of the tunables
> in postgresql.conf for the 7.4 release?

Poor.   The time to do this would have been 3 weeks ago, when I announced that
I was re-organizing them and that Bruce was changing many names.  We're past
Feature Freeze now, and we have a *lot* of bug-checking to do with the
back-end changes.

> I know Josh is working on revamping the postgresql.conf file, but
> would it be possible to include suggested values for various bits of
> hardware and then solicit contributions from admins on this list who
> have tuned their DB correctly?

Sure, but this is not a short-term project.   I started this list, and have
100% of list e-mails archived, and I can tell you that there is little
agreement on many of the parameters ... plus I think we'd need about 15-25
e-mails about the best way to implement it, as my ideas are different from
yours and Tom's are different from both of us.

I'd also suggest that this is a good thing to do *after* we have created a
comprehensive benchmarking package that allows us to difinitively test the
argued values for various parameters.  Right now, the "conventional wisdom"
we have is strictly anecdotal; for example, all of the discussions on this
list about the value of shared_buffers encompasses only about 14 servers and
3 operating systems.

> # The default values for PostgreSQL are extremely conservative and are
> # likely far from ideal for a site's needs.  Included in this
> # configuration, however, are _suggested_ values to help aid in
<snip>

This sort of narrative belongs in the SGML docs, not in a CONF file.  In fact,
one could argue that we should take *all* commentary out of the CONF file in
order to force people to read the docs.

Michael Says:
> I don't have much to add because I'm pretty new to Postgres and have been
> soliciting advice here recently, but I totally agree with everything you
> said.  I don't mind if it's in the postgres.conf file or in a faq that is
> easy to find, I just would like it to be in one place.

I spent a bunch of hours this last period re-organizing the official docs so
that they are easier to read.   Check them out in the 7.4 dev docs.   To
further enhance this, Shridhar and I will be putting together a broad
commentary and putting it on one of the postgresql web sites.  Eventually
when recommendations are tested a lot of this commentary will make its way
into the official docs.

Ron Says:
> the database. As I'm the DBA only on a part-time basis it is really time
> consuming to have to 1) find all relevant documentation and 2) learn it
> sufficiently to try to tune the db properly and 3) forget about most of
> it until we set up a new project in another year. I like postgresql and
> have convinced two of our clients to use it, but if I could fine tune it
> so it could 'fly', it would be easier for me (and others) to get more
> people to use it.

Database performance tuning will always be a "black art," as it necessitates a
broad knowledge of PostgreSQL, OS architecture, and computer hardware.  So I
doubt that we can post docs that would allow any 10% time DBA to make
PostgreSQL "fly", but hopefully over the next year we can make enough
knowledge public to allow anyone to make PostgreSQL "sprint".

Tom Comments:
> I was arguing awhile back for bumping the default shared_buffers up,
> but the discussion trailed off with no real resolution.

I think we ran up against the still far-too-low SHMMAX settings in most
*nixes.   We could raise this default once we can supply a script which will
help the user bump up the OS's memory settings at, say, initDB time.

Brian Suggests:
> I'm curious how many of the configuration values can be determined
> automatically, or with the help of some script.  It seem like there
> could be some perl script in contrib that could help figure this out.
> Possibly you are asked a bunch of questions and then the values are
> computed based on that.   Something like:

This would be great!  Wanna be in charge of it?

Sean Replies:
> Someone was working on a thing called pg_autotune or some such program
> that'd do exactly what you're thinking of.

Justin.  Unfortunately, pg_autotune didn't get very far, plus its design is
not very friendly to collaborative programming.  So it's the right idea, but
needs to be reworked from the whiteboard, probably in Perl.
    Kevin Brown and I followed that up by trying to build a downloadable public
domain database that could be used for benchmarking.  However, he got an FT
job and I got distracted by prep for 7.4.  So, a little help?

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


От:
Tom Lane
Дата:

Josh Berkus <> writes:
> Kevin Brown and I followed that up by trying to build a downloadable public
> domain database that could be used for benchmarking.  However, he got an FT
> job and I got distracted by prep for 7.4.  So, a little help?

BTW, OSDL (Linus' new home ;-)) is starting to release Postgres-friendly
versions of some of the open-source benchmarks they've been working on.
I wouldn't put all my faith in any one benchmark, but with a few to
choose from we might start to get someplace.

Also, for any of you who will be at O'Reilly next week, OSDL is just up
the road and I'm expecting to meet with them Tuesday afternoon/evening.
Anyone else interested in going?

            regards, tom lane

От:
Tom Lane
Дата:

Josh Berkus <> writes:
> Sean Asks:
>> What are the odds of going through and revamping some of the tunables
>> in postgresql.conf for the 7.4 release?

> Poor.  The time to do this would have been 3 weeks ago, when I
> announced that I was re-organizing them and that Bruce was changing
> many names.  We're past Feature Freeze now, and we have a *lot* of
> bug-checking to do with the back-end changes.

FWIW, I think what Sean is suggesting would amount purely to a
documentation change, and as such would be perfectly legitimate during
beta.  However, I quite agree with your point that we are far from
having a consensus on good numbers to put in.  To get to consensus will
take a lot more work on benchmarking than we've done to date.

            regards, tom lane

От:
Rod Taylor
Дата:

That's one heck of a poor estimate for the number of rows returned.

> ->  Seq Scan on mss_fwevent  (cost=0.00..223312.60 rows=168478 width=12) (actual time=24253.66..24319.87 rows=320
loops=1)




От:
Tom Lane
Дата:

Josh Berkus <> writes:
> Tom Comments:
>> I was arguing awhile back for bumping the default shared_buffers up,
>> but the discussion trailed off with no real resolution.

> I think we ran up against the still far-too-low SHMMAX settings in most
> *nixes.   We could raise this default once we can supply a script which will
> help the user bump up the OS's memory settings at, say, initDB time.

Actually, I think it would not be hard to get initdb to test whether
larger shared-memory settings would work.  We could do something like
try -B of 64, 256, 512, 1024, and insert into postgresql.conf the
largest value that works.  I would want it to top out at a few thousand
at most, because I don't think a default installation should try to
commandeer the whole machine, but if we could get the typical
installation to be running with even 1000 buffers rather than 64,
we'd be *way* better off.  (See "Postgres vs MySQL" thread nearby.)

We could possibly also have initdb print some kind of message if it's
forced to use an unreasonably small value for shared_buffers, so that
people might have a clue that they need to do kernel reconfiguration.

Comments?

            regards, tom lane

От:
Sean Chittenden
Дата:

> Sean Chittenden <> writes:
> > Getting the planner to pick
> > using the index to filter out data inserted in the last 3 days over
> > doing a seq scan...  well, I don't know how you could do that without
> > changing the random_page_cost.
>
> This sounds a *whole* lot like a correlation issue.  If the data in
> question were scattered randomly in the table, it's likely that an
> indexscan would be a loser.  The recently-inserted data is probably
> clustered near the end of the table (especially if they're doing
> VACUUM FULL after data purges; are they?).  But the planner's
> correlation stats are much too crude to recognize that situation, if
> the rest of the table is not well-ordered.

Data isn't scattered randomly from what I can tell and is basically
already clustered just because the data is inserted linearly and
based off of time.  I don't think they're doing a VACUUM FULL after a
purge, but I'll double check on that on Monday when they get in.  Is
there an easy way of determining or setting a planner stat to suggest
that data is ordered around a column in a permanent way?  CLUSTER has
always been a one shot deal and its effects wear off quickly depending
on the way that data is inserted.  It seems as though that this would
be a circumstance in which preallocated disk space would be a win
(that way data wouldn't always be appended to the heap and could be
inserted in order, of most use for non-time related data: ex, some
non-unique ID).

> If their typical process involves a periodic data purge and then a
> VACUUM FULL, it might be worth experimenting with doing a CLUSTER on
> the timestamp index instead of the VACUUM FULL.  The CLUSTER would
> reclaim space as effectively as VACUUM FULL + REINDEX, and it would
> leave the table with an unmistakable 1.0 correlation ... which
> should tilt the planner towards an indexscan without needing a
> physically impossible random_page_cost to do it.  I think CLUSTER
> would probably be a little slower than VACUUM FULL but it's hard to
> be sure without trying.

Hrm, I understand what clustering does, I'm just not convinced that
it'll "fix" this performance problem unless CLUSTER sets some kind of
hint that ANALYZE uses to modify the way in which it collects
statistics.  Like I said, I'll let you know on Monday when they're
back in the shop, but I'm not holding my breath.  I know
random_page_cost is set to something physically impossible, but in
terms of performance, it's always been the biggest win for me to set
this puppy quite low.  Bug in the planner, or documentation
surrounding what this knob does, I'm not sure, but setting this to a
low value consistently yields good results for me.  Faster the drive,
the lower the random_page_cost value.  *shrug*

> That's one heck of a poor estimate for the number of rows returned.
>
> > ->  Seq Scan on mss_fwevent  (cost=0.00..223312.60 rows=168478 width=12) (actual time=24253.66..24319.87 rows=320
loops=1)

The stats for the columns are already set to 1000 to aid with
this... don't know what else I can do here.  Having the planner off by
as much as even half the actual size isn't uncommon in my experience.

-sc

--
Sean Chittenden

От:
Sean Chittenden
Дата:

> > # The default values for PostgreSQL are extremely conservative and
> > # are likely far from ideal for a site's needs.  Included in this
> > # configuration, however, are _suggested_ values to help aid in >
> > # <snip>
>
> This sort of narrative belongs in the SGML docs, not in a CONF file.
> In fact, one could argue that we should take *all* commentary out of
> the CONF file in order to force people to read the docs.

The SGML docs aren't in the DBA's face and are way out of the way for
DBAs rolling out a new system or who are tuning the system.  SGML ==
Developer, conf == DBA.

> Database performance tuning will always be a "black art," as it
> necessitates a broad knowledge of PostgreSQL, OS architecture, and
> computer hardware.  So I doubt that we can post docs that would
> allow any 10% time DBA to make PostgreSQL "fly", but hopefully over
> the next year we can make enough knowledge public to allow anyone to
> make PostgreSQL "sprint".

I'm highly resistant to/disappointed in this attitude and firmly
believe that there are well understood algorithms that DBAs use to
diagnose and solve performance problems.  It's only a black art
because it hasn't been documented.  Performance tuning isn't voodoo,
it's adjusting constraints to align with the execution of applications
and we know what the applications do, therefore the database can mold
to the applications' needs.  Some of those parameters are based on
hardware constraints and should be pooled and organized as such.

random_page_cost ==
    avg cost of a random disk seek/read (eg: disk seek time) ==
    constant integer for a given piece of hardware

There are other settings that are RAM based as well, which should be
formulaic and derived though a formula hasn't been defined to date.

-sc

--
Sean Chittenden

От:
Josh Berkus
Дата:

Sean,

> The SGML docs aren't in the DBA's face and are way out of the way for
> DBAs rolling out a new system or who are tuning the system.  SGML ==
> Developer, conf == DBA.

That's exactly my point.  We cannot provide enough documentation in the CONF
file without septupling its length.  IF we remove all commentary, and instead
provide a pointer to the documentation, more DBAs will read it.

>   Some of those parameters are based on
> hardware constraints and should be pooled and organized as such.
>
> random_page_cost ==
>     avg cost of a random disk seek/read (eg: disk seek time) ==
>     constant integer for a given piece of hardware

But, you see, this is exactly what I'm talking about.   random_page_cost isn't
static to a specific piece of hardware ... it depends as well on what else is
on the disk/array, concurrent disk activity, disk controller settings,
filesystem, OS, distribution of records and tables, and arrangment of the
partitions on disk.   One can certainly get a "good enough" value by
benchmarking the disk's random seek and calculating based on that ... but to
get an "ideal" value requires a long interactive session by someone with
experience and in-depth knowledge of the machine and database.

> There are other settings that are RAM based as well, which should be
> formulaic and derived though a formula hasn't been defined to date.

You seem pretty passionate about this ... how about you help me an Kevin
define a benchmarking suite when I get back into the country (July 17)?   If
we're going to define formulas, it requires that we have a near-comprehensive
and consistent test database and test battery that we can run on a variety of
machines and platforms.

--
-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________
                                        Josh Berkus
   Complete information technology     
    and data management solutions     (415) 565-7293
   for law firms, small businesses      fax 621-2533
    and non-profit organizations.     San Francisco


От:
Sean Chittenden
Дата:

> > The SGML docs aren't in the DBA's face and are way out of the way
> > for DBAs rolling out a new system or who are tuning the system.
> > SGML == Developer, conf == DBA.
>
> That's exactly my point.  We cannot provide enough documentation in
> the CONF file without septupling its length.  IF we remove all
> commentary, and instead provide a pointer to the documentation, more
> DBAs will read it.

Which I don't think would happen and why I think the terse bits that
are included are worth while.  :)

> > Some of those parameters are based on hardware constraints and
> > should be pooled and organized as such.
> >
> > random_page_cost ==
> >     avg cost of a random disk seek/read (eg: disk seek time) ==
> >     constant integer for a given piece of hardware
>
> But, you see, this is exactly what I'm talking about.
> random_page_cost isn't static to a specific piece of hardware ... it
> depends as well on what else is on:

*) the disk/array

translation: how fast data is accessed and over how many drives.

*) concurrent disk activity

A disk/database activity metric is different than the cost of a seek
on the platters.  :) Because PostgreSQL doesn't currently support such
a disk concurrency metric doesn't mean that its definition should get
rolled into a different number in an attempt to accommodate for a lack
thereof.

*) disk controller settings

This class of settings falls into the same settings that affect random
seeks on the platters/disk array(s).

*) filesystem

Again, this influences avg seek time

*) OS

Again, avg seek time

*) distribution of records and tables

This has nothing to do with PostgreSQL's random_page_cost setting
other than that if data is fragmented on the platter, the disk is
going to have to do a lot of seeking.  This is a stat that should get
set by ANALYZE, not by a human.

*) arrangement of the partitions on disk

Again, avg seek time.

> One can certainly get a "good enough" value by benchmarking the
> disk's random seek and calculating based on that ... but to get an
> "ideal" value requires a long interactive session by someone with
> experience and in-depth knowledge of the machine and database.

An "ideal" value isn't obtained via guess and check.  Checking is only
the verification of some calculable set of settings....though right now
those calculated settings are guessed, unfortunately.

> > There are other settings that are RAM based as well, which should
> > be formulaic and derived though a formula hasn't been defined to
> > date.
>
> You seem pretty passionate about this ... how about you help me an
> Kevin define a benchmarking suite when I get back into the country
> (July 17)?  If we're going to define formulas, it requires that we
> have a near-comprehensive and consistent test database and test
> battery that we can run on a variety of machines and platforms.

Works for me, though a benchmark will be less valuable than adding a
disk concurrency stat, improving data trend/distribution analysis, and
using numbers that are concrete and obtainable through the OS kernel
API or an admin manually plunking numbers in.  I'm still recovering
from my move from Cali to WA so with any luck, I'll be settled in by
then.

-sc

--
Sean Chittenden

От:
Josh Berkus
Дата:

Sean,

> > That's exactly my point.  We cannot provide enough documentation in
> > the CONF file without septupling its length.  IF we remove all
> > commentary, and instead provide a pointer to the documentation, more
> > DBAs will read it.
>
> Which I don't think would happen and why I think the terse bits that
> are included are worth while.  :)

Depressingly enough, you are probably correct, unless we assemble a more
user-friendly "getting started" guide.

> *) concurrent disk activity
>
> A disk/database activity metric is different than the cost of a seek
> on the platters.  :) Because PostgreSQL doesn't currently support such
> a disk concurrency metric doesn't mean that its definition should get
> rolled into a different number in an attempt to accommodate for a lack
> thereof.

I was talking about concurrent activity by *other* applications.  For example,
if a DBA has a java app that is accessing XML on the same array as postgres
500 times/minute, then you'd need to adjust random_page_cost upwards to allow
for the resource contest.

> An "ideal" value isn't obtained via guess and check.  Checking is only
> the verification of some calculable set of settings....though right now
> those calculated settings are guessed, unfortunately.

> Works for me, though a benchmark will be less valuable than adding a
> disk concurrency stat, improving data trend/distribution analysis, and
> using numbers that are concrete and obtainable through the OS kernel
> API or an admin manually plunking numbers in.  I'm still recovering
> from my move from Cali to WA so with any luck, I'll be settled in by
> then.

The idea is that for a lot of statistics, we're only going to be able to
obtain valid numbers if you have something constant to check them against.

Talk to you later this month!

--
Josh Berkus
Aglio Database Solutions
San Francisco

От:
"Matthew Nuzum"
Дата:

>
> Brian Suggests:
> > I'm curious how many of the configuration values can be determined
> > automatically, or with the help of some script.  It seem like there
> > could be some perl script in contrib that could help figure this out.
> > Possibly you are asked a bunch of questions and then the values are
> > computed based on that.   Something like:
>
> This would be great!  Wanna be in charge of it?
>

Is there a to-do list for this kind of stuff?  Maybe there could be a "help
wanted" sign on the website.  Seems like there are lot's of good ideas that
fly around here but never get followed up on.

Additionally, I have an increasingly large production database that I would
be willing to do some test-cases on.  I don't really know how to do it
though... If someone where able to give instructions I could run tests on
three different platforms.

Matthew Nuzum        | Makers of "Elite Content Management System"
www.followers.net        | View samples of Elite CMS in action
    | http://www.followers.net/portfolio/


От:
"Matthew Nuzum"
Дата:

> > This sort of narrative belongs in the SGML docs, not in a CONF file.
> > In fact, one could argue that we should take *all* commentary out of
> > the CONF file in order to force people to read the docs.
>
> The SGML docs aren't in the DBA's face and are way out of the way for
> DBAs rolling out a new system or who are tuning the system.  SGML ==
> Developer, conf == DBA.
>
> > Database performance tuning will always be a "black art," as it
> > necessitates a broad knowledge of PostgreSQL, OS architecture, and
> > computer hardware.  So I doubt that we can post docs that would
> > allow any 10% time DBA to make PostgreSQL "fly", but hopefully over
> > the next year we can make enough knowledge public to allow anyone to
> > make PostgreSQL "sprint".
>
> I'm highly resistant to/disappointed in this attitude and firmly
> believe that there are well understood algorithms that DBAs use to
> diagnose and solve performance problems.  It's only a black art
> because it hasn't been documented.  Performance tuning isn't voodoo,
> it's adjusting constraints to align with the execution of applications
> and we know what the applications do, therefore the database can mold
> to the applications' needs.

I agree.

We often seem to forget simple lessons in human nature.  Expecting someone
to spend 20 extra seconds to do something is often too much.  In many cases,
the only "manual" that a person will see is the .conf files.

At the very least, if there is good documentation for these parameters,
maybe the conf file should provide a link to this info.

About the documentation...  The few times I've tried reading these sections
of the docs it was like reading a dictionary.

Bruce's book is a much better writing style because it starts out with a
basic concept and then expands on it, sometimes several times until a
thorough (but not exhaustive) example has been given.

The exhaustive material in the docs is good when you know what you're
looking for, and therefore is a critical piece of reference work.  I don't
want to belittle the authors of that material in any way.  An illustration
of this would be to compare the O'Reilly "... Nutshell" book series to
something like the [fictitious] book "Learn PostgreSQL in 24 hours".

To close this message, I would just like to add that one of the most
successful open source projects of all time could be used as an example.
The Apache httpd project is one of the few open source projects in wide
spread use that holds more market share than all competing products
combined.

It uses a three phase (if not more) documentation level.  The .conf file
contains detailed instructions in an easy to read and not-to-jargon-ish
structure.  The docs provide detailed tutorials and papers that expand on
configuration params in an easy to read format.  Both of these refer to the
thorough reference manual that breaks each possible option down into it's
nitty gritty details so that a user can get more information if they so
desire.

Matthew Nuzum        | Makers of "Elite Content Management System"
www.followers.net        | View samples of Elite CMS in action
    | http://www.followers.net/portfolio/




От:
Michael Pohl
Дата:

On Sun, 6 Jul 2003, Matthew Nuzum wrote:

> At the very least, if there is good documentation for these parameters,
> maybe the conf file should provide a link to this info.

I believe that is what Josh is proposing:

http://archives.postgresql.org/pgsql-performance/2003-07/msg00102.php

> [Apache httpd] uses a three phase (if not more) documentation level.
> The .conf file contains detailed instructions in an easy to read and
> not-to-jargon-ish structure.  The docs provide detailed tutorials and
> papers that expand on configuration params in an easy to read format.
> Both of these refer to the thorough reference manual that breaks each
> possible option down into it's nitty gritty details so that a user can
> get more information if they so desire.

I agree that Apache's approach is primo.  Often the .conf comments are
enough to jog my memory about a directive I haven't used for a while.  Or
the comments are enough to let me know I don't need a directive, or that I
need to go to the manual and read more.  I appreciate that.

michael


От:
Martin Foster
Дата:

Michael Pohl wrote:
> On Sun, 6 Jul 2003, Matthew Nuzum wrote:
>
>
>>At the very least, if there is good documentation for these parameters,
>>maybe the conf file should provide a link to this info.
>
>
> I believe that is what Josh is proposing:
>
> http://archives.postgresql.org/pgsql-performance/2003-07/msg00102.php
>
>
>>[Apache httpd] uses a three phase (if not more) documentation level.
>>The .conf file contains detailed instructions in an easy to read and
>>not-to-jargon-ish structure.  The docs provide detailed tutorials and
>>papers that expand on configuration params in an easy to read format.
>>Both of these refer to the thorough reference manual that breaks each
>>possible option down into it's nitty gritty details so that a user can
>>get more information if they so desire.
>
>
> I agree that Apache's approach is primo.  Often the .conf comments are
> enough to jog my memory about a directive I haven't used for a while.  Or
> the comments are enough to let me know I don't need a directive, or that I
> need to go to the manual and read more.  I appreciate that.
>
> michael
>


One thing that may also help, is to include more sample .conf files.
For example, you could include settings that would be commonly seen for
decicated databases with generic specs and another with less resources
and not dedicated for use with Postgres.

This would allow users to see how certain setting changes will work.
The default .conf is great if you want to setup a small test bed, but
for a real life example chances are it won't exactly be what your
looking for.

    Martin Foster
    Creator/Designer Ethereal Realms
    


От:
Matthew Hixson
Дата:

Why couldn't Postgres learn for itself what the optimal performance
settings would be? The easy one seems to be the effective_cache_size.
Top shows us this information.  Couldn't Postgres read that value from
the same place top reads it instead of relying on a config file value?
Seems like it could even adjust to changing conditions as the cache
size changes.
   Wouldn't it be great to set a single parameter in postgresql.conf
like:

learn = on

This would make Postgres run the same queries multiple times with
different settings, trying to find the ones that made the query run the
fastest.  Obviously you wouldn't want this on all the time because
Postgres would be doing more work than it needs to satisfy the
applications that are asking it for data.  You'd leave it running like
this for as long as you think it would need to get a sampling of real
world use for your specific application.
   Something like this could automagically adapt to load, hardware,
schema, and operating system.  If you drop another 1GB of RAM into the
machine, just turn the learning option on and let Postgres tune itself
again.
     -M@


On Thursday, July 3, 2003, at 04:25 PM, Sean Chittenden wrote:

>> I'm curious how many of the configuration values can be determined
>> automatically, or with the help of some script.  It seem like there
>> could be some perl script in contrib that could help figure this out.
>> Possibly you are asked a bunch of questions and then the values are
>> computed based on that.   Something like:
>>
>> How many tables will the system have?
>> How much memory will be available to the postmaster?
>> How many backends will there typically be?
>> What is the avg seek time of the drive?
>> What's the transfer rate of the drive?
>>
>> Seems to me that a lot of reasonable default values can be figure out
>> from these basic questions.  FSM settings, Sort Mem, Random Page Cost,
>> Effective Cache Size, Shared Memor, etc, etc.
>
> Someone was working on a thing called pg_autotune or some such program
> that'd do exactly what you're thinking of.
>
> http://archives.postgresql.org/pgsql-performance/2002-10/msg00101.php
> http://gborg.postgresql.org/project/pgautotune/projdisplay.php
>
>
> --
> Sean Chittenden
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 8: explain analyze is your friend
>


От:
Andrew Sullivan
Дата:

On Sat, Jul 05, 2003 at 02:12:56PM -0700, Sean Chittenden wrote:
> The SGML docs aren't in the DBA's face and are way out of the way for
> DBAs rolling out a new system or who are tuning the system.  SGML ==
> Developer, conf == DBA.

I could not disagree more.  I'd say more like, if the dba won't read
the manual, get yourself a real dba.  Sorry, but so-called
professionals who won't learn their tools have no home in my shop.

Recently, someone pointed out to me that there _was_ a deficiency
in the docs -- one I thought was not there.  And in that case, I was
more than willing to be chastised.  But claiming that inadequate
comments in the config file are the only things dbas will lok at,
well, frankly, I'm offended.

A
--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<>                              M2P 2A8
                                         +1 416 646 3304 x110


От:
"Matthew Nuzum"
Дата:

> -----Original Message-----
> From:  [mailto:pgsql-performance-
> ] On Behalf Of Andrew Sullivan
> Sent: Monday, July 07, 2003 5:23 AM
> To: 
> Subject: Re: [PERFORM] Moving postgresql.conf tunables into 2003...
>
> On Sat, Jul 05, 2003 at 02:12:56PM -0700, Sean Chittenden wrote:
> > The SGML docs aren't in the DBA's face and are way out of the way for
> > DBAs rolling out a new system or who are tuning the system.  SGML ==
> > Developer, conf == DBA.
>
> I could not disagree more.  I'd say more like, if the dba won't read
> the manual, get yourself a real dba.  Sorry, but so-called
> professionals who won't learn their tools have no home in my shop.
>

I don' want to come off confrontational, so please don't take this as an
attack.

Are you willing to say that the PostgreSQL database system should only be
used by DBAs?  I believe that Postgres is such a good and useful tool that
anyone should be able to start using it with little or no barrier to entry.

I don't believe I'm alone in this opinion either.  As a matter of fact, this
philosophy is being adopted by many in the software industry.  Note that
Linux and many other OSs that act as servers are being made more secure and
easier to use __out of the box__ so that a person can simply install from cd
and start using the tool with out too much difficulty.

Maybe your definition of "dba" is broader than mine and what you mean is,
"someone who installs a postgres database".  Also, by manual, are you
referring to the 213 page Administration guide, or are you talking about the
340 page Reference Manual?  Let us rephrase your statement like this: "If
the [person who installs a postgres database] won't read the [340 page
reference] manual, then that person should go find a different database to
use."

I think that the postgres installation procedure, .conf files and
documentation can be modified in such a way that a newbie (we were all
newbies once) can have a good "out of box experience" with little effort.
That means they can __quickly__ get a __good performing__ database up and
running with __little effort__ and without needing to subscribe to a mailing
list or read a book.

I have seen software projects that have what I call an "elitist" attitude;
meaning they expect you to be an expert or dedicated to their software in
order to use it.  Invariably this mentality stifles the usefulness of the
product.  It seems that there is a relative minority of people on this list
who feel that you have to be "elite" in order to have a good working
postgres installation.  I don't feel that should be a requirement or even a
consideration.

Matthew Nuzum        | Makers of "Elite Content Management System"
www.followers.net        | View samples of Elite CMS in action
    | http://www.followers.net/portfolio/



От:
"Michael Mattox"
Дата:

> Are you willing to say that the PostgreSQL database system should only be
> used by DBAs?  I believe that Postgres is such a good and useful tool that
> anyone should be able to start using it with little or no barrier
> to entry.

This is a good point.  After reading previous responses I was starting to
feel like the only non-DBA Postgres user on this list.  I'm a java
architect/developer and until recently I knew very little about databases.
I just learned what an index was while trying to tune Postgres.  I imagine
some of you are even laughing reading this but it's true.  In Java land, the
O/R mapping tools are getting so good that you don't have to be a database
expert to use the database.  I'm using JDO which generates my database
tables and indexes automatically.  But you do need to learn about the
database in order to increase performance and optimize the settings.  I'm
sure I'm not the only developer who is overwhelmed by the Postgres
documentation and configuration files.

Regards,
Michael



От:
Tom Lane
Дата:

"Matthew Nuzum" <> writes:
> Are you willing to say that the PostgreSQL database system should only be
> used by DBAs?  I believe that Postgres is such a good and useful tool that
> anyone should be able to start using it with little or no barrier to entry.

I quite agree.  But there is a difference between saying "you should get
decent performance with no effort" and "you should get optimal
performance with no effort".  I think we can get to the first with
relatively little trouble (like boosting the default shared_buffers to
1000), but the second is an impractical goal.

            regards, tom lane

От:
Chris Travers
Дата:

Matthew Nuzum wrote:

>>I'm highly resistant to/disappointed in this attitude and firmly
>>believe that there are well understood algorithms that DBAs use to
>>diagnose and solve performance problems.  It's only a black art
>>because it hasn't been documented.  Performance tuning isn't voodoo,
>>it's adjusting constraints to align with the execution of applications
>>and we know what the applications do, therefore the database can mold
>>to the applications' needs.
>>
>>
>
>I agree.
>
>We often seem to forget simple lessons in human nature.  Expecting someone
>to spend 20 extra seconds to do something is often too much.  In many cases,
>the only "manual" that a person will see is the .conf files.
>
>
>
In my opinion, a serious RDBMS system will *always* require the admin to
be doing research in order to learn how to use it effectively.  We are
not talking about a word processor here.

That being said, I think that a good part of the problem is that admins
don't know where to look for the appropriate documentation and what is
needed.  Expecting someone to spend 20 seconds looking for a piece of
info is not too bad, but expecting them to spend hours trying to figure
out what info is relavent is not going to get us anywhere.

For those who have been following the discussion relating to MySQL vs
PostgreSQL, I think this is relavent here.  MySQL does much of its
tuning at compile time, and the MySQL team very carefully controls the
build process for the various binary distriutions they offer.  If you
want to see a real mess, try compiling MySQL from source.  Talk about
having to read documentation on items which *should* be handled by the
configure script.

OTOH, PostgreSQL is optomized using configuration files and is tunable
on the fly. This is, I think, a better approach but it needs to be
better documented.  Maybe a "Beginner's guide to database server tuning"
or something like that.

Secondly, documenting the tuning algorythms well my allow PostgreSQL to
automatically tune itself to some extent or for the development of
performance tuning tools for the server.  This would be a big win for
the project.  Unfortunately I am not knowledgable on this topic to
really do this subject justice.

Best Wishes,
Chris Travers


От:
"Jim C. Nasby"
Дата:

On Mon, Jul 07, 2003 at 10:08:50AM -0700, Chris Travers wrote:
> In my opinion, a serious RDBMS system will *always* require the admin to
> be doing research in order to learn how to use it effectively.  We are
> not talking about a word processor here.
>
> That being said, I think that a good part of the problem is that admins
> don't know where to look for the appropriate documentation and what is
> needed.  Expecting someone to spend 20 seconds looking for a piece of
> info is not too bad, but expecting them to spend hours trying to figure
> out what info is relavent is not going to get us anywhere.

Something else to consider is that this is made worse because tuning for
pgsql is quite different than tuning for something like Oracle or DB2,
which don't deal as much with metrics such as random access cost v.
sequential access. They also take the approach of 'give me as much
memory as you can; I'll take it from there, thankyouverymuch', which
makes effective_cache_size a bit of a mystery.
--
Jim C. Nasby, Database Consultant                  
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

От:
Kaarel
Дата:

>>Are you willing to say that the PostgreSQL database system should only be
>>used by DBAs?  I believe that Postgres is such a good and useful tool that
>>anyone should be able to start using it with little or no barrier to entry.
>
>
> I quite agree.  But there is a difference between saying "you should get
> decent performance with no effort" and "you should get optimal
> performance with no effort".  I think we can get to the first with
> relatively little trouble (like boosting the default shared_buffers to
> 1000), but the second is an impractical goal.


Just wanted to repeat some of the thoughts already been expressed.

There are no reasons why shouldn't PostgreSQL be reasonably well
configured for a particular platform out of the box. Not for maximum
performance but for good enough performance. The many complaints by new
users about PostgreSQL being suprisingly slow and the all the so
standard answers (vacuum, pump up memory settings) imho prove that the
default installatio can be improved. Already mentioned in the mail
lists: using multiple standard conf files, quering system info and
dynamically generating all or some parts of the conf file, automating
the vacuum process...

Kaarel


От:
"scott.marlowe"
Дата:

On Wed, 9 Jul 2003, Kaarel wrote:

> >>Are you willing to say that the PostgreSQL database system should only be
> >>used by DBAs?  I believe that Postgres is such a good and useful tool that
> >>anyone should be able to start using it with little or no barrier to entry.
> >
> >
> > I quite agree.  But there is a difference between saying "you should get
> > decent performance with no effort" and "you should get optimal
> > performance with no effort".  I think we can get to the first with
> > relatively little trouble (like boosting the default shared_buffers to
> > 1000), but the second is an impractical goal.
>
>
> Just wanted to repeat some of the thoughts already been expressed.
>
> There are no reasons why shouldn't PostgreSQL be reasonably well
> configured for a particular platform out of the box. Not for maximum
> performance but for good enough performance. The many complaints by new
> users about PostgreSQL being suprisingly slow and the all the so
> standard answers (vacuum, pump up memory settings) imho prove that the
> default installatio can be improved. Already mentioned in the mail
> lists: using multiple standard conf files, quering system info and
> dynamically generating all or some parts of the conf file, automating
> the vacuum process...

It would be nice to have a program that could run on any OS postgresql
runs on and could report on the current limits of the kernel, and make
recommendations for changes the admin might want to make.

One could probably make a good stab at effective cache size during
install.  Anything reasonably close would probably help.

Report what % of said resources could be consumed by postgresql under
various circumstances...


От:
Martin Foster
Дата:

Scott Marlowe  wrote:
>
> It would be nice to have a program that could run on any OS postgresql
> runs on and could report on the current limits of the kernel, and make
> recommendations for changes the admin might want to make.
>
> One could probably make a good stab at effective cache size during
> install.  Anything reasonably close would probably help.
>
> Report what % of said resources could be consumed by postgresql under
> various circumstances...
>

One of the issues that automating the process would encounter are limits
in the kernel that are too low for PostgreSQL to handle. The BSD's come
to mind where they need values manually increased in the kernel before
you can reach a reasonable maximum connection count.

Another example is how OpenBSD will outright crash when trying to test
the database during install time.   It seems that most of the tests fail
because the maximum amount of processes allowed is too low for the test
to succeed.   While FreeBSD will work just fine on those same tests.

If PostgreSQL automates the configuration, that would be a plus.   But
also detect the platform and inform the person that these changes should
be done to the kernel, sysctl or whatever in order to have that
configuration run.

Perl may be useful in this for a few reasons.   It's portable enough to
run on multiple Unix variants and the tools would be fairly standard, so
the code would require less considerations for more exotic implementations.

    Martin Foster
    Creator/Designer Ethereal Realms
    



От:
Sean Chittenden
Дата:

> I don't have much to add because I'm pretty new to Postgres and have
> been soliciting advice here recently, but I totally agree with
> everything you said.  I don't mind if it's in the postgres.conf file
> or in a faq that is easy to find, I just would like it to be in one
> place.  A good example of the need for this is when I was tuning
> "effective_cache" I thought that was creating a cache for Postgres
> when in fact as it was pointed out to me, it's just hinting to
> postgres the size of the OS cache.  Lots of ways for people to get
> really confused here.

I looked through the src/doc/runtime.sgml for a good place to stick
this and couldn't find a place that this seemed appropriate, but on
FreeBSD, this can be determined with a great deal of precision in a
programmatic manner:

echo "effective_cache_size = $((`sysctl -n vfs.hibufspace` / 8192))"

The same OID is available via C too.  It'd be slick if PostgreSQL
could tune itself (on FreeBSD) at initdb time with the above code.  If
Linux exports this info via /proc and can whip out the appropriate
magic, even better.  An uncommented out good guess that shows up in
postgresql.conf would be stellar and quite possible with the use of
sed.

Maybe an initdb switch could be added to have initdb tune the config
it generates?  If a -n is added, have it generate a config and toss it
to stdout?


case `uname` in
"FreeBSD")
        echo "effective_cache_size = $((`sysctl -n vfs.hibufspace` / 8192))"
        ;;
*)
        echo "Unable to automatically determine the effective cache size" >> /dev/stderr
        ;;
esac


-sc

--
Sean Chittenden

От:
Martin Foster
Дата:

Sean Chittenden wrote:
>
> I looked through the src/doc/runtime.sgml for a good place to stick
> this and couldn't find a place that this seemed appropriate, but on
> FreeBSD, this can be determined with a great deal of precision in a
> programmatic manner:
>
> echo "effective_cache_size = $((`sysctl -n vfs.hibufspace` / 8192))"
>
> The same OID is available via C too.  It'd be slick if PostgreSQL
> could tune itself (on FreeBSD) at initdb time with the above code.  If
> Linux exports this info via /proc and can whip out the appropriate
> magic, even better.  An uncommented out good guess that shows up in
> postgresql.conf would be stellar and quite possible with the use of
> sed.
>
> Maybe an initdb switch could be added to have initdb tune the config
> it generates?  If a -n is added, have it generate a config and toss it
> to stdout?
>
>
> case `uname` in
> "FreeBSD")
>         echo "effective_cache_size = $((`sysctl -n vfs.hibufspace` / 8192))"
>         ;;
> *)
>         echo "Unable to automatically determine the effective cache size" >> /dev/stderr
>         ;;
> esac
>
>
> -sc
>

Simplest way may be to create a 'auto-tune' directory with scripts for
configured platforms.   When postgres installs the databases, it checks
for 'tune.xxx' and if found uses that to generate the script itself?

This would allow for defaults on platforms that do not have them and
optimization for those that do.

    Martin Foster
    Creator/Designer Ethereal Realms
    



От:
Bruce Momjian
Дата:

Michael Pohl wrote:
> On Sun, 6 Jul 2003, Matthew Nuzum wrote:
>
> > At the very least, if there is good documentation for these parameters,
> > maybe the conf file should provide a link to this info.
>
> I believe that is what Josh is proposing:
>
> http://archives.postgresql.org/pgsql-performance/2003-07/msg00102.php
>
> > [Apache httpd] uses a three phase (if not more) documentation level.
> > The .conf file contains detailed instructions in an easy to read and
> > not-to-jargon-ish structure.  The docs provide detailed tutorials and
> > papers that expand on configuration params in an easy to read format.
> > Both of these refer to the thorough reference manual that breaks each
> > possible option down into it's nitty gritty details so that a user can
> > get more information if they so desire.
>
> I agree that Apache's approach is primo.  Often the .conf comments are
> enough to jog my memory about a directive I haven't used for a while.  Or
> the comments are enough to let me know I don't need a directive, or that I
> need to go to the manual and read more.  I appreciate that.

Isn't that what we have now --- isn't postgresql.conf clear enough to
jog people's memory.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
                 |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

От:
Bruce Momjian
Дата:

Keep in mind that if we auto-tune, we will only be able to do it for
some platforms, so we will need a table that shows which settings are
autotuned for each platform.

---------------------------------------------------------------------------

Sean Chittenden wrote:
> > I don't have much to add because I'm pretty new to Postgres and have
> > been soliciting advice here recently, but I totally agree with
> > everything you said.  I don't mind if it's in the postgres.conf file
> > or in a faq that is easy to find, I just would like it to be in one
> > place.  A good example of the need for this is when I was tuning
> > "effective_cache" I thought that was creating a cache for Postgres
> > when in fact as it was pointed out to me, it's just hinting to
> > postgres the size of the OS cache.  Lots of ways for people to get
> > really confused here.
>
> I looked through the src/doc/runtime.sgml for a good place to stick
> this and couldn't find a place that this seemed appropriate, but on
> FreeBSD, this can be determined with a great deal of precision in a
> programmatic manner:
>
> echo "effective_cache_size = $((`sysctl -n vfs.hibufspace` / 8192))"
>
> The same OID is available via C too.  It'd be slick if PostgreSQL
> could tune itself (on FreeBSD) at initdb time with the above code.  If
> Linux exports this info via /proc and can whip out the appropriate
> magic, even better.  An uncommented out good guess that shows up in
> postgresql.conf would be stellar and quite possible with the use of
> sed.
>
> Maybe an initdb switch could be added to have initdb tune the config
> it generates?  If a -n is added, have it generate a config and toss it
> to stdout?
>
>
> case `uname` in
> "FreeBSD")
>         echo "effective_cache_size = $((`sysctl -n vfs.hibufspace` / 8192))"
>         ;;
> *)
>         echo "Unable to automatically determine the effective cache size" >> /dev/stderr
>         ;;
> esac
>
>
> -sc
>
> --
> Sean Chittenden
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
                 |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

От:
Manfred Koizar
Дата:

[jumping in late due to vacation]

On Thu, 3 Jul 2003 17:06:46 -0700, Sean Chittenden
<> wrote:
>> is some other problem that needs to be solved.  (I'd wonder about
>> index correlation myself; we know that that equation is pretty
>> bogus.)
>
>Could be.  I had him create a multi-column index on the date and a
>non-unique highly redundant id.

Tom has already suspected index correlation to be a possible source of
the problem and recommended to CLUSTER on the index.  A weakness of
the current planner implementation is that a multi column index is
always thought to have low correlation.  In your case even after
CLUSTER the 2-column index on (date, sensorid) is treated like a
single column index with correlation 0.5.

I have an experimental patch lying around somewhere that tries to work
around these problems by offering different estimation methods for
index scans.  If you are interested, I'll dig it out.

In the meantime have him try with a single column index on date.

On 04 Jul 2003 08:29:04 -0400, Rod Taylor <> wrote:
|That's one heck of a poor estimate for the number of rows returned.
|
|> ->  Seq Scan on mss_fwevent  (cost=0.00..223312.60 rows=168478 width=12)
|                        (actual time=24253.66..24319.87 rows=320 loops=1)

>  ->  Index Scan using mss_fwevent_evtime_sensorid_idx on mss_fwevent
>             (cost=0.00..2442524.70 rows=168478 width=12)
>            (actual time=68.36..132.84 rows=320 loops=1)
>      Index Cond: ((evtime > (now() - '06:00'::interval)) AND (sensorid = 7))
>      Filter: (NOT "action")

Estimated number of rows being wrong by a factor 500 seems to be the
main problem hiding everything else.  With statistics already set to
1000, does this mean that sensorid, evtime, and action are not
independent?  It'd be interesting to know whether the estimation error
comes from "Index Cond" or from "Filter".

Servus
 Manfred

От:
Sean Chittenden
Дата:

> >> is some other problem that needs to be solved.  (I'd wonder about
> >> index correlation myself; we know that that equation is pretty
> >> bogus.)
> >
> >Could be.  I had him create a multi-column index on the date and a
> >non-unique highly redundant id.
>
> Tom has already suspected index correlation to be a possible source
> of the problem and recommended to CLUSTER on the index.  A weakness
> of the current planner implementation is that a multi column index
> is always thought to have low correlation.  In your case even after
> CLUSTER the 2-column index on (date, sensorid) is treated like a
> single column index with correlation 0.5.

Howdy.  Well, I got far enough with the guy in the testing to figure
out that it wasn't a single vs multi-column index problem, however I
haven't heard back from him regarding the use of CLUSTER.  Ce est la
IRC.  :-p

> I have an experimental patch lying around somewhere that tries to
> work around these problems by offering different estimation methods
> for index scans.  If you are interested, I'll dig it out.

Sure, I'll take a gander... had my head in enough Knuth recently to
even hopefully have some kind of a useful response to the patch.

> In the meantime have him try with a single column index on date.

Been there, done that: no change.

> |That's one heck of a poor estimate for the number of rows returned.
> |
> |> ->  Seq Scan on mss_fwevent  (cost=0.00..223312.60 rows=168478 width=12)
> |                        (actual time=24253.66..24319.87 rows=320 loops=1)
>
> >  ->  Index Scan using mss_fwevent_evtime_sensorid_idx on mss_fwevent
> >             (cost=0.00..2442524.70 rows=168478 width=12)
> >            (actual time=68.36..132.84 rows=320 loops=1)
> >      Index Cond: ((evtime > (now() - '06:00'::interval)) AND (sensorid = 7))
> >      Filter: (NOT "action")
>
> Estimated number of rows being wrong by a factor 500 seems to be the
> main problem hiding everything else.  With statistics already set to
> 1000, does this mean that sensorid, evtime, and action are not
> independent?  It'd be interesting to know whether the estimation
> error comes from "Index Cond" or from "Filter".

Hrm...  sensorid is sequence and grows proportional with evtime,
obviously.  Action is a char(1) or something like that (ie: not
unique).  See the EXPLAIN ANALYZEs that I posted in msgid:
.. or at the bottom of this
msg.

Having spent a fair amount of time looking at the two following plans,
it seems as though an additional statistic is needed to change the
cost of doing an index lookup when the index is linearly ordered.
Whether CLUSTER does this or not, I don't know, I never heard back
from him after getting the runtime down to a few ms.  :-/ Are indexes
on linearly ordered data rebalanced somehow?  I thought CLUSTER only
reordered data on disk.  -sc



Plan for normal random_page_cost:

mss_masterlog=> SHOW random_page_cost;
 random_page_cost
------------------
 4
(1 row)

mss_masterlog=> EXPLAIN ANALYZE SELECT srca, COUNT(srca) FROM mss_fwevent WHERE
mss_masterlog-> sensorid = 7 AND evtime > (now() - '6 hours'::INTERVAL)
mss_masterlog-> AND NOT action GROUP BY srca ORDER BY COUNT DESC LIMIT 20;
                                                                      QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=240384.69..240384.74 rows=20 width=12) (actual time=24340.04..24340.39 rows=20 loops=1)
   ->  Sort  (cost=240384.69..240426.80 rows=16848 width=12) (actual time=24340.02..24340.14 rows=21 loops=1)
         Sort Key: count(srca)
         ->  Aggregate  (cost=237938.36..239201.95 rows=16848 width=12) (actual time=24322.84..24330.73 rows=23
loops=1)
               ->  Group  (cost=237938.36..238780.75 rows=168478 width=12) (actual time=24322.57..24328.45 rows=320
loops=1)
                     ->  Sort  (cost=237938.36..238359.55 rows=168478 width=12) (actual time=24322.55..24324.34
rows=320loops=1) 
                           Sort Key: srca
                           ->  Seq Scan on mss_fwevent  (cost=0.00..223312.60 rows=168478 width=12) (actual
time=24253.66..24319.87rows=320 loops=1) 
                                 Filter: ((sensorid = 7) AND (evtime > (now() - '06:00'::interval)) AND (NOT "action"))
 Total runtime: 24353.67 msec
(10 rows)


Plan for altered random_page_cost:

mss_masterlog=> SET random_page_cost = 0.32;
SET
mss_masterlog=> EXPLAIN ANALYZE SELECT srca, COUNT(srca) FROM mss_fwevent WHERE
mss_masterlog-> sensorid = 7 AND evtime > (now() - '6 hours'::INTERVAL)
mss_masterlog-> AND NOT action GROUP BY srca ORDER BY COUNT DESC LIMIT 20;
                                                                                      QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------
 Limit  (cost=227274.85..227274.90 rows=20 width=12) (actual time=28.42..28.75 rows=20 loops=1)
   ->  Sort  (cost=227274.85..227316.97 rows=16848 width=12) (actual time=28.41..28.52 rows=21 loops=1)
         Sort Key: count(srca)
         ->  Aggregate  (cost=224828.52..226092.11 rows=16848 width=12) (actual time=20.26..28.13 rows=23 loops=1)
               ->  Group  (cost=224828.52..225670.91 rows=168478 width=12) (actual time=19.99..25.86 rows=320 loops=1)
                     ->  Sort  (cost=224828.52..225249.72 rows=168478 width=12) (actual time=19.98..21.76 rows=320
loops=1)
                           Sort Key: srca
                           ->  Index Scan using mss_fwevent_evtime_sensorid_idx on mss_fwevent  (cost=0.00..210202.76
rows=168478width=12) (actual time=0.35..17.61 
rows=320 loops=1)
                                 Index Cond: ((evtime > (now() - '06:00'::interval)) AND (sensorid = 7))
                                 Filter: (NOT "action")
 Total runtime: 29.09 msec
(11 rows)

--
Sean Chittenden

От:
Manfred Koizar
Дата:

On Tue, 5 Aug 2003 15:26:09 -0700, Sean Chittenden
<> wrote:
>> I have an experimental patch lying around somewhere that tries to
>> work around these problems by offering different estimation methods
>> for index scans.  If you are interested, I'll dig it out.
>
>Sure, I'll take a gander... had my head in enough Knuth recently to
>even hopefully have some kind of a useful response to the patch.

Sean, the patch is at http://www.pivot.at/pg/16-correlation-732.diff.
A short description of its usage can be found at
http://archives.postgresql.org/pgsql-performance/2002-11/msg00256.php.
If you are interested how the different interpolation methods work,
read the source - it shouldn't be too hard to find.

You might also want to read the thread starting at
http://archives.postgresql.org/pgsql-hackers/2002-10/msg00072.php.

>>  does this mean that sensorid, evtime, and action are not
>> independent?
>
>Hrm...  sensorid is sequence and grows proportional with evtime,
>obviously.

So a *low* sensorid (7) is quite uncommon for a *late* evtime?  This
would help understand the problem.  Unfortunately I have no clue what
to do about it.  :-(

>Having spent a fair amount of time looking at the two following plans,
>it seems as though an additional statistic is needed to change the
>cost of doing an index lookup when the index is linearly ordered.

I'm not sure I understand what you mean by "index is linearly
ordered",  but I guess correlation is that statistic you are talking
about.  However, it is calculated per column, not per index.

>Whether CLUSTER does this or not, I don't know,

If you CLUSTER on an index and then ANALYSE, you get a correlation of
1.0 (== optimum) for the first column of the index.

> I never heard back
>from him after getting the runtime down to a few ms.  :-/

Pity!  I'd have liked to see EXPLAIN ANALYSE for

    SELECT *
      FROM mss_fwevent
     WHERE sensorid = 7
       AND evtime > (now() - '6 hours'::INTERVAL)
       AND NOT action;

    SELECT *
      FROM mss_fwevent
     WHERE sensorid = 7
       AND evtime > (now() - '6 hours'::INTERVAL);

    SELECT *
      FROM mss_fwevent
     WHERE evtime > (now() - '6 hours'::INTERVAL);

    SELECT *
      FROM mss_fwevent
     WHERE sensorid = 7;


> Are indexes
>on linearly ordered data rebalanced somehow?  I thought CLUSTER only
>reordered data on disk.  -sc

AFAIK CLUSTER re-creates all indices belonging to the table.

Servus
 Manfred

От:
Sean Chittenden
Дата:

> >> I have an experimental patch lying around somewhere that tries to
> >> work around these problems by offering different estimation methods
> >> for index scans.  If you are interested, I'll dig it out.
> >
> >Sure, I'll take a gander... had my head in enough Knuth recently to
> >even hopefully have some kind of a useful response to the patch.
>
> Sean, the patch is at http://www.pivot.at/pg/16-correlation-732.diff.
> A short description of its usage can be found at
> http://archives.postgresql.org/pgsql-performance/2002-11/msg00256.php.
> If you are interested how the different interpolation methods work,
> read the source - it shouldn't be too hard to find.
>
> You might also want to read the thread starting at
> http://archives.postgresql.org/pgsql-hackers/2002-10/msg00072.php.

Hrm... let me bop back in my archives and reply there...  very
interesting work though.  I hope a reasonable algorythm can be found
in time for 7.5, or even 7.4 as this seems to be biting many people
and the current algo is clearly not right.

> >>  does this mean that sensorid, evtime, and action are not
> >> independent?
> >
> >Hrm...  sensorid is sequence and grows proportional with evtime,
> >obviously.
>
> So a *low* sensorid (7) is quite uncommon for a *late* evtime?  This
> would help understand the problem.  Unfortunately I have no clue what
> to do about it.  :-(

Correct.

> >Having spent a fair amount of time looking at the two following plans,
> >it seems as though an additional statistic is needed to change the
> >cost of doing an index lookup when the index is linearly ordered.
>
> I'm not sure I understand what you mean by "index is linearly
> ordered",  but I guess correlation is that statistic you are talking
> about.  However, it is calculated per column, not per index.

If two rows are id's 123456 and 123457, what are the odds that the
tuples are going to be on the same page?  ie, if 123456 is read, is
123457 already in the OS or PostgreSQL's disk cache?

> >Whether CLUSTER does this or not, I don't know,
>
> If you CLUSTER on an index and then ANALYSE, you get a correlation of
> 1.0 (== optimum) for the first column of the index.

Correlating of what to what?  Of data to nearby data?  Of data to
related data (ie, multi-column index?)? Of related data to pages on
disk?  Not 100% sure in what context you're using the word
correlation...

But that value will degrade after time and at what rate?  Does ANALYZE
maintain that value so that it's kept acurrate?  The ANALYZE page was
lacking in terms of implementation details in terms of how many rows
ANALYZE actually scans on big tables, which could dramatically affect
the correlation of a table after time if ANALYZE is maintaining the
correlation for a column.

> > I never heard back from him after getting the runtime down to a
> > few ms.  :-/
>
> Pity!  I'd have liked to see EXPLAIN ANALYSE for
>
>     SELECT *
>       FROM mss_fwevent
>      WHERE sensorid = 7
>        AND evtime > (now() - '6 hours'::INTERVAL)
>        AND NOT action;
>
>     SELECT *
>       FROM mss_fwevent
>      WHERE sensorid = 7
>        AND evtime > (now() - '6 hours'::INTERVAL);
>
>     SELECT *
>       FROM mss_fwevent
>      WHERE evtime > (now() - '6 hours'::INTERVAL);
>
>     SELECT *
>       FROM mss_fwevent
>      WHERE sensorid = 7;

ditto

> > Are indexes
> >on linearly ordered data rebalanced somehow?  I thought CLUSTER only
> >reordered data on disk.  -sc
>
> AFAIK CLUSTER re-creates all indices belonging to the table.

As of 7.3 or 7.4, yes.  -sc

--
Sean Chittenden

От:
Tom Lane
Дата:

Sean Chittenden <> writes:
>> If you CLUSTER on an index and then ANALYSE, you get a correlation of
>> 1.0 (== optimum) for the first column of the index.

> Correlating of what to what?  Of data to nearby data?  Of data to
> related data (ie, multi-column index?)? Of related data to pages on
> disk?  Not 100% sure in what context you're using the word
> correlation...

The correlation is between index order and heap order --- that is, are
the tuples in the table physically in the same order as the index?
The better the correlation, the fewer heap-page reads it will take to do
an index scan.

Note it is possible to measure correlation without regard to whether
there actually is any index; ANALYZE is simply looking to see whether
the values appear in increasing order according to the datatype's
default sort operator.

One problem we have is extrapolating from the single-column correlation
stats computed by ANALYZE to appropriate info for multi-column indexes.
It might be that the only reasonable fix for this is for ANALYZE to
compute multi-column stats too when multi-column indexes are present.
People are used to the assumption that you don't need to re-ANALYZE
after creating a new index, but maybe we'll have to give that up.

> But that value will degrade after time and at what rate?  Does ANALYZE
> maintain that value so that it's kept acurrate?

You keep it up to date by ANALYZE-ing at suitable intervals.  It's no
different from any other statistic.

            regards, tom lane

От:
Manfred Koizar
Дата:

On Thu, 07 Aug 2003 19:31:52 -0400, Tom Lane <>
wrote:
>The correlation is between index order and heap order --- that is, are
>the tuples in the table physically in the same order as the index?
>The better the correlation, the fewer heap-page reads it will take to do
>an index scan.

This is true for a column that is the first column of a btree index.
Correlation doesn't help with additional index columns and with
functional indices.

>Note it is possible to measure correlation without regard to whether
>there actually is any index;

But there is no need to, because the correlation is only used for
index access cost estimation.

>One problem we have is extrapolating from the single-column correlation
>stats computed by ANALYZE to appropriate info for multi-column indexes.
>It might be that the only reasonable fix for this is for ANALYZE to
>compute multi-column stats too when multi-column indexes are present.

I wonder whether it would be better to drop column correlation and
calculate index correlation instead, i.e. correlation of index tuples
to heap tuple positions.  This would solve both the multi-column index
and the functional index cost estimation problem.

>People are used to the assumption that you don't need to re-ANALYZE
>after creating a new index, but maybe we'll have to give that up.

Index correlation would be computed on CREATE INDEX and whenever the
heap relation is analysed ...

Servus
 Manfred

От:
Rob Messer
Дата:

I have a reporting system that does regular queries on a table with a
multipart index.  I am running version 7.3.4.  Here is the table
definition:

               Table "public.ds_rec_fld"
    Column     |          Type           | Modifiers
---------------+-------------------------+-----------
 dsid          | character varying(20)   | not null
 recid         | integer                 | not null
 field_name    | character varying(20)   | not null
 option_tag    | character varying(10)   | not null
 option_value  | integer                 |
 field_text    | character varying(2000) |
 field_type_cd | character varying(8)    |
Indexes: ds_rf_ndx1 btree (recid, field_name, option_value)

Normally queries are done using recid and field_name, so Postgresql
returns rows very quickly as expected.  Here is a sample explain
analyze output for a typical query:

db=> explain analyze
db-> select field_name, option_tag from ds_rec_fld where recid = 3000
and field_name = 'Q3A1';
                                                       QUERY PLAN


-------------------------------------------------------------------------------------------------------------------------
 Index Scan using ds_rf_ndx1 on ds_rec_fld  (cost=0.00..163.09 rows=40
width=38) (actual time=0.06..0.07 rows=1 loops=1)
   Index Cond: ((recid = 3001) AND (field_name = 'Q3A1'::character
varying))
 Total runtime: 0.12 msec
(3 rows)

The problem comes in when we are selecting multiple field_name values
in one query.  The normal SQL syntax we have been using is like this:

select field_name, option_tag from ds_rec_fld where recid = 3001 and
field_name in ('Q3A1', 'Q3A9');

This is just a simplified example, at times there can be a lot of
field_name values in one query in the "in" clause.  Here postgresql
refuses to use the full index, instead doing a filter based on part of
the first recid part of index.  Here is the explain analyze output:

 Index Scan using ds_rf_ndx1 on ds_rec_fld  (cost=0.00..30425.51
rows=80 width=38) (actual time=0.18..1.08 rows=2 loops=1)
   Index Cond: (recid = 3001)
   Filter: ((field_name = 'Q3A1'::character varying) OR (field_name =
'Q3A9'::character varying))
 Total runtime: 1.12 msec
(4 rows)

So, 10 times longer.  This is an issue because at times we are
iterating through thousands of recid values.

I did a vacuum analyze, adjusted random_page_cost, etc. all to no
avail.

I also noticed that the problem goes away when I reformat the query
like this:

select field_name, option_tag from ds_rec_fld where
(recid = 3001 and field_name = 'Q3A1') or
(recid = 3001 and field_name = 'Q3A9')

Here is the explain analyze output for this:

  Index Scan using ds_rf_ndx1, ds_rf_ndx1 on ds_rec_fld
(cost=0.00..326.57 rows=80 width=38) (actual time=0.07..0.10 rows=2
loops=1)
   Index Cond: (((recid = 3001) AND (field_name = 'Q3A1'::character
varying)) OR ((recid = 3001) AND (field_name = 'Q3A9'::character
varying)))
 Total runtime: 0.16 msec
(3 rows)

Much better.  So I have partially solved my own problem, but there are
other places that this is not this simple to fix.

Therefore, my question is, is there some way to force postgresql to use
the full index and still stick with the shorter "field_name in ('...',
'...')" syntax?

If anyone has any thoughts please let me know.  Also it strikes me that
perhaps the optimizer could be tweaked to treat the first case like the
second one.  Thanks in advance,

Rob

__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com

От:
Tom Lane
Дата:

Rob Messer <> writes:
> The problem comes in when we are selecting multiple field_name values
> in one query.  The normal SQL syntax we have been using is like this:

> select field_name, option_tag from ds_rec_fld where recid = 3001 and
> field_name in ('Q3A1', 'Q3A9');

You'd have better luck if field_name were the first column of the
two-column index.  See the archives.

Improving this situation is on the to-do list but it seems not trivial
to fix.

            regards, tom lane