Обсуждение: bgwriter, checkpoints, curious (seeing delays)

Поиск

Список

Период

Сортировка

bgwriter, checkpoints, curious (seeing delays)

От

Tory M Blue

Дата:

26 февраля 2010 г., 02:12:38

Okay ladies and gents and the rest of you :)

It's time I dig into another issue, and that's a curious 5 second
delay on connect, on occasion. Now, I believe the network to be sound
and there are zero errors on any servers, no retrans, no runts, nada
nada nada. However I will continue to run tests with/without dns,
tcpdumps, tracking communications, handshakes, however.

I've been doing some reading and what did I find. I found "Checkpoints
are very disrupting to your database performance and can cause
connections to stall for up to a few seconds while they occur."

Now I'm reaching here, but wondered.

My issue is what appears to be a random 5 second connection delay.

Connection times over a 24 hour period from my application servers,
which connect to my query servers (postgres 8.3.4, slony 1.2.15), big
8 way boxes with 16-32gb ram

Conn time in seconds and count. (50 kind of a round even number eh!)
          '1.0-2.0' => 4,
          '5.0-' => 50,
          '0.5-1.0' => 6,
          '0.0-0.5' => 155,632

So the 5 second makes us instantly think network but I want to hit
this from multiple angles, so figured I would reach out to the brain
trust here for some ideas.

Here is what may be interesting and may point me into a direction of
further tuning and adjusting.

postgres=# select * from pg_stat_bgwriter;
 checkpoints_timed | checkpoints_req | buffers_checkpoint |
buffers_clean | maxwritten_clean | buffers_backend | buffers_alloc

-------------------+-----------------+--------------------+---------------+------------------+-----------------+---------------
             34820 |             207 |          444406118 |
214634102 |          1274873 |       333375850 |    3370607455
(1 row)

Now I'll be honest, I have nothing special in my configs for
bg_writer, in fact other than enabling log_checkpoints today, I have
nothing but defaults for bg_*. but the fact that checkpoints_timed is
so high, seams to maybe point to some required tuning?

Things of interest.

Memory 24gb
8 way processor
postgres 8.3.4

shared_buffers = 1500MB                  # min 128kB or max_connections*16kB
max_prepared_transactions = 0           # can be 0 or more
work_mem = 100MB                                # min 64kB
maintenance_work_mem = 128MB            # min 1MB
max_fsm_pages = 500000                  # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 225                 # min 100, ~70 bytes each
fsync = off                              # turns forced
synchronization on or off
checkpoint_segments = 100               # in logfile segments, min 1, 16MB each
checkpoint_warning = 3600s              # 0 is off

So nothing exciting in my configs (300 connections)..

So my guess is "IF" a really big if, the delay is actually due to
checkpoints (perfect lil storm, timing) than there should be ways for
me to tune this ya?

Thanks "again" for your assistance

Tory

Re: bgwriter, checkpoints, curious (seeing delays)

От

Tory M Blue

Дата:

26 февраля 2010 г., 02:20:52

On Thu, Feb 25, 2010 at 10:12 PM, Tory M Blue <tmblue@gmail.com> wrote:
> Okay ladies and gents and the rest of you :)
>
> It's time I dig into another issue, and that's a curious 5 second
> delay on connect, on occasion. Now, I believe the network to be sound
> and there are zero errors on any servers, no retrans, no runts, nada
> nada nada. However I will continue to run tests with/without dns,
> tcpdumps, tracking communications, handshakes, however.
>
> I've been doing some reading and what did I find. I found "Checkpoints
> are very disrupting to your database performance and can cause
> connections to stall for up to a few seconds while they occur."

Quick added note, sorry.

"checkpoint_completion_target (floating point)

    Specifies the target length of checkpoints, as a fraction of the
checkpoint interval. The default is 0.5. This parameter can only be
set in the postgresql.conf file or on the server command line. "

interesting that it's a .5 second default setting and I'm seeing
exactly that .5 second delay.

Again could be reaching here!

Thanks for putting up with me

Tory

Re: bgwriter, checkpoints, curious (seeing delays)

От

Jochen Erwied

Дата:

26 февраля 2010 г., 02:29:45

Friday, February 26, 2010, 7:20:38 AM you wrote:

> "checkpoint_completion_target (floating point)

> interesting that it's a .5 second default setting and I'm seeing
> exactly that .5 second delay.

It's not an exact time, but a multiplier to 'checkpoint_timeout'. So a
setting of .5 with a timeout of 300 seconds means a checkpoint should be
completed after 300*0.5 = 150 seconds.


--
Jochen Erwied     |   home: jochen@erwied.eu     +49-208-38800-18, FAX: -19
Sauerbruchstr. 17 |   work: joe@mbs-software.de  +49-2151-7294-24, FAX: -50
D-45470 Muelheim  | mobile: jochen.erwied@vodafone.de       +49-173-5404164

Re: bgwriter, checkpoints, curious (seeing delays)

От

Devrim GÜNDÜZ

Дата:

26 февраля 2010 г., 02:43:18

On Thu, 2010-02-25 at 22:12 -0800, Tory M Blue wrote:
> shared_buffers = 1500MB

Some people tend to increase this to 2.2GB(32-bit) or 4-6 GB (64 bit),
if needed. Please note that more shared_buffers will lead to more
pressure on bgwriter, but it also has lots of benefits, too.

> work_mem = 100MB

This is too much. Since you have 300 connections, you will probably swap
because of this setting, since each connection may use this much
work_mem. The rule of the thumb is to set this to a lower general value
(say, 1-2 MB), and set it per-query when needed.

> checkpoint_segments = 100
> checkpoint_warning = 3600s

What about checkpoint_timeout? Please note that even if
checkpoint_segments = 100, if timeout value is low (say 5 mins),
PostgreSQL will probably checkpoint every checkpoint_timeout minutes
(unless PostgreSQL creates $checkpoint_segments xlogs before
checkpoint_timeout value). Depending on your workload, this may not be
intended, and it may cause spikes -- which will lead to the issues you
complain.

I'll stop here, and suggest you read this:

http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm

for details about this subject. As noted there, if you are running 8.3+,
pg_stat_bgwriter will help you to tune checkpoint & bgwriter settings.

-HTH.

--
Devrim GÜNDÜZ
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz

Вложения

signature.asc

Re: bgwriter, checkpoints, curious (seeing delays)

От

Tory M Blue

Дата:

26 февраля 2010 г., 03:01:56

2010/2/25 Devrim GÜNDÜZ <devrim@gunduz.org>:
> On Thu, 2010-02-25 at 22:12 -0800, Tory M Blue wrote:
>> shared_buffers = 1500MB
>
> Some people tend to increase this to 2.2GB(32-bit) or 4-6 GB (64 bit),
> if needed. Please note that more shared_buffers will lead to more
> pressure on bgwriter, but it also has lots of benefits, too.
>
>> work_mem = 100MB
>
> This is too much. Since you have 300 connections, you will probably swap
> because of this setting, since each connection may use this much
> work_mem. The rule of the thumb is to set this to a lower general value
> (say, 1-2 MB), and set it per-query when needed.
>
>> checkpoint_segments = 100
>> checkpoint_warning = 3600s
>
> What about checkpoint_timeout? Please note that even if
> checkpoint_segments = 100, if timeout value is low (say 5 mins),
> PostgreSQL will probably checkpoint every checkpoint_timeout minutes
> (unless PostgreSQL creates $checkpoint_segments xlogs before
> checkpoint_timeout value). Depending on your workload, this may not be
> intended, and it may cause spikes -- which will lead to the issues you
> complain.
>
> I'll stop here, and suggest you read this:
>
> http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm
>
> for details about this subject. As noted there, if you are running 8.3+,
> pg_stat_bgwriter will help you to tune checkpoint & bgwriter settings.
>
> -HTH.

Checkpoint_timeout is the default and that looks like 5 mins (300
seconds). And is obviously why I have such a discrepancy between time
reached and requested.

Thank you sir, that's actually the page that I've spent much of my
time on this eve :)  I'll continue to read and check my configuration
settings.

Tory

Re: bgwriter, checkpoints, curious (seeing delays)

От

Tory M Blue

Дата:

26 февраля 2010 г., 03:04:28

2010/2/25 Tory M Blue <tmblue@gmail.com>:
> 2010/2/25 Devrim GÜNDÜZ <devrim@gunduz.org>:
>> On Thu, 2010-02-25 at 22:12 -0800, Tory M Blue wrote:
>>> shared_buffers = 1500MB
>>
>> Some people tend to increase this to 2.2GB(32-bit) or 4-6 GB (64 bit),
>> if needed. Please note that more shared_buffers will lead to more
>> pressure on bgwriter, but it also has lots of benefits, too.
>>
>>> work_mem = 100MB
>>
>> This is too much. Since you have 300 connections, you will probably swap
>> because of this setting, since each connection may use this much
>> work_mem. The rule of the thumb is to set this to a lower general value
>> (say, 1-2 MB), and set it per-query when needed.
>>
>>> checkpoint_segments = 100
>>> checkpoint_warning = 3600s
>>
>> What about checkpoint_timeout? Please note that even if
>> checkpoint_segments = 100, if timeout value is low (say 5 mins),
>> PostgreSQL will probably checkpoint every checkpoint_timeout minutes
>> (unless PostgreSQL creates $checkpoint_segments xlogs before
>> checkpoint_timeout value). Depending on your workload, this may not be
>> intended, and it may cause spikes -- which will lead to the issues you
>> complain.
>>
>> I'll stop here, and suggest you read this:
>>
>> http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm
>>
>> for details about this subject. As noted there, if you are running 8.3+,
>> pg_stat_bgwriter will help you to tune checkpoint & bgwriter settings.
>>
>> -HTH.
>
> Checkpoint_timeout is the default and that looks like 5 mins (300
> seconds). And is obviously why I have such a discrepancy between time
> reached and requested.
>
> Thank you sir, that's actually the page that I've spent much of my
> time on this eve :)  I'll continue to read and check my configuration
> settings.
>
> Tory
Also since I set the log on today I have some log information
regarding the checkpoints
2010-02-25 22:08:13 PST    LOG:  checkpoint starting: time
2010-02-25 22:10:41 PST    LOG:  checkpoint complete: wrote 44503
buffers (23.2%); 0 transaction log file(s) added, 0 removed, 20
recycled; write=148.539 s, sync=0.000 s, total=148.540 s
2010-02-25 22:13:13 PST    LOG:  checkpoint starting: time
2010-02-25 22:15:37 PST    LOG:  checkpoint complete: wrote 38091
buffers (19.8%); 0 transaction log file(s) added, 0 removed, 20
recycled; write=144.713 s, sync=0.000 s, total=144.714 s
2010-02-25 22:18:13 PST    LOG:  checkpoint starting: time
2010-02-25 22:20:42 PST    LOG:  checkpoint complete: wrote 38613
buffers (20.1%); 0 transaction log file(s) added, 0 removed, 19
recycled; write=149.870 s, sync=0.000 s, total=149.871 s
2010-02-25 22:23:13 PST    LOG:  checkpoint starting: time
2010-02-25 22:25:42 PST    LOG:  checkpoint complete: wrote 39009
buffers (20.3%); 0 transaction log file(s) added, 0 removed, 19
recycled; write=149.876 s, sync=0.000 s, total=149.877 s
2010-02-25 22:28:13 PST    LOG:  checkpoint starting: time
2010-02-25 22:30:43 PST    LOG:  checkpoint complete: wrote 30847
buffers (16.1%); 0 transaction log file(s) added, 0 removed, 19
recycled; write=150.000 s, sync=0.000 s, total=150.001 s
2010-02-25 22:33:13 PST    LOG:  checkpoint starting: time
2010-02-25 22:35:43 PST    LOG:  checkpoint complete: wrote 11917
buffers (6.2%); 0 transaction log file(s) added, 0 removed, 14
recycled; write=150.064 s, sync=0.000 s, total=150.065 s
2010-02-25 22:38:13 PST    LOG:  checkpoint starting: time
2010-02-25 22:40:43 PST    LOG:  checkpoint complete: wrote 10869
buffers (5.7%); 0 transaction log file(s) added, 0 removed, 5
recycled; write=149.995 s, sync=0.000 s, total=149.996 s
2010-02-25 22:43:13 PST    LOG:  checkpoint starting: time
2010-02-25 22:45:41 PST    LOG:  checkpoint complete: wrote 31424
buffers (16.4%); 0 transaction log file(s) added, 0 removed, 4
recycled; write=148.597 s, sync=0.000 s, total=148.598 s
2010-02-25 22:48:13 PST    LOG:  checkpoint starting: time
2010-02-25 22:50:42 PST    LOG:  checkpoint complete: wrote 33895
buffers (17.7%); 0 transaction log file(s) added, 0 removed, 17
recycled; write=149.825 s, sync=0.000 s, total=149.826 s
2010-02-25 22:53:13 PST    LOG:  checkpoint starting: time
2010-02-25 22:53:17 PST postgres postgres [local] LOG:  unexpected EOF
on client connection
2010-02-25 22:55:43 PST    LOG:  checkpoint complete: wrote 34155
buffers (17.8%); 0 transaction log file(s) added, 0 removed, 15
recycled; write=150.045 s, sync=0.000 s, total=150.046 s
2010-02-25 22:58:13 PST    LOG:  checkpoint starting: time
2010-02-25 23:00:41 PST    LOG:  checkpoint complete: wrote 33873
buffers (17.6%); 0 transaction log file(s) added, 0 removed, 15
recycled; write=148.223 s, sync=0.000 s, total=148.224 s
2010-02-25 23:03:13 PST    LOG:  checkpoint starting: time

Re: bgwriter, checkpoints, curious (seeing delays)

От

Devrim GÜNDÜZ

Дата:

26 февраля 2010 г., 03:21:10

On Thu, 2010-02-25 at 23:01 -0800, Tory M Blue wrote:

> Checkpoint_timeout is the default and that looks like 5 mins (300
> seconds). And is obviously why I have such a discrepancy between time
> reached and requested.

If you have a high load, you may want to start tuning with 15 minutes,
and bump it to 30 mins if needed. Also you may want to decrease segments
value based on your findings, since increasing only one of them won't
help you a lot.

As I wrote before, pg_stat_bgwriter is your friend here.
--
Devrim GÜNDÜZ
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz

Вложения

signature.asc

Re: bgwriter, checkpoints, curious (seeing delays)

От

Greg Smith

Дата:

26 февраля 2010 г., 05:24:21

Tory M Blue wrote:
> 2010-02-25 22:10:41 PST    LOG:  checkpoint complete: wrote 44503
> buffers (23.2%); 0 transaction log file(s) added, 0 removed, 20
> recycled; write=148.539 s, sync=0.000 s, total=148.540 s
>
This one is typical for your list so I'll only comment on it.  This is
writing out 350MB spread over 148 seconds, which means your background
checkpoint I/O is about a 2.4MB/s stream.  That's a moderate amount that
could be tough for some systems, but note that your "sync" time is
nothing.  Generally, when someone sees a long pause that's caused by a
checkpoint, the sync number is really high.  Your disks seem to be
keeping up with the checkpoint overhead moderately well.  (Mind you, the
zero sync time is because you have 'fsync = off ', which will eventually
result in your database being corrupted after a badly timed outage one
day; you really don't want to do that)

My guess is your connections are doing some sort of DNS operation that
periodically stalls waiting for a 5-second timeout.  There's nothing in
your checkpoint data suggesting it's a likely cause of a delay that
long--and it would be a lot more random if that were the case, too.  Bad
checkpoint spikes will be seconds sometimes, no time at all others; a
heavy grouping at 5 seconds doesn't match the pattern they have at all.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us

Re: bgwriter, checkpoints, curious (seeing delays)

От

"Kevin Grittner"

Дата:

26 февраля 2010 г., 09:09:20

Tory M Blue  wrote:

> 2010-02-25 22:53:13 PST LOG: checkpoint starting: time
> 2010-02-25 22:53:17 PST postgres postgres [local] LOG: unexpected
> EOF on client connection
> 2010-02-25 22:55:43 PST LOG: checkpoint complete: wrote 34155
> buffers (17.8%); 0 transaction log file(s) added, 0 removed, 15
> recycled; write=150.045 s, sync=0.000 s, total=150.046 s

Did that unexpected EOF correspond to a connection attempt that gave
up based on time?

-Kevin

Re: bgwriter, checkpoints, curious (seeing delays)

От

Tory M Blue

Дата:

26 февраля 2010 г., 14:23:23

On Fri, Feb 26, 2010 at 5:09 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Tory M Blue  wrote:
>
>> 2010-02-25 22:53:13 PST LOG: checkpoint starting: time
>> 2010-02-25 22:53:17 PST postgres postgres [local] LOG: unexpected
>> EOF on client connection
>> 2010-02-25 22:55:43 PST LOG: checkpoint complete: wrote 34155
>> buffers (17.8%); 0 transaction log file(s) added, 0 removed, 15
>> recycled; write=150.045 s, sync=0.000 s, total=150.046 s
>
> Did that unexpected EOF correspond to a connection attempt that gave
> up based on time?
>
> -Kevin
>
Kevin

Good question, I'm unclear what that was. I mean it's a LOG, so not a
client connection, that really kind of confused me. I don't normally
see EOF of client and an EOF on client from local, that's really
really weird

Tory

Re: bgwriter, checkpoints, curious (seeing delays)

От

Ben Chobot

Дата:

26 февраля 2010 г., 14:25:50

On Feb 26, 2010, at 11:23 AM, Tory M Blue wrote:

> On Fri, Feb 26, 2010 at 5:09 AM, Kevin Grittner
> <Kevin.Grittner@wicourts.gov> wrote:
>> Tory M Blue  wrote:
>>
>>> 2010-02-25 22:53:13 PST LOG: checkpoint starting: time
>>> 2010-02-25 22:53:17 PST postgres postgres [local] LOG: unexpected
>>> EOF on client connection
>>> 2010-02-25 22:55:43 PST LOG: checkpoint complete: wrote 34155
>>> buffers (17.8%); 0 transaction log file(s) added, 0 removed, 15
>>> recycled; write=150.045 s, sync=0.000 s, total=150.046 s
>>
>> Did that unexpected EOF correspond to a connection attempt that gave
>> up based on time?
>>
>> -Kevin
>>
> Kevin
>
> Good question, I'm unclear what that was. I mean it's a LOG, so not a
> client connection, that really kind of confused me. I don't normally
> see EOF of client and an EOF on client from local, that's really
> really weird

We see that from our monitoring software testing port 5432.

Re: bgwriter, checkpoints, curious (seeing delays)

От

Tory M Blue

Дата:

26 февраля 2010 г., 14:52:19

2010/2/25 Devrim GÜNDÜZ <devrim@gunduz.org>:
> On Thu, 2010-02-25 at 22:12 -0800, Tory M Blue wrote:
>> shared_buffers = 1500MB
>
> Some people tend to increase this to 2.2GB(32-bit) or 4-6 GB (64 bit),
> if needed. Please note that more shared_buffers will lead to more
> pressure on bgwriter, but it also has lots of benefits, too.
>
>> work_mem = 100MB
>
> This is too much. Since you have 300 connections, you will probably swap
> because of this setting, since each connection may use this much
> work_mem. The rule of the thumb is to set this to a lower general value
> (say, 1-2 MB), and set it per-query when needed.

I'm slightly confused. Most things I've read, including running
pg_tune for grins puts this around 100MB, 98MB for pgtune.  1-2MB just
seems really low to me. And Ignore the 300 connections, thats an upper
limit, I usually run a max of 40-45 but usually around 20 connections
per sec.


Also is there a way to log if there are any deadlocks happening (I'm
not seeing them in logs)
deadlock_timeout = 5s

Tory

Re: bgwriter, checkpoints, curious (seeing delays)

От

Tory M Blue

Дата:

26 февраля 2010 г., 15:02:20

2010/2/25 Devrim GÜNDÜZ <devrim@gunduz.org>:
> On Thu, 2010-02-25 at 23:01 -0800, Tory M Blue wrote:
>
>> Checkpoint_timeout is the default and that looks like 5 mins (300
>> seconds). And is obviously why I have such a discrepancy between time
>> reached and requested.
>
> If you have a high load, you may want to start tuning with 15 minutes,
> and bump it to 30 mins if needed. Also you may want to decrease segments
> value based on your findings, since increasing only one of them won't
> help you a lot.
>
> As I wrote before, pg_stat_bgwriter is your friend here.

Actually these servers have almost no load. Really they run very cool

Load Average (for the day):
Cur: 0.16 Load
Avg: 0.22 Load

So I don't think it's load  No network issues (that I've found)  and
while the server will eventually eat all the memory it's currently
sitting with 4 gb free.

Mem Total (for the day):
Cur: 25.55 GBytes
Avg: 25.55 GBytes
Max: 25.55 GBytes
    Mem Available (for the day):
Cur: 4.72 GBytes
Avg: 5.15 GBytes
Max: 5.71 GBytes
    Bytes Used (for the day):
Cur: 20.83 GBytes
Avg: 20.40 GBytes
Max: 21.20 GBytes

Thanks for your pointers, I'm continuing to look and will do some
tests today. I also hear you about fsync and will do some testing here
to see why this was set (been running for 4-6 years), and that setting
was probably set way way back in the day and it's survived each
upgrade/hardware/storage update.

Tory

Re: bgwriter, checkpoints, curious (seeing delays)

От

"Jorge Montero"

Дата:

26 февраля 2010 г., 15:52:40


>>> Tory M Blue <tmblue@gmail.com> 02/26/10 12:52 PM >>>
>>
>> This is too much. Since you have 300 connections, you will probably swap
>> because of this setting, since each connection may use this much
>> work_mem. The rule of the thumb is to set this to a lower general value
>> (say, 1-2 MB), and set it per-query when needed.
>
> I'm slightly confused. Most things I've read, including running
> pg_tune for grins puts this around 100MB, 98MB for pgtune.  1-2MB just
> seems really low to me. And Ignore the 300 connections, thats an upper
> limit, I usually run a max of 40-45 but usually around 20 connections
>per sec.

It has been said in the list before that pg_tune is extremely aggressive when it comes to work_mem.

100MB is just a whole lot of memory for something that is dedicated mostly to sorting. Some of my relatively heavy duty
queries,which end up manipulating hundreds of thousands of rows in subqueries, do just fine with quite a bit less. 

1-2MB is good enough for many families of queries, but it's hard to say what the right default should be for you. The
rightnumber can be estimated by running explain analyze on your most common queries, with parameters that are
representativeto regular use, and see how much memory they actually claim to use. In my case, for example, most of my
queriesdo just fine with 10 MB, while the reporting queries that accumulate quite a bit of deta request up to 60MB. 

If your average query needs 100 MB, it'd still mean that 40 connections take 4 gigs worth of work memory, which might
bebetter spent caching the database. 

Now, if your system is so over-specced that wasting a few gigs of RAM doesn't impact your performance one bit, then you
mightnot have to worry about this at all.

Re: bgwriter, checkpoints, curious (seeing delays)

От

"Fernando Hevia"

Дата:

26 февраля 2010 г., 16:00:35

> -----Mensaje original-----
> De: Tory M Blue
>
> 2010/2/25 Devrim GÜNDÜZ <devrim@gunduz.org>:
> > On Thu, 2010-02-25 at 22:12 -0800, Tory M Blue wrote:
> >> shared_buffers = 1500MB
> >
> > Some people tend to increase this to 2.2GB(32-bit) or 4-6
> GB (64 bit),
> > if needed. Please note that more shared_buffers will lead to more
> > pressure on bgwriter, but it also has lots of benefits, too.
> >
> >> work_mem = 100MB
> >
> > This is too much. Since you have 300 connections, you will probably
> > swap because of this setting, since each connection may use
> this much
> > work_mem. The rule of the thumb is to set this to a lower general
> > value (say, 1-2 MB), and set it per-query when needed.
>
> I'm slightly confused. Most things I've read, including
> running pg_tune for grins puts this around 100MB, 98MB for
> pgtune.  1-2MB just seems really low to me. And Ignore the
> 300 connections, thats an upper limit, I usually run a max of
> 40-45 but usually around 20 connections per sec.
>

If you have maximum 45 users running simultaneously a rather complex query
that requires... say 3 sorts, thats 45 x 100MB x 3 = 13.5 GB of RAM used up
because of this particular work_mem setting. Doesn't mean it will happen
just that your settings make this scenario possible.

So, to avoid this scenario, the suggestion is to keep work_mem low and
adjust it per query if required. I find 1-2 MB too low for my particular
requirements so I have it in 8 MB. Anyway, due to your server having a lot
of RAM your setting might make sense. But all that memory would probably be
better used if it was available for caching.

>
> Also is there a way to log if there are any deadlocks
> happening (I'm not seeing them in logs) deadlock_timeout = 5s
>

In postgresql.conf:
log_lock_waits = on                     # log lock waits >= deadlock_timeout

Regards,
Fernando.

Re: bgwriter, checkpoints, curious (seeing delays)

От

Tory M Blue

Дата:

26 февраля 2010 г., 19:24:55

On Fri, Feb 26, 2010 at 11:49 AM, Jorge Montero
<jorge_montero@homedecorators.com> wrote:
>
>
>>>> Tory M Blue <tmblue@gmail.com> 02/26/10 12:52 PM >>>
>>>
>>> This is too much. Since you have 300 connections, you will probably swap
>>> because of this setting, since each connection may use this much
>>> work_mem. The rule of the thumb is to set this to a lower general value
>>> (say, 1-2 MB), and set it per-query when needed.

>
> 1-2MB is good enough for many families of queries, but it's hard to say what the right default should be for you. The
rightnumber can be estimated by running explain analyze on your most common queries, with parameters that are
representativeto regular use, and see how much memory they actually claim to use. In my case, for example, most of my
queriesdo just fine with 10 MB, while the reporting queries that accumulate quite a bit of deta request up to 60MB. 
>
> If your average query needs 100 MB, it'd still mean that 40 connections take 4 gigs worth of work memory, which might
bebetter spent caching the database. 

Ya my boxes are pretty well stacked, but a question. How does one get
the memory usage of a query. You state to look at explain analyze but
this gives timing and costs, but is one of the numbers memory or do I
have to take values and do some math?


--------------------------------------------------------------------------------------------------------------------------
 Function Scan on listings_search  (cost=0.00..260.00 rows=1000
width=108) (actual time=904.374..904.383 rows=10 loops=1)
 Total runtime: 904.411 ms

Thanks
Tory

Also don't think this 5 second thing is the DB.. Sure is not checkpoints.

Re: bgwriter, checkpoints, curious (seeing delays)

От

Robert Haas

Дата:

03 марта 2010 г., 15:34:17

On Fri, Feb 26, 2010 at 6:24 PM, Tory M Blue <tmblue@gmail.com> wrote:
> Ya my boxes are pretty well stacked, but a question. How does one get
> the memory usage of a query. You state to look at explain analyze but
> this gives timing and costs, but is one of the numbers memory or do I
> have to take values and do some math?

There's really not good instrumentation on memory usage, except for
sorts.  9.0 will add a bit of instrumentation for hashes.

I might try 'top'.

...Robert

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: bgwriter, checkpoints, curious (seeing delays)

Вложения

Вложения