Обсуждение: Reducing stats collection overhead

Поиск

Список

Период

Сортировка

Reducing stats collection overhead

От

Tom Lane

Дата:

29 апреля 2007 г., 04:44:36

Arjen van der Meijden told me that according to the tweakers.net
benchmark, HEAD is noticeably slower than 8.2.4, and I soon confirmed
here that for small SELECT queries issued as separate transactions,
there's a significant difference.  I think much of the difference stems
from the fact that we now have stats_row_level ON by default, and so
every transaction sends a stats message that wasn't there by default
in 8.2.  When you're doing a few thousand transactions per second
(not hard for small read-only queries) that adds up.

It seems to me that this could be fixed fairly easily by allowing the
stats to accumulate across multiple small transactions before sending
a message.  There's surely not much point in kicking stats out quickly
when the stats collector only reports them to the world every half
second anyway.

The first design that comes to mind is that at transaction end
(pgstat_report_tabstat() time) we send a stats message only if at least
X milliseconds have elapsed since we last sent one, where X is
PGSTAT_STAT_INTERVAL or closely related to it.  We also make sure to
flush stats out before process exit.  This approach ensures that in a
lots-of-short-transactions scenario, we only need to send one stats
message every X msec, not one per query.  The cost is possible delay of
stats reports.  I claim that any transaction that makes a really sizable
change in the stats will run longer than X msec and therefore will send
its stats immediately.  Cases where a client does a small transaction
after sleeping for awhile (more than X msec) will also send immediately.
You might get a delay in reporting the last few transactions of a burst
of short transactions, but how much does it matter?  So I think that
complicating the design with, say, a timeout counter to force out the
stats after a sleep interval is not necessary.  Doing so would add a
couple of kernel calls to every client interaction so I'd really rather
avoid that.

Any thoughts, better ideas?
        regards, tom lane

Re: Reducing stats collection overhead

От

Bruce Momjian

Дата:

29 апреля 2007 г., 07:17:38

Yes, it seems we will have to do something for 8.3.  I assume the method
below would reduce frequent updates of the stats_command_string too.

---------------------------------------------------------------------------

Tom Lane wrote:
> Arjen van der Meijden told me that according to the tweakers.net
> benchmark, HEAD is noticeably slower than 8.2.4, and I soon confirmed
> here that for small SELECT queries issued as separate transactions,
> there's a significant difference.  I think much of the difference stems
> from the fact that we now have stats_row_level ON by default, and so
> every transaction sends a stats message that wasn't there by default
> in 8.2.  When you're doing a few thousand transactions per second
> (not hard for small read-only queries) that adds up.
> 
> It seems to me that this could be fixed fairly easily by allowing the
> stats to accumulate across multiple small transactions before sending
> a message.  There's surely not much point in kicking stats out quickly
> when the stats collector only reports them to the world every half
> second anyway.
> 
> The first design that comes to mind is that at transaction end
> (pgstat_report_tabstat() time) we send a stats message only if at least
> X milliseconds have elapsed since we last sent one, where X is
> PGSTAT_STAT_INTERVAL or closely related to it.  We also make sure to
> flush stats out before process exit.  This approach ensures that in a
> lots-of-short-transactions scenario, we only need to send one stats
> message every X msec, not one per query.  The cost is possible delay of
> stats reports.  I claim that any transaction that makes a really sizable
> change in the stats will run longer than X msec and therefore will send
> its stats immediately.  Cases where a client does a small transaction
> after sleeping for awhile (more than X msec) will also send immediately.
> You might get a delay in reporting the last few transactions of a burst
> of short transactions, but how much does it matter?  So I think that
> complicating the design with, say, a timeout counter to force out the
> stats after a sleep interval is not necessary.  Doing so would add a
> couple of kernel calls to every client interaction so I'd really rather
> avoid that.
> 
> Any thoughts, better ideas?
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Reducing stats collection overhead

От

Tom Lane

Дата:

29 апреля 2007 г., 07:27:25

Bruce Momjian <bruce@momjian.us> writes:
> Yes, it seems we will have to do something for 8.3.

Yeah, we've kind of ignored any possible overhead of the stats mechanism
for awhile, but I think we've got to face up to this if we're gonna have
it on by default.

> I assume the method
> below would reduce frequent updates of the stats_command_string too.

No, stats_command_string is entirely independent now.
        regards, tom lane

Re: Reducing stats collection overhead

От

Bruce Momjian

Дата:

29 апреля 2007 г., 10:08:29

Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Yes, it seems we will have to do something for 8.3.
> 
> Yeah, we've kind of ignored any possible overhead of the stats mechanism
> for awhile, but I think we've got to face up to this if we're gonna have
> it on by default.
> 
> > I assume the method
> > below would reduce frequent updates of the stats_command_string too.
> 
> No, stats_command_string is entirely independent now.

Oh, right, we used shared memory.  No wonder it isn't on the TODO list
anymore.  ;-)

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Reducing stats collection overhead

От

"Simon Riggs"

Дата:

29 апреля 2007 г., 11:18:44

On Sun, 2007-04-29 at 00:44 -0400, Tom Lane wrote:

> The first design that comes to mind is that at transaction end
> (pgstat_report_tabstat() time) we send a stats message only if at least
> X milliseconds have elapsed since we last sent one, where X is
> PGSTAT_STAT_INTERVAL or closely related to it.  We also make sure to
> flush stats out before process exit.

Sounds like a good general, long term solution.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: Reducing stats collection overhead

От

Gregory Stark

Дата:

29 апреля 2007 г., 11:36:24

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> So I think that complicating the design with, say, a timeout counter to
> force out the stats after a sleep interval is not necessary. Doing so would
> add a couple of kernel calls to every client interaction so I'd really
> rather avoid that.
>
> Any thoughts, better ideas?

If we want to have an idle_in_statement_timeout then we'll need to introduce a
select loop instead of just directly blocking on recv anyways. Does that mean
we may as well bite the bullet now?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

Re: Reducing stats collection overhead

От

Lukas Kahwe Smith

Дата:

29 апреля 2007 г., 14:34:06

Tom Lane wrote:

> The first design that comes to mind is that at transaction end
> (pgstat_report_tabstat() time) we send a stats message only if at least
> X milliseconds have elapsed since we last sent one, where X is
> PGSTAT_STAT_INTERVAL or closely related to it.  We also make sure to
> flush stats out before process exit.  This approach ensures that in a
> lots-of-short-transactions scenario, we only need to send one stats
> message every X msec, not one per query.  The cost is possible delay of
> stats reports.  I claim that any transaction that makes a really sizable
> change in the stats will run longer than X msec and therefore will send
> its stats immediately.  Cases where a client does a small transaction
> after sleeping for awhile (more than X msec) will also send immediately.
> You might get a delay in reporting the last few transactions of a burst
> of short transactions, but how much does it matter?  So I think that
> complicating the design with, say, a timeout counter to force out the
> stats after a sleep interval is not necessary.  Doing so would add a
> couple of kernel calls to every client interaction so I'd really rather
> avoid that.

Well and if this delaying of updating the stats has an effect on query 
time, then it also increases the likelihood of going past the X msec 
limit of that previously "small" query. So its sort of "self fixing" 
with the only risk of one query getting overly long due to lack of stats 
updating.

regards,
Lukas

Re: Reducing stats collection overhead

От

Alvaro Herrera

Дата:

29 апреля 2007 г., 16:00:47

Tom Lane wrote:

> The first design that comes to mind is that at transaction end
> (pgstat_report_tabstat() time) we send a stats message only if at least
> X milliseconds have elapsed since we last sent one, where X is
> PGSTAT_STAT_INTERVAL or closely related to it.  We also make sure to
> flush stats out before process exit.  This approach ensures that in a
> lots-of-short-transactions scenario, we only need to send one stats
> message every X msec, not one per query.

If you're going to make it depend on the timestamp set by transaction
start, I'm all for it.

> The cost is possible delay of stats reports.  I claim that any
> transaction that makes a really sizable change in the stats will run
> longer than X msec and therefore will send its stats immediately.

I agree with this, particularly if it means we don't get to add another
gettimeofday().

FWIW, am I reading the code wrong or do we send the number of xact
commit and rollback multiple times in pgstat_report_one_tabstat, with
only the first one having non-zero counts?  Maybe we could put these
counters in a separate message to reduce the size of the tabstat
messages themselves.  (It may be that the total impact in bytes is
minimal, and the added overhead of an additional message is greater?)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Reducing stats collection overhead

От

Tom Lane

Дата:

29 апреля 2007 г., 16:21:24

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> The first design that comes to mind is that at transaction end
>> (pgstat_report_tabstat() time) we send a stats message only if at least
>> X milliseconds have elapsed since we last sent one, where X is
>> PGSTAT_STAT_INTERVAL or closely related to it.  We also make sure to
>> flush stats out before process exit.  This approach ensures that in a
>> lots-of-short-transactions scenario, we only need to send one stats
>> message every X msec, not one per query.

> If you're going to make it depend on the timestamp set by transaction
> start, I'm all for it.

Ah, you're worried about not adding an extra gettimeofday() call.
Actually I was going to make it use the transaction-commit timestamp,
which xact.c already does a kernel call for so it can put a timestamp in
the xlog commit or abort record.  We don't save that aside at the moment
but easily could.

Doing this would probably mean wanting to convert the timestamps
stored in xlog commit/abort records from time_t to timestamptz;
anyone have a problem with that?

> FWIW, am I reading the code wrong or do we send the number of xact
> commit and rollback multiple times in pgstat_report_one_tabstat, with
> only the first one having non-zero counts?  Maybe we could put these
> counters in a separate message to reduce the size of the tabstat
> messages themselves.  (It may be that the total impact in bytes is
> minimal, and the added overhead of an additional message is greater?)

Yeah, that design seems fine to me as-is.  We'd only be sending multiple
messages if we have more than about 1K of tabstat records, so the
overhead is only 16 bytes out of each additional 1K ... not a lot.
        regards, tom lane

Re: Reducing stats collection overhead

От

Tom Lane

Дата:

29 апреля 2007 г., 16:30:09

Gregory Stark <stark@enterprisedb.com> writes:
> If we want to have an idle_in_statement_timeout then we'll need to introduce a
> select loop instead of just directly blocking on recv anyways. Does that mean
> we may as well bite the bullet now?

If we wanted such a timeout (which I personally don't) we wouldn't
implement it with select because OpenSSL wouldn't cooperate.  AFAICS
this'd require setting a timer interrupt ... and then unsetting it when
the client response comes back.
        regards, tom lane

Re: Reducing stats collection overhead

От

Alvaro Herrera

Дата:

18 мая 2007 г., 13:12:31

Tom Lane wrote:
> Arjen van der Meijden told me that according to the tweakers.net
> benchmark, HEAD is noticeably slower than 8.2.4, and I soon confirmed
> here that for small SELECT queries issued as separate transactions,
> there's a significant difference.  I think much of the difference stems
> from the fact that we now have stats_row_level ON by default, and so
> every transaction sends a stats message that wasn't there by default
> in 8.2.  When you're doing a few thousand transactions per second
> (not hard for small read-only queries) that adds up.

So, did this patch make the performance problem go away?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Reducing stats collection overhead

От

Arjen van der Meijden

Дата:

19 мая 2007 г., 14:31:59

Afaik Tom hadn't finished his patch when I was testing things, so I 
don't know. But we're in the process of benchmarking a new system (dual 
quad-core Xeon) and we'll have a look at how it performs in the postgres 
8.2dev we used before, the stable 8.2.4 and a fresh HEAD-checkout (which 
we'll call 8.3dev). I'll let you guys (or at least Tom) know how they 
compare in our benchmark.

Best regards,

Arjen

On 18-5-2007 15:12 Alvaro Herrera wrote:
> Tom Lane wrote:
>> Arjen van der Meijden told me that according to the tweakers.net
>> benchmark, HEAD is noticeably slower than 8.2.4, and I soon confirmed
>> here that for small SELECT queries issued as separate transactions,
>> there's a significant difference.  I think much of the difference stems
>> from the fact that we now have stats_row_level ON by default, and so
>> every transaction sends a stats message that wasn't there by default
>> in 8.2.  When you're doing a few thousand transactions per second
>> (not hard for small read-only queries) that adds up.
> 
> So, did this patch make the performance problem go away?
>

Re: Reducing stats collection overhead

От

Alvaro Herrera

Дата:

31 июля 2007 г., 03:07:17

Arjen van der Meijden wrote:
> Afaik Tom hadn't finished his patch when I was testing things, so I don't 
> know. But we're in the process of benchmarking a new system (dual quad-core 
> Xeon) and we'll have a look at how it performs in the postgres 8.2dev we 
> used before, the stable 8.2.4 and a fresh HEAD-checkout (which we'll call 
> 8.3dev). I'll let you guys (or at least Tom) know how they compare in our 
> benchmark.

So, ahem, did it work? :-)


> On 18-5-2007 15:12 Alvaro Herrera wrote:
>> Tom Lane wrote:
>>> Arjen van der Meijden told me that according to the tweakers.net
>>> benchmark, HEAD is noticeably slower than 8.2.4, and I soon confirmed
>>> here that for small SELECT queries issued as separate transactions,
>>> there's a significant difference.  I think much of the difference stems
>>> from the fact that we now have stats_row_level ON by default, and so
>>> every transaction sends a stats message that wasn't there by default
>>> in 8.2.  When you're doing a few thousand transactions per second
>>> (not hard for small read-only queries) that adds up.
>> So, did this patch make the performance problem go away?


-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Reducing stats collection overhead

От

Arjen van der Meijden

Дата:

31 июля 2007 г., 06:08:31

On 31-7-2007 5:07 Alvaro Herrera wrote:
> Arjen van der Meijden wrote:
>> Afaik Tom hadn't finished his patch when I was testing things, so I don't 
>> know. But we're in the process of benchmarking a new system (dual quad-core 
>> Xeon) and we'll have a look at how it performs in the postgres 8.2dev we 
>> used before, the stable 8.2.4 and a fresh HEAD-checkout (which we'll call 
>> 8.3dev). I'll let you guys (or at least Tom) know how they compare in our 
>> benchmark.
> 
> So, ahem, did it work? :-)

The machine turned out to have a faulty mainboard, so we had to 
concentrate on first figuring out why it was unstable and then whether 
the replacement mainboard did make it stable in a long durability 
test.... Of course that behaviour only appeared with mysql and not with 
postgresql, so we had to run our mysql-version of the benchmark a few 
hundred times, rather than testing various versions, untill the machine 
had to go in production.

So we haven't tested postgresql 8.3dev on that machine, sorry.

Best regards,

Arjen

> 
> 
>> On 18-5-2007 15:12 Alvaro Herrera wrote:
>>> Tom Lane wrote:
>>>> Arjen van der Meijden told me that according to the tweakers.net
>>>> benchmark, HEAD is noticeably slower than 8.2.4, and I soon confirmed
>>>> here that for small SELECT queries issued as separate transactions,
>>>> there's a significant difference.  I think much of the difference stems
>>>> from the fact that we now have stats_row_level ON by default, and so
>>>> every transaction sends a stats message that wasn't there by default
>>>> in 8.2.  When you're doing a few thousand transactions per second
>>>> (not hard for small read-only queries) that adds up.
>>> So, did this patch make the performance problem go away?
> 
>

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Reducing stats collection overhead