Обсуждение: stats for failed transactions (was Re: [GENERAL] VACUUM Question)

Поиск
Список
Период
Сортировка

stats for failed transactions (was Re: [GENERAL] VACUUM Question)

От
Tom Lane
Дата:
"Matthew T. O'Connor" <matthew@zeut.net> writes:
>> Also, somebody made a real good point about rolled-back insertions.
>> Even if the only command you ever apply to the table is INSERT, you
>> could still have dead rows in the table if some of those transactions
>> occasionally roll back.

> hmm... That's true.  I don't think autovacuum doesn't anything to account
> for the concept of rolledback inserts.

I think this is the fault of the stats system design.  AFAICT from a
quick look at the code, inserted/updated/deleted tuples are reported
to the collector in the same way regardless of whether the sending
transaction committed or rolled back.  I think this is unquestionably
a bug, at least for autovacuum's purposes --- though it might be OK
for the original intent of the stats system, which was simply to track
activity levels.

Any thoughts about how it ought to work?
        regards, tom lane


Re: stats for failed transactions (was Re: [GENERAL] VACUUM Question)

От
Alvaro Herrera
Дата:
Tom Lane wrote:

> I think this is the fault of the stats system design.  AFAICT from a
> quick look at the code, inserted/updated/deleted tuples are reported
> to the collector in the same way regardless of whether the sending
> transaction committed or rolled back.  I think this is unquestionably
> a bug, at least for autovacuum's purposes --- though it might be OK
> for the original intent of the stats system, which was simply to track
> activity levels.
> 
> Any thoughts about how it ought to work?

I don't remember exactly how it works -- I think the activity (insert,
update, delete) counters are kept separately from commit/rollback
status, right?  Maybe we should keep three separate counters: "current
transaction counters" and "counters for transactions that were
aborted/committed".  We only send the latter counts, and the former are
added to them when the transaction ends.

-- 
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"Aprende a avergonzarte más ante ti que ante los demás" (Demócrito)


Re: stats for failed transactions (was Re: [GENERAL] VACUUM Question)

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Tom Lane wrote:
>> I think this is unquestionably
>> a bug, at least for autovacuum's purposes --- though it might be OK
>> for the original intent of the stats system, which was simply to track
>> activity levels.
>> 
>> Any thoughts about how it ought to work?

> I don't remember exactly how it works -- I think the activity (insert,
> update, delete) counters are kept separately from commit/rollback
> status, right?  Maybe we should keep three separate counters: "current
> transaction counters" and "counters for transactions that were
> aborted/committed".  We only send the latter counts, and the former are
> added to them when the transaction ends.

My question was at a higher level, actually: *what* should we be
counting?

I think doubling the number of counters in the stats system, which is
what you seem to be proposing, is probably not acceptable --- we've
already got a problem with the stats file becoming unreasonably bulky.
We need to figure out exactly which counts there is adequate reason
to be tracking.

I don't, for instance, see any percentage in tracking block-level I/O
operations separately for committed and rolled-back transactions.
Those numbers are certainly things you watch only for total activity,
and a failed xact is just as much system load as a committed one.
        regards, tom lane


Re: stats for failed transactions (was Re: [GENERAL] VACUUM Question)

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > Tom Lane wrote:
> >> I think this is unquestionably
> >> a bug, at least for autovacuum's purposes --- though it might be OK
> >> for the original intent of the stats system, which was simply to track
> >> activity levels.
> >> 
> >> Any thoughts about how it ought to work?
> 
> > I don't remember exactly how it works -- I think the activity (insert,
> > update, delete) counters are kept separately from commit/rollback
> > status, right?  Maybe we should keep three separate counters: "current
> > transaction counters" and "counters for transactions that were
> > aborted/committed".  We only send the latter counts, and the former are
> > added to them when the transaction ends.
> 
> My question was at a higher level, actually: *what* should we be
> counting?

Oh, I see.  Do you think small incremental improvements to the stat
system will buy us much?  I think we should be thinking big here, i.e.
rewrite most stuff instead.  In the meantime, we should fix the minor
issues but not spend too much time on it; IMHO anyway.

I can devote some time to it starting from, say, mid february, which is
when I think I'm going to have more time to spend on community stuff.
(I've been spending the last couple of months on PL/php and internal
Command Prompt stuff.)

-- 
Alvaro Herrera                  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
"XML!" Exclaimed C++.  "What are you doing here? You're not a programming
language."
"Tell that to the people who use me," said XML.


Re: stats for failed transactions (was Re: [GENERAL] VACUUM Question)

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Tom Lane wrote:
>> My question was at a higher level, actually: *what* should we be
>> counting?

> Oh, I see.  Do you think small incremental improvements to the stat
> system will buy us much?  I think we should be thinking big here, i.e.
> rewrite most stuff instead.

Uh, I wasn't aware of any proposals for a major rewrite.  What did you
have in mind?
        regards, tom lane


Re: stats for failed transactions (was Re: [GENERAL] VACUUM Question)

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > Tom Lane wrote:
> >> My question was at a higher level, actually: *what* should we be
> >> counting?
> 
> > Oh, I see.  Do you think small incremental improvements to the stat
> > system will buy us much?  I think we should be thinking big here, i.e.
> > rewrite most stuff instead.
> 
> Uh, I wasn't aware of any proposals for a major rewrite.  What did you
> have in mind?

Nothing yet really.  But we mentioned it in the recent past.  (About not
using UDP, and also about the property of not having each backend load
the whole stat file every time.)

-- 
Alvaro Herrera                                http://www.PlanetPostgreSQL.org
"This is a foot just waiting to be shot"                (Andrew Dunstan)


Re: stats for failed transactions (was Re: [GENERAL] VACUUM

От
"Matthew T. O'Connor"
Дата:
Tom Lane wrote:
>> hmm... That's true.  I don't think autovacuum doesn't anything to account
>> for the concept of rolledback inserts.
>>     
>
> I think this is the fault of the stats system design.  AFAICT from a
> quick look at the code, inserted/updated/deleted tuples are reported
> to the collector in the same way regardless of whether the sending
> transaction committed or rolled back.  I think this is unquestionably
> a bug, at least for autovacuum's purposes --- though it might be OK
> for the original intent of the stats system, which was simply to track
> activity levels.
>
> Any thoughts about how it ought to work?

The fact that autovacuum bases it's decisions on info from the stats 
system is arguably an abuse of the original design.  However I don't 
know of a better source of information at the moment.  It has always 
been my vision that autovacuum will eventually incorporate additional 
information sources to make better informed decisions.  There has always 
been discussion of using the FSM to help clue us in as to when we need 
another vacuum.  Perhaps the addition of the vacuum space map that 
people are talking about will also help. 

None of this directly addresses the question of what the stats system 
*should* track, but perhaps it is wrongheaded to totally redesign the 
stats system for the purposes of autovacuum.  As a quick semi-fix, 
perhaps autovacuum should look at the number of rollbacks vs. commits in 
an attempt to determine the accuracy of the stats.  For example if 50% 
of the transactions are getting rolled back, then autovacuum might 
include 50% of the inserts in the count towards the vacuum threshold.  
Obviously this isn't perfect, but it probably gets us closer to reality 
with the information already available.

Thoughts?

Matt



Re: stats for failed transactions (was Re: [GENERAL] VACUUM Question)

От
Tom Lane
Дата:
"Matthew T. O'Connor" <matthew@zeut.net> writes:
> None of this directly addresses the question of what the stats system 
> *should* track, but perhaps it is wrongheaded to totally redesign the 
> stats system for the purposes of autovacuum.

I'd argue it's fine: there are tons of people using row-level stats
via autovacuum, and (AFAICT) just about nobody using 'em for any other
purpose.  Certainly you never see anyone suggesting them as a tool for
investigating problems on pgsql-performance.  Sure, it's a repurposing
of the stats subsystem, but we should be willing to do that when field
experience teaches us what's really needed.

> As a quick semi-fix, perhaps autovacuum should look at the number of
> rollbacks vs. commits in an attempt to determine the accuracy of the
> stats.  For example if 50% of the transactions are getting rolled
> back, then autovacuum might include 50% of the inserts in the count
> towards the vacuum threshold.  Obviously this isn't perfect, but it
> probably gets us closer to reality with the information already
> available.

But all that we have is *global* counts.  Inferring ratios applicable to
particular tables seems a big stretch to me.  Any given application is
likely to have some types of transactions that roll back much more often
than others.

One thing we could do is tie the stats message sending more tightly to
top-level transaction commit/abort.  (It's already effectively true that
we send stats only after commit/abort, but we send 'em from the wrong
place, ie PostgresMain.)  Then the semantics of the message could be
read as "here's what I did before committing" or "here's what I did
before aborting" and the collector could interpret the counts
accordingly.  However, this still fails in the case where a committed
top-level transaction includes some failed subtransactions.  I think
the only full solution will involve backends doing some extra work at
subtransaction commit/abort so that they can report properly classified
update counts.
        regards, tom lane


Re: stats for failed transactions (was Re: [GENERAL] VACUUM

От
"Matthew T. O'Connor"
Дата:
Tom Lane wrote:
> I'd argue it's fine: there are tons of people using row-level stats
> via autovacuum, and (AFAICT) just about nobody using 'em for any other
> purpose.  Certainly you never see anyone suggesting them as a tool for
> investigating problems on pgsql-performance.  Sure, it's a repurposing
> of the stats subsystem, but we should be willing to do that when field
> experience teaches us what's really needed.
>   

OK.

> But all that we have is *global* counts.  Inferring ratios applicable to
> particular tables seems a big stretch to me.  Any given application is
> likely to have some types of transactions that roll back much more often
> than others.
>   

Certainly a stretch, but it's far from fatal, the worst case scenario is 
we occasionally vacuum a table that might not need it, whereas the way 
things stand right now, the worst case is that we never vacuum that 
might have a lot of slack space.  BTW, I'm only arguing this based on 
what is in the stats system now and pondering how we might improve 
things if the stats system isn't changed to directly address this problem.

> One thing we could do is tie the stats message sending more tightly to
> top-level transaction commit/abort.  (It's already effectively true that
> we send stats only after commit/abort, but we send 'em from the wrong
> place, ie PostgresMain.)  Then the semantics of the message could be
> read as "here's what I did before committing" or "here's what I did
> before aborting" and the collector could interpret the counts
> accordingly.  However, this still fails in the case where a committed
> top-level transaction includes some failed subtransactions.  I think
> the only full solution will involve backends doing some extra work at
> subtransaction commit/abort so that they can report properly classified
> update counts.

Any guess as to the performance implications?



Re: stats for failed transactions (was Re: [GENERAL] VACUUM Question)

От
Tom Lane
Дата:
"Matthew T. O'Connor" <matthew@zeut.net> writes:
> Tom Lane wrote:
>> the only full solution will involve backends doing some extra work at
>> subtransaction commit/abort so that they can report properly classified
>> update counts.

> Any guess as to the performance implications?

Pushing some counts from one place to another doesn't seem that
expensive, but it'd be nice to avoid scanning a lot of unrelated
table-stats entries to find the ones that have to be adjusted.
Not sure what it'll take exactly.

Or we could blow it off for the time being.  Certainly, getting
things right at the top-transaction level would already be a big
leg up in accuracy from where we are, and I don't think that would
be hard at all.
        regards, tom lane


Re: stats for failed transactions (was Re: [GENERAL] VACUUM

От
Josh Berkus
Дата:
Tom,

> I'd argue it's fine: there are tons of people using row-level stats
> via autovacuum, and (AFAICT) just about nobody using 'em for any other
> purpose.  Certainly you never see anyone suggesting them as a tool for
> investigating problems on pgsql-performance.  

Actually, I use the stats for performance tuning.  However, I can't say 
that I care about the exact numbers; I'm just looking for columns which 
get "lots" of seq scans or indexes that don't get used.

--Josh


Re: stats for failed transactions (was Re: [GENERAL] VACUUM

От
Jan Wieck
Дата:
On 1/27/2006 10:53 AM, Alvaro Herrera wrote:
> Tom Lane wrote:
> 
>> I think this is the fault of the stats system design.  AFAICT from a
>> quick look at the code, inserted/updated/deleted tuples are reported
>> to the collector in the same way regardless of whether the sending
>> transaction committed or rolled back.  I think this is unquestionably
>> a bug, at least for autovacuum's purposes --- though it might be OK
>> for the original intent of the stats system, which was simply to track
>> activity levels.
>> 
>> Any thoughts about how it ought to work?
> 
> I don't remember exactly how it works -- I think the activity (insert,
> update, delete) counters are kept separately from commit/rollback
> status, right?  Maybe we should keep three separate counters: "current
> transaction counters" and "counters for transactions that were
> aborted/committed".  We only send the latter counts, and the former are
> added to them when the transaction ends.

It's not a bug. More some "don't bother" attitude. The stats were not 
intended to be precise. Their sole purpose at the time of design and 
development was to give clues for missing or useless indexes, identify 
candidates to move onto different spindles and things like autovacuum. 
If the number of aborts you're riding significantly disturbs your 
statistic collection, you should look at some design issues with your 
own application instead.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: stats for failed transactions (was Re: [GENERAL] VACUUM

От
Jan Wieck
Дата:
On 1/27/2006 10:56 AM, Tom Lane wrote:

> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>> Tom Lane wrote:
>>> I think this is unquestionably
>>> a bug, at least for autovacuum's purposes --- though it might be OK
>>> for the original intent of the stats system, which was simply to track
>>> activity levels.
>>> 
>>> Any thoughts about how it ought to work?
> 
>> I don't remember exactly how it works -- I think the activity (insert,
>> update, delete) counters are kept separately from commit/rollback
>> status, right?  Maybe we should keep three separate counters: "current
>> transaction counters" and "counters for transactions that were
>> aborted/committed".  We only send the latter counts, and the former are
>> added to them when the transaction ends.
> 
> My question was at a higher level, actually: *what* should we be
> counting?
> 
> I think doubling the number of counters in the stats system, which is
> what you seem to be proposing, is probably not acceptable --- we've
> already got a problem with the stats file becoming unreasonably bulky.
> We need to figure out exactly which counts there is adequate reason
> to be tracking.
> 
> I don't, for instance, see any percentage in tracking block-level I/O
> operations separately for committed and rolled-back transactions.
> Those numbers are certainly things you watch only for total activity,
> and a failed xact is just as much system load as a committed one.

Right.

The problem is that different questions need different counting 
semantics. If one looks at an application profile, he wants to know how 
many inserts, updates and deletes the app has issued, probably 
regardless of the final transaction outcome. But for autovac he wants to 
be able to figure out how many dead tuples there would be, so for that 
purpose inserts in a rolled back xact would count as ins+del where 
deletes would not count at all.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #