Обсуждение: Re: [ADMIN] Vacuum error on database postgres

Поиск
Список
Период
Сортировка

Re: [ADMIN] Vacuum error on database postgres

От
Tom Lane
Дата:
andy <andy@squeakycode.net> writes:
> Tom Lane wrote:
>> andy <andy@squeakycode.net> writes:
>>> So I'm ok, but I tried it again, by dropping the database and re-running 
>>> both scripts and got the same error again.  So thought I'd offer a test 
>>> case if there was interest.
>> 
>> Absolutely.  I've seen just enough of these reports to make me think
>> there's an underlying bug.

> Here are some urls: ...

Doh ... I think the critical bit is here:

autovacuum = on                 # enable autovacuum subprocess?

The problem is that ANALYZE takes only AccessShareLock on a table,
so it's entirely possible for two backends to try to ANALYZE the
same table concurrently, and in particular for autovacuum to try to
do so while your foreground VACUUM ANALYZE is running.  That leads
to concurrent insertion attempts into pg_statistic for the same key.

This behavior dates from a time when there was no good alternative.
One possible fix today would be to make ANALYZE take
ShareUpdateExclusive lock instead, thus ensuring there is only one
ANALYZE at a time on a table.  However I'm a bit concerned by the
possibility that ANALYZE-inside-a-transaction could accumulate a
whole bunch of such locks in a random order, leading at least to
a risk of deadlocks against other ANALYZEs.  (We have to hold the
lock till commit, else we aren't fixing the problem.)  Do we need a
specialized lock type just for ANALYZE?  Would sorting the target
list of rel OIDs be enough?  Perhaps it's not worth worrying about?
        regards, tom lane


Re: [ADMIN] Vacuum error on database postgres

От
"Joshua D. Drake"
Дата:
> This behavior dates from a time when there was no good alternative.
> One possible fix today would be to make ANALYZE take
> ShareUpdateExclusive lock instead, thus ensuring there is only one
> ANALYZE at a time on a table.  However I'm a bit concerned by the
> possibility that ANALYZE-inside-a-transaction could accumulate a
> whole bunch of such locks in a random order, leading at least to
> a risk of deadlocks against other ANALYZEs.  (We have to hold the
> lock till commit, else we aren't fixing the problem.)  Do we need a
> specialized lock type just for ANALYZE?  Would sorting the target
> list of rel OIDs be enough?  Perhaps it's not worth worrying about?
> 

Why not an internal lock that people don't see? The behavior would the 
following:

conn1: analyze foo;

conn2: analyze foo;

ERROR: analyze already running on foo

conn1: analyze foo;
conn2: analyze;

NOTICE: analyze full started, analyze running on foo, skipping foo

Sincerely,

Joshua D. Drake



>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
> 


-- 
   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240   Providing the most comprehensive  PostgreSQL
solutionssince 1997             http://www.commandprompt.com/
 




Re: [ADMIN] Vacuum error on database postgres

От
Tom Lane
Дата:
"Joshua D. Drake" <jd@commandprompt.com> writes:
>> One possible fix today would be to make ANALYZE take
>> ShareUpdateExclusive lock instead, thus ensuring there is only one
>> ANALYZE at a time on a table.

> Why not an internal lock that people don't see?

We could add another LockTagType just for ANALYZE, but that seems like
rather a lot of infrastructure to support an extremely narrow corner
case, namely two people doing database-wide ANALYZE at the same time
inside transaction blocks.  (If they do it outside a transaction block
then the ANALYZE is divided into multiple xacts and so doesn't try to
hold locks on multiple tables concurrently.  autovacuum won't try to do
that either.)  There's no such animal as "an internal lock people don't
see" --- if we went this way it'd propagate into user-visible entries in
pg_locks, for example.

ISTM it should be sufficient to use ShareUpdateExclusiveLock.  The only
real argument I can see against it is you couldn't ANALYZE and VACUUM
a table at the same time ... but that's probably a bad idea anyway,
especially if we extend ANALYZE to estimate dead-tuple statistics.
        regards, tom lane


Re: [ADMIN] Vacuum error on database postgres

От
Jeff Davis
Дата:
On Thu, 2006-09-14 at 11:20 -0400, Tom Lane wrote:
> andy <andy@squeakycode.net> writes:
> > Tom Lane wrote:
> >> andy <andy@squeakycode.net> writes:
> This behavior dates from a time when there was no good alternative.
> One possible fix today would be to make ANALYZE take
> ShareUpdateExclusive lock instead, thus ensuring there is only one
> ANALYZE at a time on a table.  However I'm a bit concerned by the
> possibility that ANALYZE-inside-a-transaction could accumulate a
> whole bunch of such locks in a random order, leading at least to
> a risk of deadlocks against other ANALYZEs.  (We have to hold the
> lock till commit, else we aren't fixing the problem.)  Do we need a
> specialized lock type just for ANALYZE?  Would sorting the target
> list of rel OIDs be enough?  Perhaps it's not worth worrying about?
> 

How would creating a new lock type avoid deadlocks when an ANALYZE is
accumulating the locks in random order?

Regards,Jeff Davis





Re: [ADMIN] Vacuum error on database postgres

От
Tom Lane
Дата:
Jeff Davis <pgsql@j-davis.com> writes:
> How would creating a new lock type avoid deadlocks when an ANALYZE is
> accumulating the locks in random order?

In itself it wouldn't.  Josh Drake sketched the idea in more detail
later: if there is a lock type used *only* for ANALYZE, then you can do
ConditionalLockAcquire on it, and if you fail, skip the table on the
assumption that someone else is already doing what you came to do.

The whole thing seems a bit too cute/complicated though; it'd open
various corner cases such as: ANALYZE, run complex query, query takes a
week because it's using out-of-date stats because previous ANALYZE-r
hadn't committed yet.  I'd rather ANALYZE always analyzed than sometimes
fell through without doing anything.
        regards, tom lane


Re: [ADMIN] Vacuum error on database postgres

От
Jeff Davis
Дата:
On Thu, 2006-09-14 at 18:25 -0400, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > How would creating a new lock type avoid deadlocks when an ANALYZE is
> > accumulating the locks in random order?
> 
> In itself it wouldn't.  Josh Drake sketched the idea in more detail
> later: if there is a lock type used *only* for ANALYZE, then you can do
> ConditionalLockAcquire on it, and if you fail, skip the table on the
> assumption that someone else is already doing what you came to do.
> 
> The whole thing seems a bit too cute/complicated though; it'd open
> various corner cases such as: ANALYZE, run complex query, query takes a
> week because it's using out-of-date stats because previous ANALYZE-r
> hadn't committed yet.  I'd rather ANALYZE always analyzed than sometimes
> fell through without doing anything.
> 

Couldn't you just sort by the table names, and ANALYZE the tables in
that order? Would that effectively prevent the deadlocks?

Regards,Jeff Davis




Re: [ADMIN] Vacuum error on database postgres

От
Tom Lane
Дата:
Jeff Davis <pgsql@j-davis.com> writes:
> Couldn't you just sort by the table names, and ANALYZE the tables in
> that order? Would that effectively prevent the deadlocks?

That'd work too, I think (I suggested the variant of ordering by OID,
which is simpler and more reliable).  Not sure if it's really worth the
trouble though --- how many people do you think are doing concurrent
whole-database ANALYZEs inside transaction blocks?

As-is the code will do the analyzes in pg_class physical row order,
which is almost good enough --- only if someone did a schema change that
forced a pg_class row update between the starts of the two ANALYZE runs
would it possibly fail.  So the use-case for a fix is really kinda narrow.
        regards, tom lane


Re: [ADMIN] Vacuum error on database postgres

От
Martijn van Oosterhout
Дата:
On Thu, Sep 14, 2006 at 06:25:42PM -0400, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > How would creating a new lock type avoid deadlocks when an ANALYZE is
> > accumulating the locks in random order?
>
> In itself it wouldn't.  Josh Drake sketched the idea in more detail
> later: if there is a lock type used *only* for ANALYZE, then you can do
> ConditionalLockAcquire on it, and if you fail, skip the table on the
> assumption that someone else is already doing what you came to do.

Wouldn't it be useful for ANALYZE to do a conditional lock anyway and
skip if it can't acquire. Especially for the analyse-from-autovacuum
case, perhaps an ANALYSE NOLOCK or whatever.

For stuff run from autovacuum, would it be reasonable for the
automatically run version to just abort if it sees someone doing the
same thing?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: [ADMIN] Vacuum error on database postgres

От
Markus Schaber
Дата:
Hi, Tom,

Tom Lane wrote:
> We could add another LockTagType just for ANALYZE, but that seems like
> rather a lot of infrastructure to support an extremely narrow corner
> case, namely two people doing database-wide ANALYZE at the same time
> inside transaction blocks.  (If they do it outside a transaction block
> then the ANALYZE is divided into multiple xacts and so doesn't try to
> hold locks on multiple tables concurrently.  autovacuum won't try to do
> that either.)

Is there any reason to allow ANALYZE run insinde a transaction at all?


Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org


Re: [ADMIN] Vacuum error on database postgres

От
Andrew - Supernews
Дата:
On 2006-09-15, Markus Schaber <schabi@logix-tt.com> wrote:
> Is there any reason to allow ANALYZE run insinde a transaction at all?

Absolutely. In a large transaction that radically changes the content of
the database, it is often necessary to analyze in order to avoid getting
extremely bad query plans for later commands in the transaction.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services


Re: [ADMIN] Vacuum error on database postgres

От
Markus Schaber
Дата:
Hi, Andrew,

Andrew - Supernews wrote:

>> Is there any reason to allow ANALYZE run insinde a transaction at all?
>
> Absolutely. In a large transaction that radically changes the content of
> the database, it is often necessary to analyze in order to avoid getting
> extremely bad query plans for later commands in the transaction.

OK, I see.

But this leads to the danger that, should the transaction abort
afterwards, we're left with borked stats, or are those rolled back
accordingly?

Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org


Re: [ADMIN] Vacuum error on database postgres

От
Gregory Stark
Дата:
Markus Schaber <schabi@logix-tt.com> writes:

> Is there any reason to allow ANALYZE run insinde a transaction at all?

I had a script to run explain over a set of queries, then run analyze, then
run explain again and check the plans for unexpected changes. It would roll
back the analyze if any production queries had changed plans and email the
diff to the DBA to review.

Actually I never finished the script but that was the plan :)

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: [ADMIN] Vacuum error on database postgres

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> For stuff run from autovacuum, would it be reasonable for the
> automatically run version to just abort if it sees someone doing the
> same thing?

Not especially --- there's no guarantee that the other guy is going to
commit at all.  And autovac is only holding one lock at a time so it's
not a factor in the deadlock issue anyway.
        regards, tom lane


Re: [ADMIN] Vacuum error on database postgres

От
andy
Дата:
Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
>> Couldn't you just sort by the table names, and ANALYZE the tables in
>> that order? Would that effectively prevent the deadlocks?
> 
> That'd work too, I think (I suggested the variant of ordering by OID,
> which is simpler and more reliable).  Not sure if it's really worth the
> trouble though --- how many people do you think are doing concurrent
> whole-database ANALYZEs inside transaction blocks?
> 
> As-is the code will do the analyzes in pg_class physical row order,
> which is almost good enough --- only if someone did a schema change that
> forced a pg_class row update between the starts of the two ANALYZE runs
> would it possibly fail.  So the use-case for a fix is really kinda narrow.
> 
>             regards, tom lane

Honestly, its not that big a problem, and if there were some doc's, 
faq's, etc (and people on the newsgroups) I dont think you should even 
worry about it.

It makes sense to me, and if Tom had come back and said, yeah, here is 
why, cuz you run autovacuum and at then end of the script you did a 
vacuum... they are conflicting... dont do that.  I'd be cool with that.  As soon as its common knowledge I think it
couldbe avoided.
 

Really, isn't it just bulk loads anyway where a person might do this?

-Andy