Обсуждение: vacuum analyze

Поиск
Список
Период
Сортировка

vacuum analyze

От
Michael Simms
Дата:
Hi, Ive spent the last 4 days working my butt off trying to find the cause of
the seemingly random  vacuum analyze crash. Actually Ive been just
trying to reproduce it, cos as soon as I added in -ggdb into the
compile rules it stopped happening *grrr* (not that Im surprised. It
was random at best before, and things like this always hide when you
try and look for them).

But after 4 days of frustration, I just want to be sure - nobody else
has found the problem and solved it have they? I just dont want to
waste my time on this if someone else has found the cause...

Thanx
                M Simms


Re: [HACKERS] vacuum analyze

От
Tom Lane
Дата:
Michael Simms <grim@argh.demon.co.uk> writes:
> But after 4 days of frustration, I just want to be sure - nobody else
> has found the problem and solved it have they? I just dont want to
> waste my time on this if someone else has found the cause...

Let's see ... I know that removing pg_vlock while vacuum is running
will lead to a coredump after vacuum finishes (it doesn't recover
cleanly after its attempt to unlink pg_vlock fails).  I think I know
how to fix that but it's not done yet.  The same problem could affect
any error that is detected between vacuum's internal transactions.
Do you get any error reports in the postmaster log when there is a
crash?

Beyond that, I don't recall having heard of any recent fixes that affect
vacuum.

If you can create a reproducible example then more people could poke
at it, so that seems like the avenue to focus on.
        regards, tom lane


Re: [HACKERS] vacuum analyze

От
Michael Simms
Дата:
> 
> Michael Simms <grim@argh.demon.co.uk> writes:
> > But after 4 days of frustration, I just want to be sure - nobody else
> > has found the problem and solved it have they? I just dont want to
> > waste my time on this if someone else has found the cause...
> 
> Let's see ... I know that removing pg_vlock while vacuum is running
> will lead to a coredump after vacuum finishes (it doesn't recover
> cleanly after its attempt to unlink pg_vlock fails).  I think I know
> how to fix that but it's not done yet.  The same problem could affect
> any error that is detected between vacuum's internal transactions.
> Do you get any error reports in the postmaster log when there is a
> crash?

ahem, well, to be honest, Ive never found any documentation on how to
read the logs *embarrassed smile*.

template1=> select * from pg_log;
ERROR:  pg_log cannot be accessed by users

That happens with any account.

It COULD be a problem with that, as I have a crontab process that vacuums
everything every 24 hours, but also I perform some minor vacuums in the
meantime, some of which may occur when the main vacuum is happening. I didnt
notice that as a pattern, but it certainly COULD be that. I'll check into it.

> Beyond that, I don't recall having heard of any recent fixes that affect
> vacuum.
> 
> If you can create a reproducible example then more people could poke
> at it, so that seems like the avenue to focus on.

Yup, well, if I could get it to happen *at all* any more, I could poke around,
as I am running the backend that is handling the vacuum under gdb. If I
find a reproducable way I will certainly report it here.

Thanx
                    M Simms


Re: [HACKERS] vacuum analyze

От
Tom Lane
Дата:
Michael Simms <grim@argh.demon.co.uk> writes:
> ahem, well, to be honest, Ive never found any documentation on how to
> read the logs *embarrassed smile*.

> template1=> select * from pg_log;
> ERROR:  pg_log cannot be accessed by users

No, no, not pg_log.  I'm talking about the text file that you've
directed the postmaster's stdout and stderr into.  (You are doing that
and not dropping it on the floor, I trust.)

> It COULD be a problem with that, as I have a crontab process that vacuums
> everything every 24 hours, but also I perform some minor vacuums in the
> meantime, some of which may occur when the main vacuum is happening.

pg_vlock exists specifically to prevent two concurrent vacuums.  The
scenario I was talking about involved removing it by hand, which you
wouldn't do unless you were trying to provoke a vacuum error (or,
perhaps, cleaning up after a previous vacuum run coredumped).
        regards, tom lane


Re: [HACKERS] vacuum analyze

От
Christof Petig
Дата:
Tom Lane wrote:

> Let's see ... I know that removing pg_vlock while vacuum is running
> will lead to a coredump after vacuum finishes (it doesn't recover
> cleanly after its attempt to unlink pg_vlock fails).  I think I know
> how to fix that but it's not done yet.  The same problem could affect
> any error that is detected between vacuum's internal transactions.
> Do you get any error reports in the postmaster log when there is a
> crash?
>
> Beyond that, I don't recall having heard of any recent fixes that affect
> vacuum.
>
> If you can create a reproducible example then more people could poke
> at it, so that seems like the avenue to focus on.
>
>                         regards, tom lane

Perhaps the bug I reported on pgsql-bugs about a week ago has some relation
to this problem:
I had been able to reproducibly (?) crash postmaster with my example
program (a loop of
update table) combined with several vacuum commands in a seperate task.
As the sice of the table's index grows a failure almost gets certain.

If you think the program might help you, contact me or look into bugs'
archives.

Regards     Christof