Обсуждение: 8.3.0 Core with concurrent vacuum fulls

Поиск
Список
Период
Сортировка

8.3.0 Core with concurrent vacuum fulls

От
"Gavin M. Roy"
Дата:
This morning I had a postgres 8.3 install core this morning while multiple vacuum fulls were taking place. I saved the core file, would anyone be interested in dissecting it?  I've otherwise had no issues with this machine or pgsql install.

Gavin

Re: 8.3.0 Core with concurrent vacuum fulls

От
Alvaro Herrera
Дата:
Gavin M. Roy wrote:
> This morning I had a postgres 8.3 install core this morning while multiple
> vacuum fulls were taking place. I saved the core file, would anyone be
> interested in dissecting it?  I've otherwise had no issues with this machine
> or pgsql install.

Of course.  Please post the backtrace.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: 8.3.0 Core with concurrent vacuum fulls

От
"Gavin M. Roy"
Дата:
[gmr@db04 backup]$ cat /etc/redhat-release 
CentOS release 4.4 (Final)

BINDIR = /usr/local/pgsql/bin
DOCDIR = /usr/local/pgsql/doc
INCLUDEDIR = /usr/local/pgsql/include
PKGINCLUDEDIR = /usr/local/pgsql/include
INCLUDEDIR-SERVER = /usr/local/pgsql/include/server
LIBDIR = /usr/local/pgsql/lib
PKGLIBDIR = /usr/local/pgsql/lib
LOCALEDIR = 
MANDIR = /usr/local/pgsql/man
SHAREDIR = /usr/local/pgsql/share
SYSCONFDIR = /usr/local/pgsql/etc
PGXS = /usr/local/pgsql/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--with-ldap' '--with-perl' '--enable-integer-datetimes'
CC = gcc
CPPFLAGS = -D_GNU_SOURCE
CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
CFLAGS_SL = -fpic
LDFLAGS = -Wl,-rpath,'/usr/local/pgsql/lib'
LDFLAGS_SL = 
LIBS = -lpgport -lz -lreadline -ltermcap -lcrypt -ldl -lm 
VERSION = PostgreSQL 8.3.0

(gdb) where
#0  0x0000003fe362e21d in raise () from /lib64/tls/libc.so.6
#1  0x0000003fe362fa1e in abort () from /lib64/tls/libc.so.6
#2  0x000000000063a2e3 in errfinish ()
#3  0x00000000005974c4 in DeadLockReport ()
#4  0x000000000059381f in LockAcquire ()
#5  0x0000000000592357 in LockRelationOid ()
#6  0x0000000000457255 in relation_open ()
#7  0x00000000004574c3 in heap_open ()
#8  0x000000000062cf41 in CatalogCacheInitializeCache ()
#9  0x000000000062dfad in PrepareToInvalidateCacheTuple ()
#10 0x000000000062e8c5 in CacheInvalidateHeapTuple ()
#11 0x000000000045c803 in heap_page_prune ()
#12 0x00000000005086cd in vacuum_rel ()
#13 0x00000000005096bb in vacuum ()
#14 0x00000000005a163b in PortalRunUtility ()
#15 0x00000000005a1714 in PortalRunMulti ()
#16 0x00000000005a1d30 in PortalRun ()
#17 0x000000000059f4b6 in PostgresMain ()
#18 0x00000000005760c0 in ServerLoop ()
#19 0x0000000000577770 in PostmasterMain ()
#20 0x000000000052fd3e in main ()

On Tue, Mar 4, 2008 at 11:35 AM, Alvaro Herrera <alvherre@commandprompt.com> wrote:
Gavin M. Roy wrote:
> This morning I had a postgres 8.3 install core this morning while multiple
> vacuum fulls were taking place. I saved the core file, would anyone be
> interested in dissecting it?  I've otherwise had no issues with this machine
> or pgsql install.

Of course.  Please post the backtrace.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: 8.3.0 Core with concurrent vacuum fulls

От
Tom Lane
Дата:
"Gavin M. Roy" <gmr@myyearbook.com> writes:
> (gdb) where
> #0  0x0000003fe362e21d in raise () from /lib64/tls/libc.so.6
> #1  0x0000003fe362fa1e in abort () from /lib64/tls/libc.so.6
> #2  0x000000000063a2e3 in errfinish ()
> #3  0x00000000005974c4 in DeadLockReport ()
> #4  0x000000000059381f in LockAcquire ()
> #5  0x0000000000592357 in LockRelationOid ()
> #6  0x0000000000457255 in relation_open ()
> #7  0x00000000004574c3 in heap_open ()
> #8  0x000000000062cf41 in CatalogCacheInitializeCache ()
> #9  0x000000000062dfad in PrepareToInvalidateCacheTuple ()
> #10 0x000000000062e8c5 in CacheInvalidateHeapTuple ()
> #11 0x000000000045c803 in heap_page_prune ()
> #12 0x00000000005086cd in vacuum_rel ()
> #13 0x00000000005096bb in vacuum ()
> #14 0x00000000005a163b in PortalRunUtility ()
> #15 0x00000000005a1714 in PortalRunMulti ()
> #16 0x00000000005a1d30 in PortalRun ()
> #17 0x000000000059f4b6 in PostgresMain ()
> #18 0x00000000005760c0 in ServerLoop ()
> #19 0x0000000000577770 in PostmasterMain ()
> #20 0x000000000052fd3e in main ()

So what did DeadLockReport put in the server log before panic'ing?

I'm wondering exactly why CatalogCacheInitializeCache is being called
here --- seems like that should have been done long before we got to
VACUUM.  But it would be useful to know just what deadlock it saw.
        regards, tom lane


Re: 8.3.0 Core with concurrent vacuum fulls

От
"Pavan Deolasee"
Дата:
On Wed, Mar 5, 2008 at 8:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Gavin M. Roy" <gmr@myyearbook.com> writes:
>  > (gdb) where
>  > #0  0x0000003fe362e21d in raise () from /lib64/tls/libc.so.6
>  > #1  0x0000003fe362fa1e in abort () from /lib64/tls/libc.so.6
>  > #2  0x000000000063a2e3 in errfinish ()
>  > #3  0x00000000005974c4 in DeadLockReport ()
>  > #4  0x000000000059381f in LockAcquire ()
>  > #5  0x0000000000592357 in LockRelationOid ()
>  > #6  0x0000000000457255 in relation_open ()
>  > #7  0x00000000004574c3 in heap_open ()
>  > #8  0x000000000062cf41 in CatalogCacheInitializeCache ()
>  > #9  0x000000000062dfad in PrepareToInvalidateCacheTuple ()
>  > #10 0x000000000062e8c5 in CacheInvalidateHeapTuple ()
>  > #11 0x000000000045c803 in heap_page_prune ()
>  > #12 0x00000000005086cd in vacuum_rel ()
>  > #13 0x00000000005096bb in vacuum ()
>  > #14 0x00000000005a163b in PortalRunUtility ()
>  > #15 0x00000000005a1714 in PortalRunMulti ()
>  > #16 0x00000000005a1d30 in PortalRun ()
>  > #17 0x000000000059f4b6 in PostgresMain ()
>  > #18 0x00000000005760c0 in ServerLoop ()
>  > #19 0x0000000000577770 in PostmasterMain ()
>  > #20 0x000000000052fd3e in main ()
>
>  So what did DeadLockReport put in the server log before panic'ing?
>
>  I'm wondering exactly why CatalogCacheInitializeCache is being called
>  here --- seems like that should have been done long before we got to
>  VACUUM.  But it would be useful to know just what deadlock it saw.
>

Seems like its possible that the second phase of catalog cache initialization
is not done when VACUUM FULL is called, especially if VACUUM FULL is
the first command/query in the backend.

InitCatalogCachePhase2() is called only if new cache file needs to be
written.

   /*    * Lastly, write out a new relcache cache file if one is needed.    */   if (needNewCacheFile)   {       /*
  * Force all the catcaches to finish initializing and thereby open the        * catalogs and indexes they use.  This
willpreload the relcache with        * entries for all the most important system catalogs and indexes, so        * that
theinit file will be most useful for future backends.        */       InitCatalogCachePhase2();
 
       /* now write the file */       write_relcache_init_file();   }


We haven't yet seen the deadlock message, but here is my theory of a possible
deadlock scenario:

Two backends try to vacuum full two different catalog tables. Each acquires an
exclusive lock on the respective catalog relation. Then each try to
initialize its
own catalog cache. But to do that they need AccessShareLock on each other's
table leading to a deadlock.

Why not just unconditionally finish the phase 2 as part of
InitPostgres ? I understand
that we may end up initializing caches that we don't need in that
session, but there
might be other places where this deadlock is waiting to happen.


Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


Re: 8.3.0 Core with concurrent vacuum fulls

От
"Pavan Deolasee"
Дата:
On Wed, Mar 5, 2008 at 3:41 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
>
>
>
>  Two backends try to vacuum full two different catalog tables. Each acquires an
>  exclusive lock on the respective catalog relation. Then each try to
>  initialize its
>  own catalog cache. But to do that they need AccessShareLock on each other's
>  table leading to a deadlock.
>


Well I could reproduce the above mentioned deadlock scenario with two system
relations. So that validates the theory.


Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


Re: 8.3.0 Core with concurrent vacuum fulls

От
Tom Lane
Дата:
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> Why not just unconditionally finish the phase 2 as part of InitPostgres ?

You're jumping to a patch before we even understand what's happening.
In particular, if that's the problem, why has this not been seen before?
The fact that it's going through heap_page_prune doesn't seem very
relevant --- VACUUM FULL has certainly always had to invoke
CacheInvalidateHeapTuple someplace or other.  So I still want to see
the deadlock report ... we at least need to know which tables are
involved in the deadlock.

A separate line of thought is whether it's a good idea that
heap_page_prune calls the inval code inside a critical section.
That's what's turning an ordinary deadlock failure into a PANIC.
Even without the possibility of having to do cache initialization,
that's a lot of code to be invoking, and it has obvious failure
modes (eg, out of memory for the new inval list item).
        regards, tom lane


Re: 8.3.0 Core with concurrent vacuum fulls

От
Tom Lane
Дата:
I wrote:
> In particular, if that's the problem, why has this not been seen before?
> The fact that it's going through heap_page_prune doesn't seem very
> relevant --- VACUUM FULL has certainly always had to invoke
> CacheInvalidateHeapTuple someplace or other.  So I still want to see
> the deadlock report ... we at least need to know which tables are
> involved in the deadlock.

Actually, maybe it *has* been seen before.  Gavin, are you in the habit
of running concurrent VACUUM FULLs on system catalogs, and if so have
you noted that they occasionally get deadlock failures?

> A separate line of thought is whether it's a good idea that
> heap_page_prune calls the inval code inside a critical section.
> That's what's turning an ordinary deadlock failure into a PANIC.
> Even without the possibility of having to do cache initialization,
> that's a lot of code to be invoking, and it has obvious failure
> modes (eg, out of memory for the new inval list item).

The more I think about this the more I don't like it.  The critical
section in heap_page_prune is *way* too big.  Aside from the inval
call, there are HeapTupleSatisfiesVacuum() calls, which could have
failures during attempted clog references.

The reason the critical section is so large is that we're manipulating
the contents of a shared buffer, and we don't want a failure to leave a
partially-modified page in the buffer.  We could fix that if we were to
memcpy the page into local storage and do all the pruning work there.
Then the critical section would only surround copying the page back to
the buffer and writing the WAL record.  Copying the page is a tad
annoying but heap_page_prune is an expensive operation anyway, and
I think we really are at too much risk of PANIC the way it's being done
now.  Has anyone got a better idea?
        regards, tom lane


Re: 8.3.0 Core with concurrent vacuum fulls

От
"Gavin M. Roy"
Дата:
2008-03-04 05:45:47 EST [6698]: [1-1] LOG:  process 6698 still waiting for AccessShareLock on relation 1247 of database 16385 after 1001.519 ms
2008-03-04 05:45:47 EST [6698]: [2-1] STATEMENT:  VACUUM FULL autograph.autograph_creators
2008-03-04 05:46:28 EST [6730]: [1-1] LOG:  process 6730 still waiting for AccessShareLock on relation 1247 of database 16385 after 1000.887 ms
2008-03-04 05:46:28 EST [6730]: [2-1] STATEMENT:  VACUUM FULL lunchmoney.totals
2008-03-04 05:47:55 EST [3809]: [18-1] LOG:  server process (PID 6742) was terminated by signal 6: Aborted
2008-03-04 05:47:55 EST [3809]: [19-1] LOG:  terminating any other active server processes
2008-03-04 05:47:55 EST [6741]: [12-1] WARNING:  terminating connection because of crash of another server process


On Tue, Mar 4, 2008 at 9:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Gavin M. Roy" <gmr@myyearbook.com> writes:
> (gdb) where
> #0  0x0000003fe362e21d in raise () from /lib64/tls/libc.so.6
> #1  0x0000003fe362fa1e in abort () from /lib64/tls/libc.so.6
> #2  0x000000000063a2e3 in errfinish ()
> #3  0x00000000005974c4 in DeadLockReport ()
> #4  0x000000000059381f in LockAcquire ()
> #5  0x0000000000592357 in LockRelationOid ()
> #6  0x0000000000457255 in relation_open ()
> #7  0x00000000004574c3 in heap_open ()
> #8  0x000000000062cf41 in CatalogCacheInitializeCache ()
> #9  0x000000000062dfad in PrepareToInvalidateCacheTuple ()
> #10 0x000000000062e8c5 in CacheInvalidateHeapTuple ()
> #11 0x000000000045c803 in heap_page_prune ()
> #12 0x00000000005086cd in vacuum_rel ()
> #13 0x00000000005096bb in vacuum ()
> #14 0x00000000005a163b in PortalRunUtility ()
> #15 0x00000000005a1714 in PortalRunMulti ()
> #16 0x00000000005a1d30 in PortalRun ()
> #17 0x000000000059f4b6 in PostgresMain ()
> #18 0x00000000005760c0 in ServerLoop ()
> #19 0x0000000000577770 in PostmasterMain ()
> #20 0x000000000052fd3e in main ()

So what did DeadLockReport put in the server log before panic'ing?

I'm wondering exactly why CatalogCacheInitializeCache is being called
here --- seems like that should have been done long before we got to
VACUUM.  But it would be useful to know just what deadlock it saw.

                       regards, tom lane

Re: 8.3.0 Core with concurrent vacuum fulls

От
Tom Lane
Дата:
"Gavin M. Roy" <gmr@myyearbook.com> writes:
> 2008-03-04 05:45:47 EST [6698]: [1-1] LOG:  process 6698 still waiting for
> AccessShareLock on relation 1247 of database 16385 after 1001.519 ms
> 2008-03-04 05:45:47 EST [6698]: [2-1] STATEMENT:  VACUUM FULL
> autograph.autograph_creators
> 2008-03-04 05:46:28 EST [6730]: [1-1] LOG:  process 6730 still waiting for
> AccessShareLock on relation 1247 of database 16385 after 1000.887 ms
> 2008-03-04 05:46:28 EST [6730]: [2-1] STATEMENT:  VACUUM FULL
> lunchmoney.totals
> 2008-03-04 05:47:55 EST [3809]: [18-1] LOG:  server process (PID 6742) was
> terminated by signal 6: Aborted
> 2008-03-04 05:47:55 EST [3809]: [19-1] LOG:  terminating any other active
> server processes
> 2008-03-04 05:47:55 EST [6741]: [12-1] WARNING:  terminating connection
> because of crash of another server process

How annoying ... the PANIC message doesn't seem to have reached the log.
elog.c is careful to fflush(stderr) before abort(), so that isn't
supposed to happen.  But it looks like you are using syslog for logging
(correct?) so maybe this is a problem with the syslog implementation
you're using.  What's the platform exactly?

I wonder if it'd be reasonable to put a closelog() call just before
the abort() ...
        regards, tom lane


Re: 8.3.0 Core with concurrent vacuum fulls

От
"Gavin M. Roy"
Дата:
On Wed, Mar 5, 2008 at 10:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> In particular, if that's the problem, why has this not been seen before?
> The fact that it's going through heap_page_prune doesn't seem very
> relevant --- VACUUM FULL has certainly always had to invoke
> CacheInvalidateHeapTuple someplace or other.  So I still want to see
> the deadlock report ... we at least need to know which tables are
> involved in the deadlock.

Actually, maybe it *has* been seen before.  Gavin, are you in the habit
of running concurrent VACUUM FULLs on system catalogs, and if so have
you noted that they occasionally get deadlock failures?

Generally no,  I've never noticed deadlocks before, but I'll go back and look at some of the other the machines.

> A separate line of thought is whether it's a good idea that
> heap_page_prune calls the inval code inside a critical section.
> That's what's turning an ordinary deadlock failure into a PANIC.
> Even without the possibility of having to do cache initialization,
> that's a lot of code to be invoking, and it has obvious failure
> modes (eg, out of memory for the new inval list item).

The more I think about this the more I don't like it.  The critical
section in heap_page_prune is *way* too big.  Aside from the inval
call, there are HeapTupleSatisfiesVacuum() calls, which could have
failures during attempted clog references.

The reason the critical section is so large is that we're manipulating
the contents of a shared buffer, and we don't want a failure to leave a
partially-modified page in the buffer.  We could fix that if we were to
memcpy the page into local storage and do all the pruning work there.
Then the critical section would only surround copying the page back to
the buffer and writing the WAL record.  Copying the page is a tad
annoying but heap_page_prune is an expensive operation anyway, and
I think we really are at too much risk of PANIC the way it's being done
now.  Has anyone got a better idea?

                       regards, tom lane

Re: 8.3.0 Core with concurrent vacuum fulls

От
"Gavin M. Roy"
Дата:


On Wed, Mar 5, 2008 at 10:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Gavin M. Roy" <gmr@myyearbook.com> writes:
> 2008-03-04 05:45:47 EST [6698]: [1-1] LOG:  process 6698 still waiting for
> AccessShareLock on relation 1247 of database 16385 after 1001.519 ms
> 2008-03-04 05:45:47 EST [6698]: [2-1] STATEMENT:  VACUUM FULL
> autograph.autograph_creators
> 2008-03-04 05:46:28 EST [6730]: [1-1] LOG:  process 6730 still waiting for
> AccessShareLock on relation 1247 of database 16385 after 1000.887 ms
> 2008-03-04 05:46:28 EST [6730]: [2-1] STATEMENT:  VACUUM FULL
> lunchmoney.totals
> 2008-03-04 05:47:55 EST [3809]: [18-1] LOG:  server process (PID 6742) was
> terminated by signal 6: Aborted
> 2008-03-04 05:47:55 EST [3809]: [19-1] LOG:  terminating any other active
> server processes
> 2008-03-04 05:47:55 EST [6741]: [12-1] WARNING:  terminating connection
> because of crash of another server process

How annoying ... the PANIC message doesn't seem to have reached the log.
elog.c is careful to fflush(stderr) before abort(), so that isn't
supposed to happen.  But it looks like you are using syslog for logging
(correct?) so maybe this is a problem with the syslog implementation
you're using.  What's the platform exactly?

I wonder if it'd be reasonable to put a closelog() call just before
the abort() ...

                       regards, tom lane

I'm using stderr, emulated to look like syslog for pgfouine.

log_destination = 'stderr'
logging_collector = on                  # Enable capturing of stderr and csvlog
log_directory = '/var/log/postgres/'         # Directory where log files are written
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern.
 

Re: 8.3.0 Core with concurrent vacuum fulls

От
"Gavin M. Roy"
Дата:
On Wed, Mar 5, 2008 at 10:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Gavin M. Roy" <gmr@myyearbook.com> writes:
> 2008-03-04 05:45:47 EST [6698]: [1-1] LOG:  process 6698 still waiting for
> AccessShareLock on relation 1247 of database 16385 after 1001.519 ms
> 2008-03-04 05:45:47 EST [6698]: [2-1] STATEMENT:  VACUUM FULL
> autograph.autograph_creators
> 2008-03-04 05:46:28 EST [6730]: [1-1] LOG:  process 6730 still waiting for
> AccessShareLock on relation 1247 of database 16385 after 1000.887 ms
> 2008-03-04 05:46:28 EST [6730]: [2-1] STATEMENT:  VACUUM FULL
> lunchmoney.totals
> 2008-03-04 05:47:55 EST [3809]: [18-1] LOG:  server process (PID 6742) was
> terminated by signal 6: Aborted
> 2008-03-04 05:47:55 EST [3809]: [19-1] LOG:  terminating any other active
> server processes
> 2008-03-04 05:47:55 EST [6741]: [12-1] WARNING:  terminating connection
> because of crash of another server process

How annoying ... the PANIC message doesn't seem to have reached the log.
elog.c is careful to fflush(stderr) before abort(), so that isn't
supposed to happen.  But it looks like you are using syslog for logging
(correct?) so maybe this is a problem with the syslog implementation
you're using.  What's the platform exactly?

I wonder if it'd be reasonable to put a closelog() call just before
the abort() ...

                       regards, tom lane

The panic may have made it if this is what you were looking for:

2008-03-04 05:45:20 EST [6742]: [7-1] PANIC:  deadlock detected
2008-03-04 05:58:33 EST [8751]: [3-1] PANIC:  deadlock detected

(it cored twice before I lowered the concurrency of vacuums)

Re: 8.3.0 Core with concurrent vacuum fulls

От
"Heikki Linnakangas"
Дата:
Tom Lane wrote:
> The reason the critical section is so large is that we're manipulating
> the contents of a shared buffer, and we don't want a failure to leave a
> partially-modified page in the buffer.  We could fix that if we were to
> memcpy the page into local storage and do all the pruning work there.
> Then the critical section would only surround copying the page back to
> the buffer and writing the WAL record.  Copying the page is a tad
> annoying but heap_page_prune is an expensive operation anyway, and
> I think we really are at too much risk of PANIC the way it's being done
> now.  Has anyone got a better idea?

We could do the pruning in two phases: first figure out what to do 
without modifyng anything, outside critical-section, and then actually 
do it, inside critical section.

Looking at heap_page_prune, we already collect information of what we 
did in the redirected/nowdead/nowunused arrays for WAL logging purposes. 
We could use that, but we would also have to teach heap_prune_chain to 
not step into tuples that we've already decided to remove.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: 8.3.0 Core with concurrent vacuum fulls

От
Tom Lane
Дата:
"Gavin M. Roy" <gmr@myyearbook.com> writes:
> The panic may have made it if this is what you were looking for:

> 2008-03-04 05:45:20 EST [6742]: [7-1] PANIC:  deadlock detected
> 2008-03-04 05:58:33 EST [8751]: [3-1] PANIC:  deadlock detected

That's what I expected to find, but where's the DETAIL for these?
        regards, tom lane


Re: 8.3.0 Core with concurrent vacuum fulls

От
"Gavin M. Roy"
Дата:
2008-03-04 05:45:20 EST [6742]: [7-1] PANIC:  deadlock detected
2008-03-04 05:45:20 EST [6742]: [8-1] DETAIL:  Process 6742 waits for AccessShareLock on relation 2619 of database 16385; blocked by process 6740.
Process 6740 waits for AccessShareLock on relation 1247 of database 16385; blocked by process 6742.
2008-03-04 05:45:20 EST [6742]: [9-1] STATEMENT:  VACUUM FULL pg_catalog.pg_type

Sorry, been juggling too many things this morning!

On Wed, Mar 5, 2008 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Gavin M. Roy" <gmr@myyearbook.com> writes:
> The panic may have made it if this is what you were looking for:

> 2008-03-04 05:45:20 EST [6742]: [7-1] PANIC:  deadlock detected
> 2008-03-04 05:58:33 EST [8751]: [3-1] PANIC:  deadlock detected

That's what I expected to find, but where's the DETAIL for these?

                       regards, tom lane

Re: 8.3.0 Core with concurrent vacuum fulls

От
Tom Lane
Дата:
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> Tom Lane wrote:
>> I think we really are at too much risk of PANIC the way it's being done
>> now.  Has anyone got a better idea?

> We could do the pruning in two phases: first figure out what to do 
> without modifyng anything, outside critical-section, and then actually 
> do it, inside critical section.

> Looking at heap_page_prune, we already collect information of what we 
> did in the redirected/nowdead/nowunused arrays for WAL logging purposes. 

That's a thought, but ...

> We could use that, but we would also have to teach heap_prune_chain to 
> not step into tuples that we've already decided to remove.

... seems like this would require searching the aforementioned arrays
for each tuple examined, which could turn into an O(N^2) problem.
If there are many removable tuples it could easily end up slower than
copying.

[ thinks some more... ]  I guess we could use a flag array dimensioned
MaxHeapTuplesPerPage to mark already-processed tuples, so that you
wouldn't need to search the existing arrays but just index into the flag
array with the tuple's offsetnumber.

I wonder if the logic could be restructured to avoid this by taking
advantage of it being a two-pass process, instead of fighting it?
But that'd probably be a bigger change than we'd want to risk
back-patching.

Since I'm the one complaining about the PANIC risk, I guess I should
do the legwork here.
        regards, tom lane


Re: 8.3.0 Core with concurrent vacuum fulls

От
Tom Lane
Дата:
"Gavin M. Roy" <gmr@myyearbook.com> writes:
> 2008-03-04 05:45:20 EST [6742]: [7-1] PANIC:  deadlock detected
> 2008-03-04 05:45:20 EST [6742]: [8-1] DETAIL:  Process 6742 waits for
> AccessShareLock on relation 2619 of database 16385; blocked by process 6740.
> Process 6740 waits for AccessShareLock on relation 1247 of database 16385;
> blocked by process 6742.
> 2008-03-04 05:45:20 EST [6742]: [9-1] STATEMENT:  VACUUM FULL
> pg_catalog.pg_type

Hmm ...

regression=# select 2619::regclass, 1247::regclass;  regclass   | regclass 
--------------+----------pg_statistic | pg_type
(1 row)

regression=# 

So presumably 6740 is doing a vac full on pg_statistic.  There isn't
really any need for these to conflict in cache initialization.
It strikes me that we could fix most of the deadlock risk if we just
switched the order of the two tests in the loop in
PrepareToInvalidateCacheTuple; that is, don't force initialization
of a catcache unless it's one we need to access right now.  Then a
VAC FULL is only going to be interested in initializing caches for
the catalog it's vacuuming, which should be safe enough.

We still need to deal with the excessive PANIC risk, but I think we
have a plan for that now.
        regards, tom lane


Re: 8.3.0 Core with concurrent vacuum fulls

От
Tom Lane
Дата:
"Gavin M. Roy" <gmr@myyearbook.com> writes:
> On Wed, Mar 5, 2008 at 10:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Actually, maybe it *has* been seen before.  Gavin, are you in the habit
>> of running concurrent VACUUM FULLs on system catalogs, and if so have
>> you noted that they occasionally get deadlock failures?

> Generally no,  I've never noticed deadlocks before, but I'll go back and
> look at some of the other the machines.

After digging a bit deeper, it seems that pre-8.2 releases wouldn't have
been at risk for a deadlock here anyway, because
CatalogCacheInitializeCache didn't lock the system catalog it was
initializing a cache for.  (That had *other* risks, but not this one.)
So possibly the lack of prior reports is just because not too many
people are in the habit of using concurrent VACUUM FULLs with late-model
Postgres.  I can reproduce the deadlock (though not the ensuing PANIC)
in 8.2, so it's definitely not heap_page_prune's fault.
        regards, tom lane


Re: 8.3.0 Core with concurrent vacuum fulls

От
"Pavan Deolasee"
Дата:
On Wed, Mar 5, 2008 at 9:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>
>  [ thinks some more... ]  I guess we could use a flag array dimensioned
>  MaxHeapTuplesPerPage to mark already-processed tuples, so that you
>  wouldn't need to search the existing arrays but just index into the flag
>  array with the tuple's offsetnumber.
>

We can actually combine this and the page copying ideas. Instead of copying
the entire page, we can just copy the line pointers array and work on the copy.
ISTM that the only place where we change the tuple contents is when we
collapse the redirected line pointers and that we can do at the end, on the
original page.

The last step which we run inside a critical section would then be just like
invoking heap_xlog_clean with the information collected in the first pass.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


Re: 8.3.0 Core with concurrent vacuum fulls

От
Tom Lane
Дата:
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> On Wed, Mar 5, 2008 at 9:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> [ thinks some more... ]  I guess we could use a flag array dimensioned
>> MaxHeapTuplesPerPage to mark already-processed tuples, so that you
>> wouldn't need to search the existing arrays but just index into the flag
>> array with the tuple's offsetnumber.

> We can actually combine this and the page copying ideas. Instead of copying
> the entire page, we can just copy the line pointers array and work on the copy.

I think that just makes things more complex and fragile.  I like
Heikki's idea, in part because it makes the normal path and the WAL
recovery path guaranteed to work alike.  I'll attach my work-in-progress
patch for this --- it doesn't do anything about the invalidation
semantics problem but it does fix the critical-section-too-big problem.

            regards, tom lane


Вложения

Re: 8.3.0 Core with concurrent vacuum fulls

От
"Heikki Linnakangas"
Дата:
Tom Lane wrote:
> "Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
>> On Wed, Mar 5, 2008 at 9:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> [ thinks some more... ]  I guess we could use a flag array dimensioned
>>> MaxHeapTuplesPerPage to mark already-processed tuples, so that you
>>> wouldn't need to search the existing arrays but just index into the flag
>>> array with the tuple's offsetnumber.
> 
>> We can actually combine this and the page copying ideas. Instead of copying
>> the entire page, we can just copy the line pointers array and work on the copy.
> 
> I think that just makes things more complex and fragile.  I like
> Heikki's idea, in part because it makes the normal path and the WAL
> recovery path guaranteed to work alike.  I'll attach my work-in-progress
> patch for this --- it doesn't do anything about the invalidation
> semantics problem but it does fix the critical-section-too-big problem.

FWIW, the patch looks fine to me. By inspection; I didn't test it.

I'm glad we got away with a single "marked" array. I was afraid we would 
need to consult the unused/redirected/dead arrays separately.

Do you have a plan for the invalidation problem? I think we could just 
not remove the redirection line pointers in catalog tables.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: 8.3.0 Core with concurrent vacuum fulls

От
Tom Lane
Дата:
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> I'm glad we got away with a single "marked" array. I was afraid we would 
> need to consult the unused/redirected/dead arrays separately.

Yeah, I was worried about that too.  The fundamental reason why it's
okay seems to be that redirects can only be the heads of HOT chains.
Therefore, a newly marked tuple cannot be a member of any chain we'll
need to scan later, no matter whether it is marked as newly redirected,
dead, or unused.  And so we can just ignore marked tuples in all cases.

I set up the code to mark the redirection target too, but that is just a
minor optimization AFAICS --- if we reach the redirection target later
in the outer scan loop, we'd decide not to process it anyway.

> Do you have a plan for the invalidation problem? I think we could just 
> not remove the redirection line pointers in catalog tables.

Still thinking about it, but I agree that I'd rather make a small
modification to VACUUM FULL than try to redesign cache invalidation.

The trick with not removing redirect pointers would be to ensure we
don't remove the redirection target.  While the redirection target was
presumably not DEAD when heap_page_prune looked at it, it seems possible
that it is DEAD by the time vacuum.c looks.  Another risk factor is
trying to move the redirection target down to a lower page.
        regards, tom lane


Re: 8.3.0 Core with concurrent vacuum fulls

От
"Pavan Deolasee"
Дата:
On Thu, Mar 6, 2008 at 11:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>
>  I think that just makes things more complex and fragile.  I like
>  Heikki's idea, in part because it makes the normal path and the WAL
>  recovery path guaranteed to work alike.  I'll attach my work-in-progress
>  patch for this --- it doesn't do anything about the invalidation
>  semantics problem but it does fix the critical-section-too-big problem.
>

The WIP patch looks good to me. I haven't yet tested it (will wait for the
final version). The following pointer arithmetic caught my eye though.

!   nunused = (end - nowunused);

Shouldn't we typecast them to (char *) first ?

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


Re: 8.3.0 Core with concurrent vacuum fulls

От
Tom Lane
Дата:
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> The WIP patch looks good to me. I haven't yet tested it (will wait for the
> final version). The following pointer arithmetic caught my eye though.
> !   nunused = (end - nowunused);
> Shouldn't we typecast them to (char *) first ?

No ... we want the number of OffsetNumbers, not the number of bytes.
        regards, tom lane