Re: [HACKERS] Vacuum analyze bug CAUGHT

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] Vacuum analyze bug CAUGHT
Дата
Msg-id 13452.937004740@sss.pgh.pa.us
обсуждение исходный текст
Ответ на RE: [HACKERS] Vacuum analyze bug CAUGHT  ("Hiroshi Inoue" <Inoue@tpf.co.jp>)
Ответы Re: [HACKERS] Vacuum analyze bug CAUGHT  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
>>>> Moreover CREATE TABLE doesn't acquire any lock for pg_attribute
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> while tuples are inserted into pg_attribute.
>>>> Concurrent vacuum may corrupt pg_attribute.
>> 
>> Should be fixed!
>> 

> Seems CREATE TABLE don't acquire any lock for pg_relcheck and
> pg_attrdef as well as pg_attribute. There may be other pg_.......

> Here is a patch.

Hmm, do we really need to grab AccessExclusiveLock on the pg_ tables
while creating or deleting tables?  That will mean no concurrency at
all for these operations.  Seems to me we want AccessExclusiveLock on
the table being created or deleted, but something less strong on the
system tables.  RowExclusiveLock might be appropriate --- Vadim, what
do you think?

Also, rather than running around and adding locks to every single
place that calls heap_open or heap_close, I wonder whether we shouldn't
have heap_open/heap_close themselves automatically grab or release
at least a minimal lock (AccessShareLock, I suppose).

Or maybe better: change heap_open/heap_openr/heap_close to take an
additional parameter specifying the kind of lock to grab.  That'd still
mean having to visit all the call sites, but it would force people to
think about the issue in future rather than forgetting to lock a table
they're accessing.

Comments?

BTW, while I still haven't been able to reproduce Michael Simms' crash
reliably, I did see one coredump caused by an assert failure in
heap_delete():

#6  0x16181c in ExceptionalCondition (   conditionName=0x283d4 "!(( lp)->lp_flags & 0x01)", exceptionP=0x40009a80,
detail=0x0,fileName=0x7ae4 "\003", lineNumber=1121) at assert.c:72
 
#7  0x7cc18 in heap_delete (relation=0x400891c0, tid=0x402d962c, ctid=0x0)   at heapam.c:1121
#8  0x9c208 in DeleteAttributeTuples (rel=0x40535260) at heap.c:1118
#9  0x9c4dc in heap_destroy_with_catalog (   relname=0x4e4ed7 <Address 0x4e4ed7 out of bounds>) at heap.c:1310
#10 0xa5168 in RemoveRelation (   name=0x80db9380 <Address 0x80db9380 out of bounds>) at creatinh.c:157
#11 0x129760 in ProcessUtility (parsetree=0x402d8d28, dest=Remote)   at utility.c:215

This was in the process doing table creates/drops, and I surmise that
the problem was a tuple move executed concurrently by the process doing
VACUUM.  In other words, it looks like this problem of missing lock
operations might be the cause, or one cause, of Michael's symptoms.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dmitry Samersoff
Дата:
Сообщение: RE: [HACKERS] Re: Query about postgres medical database
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Vacuum analyze bug CAUGHT