Re: Errors on missing pg_subtrans/ files with 9.3

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Errors on missing pg_subtrans/ files with 9.3
Дата
Msg-id 20131125114712.GI28145@alap2.anarazel.de
обсуждение исходный текст
Ответ на Re: Errors on missing pg_subtrans/ files with 9.3  (J Smith <dark.panda+lists@gmail.com>)
Ответы Re: Errors on missing pg_subtrans/ files with 9.3  (J Smith <dark.panda+lists@gmail.com>)
Re: Errors on missing pg_subtrans/ files with 9.3  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
Hi,

On 2013-11-24 16:56:26 -0500, J Smith wrote:
> coredumper worked like a charm. Useful tool, that is... although as a
> bit of advice, I'd try not to run it on Postgres if your various
> memory settings are tweaked towards production use -- the core dump
> that was captured on my server weighed in at 16 GB.

> Nov 23 14:38:32 dev postgres[23810]: [4-1] user=dev,db=dev ERROR:  could not access status of transaction 13514992
> Nov 23 14:38:32 dev postgres[23810]: [4-2] user=dev,db=dev DETAIL:  Could not open file "pg_subtrans/00CE": Success.
> Nov 23 14:38:32 dev postgres[23810]: [4-3] user=dev,db=dev CONTEXT:  SQL statement "SELECT 1 FROM ONLY
"dev"."collection_batches"x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
 


Ok, this is helpful. Do you rather longrunning transactions? The
transaction that does foreign key checks has an xid of 10260613, while
the row that's getting checked has 13514992.

#4  0x0000000000635dc7 in XactLockTableWait (xid=13514992) at lmgr.c:501       tag = {locktag_field1 = 13514992,
locktag_field2= 0, locktag_field3 = 0, locktag_field4 = 0, locktag_type = 4 '\004', locktag_lockmethodid = 1 '\001'}
 
#5  0x0000000000482223 in heap_lock_updated_tuple_rec (rel=0x2b20f050a8d0, tuple=<value optimized out>, ctid=<value
optimizedout>, xid=10260613, mode=LockTupleKeyShare) at heapam.c:4847
 

I am not sure whether that's the origin of the problem but at the very
least it seems to me that heap_lock_updated_tuple_rec() is missing
several important pieces:
a) do the priorXmax==xmin dance to check we're still following the same  ctid chain. Currently we could easily stumble
acrosscompletely  unrelated tuples if a chain element aborted and got vacuumed.
 
b) Check whether a chain element actually aborted - currently we're  only doing that in the HEAP_KEYS_UPDATED updated
case,but that seems  wrong (we can't check for committed tho!).
 
c) (reported separately as well) cope with failure of heap_fetch() to  return anything.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Sequence Access Method WIP
Следующее
От: Sawada Masahiko
Дата:
Сообщение: Re: Logging WAL when updating hintbit