Re: XLogInsert scaling, revisited

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: XLogInsert scaling, revisited
Дата
Msg-id 51C58B5E.7030102@vmware.com
обсуждение исходный текст
Ответ на Re: XLogInsert scaling, revisited  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: XLogInsert scaling, revisited  (Andres Freund <andres@2ndquadrant.com>)
Re: XLogInsert scaling, revisited  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
On 21.06.2013 21:55, Jeff Janes wrote:
> I think I'm getting an undetected deadlock between the checkpointer and a
> user process running a TRUNCATE command.
>
> This is the checkpointer:
>
> #0  0x0000003a73eeaf37 in semop () from /lib64/libc.so.6
> #1  0x00000000005ff847 in PGSemaphoreLock (sema=0x7f8c0a4eb730,
> interruptOK=0 '\000') at pg_sema.c:415
> #2  0x00000000004b0abf in WaitOnSlot (upto=416178159648) at xlog.c:1775
> #3  WaitXLogInsertionsToFinish (upto=416178159648) at xlog.c:2086
> #4  0x00000000004b657a in CopyXLogRecordToWAL (write_len=32, isLogSwitch=1
> '\001', rdata=0x0, StartPos=<value optimized out>, EndPos=416192397312)
>      at xlog.c:1389
> #5  0x00000000004b6fb2 in XLogInsert (rmid=0 '\000', info=<value optimized
> out>, rdata=0x7fff00000020) at xlog.c:1209
> #6  0x00000000004b7644 in RequestXLogSwitch () at xlog.c:8748

Hmm, it looks like the xlog-switch is trying to wait for itself to
finish. The concurrent TRUNCATE is just being blocked behind the
xlog-switch, which is stuck on itself.

I wasn't able to reproduce exactly that, but I got a PANIC by running
pgbench and concurrently doing "select pg_switch_xlog()" many times in psql.

Attached is a new version that fixes at least the problem I saw. Not
sure if it fixes what you saw, but it's worth a try. How easily can you
reproduce that?

> This is using the same testing harness as in the last round of this patch.

This one?
http://www.postgresql.org/message-id/CAMkU=1xoA6Fdyoj_4fMLqpicZR1V9GP7cLnXJdHU+iGgqb6WUw@mail.gmail.com

> Is there a way for me to dump the list of held/waiting lwlocks from gdb?

You can print out the held_lwlocks array. Or to make it more friendly,
write a function that prints it out and call that from gdb. There's no
easy way to print out who's waiting for what that I know of.

Thanks for the testing!

- Heikki

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: MemoryContextAllocHuge(): selectively bypassing MaxAllocSize
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Support for REINDEX CONCURRENTLY