Re: Adding REPACK [concurrently]
| От | Antonin Houska |
|---|---|
| Тема | Re: Adding REPACK [concurrently] |
| Дата | |
| Msg-id | 46846.1774267234@localhost обсуждение исходный текст |
| Ответ на | Re: Adding REPACK [concurrently] (Antonin Houska <ah@cybertec.at>) |
| Ответы |
Re: Adding REPACK [concurrently]
Re: Adding REPACK [concurrently] |
| Список | pgsql-hackers |
Antonin Houska <ah@cybertec.at> wrote: > Antonin Houska <ah@cybertec.at> wrote: > > > Antonin Houska <ah@cybertec.at> wrote: > > > > > Srinath Reddy Sadipiralla <srinath2133@gmail.com> wrote: > > > > > > > The concurrency test failed once. I tried to reproduce the below scenario > > > > but no luck,i think the reason the assert failure happened because > > > > after speculative insert there might be no spec CONFIRM or ABORT, thoughts? > > > > > > Perhaps, I'll try. I'm not sure the REPACK decoding worker does anthing > > > special regarding decoding. If you happen to see the problem again, please try > > > to preserve the related WAL segments - if this is a bug in PG executor, > > > pg_waldump might reveal that. > > > > I could not reproduce the failure, and have no idea how speculative insert can > > stay w/o CONFIRM / ABORT record. The only problem I could imagine is that > > change_useless_for_repack() filters out the CONFIRM / ABORT record > > accidentally, but neither code review nor debugger proves that > > theory. (Actually if this was the problem, the test failure probably wouldn't > > be that rare.) > > I confirm that I was able to reproduce the crash using debugger and your more > recent diagnosis [1]. Indeed, filtering was the problem. > > Unfortunately, I wasn't able to make the crash easily reproducible using > isolation tester. The problem is that the logical decoding is performed by a > background worker, and when the backend executing REPACK waits for the > background worker, which in turn waits on an injection point, the isolation > tester does not recognize that it's effectively the backend who is waiting on > the injection point. Therefore the isolation tester does not proceed to the > next step. I could not resist digging in it deeper :-) Attached is a test that reproduces the crash - it includes the isolation tester enhancement that I posted separately [1]. It crashes reliably with v43 [2] if your fix v43-0005 is omitted. [1] https://www.postgresql.org/message-id/4703.1774250534%40localhost [2] https://www.postgresql.org/message-id/202603191855.fzsgsnyzfvpt%40alvherre.pgsql -- Antonin Houska Web: https://www.cybertec-postgresql.com
Вложения
В списке pgsql-hackers по дате отправления: