Re: adding wait_start column to pg_locks
От | Fujii Masao |
---|---|
Тема | Re: adding wait_start column to pg_locks |
Дата | |
Msg-id | 655f1a5b-4fa8-53ba-c3e2-8cee31587d37@oss.nttdata.com обсуждение исходный текст |
Ответ на | Re: adding wait_start column to pg_locks (Fujii Masao <masao.fujii@oss.nttdata.com>) |
Ответы |
Re: adding wait_start column to pg_locks
|
Список | pgsql-hackers |
On 2021/02/10 10:43, Fujii Masao wrote: > > > On 2021/02/09 23:31, torikoshia wrote: >> On 2021-02-09 22:54, Fujii Masao wrote: >>> On 2021/02/09 19:11, Fujii Masao wrote: >>>> >>>> >>>> On 2021/02/09 18:13, Fujii Masao wrote: >>>>> >>>>> >>>>> On 2021/02/09 17:48, torikoshia wrote: >>>>>> On 2021-02-05 18:49, Fujii Masao wrote: >>>>>>> On 2021/02/05 0:03, torikoshia wrote: >>>>>>>> On 2021-02-03 11:23, Fujii Masao wrote: >>>>>>>>>> 64-bit fetches are not atomic on some platforms. So spinlock is necessary when updating "waitStart" without holdingthe partition lock? Also GetLockStatusData() needs spinlock when reading "waitStart"? >>>>>>>>> >>>>>>>>> Also it might be worth thinking to use 64-bit atomic operations like >>>>>>>>> pg_atomic_read_u64(), for that. >>>>>>>> >>>>>>>> Thanks for your suggestion and advice! >>>>>>>> >>>>>>>> In the attached patch I used pg_atomic_read_u64() and pg_atomic_write_u64(). >>>>>>>> >>>>>>>> waitStart is TimestampTz i.e., int64, but it seems pg_atomic_read_xxx and pg_atomic_write_xxx only supports unsignedint, so I cast the type. >>>>>>>> >>>>>>>> I may be using these functions not correctly, so if something is wrong, I would appreciate any comments. >>>>>>>> >>>>>>>> >>>>>>>> About the documentation, since your suggestion seems better than v6, I used it as is. >>>>>>> >>>>>>> Thanks for updating the patch! >>>>>>> >>>>>>> + if (pg_atomic_read_u64(&MyProc->waitStart) == 0) >>>>>>> + pg_atomic_write_u64(&MyProc->waitStart, >>>>>>> + pg_atomic_read_u64((pg_atomic_uint64 *) &now)); >>>>>>> >>>>>>> pg_atomic_read_u64() is really necessary? I think that >>>>>>> "pg_atomic_write_u64(&MyProc->waitStart, now)" is enough. >>>>>>> >>>>>>> + deadlockStart = get_timeout_start_time(DEADLOCK_TIMEOUT); >>>>>>> + pg_atomic_write_u64(&MyProc->waitStart, >>>>>>> + pg_atomic_read_u64((pg_atomic_uint64 *) &deadlockStart)); >>>>>>> >>>>>>> Same as above. >>>>>>> >>>>>>> + /* >>>>>>> + * Record waitStart reusing the deadlock timeout timer. >>>>>>> + * >>>>>>> + * It would be ideal this can be synchronously done with updating >>>>>>> + * lock information. Howerver, since it gives performance impacts >>>>>>> + * to hold partitionLock longer time, we do it here asynchronously. >>>>>>> + */ >>>>>>> >>>>>>> IMO it's better to comment why we reuse the deadlock timeout timer. >>>>>>> >>>>>>> proc->waitStatus = waitStatus; >>>>>>> + pg_atomic_init_u64(&MyProc->waitStart, 0); >>>>>>> >>>>>>> pg_atomic_write_u64() should be used instead? Because waitStart can be >>>>>>> accessed concurrently there. >>>>>>> >>>>>>> I updated the patch and addressed the above review comments. Patch attached. >>>>>>> Barring any objection, I will commit this version. >>>>>> >>>>>> Thanks for modifying the patch! >>>>>> I agree with your comments. >>>>>> >>>>>> BTW, I ran pgbench several times before and after applying >>>>>> this patch. >>>>>> >>>>>> The environment is virtual machine(CentOS 8), so this is >>>>>> just for reference, but there were no significant difference >>>>>> in latency or tps(both are below 1%). >>>>> >>>>> Thanks for the test! I pushed the patch. >>>> >>>> But I reverted the patch because buildfarm members rorqual and >>>> prion don't like the patch. I'm trying to investigate the cause >>>> of this failures. >>>> >>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rorqual&dt=2021-02-09%2009%3A20%3A10 >>> >>> - relation | locktype | mode >>> ------------------+----------+--------------------- >>> - test_prepared_1 | relation | RowExclusiveLock >>> - test_prepared_1 | relation | AccessExclusiveLock >>> -(2 rows) >>> - >>> +ERROR: invalid spinlock number: 0 >>> >>> "rorqual" reported that the above error happened in the server built with >>> --disable-atomics --disable-spinlocks when reading pg_locks after >>> the transaction was prepared. The cause of this issue is that "waitStart" >>> atomic variable in the dummy proc created at the end of prepare transaction >>> was not initialized. I updated the patch so that pg_atomic_init_u64() is >>> called for the "waitStart" in the dummy proc for prepared transaction. >>> Patch attached. I confirmed that the patched server built with >>> --disable-atomics --disable-spinlocks passed all the regression tests. >> >> Thanks for fixing the bug, I also tested v9.patch configured with >> --disable-atomics --disable-spinlocks on my environment and confirmed >> that all tests have passed. > > Thanks for the test! > > I found another bug in the patch. InitProcess() initializes "waitStart", > but previously InitAuxiliaryProcess() did not. This could cause "invalid > spinlock number" error when reading pg_locks in the standby server. > I fixed that. Attached is the updated version of the patch. I pushed this version. Thanks! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
В списке pgsql-hackers по дате отправления: