Re: POC: make mxidoff 64 bits

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: POC: make mxidoff 64 bits
Дата
Msg-id 669fa18c-82f9-4f56-86a9-75ba7bc4e7dc@iki.fi
обсуждение исходный текст
Ответ на Re: POC: make mxidoff 64 bits  (Maxim Orlov <orlovmg@gmail.com>)
Список pgsql-hackers
On 07/11/2025 18:03, Maxim Orlov wrote:
> I tried finding out how long it would take to convert a big number of
> segments. Unfortunately, I only have access to a very old machine right
> now. It took me 7 hours to generate this much data on my old
> Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 16 Gb of RAM.
> 
> Here are my rough measurements:
> 
> HDD
> $ sudo sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
> $ time pg_upgrade
> ...
> real    4m59.459s
> user    0m19.974s
> sys     0m13.640s
> 
> SSD
> $ sudo sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
> $ time pg_upgrade
> ...
> real    4m52.958s
> user    0m19.826s
> sys     0m13.624s
> 
> I aim to get access to more modern stuff and check it all out there.

Thanks, I also did some perf testing on my laptop. I wrote a little 
helper function to consume multixids, and used it to create a v17 
cluster with 100 million multixids. See attached 
consume-mxids.patch.txt. I then ran pg_upgrade on that, and measured how 
long the pg_multixact conversion part of pg_upgrade took. It took about 
1.2 s on my laptop. Extrapolating from that, converting 1 billion 
multixids would take 12 s. These were very simple multixacts with just 
one member each, though; realistic multixacts with more members would 
presumably take a little longer.

In any case, I think we're in an acceptable ballpark here.

There's some very low-hanging fruit though: Profiling with 'linux-perf' 
suggested that a lot of CPU time was spent simply on the function call 
overhead of GetOldMultiXactIdSingleMember, SlruReadSwitchPage, 
RecordNewMultiXact, SlruWriteSwitchPage for each multixact. I added an 
inlined fast path to SlruReadSwitchPage and SlruWriteSwitchPage to 
eliminate the function call overhead of those in the common case that no 
page switch is needed. With that, the 100 million mxid test case I used 
went from 1.2 s to 0.9 s. We could optimize this further but I think 
this is good enough.

Some other changes since patch set v23:

- Rebased. I committed the wraparound bug fixes.

- I added an SlruFileName() helper function to slru_io.c, and support 
for reading SLRUs with long_segment_names==true. It's not needed 
currently, but it seemed like a weird omission. AllocSlruRead() actually 
left 'long_segment_names' uninitialized which is error-prone. We 
could've just documented it, but it seems just as easy to support it.

- I split the multixact_internal.h header in a separate commit, to make 
it more clear what changes are related to 64-bit offsets

I kept all the new test cases for now. We need to decide which ones are 
worth keeping, and polish and speed up the ones we decide to keep.


I'm getting one failure from the pg_upgrade/008_mxoff test:

> [14:43:38.422](0.530s) not ok 26 - dump outputs from original and restored regression databases match
> [14:43:38.422](0.000s) #   Failed test 'dump outputs from original and restored regression databases match'
> #   at /home/heikki/git-sandbox/postgresql/src/test/perl/PostgreSQL/Test/Utils.pm line 801.
> [14:43:38.422](0.000s) #          got: '1'
> #     expected: '0'
> === diff of
/home/heikki/git-sandbox/postgresql/build/testrun/pg_upgrade/008_mxoff/data/tmp_test_AC6A/oldnode_6_dump.sql_adjusted
and
/home/heikki/git-sandbox/postgresql/build/testrun/pg_upgrade/008_mxoff/data/tmp_test_AC6A/newnode_6_dump.sql_adjusted
> === stdout ===
> ---
/home/heikki/git-sandbox/postgresql/build/testrun/pg_upgrade/008_mxoff/data/tmp_test_AC6A/oldnode_6_dump.sql_adjusted
  2025-11-12 14:43:38.030399957 +0200
 
> +++
/home/heikki/git-sandbox/postgresql/build/testrun/pg_upgrade/008_mxoff/data/tmp_test_AC6A/newnode_6_dump.sql_adjusted
  2025-11-12 14:43:38.314399819 +0200
 
> @@ -2,8 +2,8 @@
>  -- PostgreSQL database dump
>  --
>  \restrict test
> --- Dumped from database version 17.6
> --- Dumped by pg_dump version 17.6
> +-- Dumped from database version 19devel
> +-- Dumped by pg_dump version 19devel
>  SET statement_timeout = 0;
>  SET lock_timeout = 0;
>  SET idle_in_transaction_session_timeout = 0;=== stderr ===
> === EOF ===
> [14:43:38.425](0.004s) # >>> case #6

I ran the test with:

(rm -rf build/testrun/ build/tmp_install/; 
olddump=/tmp/olddump-regress.sql oldinstall=/home/heikki/pgsql.17stable/ 
meson test -C build --suite setup --suite pg_upgrade)

- Heikki

Вложения

В списке pgsql-hackers по дате отправления: