clang's sanitizer makes stringToNode() extremely slow

Поиск
Список
Период
Сортировка
От Alexander Lakhin
Тема clang's sanitizer makes stringToNode() extremely slow
Дата
Msg-id eb0a5bbf-1dc8-ea26-e9c6-0bf8f190dd54@gmail.com
обсуждение исходный текст
Список pgsql-hackers
Hello hackers,

When using a server built with clang (15, 18) with sanitizers enabled,
the last query in the following script:
SET parallel_setup_cost = 0;
SET min_parallel_table_scan_size = 0;

SELECT a::text INTO t FROM generate_series(1, 1000) a;
\timing on
SELECT string_agg(a, ',') FROM t WHERE a = REPEAT('0', 400000);

runs for more than a minute on my workstation (a larger repeat count gives
much longer duration):
Time: 66017.594 ms (01:06.018)

The typical stack trace for a running parallel worker is:
#0  0x0000559a23671885 in __sanitizer::internal_strlen(char const*) ()
#1  0x0000559a236568ed in StrtolFixAndCheck(void*, char const*, char**, char*, int) ()
#2  0x0000559a236423dc in __interceptor_strtol ()
#3  0x0000559a240027d9 in atoi (...) at readfuncs.c:629
#5  0x0000559a23fb03f0 in _readConst () at readfuncs.c:275
#6  parseNodeString () at ./readfuncs.switch.c:29
#7  0x0000559a23faa421 in nodeRead (
     token=0x7fee75cf3bd2 "{CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false 
:constisnull false :location -1 :constvalue 400004 [ 16 106 24 0 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48
48"...,
     tok_len=1) at read.c:338
#8  0x0000559a23faa916 in nodeRead (...) at read.c:452
#9  0x0000559a23fb3f34 in _readOpExpr () at ./readfuncs.funcs.c:279
#10 0x0000559a23fb0842 in parseNodeString () at ./readfuncs.switch.c:47
#11 0x0000559a23faa421 in nodeRead (...) at read.c:338
#12 0x0000559a23faa916 in nodeRead (...) at read.c:452
#13 0x0000559a23fefb74 in _readSeqScan () at ./readfuncs.funcs.c:3954
#14 0x0000559a23fb2b97 in parseNodeString () at ./readfuncs.switch.c:559
#15 0x0000559a23faa421 in nodeRead (...) at read.c:338
#16 0x0000559a23ffc033 in _readAgg () at ./readfuncs.funcs.c:4685
#17 0x0000559a23fb2dd3 in parseNodeString () at ./readfuncs.switch.c:611
#18 0x0000559a23faa421 in nodeRead (...) at read.c:338
#19 0x0000559a23feb340 in _readPlannedStmt () at ./readfuncs.funcs.c:3685
#20 0x0000559a23fb2ad1 in parseNodeString () at ./readfuncs.switch.c:541
#21 0x0000559a23faa421 in nodeRead (...) at read.c:338
#22 0x0000559a23fa99d8 in stringToNodeInternal (...) at read.c:92
#24 0x0000559a23d66609 in ExecParallelGetQueryDesc (...) at execParallel.c:1250
#25 ParallelQueryMain (...) at execParallel.c:1424
#26 0x0000559a238cfe13 in ParallelWorkerMain (...) at parallel.c:1516
#27 0x0000559a241e5b6a in BackgroundWorkerMain (...) at bgworker.c:848
#28 0x0000559a241ec254 in postmaster_child_launch (...) at launch_backend.c:265
#29 0x0000559a241f1c15 in do_start_bgworker (...) at postmaster.c:4270
#30 maybe_start_bgworkers () at postmaster.c:4501
#31 0x0000559a241f486e in process_pm_pmsignal () at postmaster.c:3774
#32 ServerLoop () at postmaster.c:1667
#33 0x0000559a241f0ed6 in PostmasterMain (...) at postmaster.c:1372
#34 0x0000559a23ebe16d in main (...) at main.c:197

The flamegraph (which shows that readDatum() -> __interceptor_strtol() ->
StrtolFixAndCheck() -> __sanitizer::internal_strlen () takes >99% of time)
is attached.
(I could not reproduce this behaviour with the gcc's sanitizer.)

Moreover, this query cannot be canceled, you can only kill -SIGKILL
parallel workers to interrupt it.

Best regards,
Alexander
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nathan Bossart
Дата:
Сообщение: Re: improve performance of pg_dump --binary-upgrade
Следующее
От: Justin Pryzby
Дата:
Сообщение: Re: pg17 issues with not-null contraints