Faster str to int conversion (was Table with large number of intcolumns, very slow COPY FROM)

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Faster str to int conversion (was Table with large number of intcolumns, very slow COPY FROM)
Дата
Msg-id 20171208214437.qgn6zdltyq5hmjpk@alap3.anarazel.de
обсуждение исходный текст
Ответы Re: Faster str to int conversion (was Table with large number of intcolumns, very slow COPY FROM)  (Andres Freund <andres@anarazel.de>)
Re: Faster str to int conversion (was Table with large number of intcolumns, very slow COPY FROM)  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi,


On 2017-12-08 10:17:34 -0800, Andres Freund wrote:
> the strtoll is libc functionality triggered by pg_atoi(), something I've
> seen show up in numerous profiles. I think it's probably time to have
> our own optimized version of it rather than relying on libcs.

Attached is a hand-rolled version. After quickly hacking up one from
scratch, I noticed we already kind of have one for int64 (scanint8), so
I changed the structure of this one to be relatively similar.

It's currently using the overflow logic from [1], but that's not
fundamentally required, we could rely on fwrapv for this one too.

This one improves performance of the submitted workload from 1223.950ms
to 1020.640ms (best of three). The profile's shape changes quite
noticeably:

master:
+   15.38%  postgres  libc-2.25.so      [.] __GI_____strtoll_l_internal
+   11.79%  postgres  postgres          [.] heap_fill_tuple
+    8.00%  postgres  postgres          [.] CopyFrom
+    7.40%  postgres  postgres          [.] CopyReadLine
+    6.79%  postgres  postgres          [.] ExecConstraints
+    6.68%  postgres  postgres          [.] NextCopyFromRawFields
+    6.36%  postgres  postgres          [.] heap_compute_data_size
+    6.02%  postgres  postgres          [.] pg_atoi
patch:
+   13.70%  postgres  postgres          [.] heap_fill_tuple
+   10.46%  postgres  postgres          [.] CopyFrom
+    9.31%  postgres  postgres          [.] pg_strto32
+    8.39%  postgres  postgres          [.] CopyReadLine
+    7.88%  postgres  postgres          [.] ExecConstraints
+    7.63%  postgres  postgres          [.] InputFunctionCall
+    7.41%  postgres  postgres          [.] heap_compute_data_size
+    7.21%  postgres  postgres          [.] pg_verify_mbstr
+    5.49%  postgres  postgres          [.] NextCopyFromRawFields


This probably isn't going to resolve Alex's performance concerns
meaningfully, but seems quite worthwhile to do anyway.

We probably should have int8/16/64 version coded just as use the 32bit
version, but I decided to leave that out for now. Primarily interested
in comments.  Wonder a bit whether it's worth providing an 'errorOk'
mode like scanint8 does, but surveying its callers suggests we should
rather change them to not need it...

Greetings,

Andres Freund

[1] http://archives.postgresql.org/message-id/20171030112751.mukkriz2rur2qkxc%40alap3.anarazel.de

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] Re: Is anything preventing us from allowing write toforeign tables from standby?
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] Re: proposal - psql: possibility to specify sort fordescribe commands, when size is printed