Re: Parallel copy

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: Parallel copy
Дата	22 февраля 2020 г. 00:28:02
Msg-id	20200222002802.yew5buvrd2yrjkm6@development обсуждение исходный текст
Ответ на	Re: Parallel copy (Ants Aasma <ants@cybertec.at>)
Список	pgsql-hackers

Дерево обсуждения

On Fri, Feb 21, 2020 at 02:54:31PM +0200, Ants Aasma wrote:
>On Thu, 20 Feb 2020 at 18:43, David Fetter <david@fetter.org> wrote:>
>> On Thu, Feb 20, 2020 at 02:36:02PM +0100, Tomas Vondra wrote:
>> > I think the wc2 is showing that maybe instead of parallelizing the
>> > parsing, we might instead try using a different tokenizer/parser and
>> > make the implementation more efficient instead of just throwing more
>> > CPUs on it.
>>
>> That was what I had in mind.
>>
>> > I don't know if our code is similar to what wc does, maytbe parsing
>> > csv is more complicated than what wc does.
>>
>> CSV parsing differs from wc in that there are more states in the state
>> machine, but I don't see anything fundamentally different.
>
>The trouble with a state machine based approach is that the state
>transitions form a dependency chain, which means that at best the
>processing rate will be 4-5 cycles per byte (L1 latency to fetch the
>next state).
>
>I whipped together a quick prototype that uses SIMD and bitmap
>manipulations to do the equivalent of CopyReadLineText() in csv mode
>including quotes and escape handling, this runs at 0.25-0.5 cycles per
>byte.
>

Interesting. How does that compare to what we currently have?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Parallel copy