Re: Parallel copy

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Parallel copy
Дата
Msg-id 20200224010951.bxecdyaduyjktg6q@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Parallel copy  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: Parallel copy  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
Hi,

On 2020-02-19 11:38:45 +0100, Tomas Vondra wrote:
> I generally agree with the impression that parsing CSV is tricky and
> unlikely to benefit from parallelism in general. There may be cases with
> restrictions making it easier (e.g. restrictions on the format) but that
> might be a bit too complex to start with.
> 
> For example, I had an idea to parallelise the planning by splitting it
> into two phases:

FWIW, I think we ought to rewrite our COPY parsers before we go for
complex schemes. They're way slower than a decent green-field
CSV/... parser.


> The one piece of information I'm missing here is at least a very rough
> quantification of the individual steps of CSV processing - for example
> if parsing takes only 10% of the time, it's pretty pointless to start by
> parallelising this part and we should focus on the rest. If it's 50% it
> might be a different story. Has anyone done any measurements?

Not recently, but I'm pretty sure that I've observed CSV parsing to be
way more than 10%.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Parallel copy
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Error on failed COMMIT