Re: Benchmark Data requested --- pgloader CE design ideas

Поиск
Список
Период
Сортировка
От Mark Lewis
Тема Re: Benchmark Data requested --- pgloader CE design ideas
Дата
Msg-id 1202404489.24047.3.camel@archimedes
обсуждение исходный текст
Ответ на Re: Benchmark Data requested --- pgloader CE design ideas  (Greg Smith <gsmith@gregsmith.com>)
Список pgsql-performance
> > I was thinking of not even reading the file content from the controller
> > thread, just decide splitting points in bytes (0..ST_SIZE/4 -
> > ST_SIZE/4+1..2*ST_SIZE/4 etc) and let the reading thread fine-tune by
> > beginning to process input after having read first newline, etc.
>
> The problem I was pointing out is that if chunk#2 moved foward a few bytes
> before it started reading in search of a newline, how will chunk#1 know
> that it's supposed to read up to that further point?  You have to stop #1
> from reading further when it catches up with where #2 started.  Since the
> start of #2 is fuzzy until some reading is done, what you're describing
> will need #2 to send some feedback to #1 after they've both started, and
> that sounds bad to me.  I like designs where the boundaries between
> threads are clearly defined before any of them start and none of them ever
> talk to the others.

I don't think that any communication is needed beyond the beginning of
the threads.  Each thread knows that it should start at byte offset X
and end at byte offset Y, but if Y happens to be in the middle of a
record then just keep going until the end of the record.  As long as the
algorithm for reading past the end marker is the same as the algorithm
for skipping past the beginning marker then all is well.

-- Mark Lewis

В списке pgsql-performance по дате отправления:

Предыдущее
От: Kenneth Marshall
Дата:
Сообщение: Re: Benchmark Data requested --- pgloader CE design ideas
Следующее
От: andrew klassen
Дата:
Сообщение: index usage on arrays