Re: Benchmark Data requested

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Benchmark Data requested
Дата
Msg-id 47A8C1B9.10200@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Benchmark Data requested  (Dimitri Fontaine <dfontaine@hi-media.com>)
Ответы Re: Benchmark Data requested  ("Jignesh K. Shah" <J.K.Shah@Sun.COM>)
Список pgsql-performance
Dimitri Fontaine wrote:
> Le mardi 05 février 2008, Simon Riggs a écrit :
>> I'll look at COPY FROM internals to make this faster. I'm looking at
>> this now to refresh my memory; I already had some plans on the shelf.
>
> Maybe stealing some ideas from pg_bulkload could somewhat help here?
>   http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf
>
> IIRC it's mainly about how to optimize index updating while loading data, and
> I've heard complaints on the line "this external tool has to know too much
> about PostgreSQL internals to be trustworthy as non-core code"... so...

I've been thinking of looking into that as well. The basic trick
pg_bulkload is using is to populate the index as the data is being
loaded. There's no fundamental reason why we couldn't do that internally
in COPY. Triggers or constraints that access the table being loaded
would make it impossible, but we should be able to detect that and fall
back to what we have now.

What I'm basically thinking about is to modify the indexam API of
building a new index, so that COPY would feed the tuples to the indexam,
instead of the indexam opening and scanning the heap. The b-tree indexam
would spool the tuples into a tuplesort as the COPY progresses, and
build the index from that at the end as usual.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

В списке pgsql-performance по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Benchmark Data requested
Следующее
От: "Jignesh K. Shah"
Дата:
Сообщение: Re: Benchmark Data requested