Re: Performace Optimization for Dummies

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: Performace Optimization for Dummies
Дата
Msg-id b42b73150609281306r4806ab8fy7ddbb2fac31276d2@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Performace Optimization for Dummies  ("Carlo Stonebanks" <stonec.register@sympatico.ca>)
Список pgsql-performance
On 9/28/06, Carlo Stonebanks <stonec.register@sympatico.ca> wrote:
> > are you using the 'copy' interface?
>
> Straightforward inserts - the import data has to transformed, normalised and
> de-duped by the import program. I imagine the copy interface is for more
> straightforward data importing. These are - buy necessity - single row
> inserts.

right. see comments below.

> > thats a tough question.  my gut says that windows will not scale as
> > well as recent linux kernels in high load environments.
>
> But not in the case of a single import program trying to seed a massive
> database?

probably not.

> > hearing good things about the woodcrest. pre-woodcrest xeon (dempsey
> > down) is outclassed by the opteron.
>
> Need to find a way to deterimine the Xeon type. The server was bought in
> early 2006, and it looks like woodcrest was form July.

ok, there are better chips out there but again this is not something
you would really notice outside of high load environements.

> > 1. can probably run fsync=off during the import
> > 2. if import is single proecess, consider temporary bump to memory for
> > index creation. or, since you have four cores consider having four
> > processes import the data somehow.
> > 3. turn off stats collector, stats_command_string, stats_row_level,
> > and autovacuum during import.

by the way, stats_command_string is a known performance killer that
iirc was improved in 8.2. just fyi.

I would suggest at least consideration of retooling your import as
follows...it might be a fun project to learn some postgresql
internals.  I'm assuming you are doing some script preprocessing in a
language like perl:

bulk load denomalized tables into scratch tables into the postgresql
database. create indexes appropriate to the nomalization process
remembering you can index on virtually any expression in postgresql
(including regex substitution).

use sql to process the data. if tables are too large to handle with
monolithic queries, use cursors and/or functions to handle the
conversion.  now you can keep track of progress using pl/pgsql raise
notice command for example.

merlin

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Carlo Stonebanks"
Дата:
Сообщение: Re: Performace Optimization for Dummies
Следующее
От: Cedric Boudin
Дата:
Сообщение: archive wal's failure and load increase.