Re: Performace Optimization for Dummies

Поиск

Список

Период

Сортировка

От	Carlo Stonebanks
Тема	Re: Performace Optimization for Dummies
Дата	28 сентября 2006 г. 17:15:16
Msg-id	efhag4$1n9d$1@news.hub.org обсуждение
Ответ на	Performace Optimization for Dummies ("Carlo Stonebanks" <stonec.register@sympatico.ca>)
Ответы	Re: Performace Optimization for Dummies
Список	pgsql-performance

Дерево обсуждения

The deduplication process requires so many programmed procedures that it
runs on the client. Most of the de-dupe lookups are not "straight" lookups,
but calculated ones emplying fuzzy logic. This is because we cannot dictate
the format of our input data and must deduplicate with what we get.

This was one of the reasons why I went with PostgreSQL in the first place,
because of the server-side programming options. However, I saw incredible
performance hits when running processes on the server and I partially
abandoned the idea (some custom-buiilt name-comparison functions still run
on the server).

I am using Tcl on both the server and the client. I'm not a fan of Tcl, but
it appears to be quite well implemented and feature-rich in PostgreSQL. I
find PL/pgsql awkward - even compared to Tcl. (After all, I'm just a
programmer...  we do tend to be a little limited.)

The import program actually runs on the server box as a db client and
involves about 3000 lines of code (and it will certainly grow steadily as we
add compatability with more import formats). Could a process involving that
much logic run on the db server, and would there really be a benefit?

Carlo

""Jim C. Nasby"" <jim@nasby.net> wrote in message
news:20060928184538.GV34238@nasby.net...
> On Thu, Sep 28, 2006 at 01:53:22PM -0400, Carlo Stonebanks wrote:
>> > are you using the 'copy' interface?
>>
>> Straightforward inserts - the import data has to transformed, normalised
>> and
>> de-duped by the import program. I imagine the copy interface is for more
>> straightforward data importing. These are - buy necessity - single row
>> inserts.
>
> BTW, stuff like de-duping is something you really want the database -
> not an external program - to be doing. Think about loading the data into
> a temporary table and then working on it from there.
> --
> Jim Nasby                                            jim@nasby.net
> EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Performace Optimization for Dummies