Re: PG 9.3 materialized view VS Views, indexes, shared memory

Поиск

Список

Период

Сортировка

От	Nicolas Paris
Тема	Re: PG 9.3 materialized view VS Views, indexes, shared memory
Дата	10 марта 2015 г. 11:53:29
Msg-id	CA+ssMOQL1bEzG+uJWQVgbtOUm9d2OVk_x5TpMHf9xEKj1Zj5DA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: PG 9.3 materialized view VS Views, indexes, shared memory (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список	pgsql-performance

Дерево обсуждения

Why are you dropping and re-loading? You mentioned it before and it sounded like it had something to do with adding columns, but you

don't have to drop and reload to add a column.

Adding a NULL column is fast. Dropping one too. I need to set some row as TRUE. I can do it with an update, but in postgresql update is done by delete then insert with copy of the row. This is really slow. A drop cascade, then bulk load is better.

This is not the only reason. Drop & load simplify all the ETL process. No question of delta changes and no "fuck brain" when a problem occurs or a modification of the table. I've tested, it loads 20milion rows in 5 min (without time for reindexing and time to retrieve datas)

2015-03-10 9:31 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 3/9/15 8:17 AM, Nicolas Paris wrote:
(sorry for top-posting, gmail does not help.)

*shakes fist at gmail*

Thanks to your advice Jim, I have done an other test :
No partial indexes, just a partial index on boolean columns does the
job. (I get same perfs as MV)
CREATE INDEX ..ON (BoolColumnX) WHERE BoolColumnX IS TRUE

Then VIEW =
SELECT colA....colZ
FROM huge_table
WHERE BoolColumnX IS TRUE

Then this only index is used 800times (for each bool col) and saves
place as it does'nt indexes NULL values, and does no replicate. subsets.
Moreover the huge indexes are allways loaded in cache memory.

Cool. :)

According to this link
http://postgresql.nabble.com/NULL-saves-disk-space-td4344106.html
NULL values do not take place if only one other column are null for
that row.
Boolean takes 1 byte wheras smallint 2bytes.
Then the space problem is not anymore a problem with boolean columns
95% empty

One thing that is really great with postgresql is transaction for
drop table cascade, that allow te restore all stuf index, views on a
rollback if problem in loading appears.
I hope using one transaction to drop/load many table is not a
performance issue ?

Why are you dropping and re-loading? You mentioned it before and it sounded like it had something to do with adding columns, but you don't have to drop and reload to add a column.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

В списке pgsql-performance по дате отправления:

Предыдущее

От: Jim Nasby
Дата: 10 марта 2015 г., 11:31:48
Сообщение: Re: PG 9.3 materialized view VS Views, indexes, shared memory

Следующее

От: Gunnlaugur Thor Briem
Дата: 11 марта 2015 г., 18:55:06
Сообщение: Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: PG 9.3 materialized view VS Views, indexes, shared memory

Предыдущее

Следующее