Re: PG 9.3 materialized view VS Views, indexes, shared memory

Поиск
Список
Период
Сортировка
От Nicolas Paris
Тема Re: PG 9.3 materialized view VS Views, indexes, shared memory
Дата
Msg-id CA+ssMOQL1bEzG+uJWQVgbtOUm9d2OVk_x5TpMHf9xEKj1Zj5DA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: PG 9.3 materialized view VS Views, indexes, shared memory  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список pgsql-performance
Why are you dropping and re-loading? You mentioned it before and it sounded like it had something to do with adding columns, but you 
don't have to drop and reload to add a column.

​Adding a NULL column is fast. Dropping one too. I need to set some row as TRUE. I can do it with an update, but in postgresql update is done by delete then insert with copy of the row. This is really slow. A drop cascade, then bulk load is better.

This is not the only reason. Drop & load simplify all the ETL process. No question of delta changes and no "fuck brain" when a problem occurs or a modification of the table. I've tested, it loads 20milion rows in 5 min (without time for reindexing and time to retrieve datas)

2015-03-10 9:31 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 3/9/15 8:17 AM, Nicolas Paris wrote:
(sorry for top-posting, gmail does not help.)

*shakes fist at gmail*

Thanks to your advice Jim, I have done an other test :
No partial indexes, just a partial index on boolean columns does the
job.  (I get same perfs as MV)
CREATE INDEX ..ON (BoolColumnX) WHERE BoolColumnX IS TRUE

Then VIEW =
SELECT colA....colZ
FROM huge_table
WHERE BoolColumnX IS TRUE

Then this only index is used 800times (for each bool col)  and saves
place as it does'nt indexes NULL values, and does no replicate. subsets.
Moreover the huge indexes are allways loaded in cache memory.

Cool. :)

    According to this link
    http://postgresql.nabble.com/NULL-saves-disk-space-td4344106.html
    NULL values do not take place if only one other column are null for
    that row.
    Boolean takes 1 byte wheras smallint 2bytes.
    Then the space problem is not anymore a problem with boolean columns
    95% empty

    One thing that is really great with postgresql is transaction for
    drop table cascade, that allow te restore all stuf index, views on a
    rollback if problem in loading appears.
    I hope using one transaction to drop/load many table is not a
    performance issue ?

Why are you dropping and re-loading? You mentioned it before and it sounded like it had something to do with adding columns, but you don't have to drop and reload to add a column.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jim Nasby
Дата:
Сообщение: Re: PG 9.3 materialized view VS Views, indexes, shared memory
Следующее
От: Gunnlaugur Thor Briem
Дата:
Сообщение: Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT