Re: [PERFORM] performance problem on big tables

Поиск
Список
Период
Сортировка
От Mariel Cherkassky
Тема Re: [PERFORM] performance problem on big tables
Дата
Msg-id CA+t6e1nG8bF4-hrvjijhJ_nC5OXmw32eXNYdxuYMkPSk2QLrag@mail.gmail.com
обсуждение исходный текст
Ответ на [PERFORM] performance problem on big tables  (Mariel Cherkassky <mariel.cherkassky@gmail.com>)
Ответы Re: [PERFORM] performance problem on big tables  (Jeff Janes <jeff.janes@gmail.com>)
Re: [PERFORM] performance problem on big tables  (Scott Marlowe <scott.marlowe@gmail.com>)
Список pgsql-performance
Hi,
So I I run the cheks that jeff mentioned :  
\copy (select * from oracle_remote_table) to /tmp/tmp with binary - 1 hour and 35 minutes
\copy local_postresql_table from /tmp/tmp with binary - Didnt run because the remote oracle database is currently under maintenance work.

So I decided to follow MichaelDBA tips and I set the ram on my machine to 16G and I configured the effective_cache memory to 14G,tshared_buffer to be 2G and maintenance_work_mem to 4G.

I started running the copy checks again and for now it coppied 5G in 10 minutes. I have some questions : 
1)When I run insert into local_postresql_table select * from remote_oracle_table I insert that data as bulk to the local table or row by row ?  If the answer as bulk than why copy is a better option for this case ? 
2)The copy from dump into the postgresql database should take less time than the copy to dump ?
3)What do you think about the new memory parameters that I cofigured ?






2017-08-14 16:24 GMT+03:00 Mariel Cherkassky <mariel.cherkassky@gmail.com>:

I have performance issues with two big tables. Those tables are located on an oracle remote database. I'm running the quert : insert into local_postgresql_table select * from oracle_remote_table.

The first table has 45M records and its size is 23G. The import of the data from the oracle remote database is taking 1 hour and 38 minutes. After that I create 13 regular indexes on the table and it takes 10 minutes per table ->2 hours and 10 minutes in total.

The second table has 29M records and its size is 26G. The import of the data from the oracle remote database is taking 2 hours and 30 minutes. The creation of the indexes takes 1 hours and 30 minutes (some are indexes on one column and the creation takes 5 min and some are indexes on multiples column and it takes 11 min.

Those operation are very problematic for me and I'm searching for a solution to improve the performance. The parameters I assigned :

min_parallel_relation_size = 200MB
max_parallel_workers_per_gather = 5
max_worker_processes = 8
effective_cache_size = 2500MB
work_mem = 16MB
maintenance_work_mem = 1500MB
shared_buffers = 2000MB
RAM : 5G
CPU CORES : 8

-I tried running select count(*) from table in oracle and in postgresql the running time is almost equal.

-Before importing the data I drop the indexes and the constraints.

-I tried to copy a 23G file from the oracle server to the postgresql server and it took me 12 minutes.

Please advice how can I continue ? How can I improve something in this operation ?

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jerry Sievers
Дата:
Сообщение: Re: [PERFORM] Odd sudden performance degradation related to temp object churn
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: [PERFORM] Odd sudden performance degradation related to tempobject churn