Re: Batch copying of databases
От | Shridhar Daithankar |
---|---|
Тема | Re: Batch copying of databases |
Дата | |
Msg-id | 3E673DCE.24575.352CED2@localhost обсуждение исходный текст |
Ответ на | Batch copying of databases ("Tim Mohler" <tamohler@nyc.rr.com>) |
Список | pgsql-performance |
On 5 Mar 2003 at 15:26, Tim Mohler wrote: > Hi all, > > I'm new to Postgres, and am not even the DBA for the system. I'm just a > sysadmin trying to make things run faster. Every month, we copy over a 25 > million row table from the production server to the reporting server. Total > size is something like 40 gigabytes. > > The copy in takes close to 24 hours, and I see the disks being hammered by > hundreds of small writes every second. The system is mostly waiting on I/O. > Is there any facility in Postgres to force batching of the I/O transactions > to something more reasonable than 8K? Well, 8K has nothing to with transactions in postgresql. You need to make sure at least two things. 1. You are using copy. By default postgresql writes each inserts in it's own transaction which is seriously slow for bulk load. Copy bunches the rwos in a single transaction and is quite fast. if you need to preprocess the data, batch something like 1K-10K records in a single transaction. 2. Postgresql bulk load is not as fast as many of us would like, especially when compared to oracle. So if you know you are going to bulk load using say copy, don't load the data from a single connection. Split the data file in say 5-10 parts and start loading all of them simaltaneously. It does speed up the things. At least it certainly saturates the disk bandwidth which single load does not do many times. On a side note, for such a bulk load consider dropping any indexes and foreign key contraints. HTH Bye Shridhar -- Turnaucka's Law: The attention span of a computer is only as long as its electrical cord.
В списке pgsql-performance по дате отправления: