Обсуждение: Blocking every 20 sec while mass copying.

Поиск
Список
Период
Сортировка

Blocking every 20 sec while mass copying.

От
Benjamin Dugast
Дата:
Hello,

I'm working on Postgres 9.3.4 for a project.

We are using Scala, Akka and JDBC to insert data in the database, we have around 25M insert to do which are basically lines from 5000 files. We issue a DELETE according to the file (mandatory) and then a COPY each 1000 lines of that file.

DELETE request : DELETE FROM table WHERE field1 = ? AND field2 = ?;
COPY request : COPY table FROM STDIN WITH CSV

We have indexes on our database that we can't delete to insert our data.

When we insert the data there is some kind of freezes on the databases between requests. Freezes occur about every 20 seconds.

Here is a screenshot from yourkit.

We tried different solutions:
  • 1 table to 5 tables to reduces lock contention
  • fillfactor on indexes
  • commit delay
  • fsync to off (that helped but we can't do this)

We mainly want to know why this is happening because it slowing the insert too much for us.

Re: Blocking every 20 sec while mass copying.

От
Albe Laurenz
Дата:
Benjamin Dugast wrote:
> I'm working on Postgres 9.3.4 for a project.
> 
> 
> We are using Scala, Akka and JDBC to insert data in the database, we have around 25M insert to do
> which are basically lines from 5000 files. We issue a DELETE according to the file (mandatory) and
> then a COPY each 1000 lines of that file.
> 
> DELETE request : DELETE FROM table WHERE field1 = ? AND field2 = ?;
> 
> COPY request : COPY table FROM STDIN WITH CSV
> 
> 
> We have indexes on our database that we can't delete to insert our data.
> 
> 
> When we insert the data there is some kind of freezes on the databases between requests. Freezes occur
> about every 20 seconds.
> 
> 
> Here is a screenshot <http://tof.canardpc.com/view/c42e69c0-d776-4f93-a8a3-8713794a1a07.jpg> from
> yourkit.
> 
> 
> We tried different solutions:
> 
> 
> *    1 table to 5 tables to reduces lock contention
> *    fillfactor on indexes
> *    commit delay
> *    fsync to off (that helped but we can't do this)
> 
> We mainly want to know why this is happening because it slowing the insert too much for us.

This sounds a lot like checkpoint I/O spikes.

Check with the database server log if the freezes coincide with checkpoints.

You can increase checkpoint_segments when you load data to have them occur less often.

If you are on Linux and you have a lot of memory, you might hit spikes because too
much dirty data are cached; check /proc/sys/vm/dirty_ratio and /proc/sys/dirty_background_ratio.

Yours,
Laurenz Albe

Re: Blocking every 20 sec while mass copying.

От
Guillaume Cottenceau
Дата:
Benjamin Dugast <bdugast 'at' excilys.com> writes:

>   • fsync to off (that helped but we can't do this)

not exactly your question, but maybe synchronous_commit=off is a
nice enough intermediary solution for you (it may give better
performances at other places too for only an affordable cost)

--
Guillaume Cottenceau


Re: Blocking every 20 sec while mass copying.

От
Jeff Janes
Дата:
On Fri, Jul 18, 2014 at 3:52 AM, Benjamin Dugast <bdugast@excilys.com> wrote:
Hello,

I'm working on Postgres 9.3.4 for a project.

We are using Scala, Akka and JDBC to insert data in the database, we have around 25M insert to do which are basically lines from 5000 files. We issue a DELETE according to the file (mandatory) and then a COPY each 1000 lines of that file.

DELETE request : DELETE FROM table WHERE field1 = ? AND field2 = ?;
COPY request : COPY table FROM STDIN WITH CSV

We have indexes on our database that we can't delete to insert our data.

Inserting data into large indexed tables will usually dirty a prodigious amount of data in a random manner, to maintain those indexes.  It will take a very long time to clear that data down to spinning disks, because the writes cannot be effectively combined into long sequences (sometimes they theoretically could be combined, but the kernel just fails to do a good job of doing so).  

Buy a good IO system, RAID with lots of disks, or maybe SSD, for your indexes.

If the freezes occur mostly at checkpoint sync time, then you can try making the checkpoint interval much longer. Checkpoints will still suck when they do happen, but that happens less often.   Depending on the details of your system of your data and your loading processes, they might freeze for N times longer if you make them N times less frequent, such that the total amount of freezing time is conserved.  Or they might freeze for just the same period, so that total freezing time is reduced by a factor of N.  It is hard to know without trying it.  You could also try lowering the /proc/sys/vm/dirty_background_bytes setting, so that the kernel starts writing things out *before* the end-of-checkpoint sync calls start landing.

If the freezes aren't correlated with checkpoints, you could try increasing the shared_buffers to take up most of your RAM. This is unconventional advice, but I've seen it do wonders for such loads when the indexes that need maintenance are about the same size as RAM.

If you can partition your tables so that only one partition is being actively loaded at a time, that could be very effective if the indexes for each partition would then be small enough to fit in memory.

Cheers,

Jeff

Re: Blocking every 20 sec while mass copying.

От
Albe Laurenz
Дата:
Please keep the list on CC: in your responses.

Benjamin Dugast wrote:
> 2014-07-18 13:11 GMT+02:00 Albe Laurenz <laurenz.albe@wien.gv.at>:
>> This sounds a lot like checkpoint I/O spikes.
>>
>> Check with the database server log if the freezes coincide with checkpoints.
>>
>> You can increase checkpoint_segments when you load data to have them occur less often.
>>
>> If you are on Linux and you have a lot of memory, you might hit spikes because too
>> much dirty data are cached; check /proc/sys/vm/dirty_ratio and /proc/sys/dirty_background_ratio.

> The checkpoint_segments is set to 64 already
> 
> the dirty_ration was set by default to 10 i put it down to 5
> the dirty_background_ratio was set to 5 and I changed it to 2
> 
> There is less freezes but the insert is so slower than before.

That seems to indicate that my suspicion was right.

I would say that your I/O system is saturated.
Have you checked with "iostat -mNx 1"?

If you really cannot drop the indexes during loading, there's probably not much more
you can do to speed up the load.
You can try to increase checkpoint_segments beyond 64 and see if that buys you anything.

Tuning the file system write cache will not reduce the amount of I/O necessary, but it
should reduce the spikes (which is what I thought was your problem).

Yours,
Laurenz Albe

Re: Blocking every 20 sec while mass copying.

От
Benjamin Dugast
Дата:
Finally we solved our problem by using a kind of trick

We have 2 kind of table : online table for read and temp table to mass insert our data

We work on the temp tables (5 different tables) to insert every data without any index that goes really fast compared to the previous method
then we create index on these tables simultanously,
then we drop online tables(also 5 tables) and rename the temp tables to online (takes less than 1 sec)

This is the faster way to insert our data that we found.
On our config it goes pretty fast, we reduce our execution time to 50% and there is no more need of many maintenance on the database.

Thanks for all answer that you give us.



2014-07-21 10:02 GMT+02:00 Albe Laurenz <laurenz.albe@wien.gv.at>:
Please keep the list on CC: in your responses.

Benjamin Dugast wrote:
> 2014-07-18 13:11 GMT+02:00 Albe Laurenz <laurenz.albe@wien.gv.at>:
>> This sounds a lot like checkpoint I/O spikes.
>>
>> Check with the database server log if the freezes coincide with checkpoints.
>>
>> You can increase checkpoint_segments when you load data to have them occur less often.
>>
>> If you are on Linux and you have a lot of memory, you might hit spikes because too
>> much dirty data are cached; check /proc/sys/vm/dirty_ratio and /proc/sys/dirty_background_ratio.

> The checkpoint_segments is set to 64 already
>
> the dirty_ration was set by default to 10 i put it down to 5
> the dirty_background_ratio was set to 5 and I changed it to 2
>
> There is less freezes but the insert is so slower than before.

That seems to indicate that my suspicion was right.

I would say that your I/O system is saturated.
Have you checked with "iostat -mNx 1"?

If you really cannot drop the indexes during loading, there's probably not much more
you can do to speed up the load.
You can try to increase checkpoint_segments beyond 64 and see if that buys you anything.

Tuning the file system write cache will not reduce the amount of I/O necessary, but it
should reduce the spikes (which is what I thought was your problem).

Yours,
Laurenz Albe