Re: Partitioned checkpointing
| От | Fabien COELHO | 
|---|---|
| Тема | Re: Partitioned checkpointing | 
| Дата | |
| Msg-id | alpine.DEB.2.10.1509160850240.14201@sto обсуждение исходный текст | 
| Ответ на | Re: Partitioned checkpointing (Takashi Horikawa <t-horikawa@aj.jp.nec.com>) | 
| Список | pgsql-hackers | 
Hello Takashi-san,
> I've noticed that the behavior in 'checkpoint_partitions = 1' is not the 
> same as that of original 9.5alpha2. Attached 
> 'partitioned-checkpointing-v3.patch' fixed the bug, thus please use it.
I've done two sets of run on an old box (16 GB, 8 cores, RAID1 HDD)
with "pgbench -M prepared -N -j 2 -c 4 ..." and analysed per second traces 
(-P 1) for 4 versions : sort+flush patch fully on, sort+flush patch full 
off (should be equivalent to head), partition patch v3 with 1 partition 
(should also be equivalent to head), partition patch v3 with 16 
partitions.
I ran two configurations :
small:  shared_buffers = 2GB  checkpoint_timeout = 300s  checkpoint_completion_target = 0.8  pgbench's scale = 120,
time= 4000
 
medium:  shared_buffers = 4GB  max_wal_size = 5GB  checkpoint_timeout = 30min  checkpoint_completion_target = 0.8
pgbench'sscale = 300, time = 7500
 
* full speed run performance
  average tps +- standard deviation (percent of under 10 tps seconds)
                        small               medium  1. flush+sort    : 751 +- 415 ( 0.2)   984 +- 500 ( 0.3)  2. no
flush/sort: 188 +- 508 (83.6)   252 +- 551 (77.0)  3. 1 partition   : 183 +- 518 (85.6)   232 +- 535 (78.3)  4. 16
partitions: 179 +- 462 (81.1)   196 +- 492 (80.9)
 
Although 2 & 3 should be equivalent, there seems to be a lower performance 
with 1 partition, but it is pretty close and it may not be significant.
The 16 partitions seems to show significant lower tps performance, 
especially for the medium case. Although the stddev is a little bit better 
for the small case, as suggested by the lower off-line figure, but 
relatively higher with the medium case (stddev = 2.5 * average).
There is no comparision with the flush & sort activated.
* throttled performance (-R 100/200 -L 100)
  percent of late transactions - above 100 ms or not even started as the  system is much too behind schedule.
                      small-100 small-200  medium-100  1. flush+sort    :     1.0       1.9        1.9  2. no
flush/sort:    31.5      49.8       27.1  3. 1 partition   :    32.3      49.0       27.0  4. 16 partitions :    32.9
  48.0       31.5
 
2 & 3 seem pretty equivalent, as expected. The 16 partitions seem to 
slightly degrade availability on average. Yet again, no comparison with 
flush & sort activated.
From these runs, I would advise against applying the checkpoint 
partitionning patch: there is no consistent benefit on the basic harware 
I'm using on this test. I think that it make sense, because fsyncing 
random I/O several times instead of one has little impact.
Now, once I/O are not random, that is with some kind of combined patch,
this is another question. I would rather go with Andres suggestion to 
fsync once per file, when writing to a file is completed, because 
partitionning as such would reduce the effectiveness of sorting buffers.
I think that it would be interesting if you could test the sort/flush 
patch on the same high-end system that you used for testing your partition 
patch.
-- 
Fabien.
		
	В списке pgsql-hackers по дате отправления: