Re: Enable data checksums by default
От | Andres Freund |
---|---|
Тема | Re: Enable data checksums by default |
Дата | |
Msg-id | 23278C72-65BD-4EAE-85D4-15CC51E77DC9@anarazel.de обсуждение исходный текст |
Ответ на | Re: Enable data checksums by default (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Список | pgsql-hackers |
On March 30, 2019 3:25:43 PM EDT, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: >On Fri, Mar 29, 2019 at 08:35:26PM +0100, Christoph Berg wrote: >>Re: Bernd Helmle 2019-03-29 ><3586bb9345a59bfc8d13a50a7c729be1ee6759fd.camel@oopsware.de> >>> Am Freitag, den 29.03.2019, 23:10 +0900 schrieb Michael Paquier: >>> > >>> > I can't really believe that many people set up shared_buffers at >>> > 128kB >>> > which would cause such a large number of page evictions, but I can >>> > believe that many users have shared_buffers set to its default >value >>> > and that we are going to get complains about "performance drop >after >>> > upgrade to v12" if we switch data checksums to on by default. >>> >>> Yeah, i think Christoph's benchmark is based on this thinking. I >assume >>> this very unrealistic scenery should emulate the worst case (many >>> buffer_reads, high checksum calculation load). >> >>It's not unrealistic to have large seqscans that are all buffer >>misses, the table just has to be big enough. The idea in my benchmark >>was that if I make shared buffers really small, and the table still >>fits in to RAM, I should be seeing only buffer misses, but without any >>delay for actually reading from disk. >> >>Christoph >> > >FWIW I think it's a mistake to focus solely on CPU utilization, which >all the benchmarks performed on this thread do because they look at tps >of in-memory read-only workloads. Checksums have other costs too, not >just the additional CPU time. Most importanly they require >wal_log_hints >to be set (which people may or may not want anyway). > >I've done a simple benchmark, that does read-only (-S) and read-write >(-N) pgbench runs with different scales, but also measures duration of >the pgbench init and amount of WAL produced during the tests. > >On a small machine (i5, 8GB RAM, SSD RAID) the results are these: > > scale config | init tps wal > =========================|================================== > ro 10 no-hints | 2 117038 130 > hints | 2 116378 146 > checksums | 2 115619 147 > -------------------|---------------------------------- > 200 no-hints | 32 88340 2407 > hints | 37 86154 2628 > checksums | 36 83336 2624 > -------------------|---------------------------------- > 2000 no-hints | 365 38680 1967 > hints | 423 38670 2123 > checksums | 504 37510 2046 > -------------------------|---------------------------------- > rw 10 no-hints | 2 19691 437 > hints | 2 19712 437 > checksums | 2 19654 437 > -------------------|---------------------------------- > 200 no-hints | 32 15839 2745 > hints | 37 15735 2783 > checksums | 36 15646 2775 > -------------------|---------------------------------- > 2000 no-hints | 365 5371 3721 > hints | 423 5270 3671 > checksums | 504 5094 3574 > >The no-hints config is default (wal_log_hints=off, data_checksums=off), >hints sets wal_log_hints=on and checksums enables data checksums. All >the configs were somewhat tuned (1GB shared buffers, max_wal_size high >enough not to hit checkpoints very often, etc.). > >I've also done the tests on the a larger machine (2x E5-2620v4, 32GB of >RAM, NVMe SSD), and the general pattern is about the same - while the >tps and amount of WAL (not covering the init) does not change, the time >for initialization increases significantly (by 20-40%). > >This effect is even clearer when using slower storage (SATA-based >RAID). >The results then look like this: > > scale config | init tps wal > =========================|================================== > ro 100 no-hints | 49 229459 122 > hints | 101 167983 190 > checksums | 103 156307 190 > -------------------|---------------------------------- > 1000 no-hints | 580 152167 109 > hints | 1047 122814 142 > checksums | 1080 118586 141 > -------------------|---------------------------------- > 6000 no-hints | 4035 508 1 > hints | 11193 502 1 > checksums | 11376 506 1 > -------------------------|---------------------------------- > rw 100 no-hints | 49 279 192 > hints | 101 275 190 > checksums | 103 275 190 > -------------------|---------------------------------- > 1000 no-hints | 580 237 210 > hints | 1047 225 201 > checksums | 1080 224 200 > -------------------|---------------------------------- > 6000 no-hints | 4035 135 123 > hints | 11193 133 122 > checksums | 11376 132 121 > >and when expressed as relative to no-hints: > > scale config | init tps wal > ============================|=============================== > ro 100 hints | 206% 73% 155% > checksums | 210% 68% 155% > -------------------|-------------------------------- > 1000 hints | 181% 81% 131% > checksums | 186% 78% 129% > -------------------|-------------------------------- > 6000 hints | 277% 99% 100% > checksums | 282% 100% 104% > ----------------------------|-------------------------------- > rw 100 hints | 206% 99% 99% > checksums | 210% 99% 99% > -------------------|-------------------------------- > 1000 hints | 181% 95% 96% > checksums | 186% 95% 95% > -------------------|-------------------------------- > 6000 hints | 277% 99% 99% > checksums | 282% 98% 98% > >I have not investigated the exact reasons, but my hypothesis it's about >the amount of WAL generated during the initial CREATE INDEX (because it >probably ends up setting the hint bits), which puts additional pressure >on the storage. > >Unfortunately, this additional cost is unlikely to go away :-( > >Now, maybe we want to enable checksums by default anyway, but we should >not pretent the only cost related to checksums is CPU usage. Thanks for running these, very helpful. -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Tomas VondraДата:
Сообщение: Re: [HACKERS] PATCH: multivariate histograms and MCV lists