Re: Enable data checksums by default

Поиск
Список
Период
Сортировка
От Burd, Greg
Тема Re: Enable data checksums by default
Дата
Msg-id BBFB3992-6985-4523-8530-4F2BCC1DA12C@burd.me
обсуждение исходный текст
Ответ на Re: Enable data checksums by default  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers

> On Jul 31, 2025, at 6:10 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Thu, 2025-07-31 at 17:21 +0200, Tomas Vondra wrote:
>> On 7/31/25 15:39, Greg Burd wrote:
>>> I recall a conversation at the last PGConf.dev (2025) with a
>>> representative
>>> from Intel and Jeff Davis (CC’ed) that had to do with checksums and
>>> a vast
>>> performance difference between Intel and AMD the latter winning by
>>> a mile.
>>
>> I don't know the Intel vs. AMD situation exactly, but e.g. [1] does
>> not
>> suggest AMD wins by a mile. In fact, it suggests Intel does much
>> better
>> in this particular benchmark (with AVX-512 improvements). Of course,
>> this is a fairly recent *kernel* improvement, maybe it wouldn't work
>> for
>> our data checksums that well.
>>
>> However, I don't think the cost of the checksum calculation itself is
>> the main concern. It's probably negligible compared to all the other
>> costs, triggered by checksums - having to WAL-log hint bits, doing
>> more
>> expensive checks (that's what the btree regression was about), etc.
>
> The issue Greg and I discussed, explained to me earlier by Andres, was
> a memory bandwidth issue.
>
> IIRC (Andres please correct me): The new IO infrastructure enables us
> to bypass a memory copy (from userspace to kernel space) when writing
> out a page. Unfortunately, checksums require reading the data to
> calculate the checksum, which effectively defeats that optimization.
>
> Those memory copies mostly happen in the bgwriter, where the page isn't
> generally in the cache, which means that memory bandwidth can become
> the bottleneck. Intel seems to have poor per-core memory bandwidth
> compared with AMD:
>
> https://sites.utexas.edu/jdm4372/2023/04/25/the-evolution-of-single-core-bandwidth-in-multicore-processors/
>
> so it's more likely to become the bottleneck on Intel.
>
> That lead to an interesting discussion about calculating the checksum
> on a page in the backend eagerly when it dirties a page, while it's
> still in cache. As you point out, that's quite cheap.
>
> Regards,
> Jeff Davis

Thanks Jeff for filling in the gaps in my memory. :)

-greg




В списке pgsql-hackers по дате отправления: