Re: [HACKERS] Checksums by default?

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: [HACKERS] Checksums by default?
Дата
Msg-id 00de5d7f-dabb-ae70-c78e-9d2690d94270@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Checksums by default?  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] Checksums by default?  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On 01/23/2017 01:40 PM, Amit Kapila wrote:
> On Mon, Jan 23, 2017 at 3:56 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> On 01/23/2017 09:57 AM, Amit Kapila wrote:
>>>
>>> On Mon, Jan 23, 2017 at 1:18 PM, Tomas Vondra
>>> <tomas.vondra@2ndquadrant.com> wrote:
>>>>
>>>> On 01/23/2017 08:30 AM, Amit Kapila wrote:
>>>>>
>>>>>
>>>>>
>>>>> I think if we can get data for pgbench read-write workload when data
>>>>> doesn't fit in shared buffers but fit in RAM, that can give us some
>>>>> indication.  We can try by varying the ratio of shared buffers w.r.t
>>>>> data.  This should exercise the checksum code both when buffers are
>>>>> evicted and at next read.  I think it also makes sense to check the
>>>>> WAL data size for each of those runs.
>>>>>
>>>>
>>>> Yes, I'm thinking that's pretty much the worst case for OLTP-like
>>>> workload,
>>>> because it has to evict buffers from shared buffers, generating a
>>>> continuous
>>>> stream of writes. Doing that on good storage (e.g. PCI-e SSD or possibly
>>>> tmpfs) will further limit the storage overhead, making the time spent
>>>> computing checksums much more significant. Makes sense?
>>>>
>>>
>>> Yeah, I think that can be helpful with respect to WAL, but for data,
>>> if we are considering the case where everything fits in RAM, then
>>> faster storage might or might not help.
>>>
>>
>> I'm not sure I understand. Why wouldn't faster storage help? It's only a
>> matter of generating enough dirty buffers (that get evicted from shared
>> buffers) to saturate the storage.
>>
>
> When the page gets evicted from shared buffer, it is just pushed to
> kernel; the real write to disk won't happen until the kernel feels
> like it.They are written to storage later when a checkpoint occurs.
> So, now if we have fast storage subsystem then it can improve the
> writes from kernel to disk, but not sure how much that can help in
> improving TPS.
>

I don't think that's quite true. If the pages are evicted by bgwriter, 
since 9.6 there's a flush every 512kB. This will also flush data written 
by backends, of course. But even without the flushing, the OS does not 
wait with the flush until the very last moment - that'd be a huge I/O 
spike. Instead, the OS will write the dirty data to disk after 30 
seconds, of after accumulating some predefined amount of dirty data.

So the system will generally get into a "stable state" where it writes 
about the same amount of data to disk on average.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: [HACKERS] Failure in commit_ts tap tests