Re: [HACKERS] Checksums by default?

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [HACKERS] Checksums by default?
Дата
Msg-id CAA4eK1LF9U5x0qvG_cEWcxKqV1LwN8oVdDkjDouCWVV-TcWkkA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Checksums by default?  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On Mon, Jan 23, 2017 at 6:57 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> On 01/23/2017 01:40 PM, Amit Kapila wrote:
>>
>> On Mon, Jan 23, 2017 at 3:56 PM, Tomas Vondra
>> <tomas.vondra@2ndquadrant.com> wrote:
>>>
>>> On 01/23/2017 09:57 AM, Amit Kapila wrote:
>>>>
>>>>
>>>> On Mon, Jan 23, 2017 at 1:18 PM, Tomas Vondra
>>>> <tomas.vondra@2ndquadrant.com> wrote:
>>>>>
>>>>>
>>>>> On 01/23/2017 08:30 AM, Amit Kapila wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I think if we can get data for pgbench read-write workload when data
>>>>>> doesn't fit in shared buffers but fit in RAM, that can give us some
>>>>>> indication.  We can try by varying the ratio of shared buffers w.r.t
>>>>>> data.  This should exercise the checksum code both when buffers are
>>>>>> evicted and at next read.  I think it also makes sense to check the
>>>>>> WAL data size for each of those runs.
>>>>>>
>>>>>
>>>>> Yes, I'm thinking that's pretty much the worst case for OLTP-like
>>>>> workload,
>>>>> because it has to evict buffers from shared buffers, generating a
>>>>> continuous
>>>>> stream of writes. Doing that on good storage (e.g. PCI-e SSD or
>>>>> possibly
>>>>> tmpfs) will further limit the storage overhead, making the time spent
>>>>> computing checksums much more significant. Makes sense?
>>>>>
>>>>
>>>> Yeah, I think that can be helpful with respect to WAL, but for data,
>>>> if we are considering the case where everything fits in RAM, then
>>>> faster storage might or might not help.
>>>>
>>>
>>> I'm not sure I understand. Why wouldn't faster storage help? It's only a
>>> matter of generating enough dirty buffers (that get evicted from shared
>>> buffers) to saturate the storage.
>>>
>>
>> When the page gets evicted from shared buffer, it is just pushed to
>> kernel; the real write to disk won't happen until the kernel feels
>> like it.They are written to storage later when a checkpoint occurs.
>> So, now if we have fast storage subsystem then it can improve the
>> writes from kernel to disk, but not sure how much that can help in
>> improving TPS.
>>
>
> I don't think that's quite true. If the pages are evicted by bgwriter, since
> 9.6 there's a flush every 512kB.
>

Right, but backend flush after is zero by default.

> This will also flush data written by
> backends, of course. But even without the flushing, the OS does not wait
> with the flush until the very last moment - that'd be a huge I/O spike.
> Instead, the OS will write the dirty data to disk after 30 seconds, of after
> accumulating some predefined amount of dirty data.
>

This is the reason I told it might or might not help.  I think there
is no point in having too much discussion on this point, if you access
to fast storage system, then go ahead and perform the tests on same,
if not, then also we can try without that.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: [HACKERS] Failure in commit_ts tap tests
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Failure on sittella