Re: Block-level CRC checks

Поиск
Список
Период
Сортировка
От Richard Huxton
Тема Re: Block-level CRC checks
Дата
Msg-id 4B159CC8.9030201@archonet.com
обсуждение исходный текст
Ответ на Re: Block-level CRC checks  (Greg Stark <gsstark@mit.edu>)
Ответы Re: Block-level CRC checks  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Greg Stark wrote:
> On Tue, Dec 1, 2009 at 9:57 PM, Richard Huxton <dev@archonet.com> wrote:
>> Why are we writing out the hint bits to disk anyway? Is it really so
>> slow to calculate them on read + cache them that it's worth all this
>> trouble? Are they not also to blame for the "write my import data twice"
>> feature?
> 
> It would be interesting to experiment with different strategies. But
> the results would depend a lot on workloads and I doubt one strategy
> is best for everyone.
> 
> It has often been suggested that we could set the hint bits but not
> dirty the page, so they would never be written out unless some other
> update hit the page. In most use cases that would probably result in
> the right thing happening where we avoid half the writes but still
> stop doing transaction status lookups relatively promptly. The scary
> thing is that there might be use cases such as static data loaded
> where the hint bits never get set and every scan of the page has to
> recheck those statuses until the tuples are frozen.

And how scary is that? Assuming we cache the hints...
1. With the page itself, so same lifespan
2. Separately, perhaps with a different (longer) lifespan.

Separately would then let you trade complexity for compactness - "all of
block B is deleted", "all of table T is visible".

So what is the cost of calculating the hint-bits for a whole block of
tuples in one go vs reading that block from actual spinning disk?

> There does need to be something like the hint bits which does
> eventually have to be set because we can't keep transaction
> information around forever. Even if you keep the transaction
> information all the way back to the last freeze date (up to about 1GB
> and change I think) then the data has to be written twice, the second
> time is to freeze the transactions. In the worst case then reading a
> page requires a random page access (or two) from anywhere in that 1GB+
> file for each tuple on the page (whether visible to us or not).

While on that topic - I'm assuming freezing requires substantially more
effort than updating hint bits?

--  Richard Huxton Archonet Ltd


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Smith
Дата:
Сообщение: Re: [CORE] EOL for 7.4?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Block-level CRC checks