Re: mdnblocks() sabotages error checking in _mdfd_getseg()

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: mdnblocks() sabotages error checking in _mdfd_getseg()
Дата
Msg-id CANP8+jLyMafNWJEOsq0g34ZPr-=Acg914GL9PEH3ukGxoNjeGQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: mdnblocks() sabotages error checking in _mdfd_getseg()  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: mdnblocks() sabotages error checking in _mdfd_getseg()  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 10 December 2015 at 16:47, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Dec 10, 2015 at 11:36 AM, Andres Freund <andres@anarazel.de> wrote:
>> In fact, having no way to get the relation length other than scanning
>> 1000 files doesn't seem like an especially good choice even if we used
>> a better data structure.  Putting a header page in the heap would make
>> getting the length of a relation O(1) instead of O(segments), and for
>> a bonus, we'd be able to reliably detect it if a relation file
>> disappeared out from under us.  That's a difficult project and
>> definitely not my top priority, but this code is old and crufty all
>> the same.)
>
> The md layer doesn't really know whether it's dealing with an index, or
> with an index, or ... So handling this via a metapage doesn't seem
> particularly straightforward.

It's not straightforward, but I don't think that's the reason.  What
we could do is look at the call sites that use
RelationGetNumberOfBlocks() and change some of them to get the
information some other way instead.  I believe get_relation_info() and
initscan() are the primary culprits, accounting for some enormous
percentage of the system calls we do on a read-only pgbench workload.
Those functions certainly know enough to consult a metapage if we had
such a thing.

It looks pretty straightforward to me... 

The number of relations with >1 file is likely to be fairly small, so we can just have an in-memory array to record that. 8 bytes per relation >1 GB isn't going to take much shmem, but we can extend using dynshmem as needed. We can seq scan the array at relcache build time and invalidate relcache when we extend. WAL log any extension to a new segment and write the table to disk at checkpoint.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: mdnblocks() sabotages error checking in _mdfd_getseg()
Следующее
От: Robert Haas
Дата:
Сообщение: Re: mdnblocks() sabotages error checking in _mdfd_getseg()