Re: [HACKERS] Pluggable storage

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: [HACKERS] Pluggable storage
Дата
Msg-id CAPpHfdseKNWmqVodbQ43vgh6LbEEKAWTJGgsuXZ+d3SiJi_CpQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Pluggable storage  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Ответы Re: [HACKERS] Pluggable storage  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Список pgsql-hackers
On Thu, Sep 14, 2017 at 8:17 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
Instead of modifying the Bitmap Heap and Sample scan's to avoid referring
the internal members of the HeapScanDesc, I divided the HeapScanDesc
into two parts.

1. StorageScanDesc
2. HeapPageScanDesc

The StorageScanDesc contains the minimal information that is required outside
the Storage routine and this must be provided by all storage routines. This
structure contains minimal information such as relation, snapshot, buffer and
etc.

The HeapPageScanDesc contains other extra information that is required for
Bitmap Heap and Sample scans to work. This structure contains the information
of blocks, visible offsets and etc. Currently this structure is used only in 
Bitmap Heap and Sample scan and it's supported contrib modules, except
the pgstattuple module. The pgstattuple needs some additional changes.

By adding additional storage API to return HeapPageScanDesc as it required
by the Bitmap Heap and Sample scan's and this API is called only in these
two scan's. And also these scan methods are choosen by the planner only
when the storage routine supports to returning of HeapPageScanDesc API.
Currently Implemented the planner support only for Bitmap, yet to do it
for Sample scan.

With the above approach, I removed all the references of HeapScanDesc
outside the heap. The changes of this approach is available in the
0008-Remove-HeapScanDesc-usage-outside-heap.patch 

Suggestions/comments with the above approach.

For me, that's an interesting idea.  Naturally, the way BitmapHeapScan and SampleScan work even on very high-level is applicable only for some storage AMs (i.e. heap-like storage AMs).  For example, index-organized table wouldn't ever support BitmapHeapScan, because it refers tuples by PK values not TIDs.  However, in this case, storage AM might have some alternative to our BitmapHeapScan.  So, index-organized table might have some compressed representation of ordered PK values set and use it for bulk fetch of PK index.

Therefore, I think it would be nice to make BitmapHeapScan an heap-storage-AM-specific scan method while other storage AMs could provide other storage-AM-specific scan methods.  Probably it would be too much for this patchset and should be done during one of next work cycles on storage AM (I'm sure that such huge project as pluggable storage AMs would have multiple iterations).

Similarly, SampleScans contain storage-AM-specific logic.  For instance, our SYSTEM sampling method fetches random blocks from heap providing high performance way to sample heap.  Coming back to the example of index-organized table, it could provide it's own storage-AM-specific table sampling methods including sophisticated PK tree traversal with fetching random small ranges of PK.  Given that tablesample methods are already pluggable, making them storage-AM-specific would lead to user-visible changes.  I.e. tablesample method should be created for particular storage AM or set of storage AMs.  However, I didn't yet figure out what should API exactly look like...

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: [HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the sametransaction
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10(upgrading standby servers)