Re: Gsoc2012 idea, tablesample

Поиск
Список
Период
Сортировка
От Ants Aasma
Тема Re: Gsoc2012 idea, tablesample
Дата
Msg-id CA+CSw_twJkJ0P45oGNXRJ4RFfknEuuR4nVd=GqXUGw4gGm-HUA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Gsoc2012 idea, tablesample  (Sandro Santilli <strk@keybit.net>)
Ответы Re: Gsoc2012 idea, tablesample  (Sandro Santilli <strk@keybit.net>)
Список pgsql-hackers
On Mon, Apr 23, 2012 at 4:37 PM, Sandro Santilli <strk@keybit.net> wrote:
> I'd love to see enhanced CTID operators, to fetch all visible tuples in a page
> using a tidscan.  Something like: WHERE ctid =~ '(501,*)' or a ctidrange.

Among other things, this would enable user-space implementation of
tablesample. Given the operator =~(tid, int) that matches the page
number and planner/executor integration so that it results in a TID
scan, you would need the following functions:

random_pages(tbl regclass, samples int) returns int[]
aggregate function:
reservoir_sample(item anyelement, samples int) returns anyarray

Implementations for both of the functions could be adapted from analyze.c.

Then tablesample could be implemented with the following query:
SELECT (SELECT reservoir_sample(some_table, 50) AS samples  FROM some_table WHERE ctid =~ ANY (rnd_pgtids))
FROM random_pages('some_table', 50) AS rnd_pgtids;

Actually, now that I think about it, it could actually be implemented
without any modifications to core at some cost to efficiency.
random_pages would have to return tid[] that contains for each
generated pagenumber all possible tids on that page.

By making the building blocks available users get more flexibility.
The downside would be that we can't automatically make better sampling
methods available.

Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: psql omits row count under "\x auto"
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Patch: add timing of buffer I/O requests