Re: TABLESAMPLE patch

Поиск
Список
Период
Сортировка
От Petr Jelinek
Тема Re: TABLESAMPLE patch
Дата
Msg-id 5488D761.3090506@2ndquadrant.com
обсуждение исходный текст
Ответ на TABLESAMPLE patch  (Petr Jelinek <petr@2ndquadrant.com>)
Список pgsql-hackers
On 11/12/14 00:24, Petr Jelinek wrote:
> Hello,
>
> Attached is a basic implementation of TABLESAMPLE clause. It's SQL
> standard clause and couple of people tried to submit it before so I
> think I don't need to explain in length what it does - basically returns
> "random" sample of a table using a specified sampling method.
>
> I implemented both SYSTEM and BERNOULLI sampling as specified by SQL
> standard. The SYSTEM sampling does block level sampling using same
> algorithm as ANALYZE, BERNOULLI scans whole table and picks tuple randomly.
>
> There is API for sampling methods which consists of 4 functions at the
> moment - init, end, nextblock and nexttuple. I added catalog which maps
> the sampling method to the functions implementing this API. The grammar
> creates new TableSampleRange struct that I added for sampling. Parser
> then uses the catalog to load information about the sampling method into
> TableSampleClause which is then attached to RTE. Planner checks for if
> this parameter is present in the RTE and if it finds it it will create
> plan with just one path - SampleScan. SampleScan implements standard
> executor API and calls the sampling method API as needed.
>
> It is possible to write custom sampling methods. The sampling method
> parameters are not limited to just percent number as in standard but
> dynamic list of expressions which is checked against the definition of
> the init function in a similar fashion (although much simplified) as
> function calls are.
>
> Notable lacking parts are:
> - proper costing and returned row count estimation - given the dynamic
> nature of parameters I think for we'll need to let the sampling method
> do this, so there will have to be fifth function in the API.
> - ruleutils support (it needs a bit of code in get_from_clause_item
> function)
> - docs are sparse at the moment
>

Forgot the obligatory:

The research leading to these results has received funding from the 
European Union's Seventh Framework Programme (FP7/2007-2013) under grant 
agreement n° 318633.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Petr Jelinek
Дата:
Сообщение: TABLESAMPLE patch
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Casting issues with domains