Re: TABLESAMPLE patch is really in pretty sad shape

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: TABLESAMPLE patch is really in pretty sad shape
Дата
Msg-id 15899.1437609699@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: TABLESAMPLE patch is really in pretty sad shape  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: TABLESAMPLE patch is really in pretty sad shape  (Petr Jelinek <petr@2ndquadrant.com>)
Список pgsql-hackers
I wrote:
> We could alternatively provide two scan-initialization callbacks,
> one that analyzes the parameters before we do heap_beginscan,
> and another that can do additional setup afterwards.  Actually,
> that second call would really not be meaningfully different from
> the ReScan call, so another solution would be to just automatically
> invoke ReScan after we've created the HeapScanDesc.  TSMs could work
> around not having this by complicating their NextBlock function a bit
> (so that it would do some initialization on first call) but perhaps
> it would be easier to have the additional init call.

Actually, there's another reason why this has to be redesigned: evaluating
the parameter expressions of TABLESAMPLE during ExecInitSampleScan is
not OK.  Consider this:

regression=# select * from (values (0::float4),(100)) v(pct), lateral (select count(*) from tenk1 tablesample bernoulli
(pct))ss;pct | count 
 
-----+-------  0 |     0100 |     0
(2 rows)

Surely the second execution of the subquery should've given me 10000.
It doesn't because the TABLESAMPLE parameters aren't recomputed during a
rescan.  Even if you're minded to claim that that's all right, this is
definitely not acceptable:

regression=# select * from (values (0::float4),(100)) v(pct), lateral (select * from tenk1 tablesample bernoulli (pct))
ss;
server closed the connection unexpectedly       This probably means the server terminated abnormally       before or
whileprocessing the request.
 
The connection to the server was lost. Attempting reset: Failed.

That one's crashing because "pct" is a Var that refers to an outer scan
tuple that doesn't exist yet.  I'm not real sure how come the case with
an aggregate manages not to crash, actually.

This needs to work more like LIMIT, which doesn't try to compute the
limit parameters until the first fetch.  So what we need is an Init
function that does very darn little indeed (maybe we don't even need
it at all), and then a ParamInspect function that is called at first fetch
or during a ReScan, and that one is the one that gets to look at the
evaluated parameter values.

If we wanted to let the method inspect the HeapScanDesc during setup
it would need still a third callback.  I'm not exactly sure if that's
worth the trouble.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: could not truncate directory "pg_subtrans": apparent wraparound
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: ALTER TABLE .. ADD PRIMARY KEY .. USING INDEX has dump-restore hazard