Re: TABLESAMPLE patch

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: TABLESAMPLE patch
Дата
Msg-id CAB7nPqS-t2pM6aCxurtdrvb0PJ3KYRpxCh20xKoYvGu+_9QrVw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: TABLESAMPLE patch  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: TABLESAMPLE patch  (Peter Eisentraut <peter_e@gmx.net>)
Список pgsql-hackers
On Thu, Apr 9, 2015 at 5:12 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Apr 9, 2015 at 4:30 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> On Mon, Apr 6, 2015 at 11:02 AM, Petr Jelinek <petr@2ndquadrant.com> wrote:
>>>
>>> On 06/04/15 14:30, Petr Jelinek wrote:
>>>>
>>>> On 06/04/15 11:02, Simon Riggs wrote:
>>>>
>>>>> Are we ready for a final detailed review and commit?
>>>>>
>>>>
>>>> I plan to send v12 in the evening with some additional changes that came
>>>> up from Amit's comments + some improvements to error reporting. I think
>>>> it will be ready then.
>>>>
>>>
>>> Ok so here it is.
>>>
>>> Changes vs v11:
>>> - changed input parameter list to expr_list
>>> - improved error reporting, particularly when input parameters are wrong
>>> - fixed SELECT docs to show correct syntax and mention that there can be
>>> more sampling methods
>>> - added name of the sampling method to the explain output - I don't like
>>> the code much there as it has to look into RTE but on the other hand I don't
>>> want to create new scan node just so it can hold the name of the sampling
>>> method for explain
>>> - made views containing TABLESAMPLE clause not autoupdatable
>>> - added PageIsAllVisible() check before trying to check for tuple
>>> visibility
>>> - some typo/white space fixes
>>
>>
>>
>> Compiler warnings:
>> explain.c: In function 'ExplainNode':
>> explain.c:861: warning: 'sname' may be used uninitialized in this function
>>
>>
>> Docs spellings:
>>
>> "PostgreSQL distrribution"  extra r.
>>
>> "The optional parameter REPEATABLE acceps"   accepts.  But I don't know that
>> 'accepts' is the right word.  It makes the seed value sound optional to
>> REPEATABLE.
>>
>> "each block having same chance"  should have "the" before "same".
>>
>> "Both of those sampling methods currently...".  I think it should be "these"
>> not "those", as this sentence is immediately after their introduction, not
>> at a distance.
>>
>> "...tuple contents and decides if to return in, or zero if none"  Something
>> here is confusing. "return it", not "return in"?
>>
>> Other comments:
>>
>> Do we want tab completions for psql?  (I will never remember how to spell
>> BERNOULLI).
>
> Yes. I think so.
>
>> Needs a Cat version bump.
>
> The committer who will pick up this patch will normally do it.
>
> Patch 1 is simple enough and looks fine to me.
>
> Regarding patch 2...
> I found for now some typos:
> +  <title><structname>pg_tabesample_method</structname></title>
> +        <productname>PostgreSQL</productname> distrribution:
>
> Also, I am wondering if the sampling logic based on block analysis is
> actually correct, for example for now this fails and I think that we
> should support it:
> =# with query_select as (select generate_series(1, 10) as a) select
> query_select.a from query_select tablesample system (100.0/11)
> REPEATABLE (9999);
> ERROR:  42P01: relation "query_select" does not exist
>
> How does the SQL spec define exactly TABLESAMPLE? Shouldn't we get a
> sample from a result set?
> Thoughts?

Just to be clear, the example above being misleading... Doing table
sampling using SYSTEM at physical level makes sense. In this case I
think that we should properly error out when trying to use this method
on something not present at physical level. But I am not sure that
this restriction applies to BERNOUILLI: you may want to apply it on
other things than physical relations, like views or results of WITH
clauses. Also, based on the fact that we support custom sampling
methods, I think that it should be up to the sampling method to define
on what kind of objects it supports sampling, and where it supports
sampling fetching, be it page-level fetching or analysis from an
existing set of tuples. Looking at the patch, TABLESAMPLE is just
allowed on tables and matviews, this limitation is too restrictive
IMO.
-- 
Michael



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: TABLESAMPLE patch
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: TABLESAMPLE patch