Re: Yet another abort-early plan disaster on 9.3

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: Yet another abort-early plan disaster on 9.3
Дата	15 октября 2014 г. 21:02:41
Msg-id	543EB6B6.4080005@fuzzy.cz обсуждение исходный текст
Ответ на	Re: Yet another abort-early plan disaster on 9.3 (Josh Berkus <josh@agliodbs.com>)
Ответы	Re: Yet another abort-early plan disaster on 9.3 (Greg Stark <stark@mit.edu>)
Список	pgsql-performance

Дерево обсуждения

On 15.10.2014 19:20, Josh Berkus wrote:
> On 10/10/2014 04:16 AM, Greg Stark wrote:
>> On Thu, Oct 2, 2014 at 8:56 PM, Josh Berkus <josh@agliodbs.com> wrote:
>>> Yes, it's only intractable if you're wedded to the idea of a tiny,
>>> fixed-size sample.  If we're allowed to sample, say, 1% of the table, we
>>> can get a MUCH more accurate n_distinct estimate using multiple
>>> algorithms, of which HLL is one.  While n_distinct will still have some
>>> variance, it'll be over a much smaller range.
>>
>> I've gone looking for papers on this topic but from what I read this
>> isn't so. To get any noticeable improvement you need to read 10-50% of
>> the table and that's effectively the same as reading the entire table
>> -- and it still had pretty poor results. All the research I could find
>> went into how to analyze the whole table while using a reasonable
>> amount of scratch space and how to do it incrementally.
>
> So, right now our estimation is off on large tables by -10X to
> -10000X. First, the fact that it's *always* low is an indication
> we're using the wrong algorithm. Second, we can most certainly do
> better than a median of -1000X.

A few days ago I posted an experimental patch with the adaptive
estimator, described in [1]. Not perfect, but based on the testing I did
I believe it's a superior algorithm to the one we use now. Would be nice
to identify a few datasets where the current estimate is way off.

[1]
http://ftp.cse.buffalo.edu/users/azhang/disc/disc01/cd1/out/papers/pods/towardsestimatimosur.pdf

> One interesting set of algorithms is block-based sampling. That is,
> you read 5% of the physical table in random blocks, reading every row
> in the block. The block size is determined by your storage block
> size, so you're not actually reading any more physically than you are
> logically; it really is just 5% of the table, especially on SSD.
>
> Then you apply algorithms which first estimate the correlation of
> common values in the block (i.e. how likely is it that the table is
> completely sorted?), and then estimates of how many values there
> might be total based on the correlation estimate.

I think we might also use a different approach - instead of sampling the
data when ANALYZE kicks in, we might collect a requested sample of rows
on the fly. Say we want 1% sample - whenever you insert a new row, you
do [random() < 0.01] and if it happens to be true you keep a copy of the
row aside. Then, when you need the sample, you simply read the sample
and you're done - no random access to the main table, no problems with
estimated being off due to block-level sampling, etc.

Not sure how to track deletions/updates, though. Maybe rebuilding the
sample if the number of deletions exceeds some threshold, but that
contradicts the whole idea a bit.

> I no longer have my ACM membership, so I can't link this, but
> researchers were able to get +/- 3X accuracy for a TPCH workload
> using this approach. A real database would be more variable, of
> course, but even so we should be able to achieve +/- 50X, which
> would be an order of magnitude better than we're doing now.

If you know the title of the article, it's usually available elsewhere
on the web - either at the university site, or elsewhere. I found these
two articles about block-based sampling:

http://ranger.uta.edu/~gdas/websitepages/preprints-papers/p287-chaudhuri.pdf

   https://www.stat.washington.edu/research/reports/1999/tr355.pdf

Maybe there are more, but most of the other links were about how Oracle
does this in 11g.

Tomas

В списке pgsql-performance по дате отправления:

Предыдущее

От: Josh Berkus
Дата: 15 октября 2014 г., 20:20:29
Сообщение: Re: Yet another abort-early plan disaster on 9.3

Следующее

От: Dave Johansen
Дата: 15 октября 2014 г., 23:05:11
Сообщение: Re: Partitions and work_mem?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Yet another abort-early plan disaster on 9.3

Предыдущее

Следующее