Re: [PERFORM] Number of characters in column preventing index usage

Поиск
Список
Период
Сортировка
От Hustler DBA
Тема Re: [PERFORM] Number of characters in column preventing index usage
Дата
Msg-id CAM00CHFeSv-01S31WKNTkeyNJFf7buZMgrFqnU-4mbmdBKZbjQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PERFORM] Number of characters in column preventing index usage  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
Thanks you guys are correct... the size of the table caused the optimizer to do a seq scan instead of using the index. I tried it on a  24 MB and 1 GB table and the expected index was used. 



On Fri, Feb 17, 2017 at 7:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Fri, Feb 17, 2017 at 3:49 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com>
> wrote:
>> That may seem a bit strange, but I'd bet it finds the short value in some
>> statistic (MCV, histogram) ans so can provide very accurate estimate.

> ​I'm not seeing how any of the statistic columns would capture a value that
> doesn't actually appear in the table...(actual ... row=0)​

I think it's the other way around.  It found
'01625cfa-2bf8-45cf-bf4c-aa5f3c6fa8ea' in the stats, concluded
(accurately) that there would be five matches, and on the strength of that
decided that a seqscan over this very tiny table would be faster than an
indexscan.  In the other case, the short string exists neither in the
table nor the stats, and the default estimate is turning out to be that
there's a single match, for which it likes the indexscan solution.  This
is all pretty unsurprising if '01625cfa-2bf8-45cf-bf4c-aa5f3c6fa8ea'
is in the most-common-values list.  Anything that's *not* in that list
is going to get a smaller rowcount estimate.  (I don't think that the
string length, per se, has anything to do with it.)

I'm not sure what performance problem the OP was looking to solve,
but expecting experiments on toy-sized tables to give the same plans
as you get on large tables is a standard mistake when learning to work
with the PG planner.

Also, if toy-sized tables are all you've got, meaning the whole database
can be expected to stay RAM-resident at all times, it'd be a good idea
to reduce random_page_cost to reflect that.  The default planner cost
settings are meant for data that's mostly on spinning rust.

                        regards, tom lane


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [PERFORM] Number of characters in column preventing index usage
Следующее
От: Mike Beaton
Дата:
Сообщение: Re: [PERFORM] Correct use of cursors for very large result sets in Postgres