Re: BUG #7510: Very bad costing estimation on gin vs gist with FTS

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #7510: Very bad costing estimation on gin vs gist with FTS
Дата
Msg-id 20426.1346299806@sss.pgh.pa.us
обсуждение исходный текст
Ответ на BUG #7510: Very bad costing estimation on gin vs gist with FTS  (daniel@heroku.com)
Список pgsql-bugs
daniel@heroku.com writes:
> If one adds a GIN index and does a fresh analyze, the planner still produce
> a plan for the GiST index.  Because there is no way to disable particular
> indexes in a session, it's impossible to quickly experiment with a new
> hypothetical situation with only the GIN index without possibly painting
> yourself into a corner where you've dropped a needed index.

FWIW, there is a pretty standard workaround for that:

    begin;
    drop index unwanted_index;
    explain ...;
    rollback;

This isn't ideal in a production database because it requires exclusive
lock on the table for long enough to run the EXPLAIN.  But it's not true
that there's no way to handle this at all.  If you want something more
flexible, I'd suggest working on improving the "index advisor" plugin
that was getting batted around a couple years ago.  There are sufficient
hooks in the planner to let it be given an arbitrary hypothetical set of
indexes.  But I digress...

Anyway, the meat of your complaint is that the planner is overestimating
the cost of a GIN scan relative to a GIST scan.  I believe the reason
for this is that gincostestimate() is trying to make a fairly honest
estimate of the work involved, whereas gistcostestimate() is just a stub
around genericcostestimate(), which computes an estimate that's more or
less suitable for btree-equivalent index operations.  There's certainly
not any intelligence in the latter that would be capable of dealing with
issues like how much a tsvector "*" operator is going to hurt.  We could
stand to have less bogus estimates for GIST (not to mention SPGIST), but
I'm really not familiar enough with either to write better code for that.

It also seems possible that you've tripped over a plain old performance
bug in the GIST code, ie, the fault is not with the estimate but the
reality.  It's hard to tell about that though.  Do you want to try to
make up a smaller self-contained test case?

            regards, tom lane

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Dave Johansen
Дата:
Сообщение: Re: BUG #3668: type error in serial
Следующее
От: i@avdd.co
Дата:
Сообщение: BUG #7511: 9.2: pg_stat_activity.procpid renamed unnecessarily