Re: PATCH: index-only scans with partial indexes

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: PATCH: index-only scans with partial indexes
Дата
Msg-id 55EB23B5.8030805@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: PATCH: index-only scans with partial indexes  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: PATCH: index-only scans with partial indexes  (Kevin Grittner <kgrittn@ymail.com>)
Список pgsql-hackers
Hi,

On 09/05/2015 10:53 AM, Simon Riggs wrote:
> On 4 September 2015 at 22:03, Tomas Vondra <tomas.vondra@2ndquadrant.com
> <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>
>     A summary of 100 EXPLAIN timings looks like this:
>
>
>     master       A          B          C          D          E          F
>     -------------------------------------------------------------------------
>     min        0.10       0.10       0.30       0.29       0.66       0.23
>     max        1.07       1.00       2.13       1.98       4.52       1.59
>     median     0.49       0.52       0.31       0.33       0.68       1.12
>     average    0.43       0.35       0.62       0.49       1.01       0.89
>
>
> What are these? Times? in ms?

Yes, those are planning times in milliseconds. I've been thinking about 
possible issues in the benchmark, and I ended up with two main suspects:
  (a) environment - VM running on a laptop. thus quite noisy and      subject to various sources of overhead,
power-management,etc.
 
  (b) time measured using \timing in psql (by running EXPLAIN), so      probably influenced by formatting/transfer

So I reran the benchmark on a different machine (bare metal, pretty much 
no noise in the results), and measured the planning time using EXPLAIN 
ANALYZE (Planning Time). And I got this (milliseconds):
           A         B         C         D         E         F
----------------------------------------------------------------- min      0.04      0.04      0.11      0.10      0.37
    0.12 max      0.10      0.10      0.92      0.92      1.62      1.23 median   0.04      0.04      0.11      0.11
 0.37      0.13 average  0.04      0.04      0.11      0.11      0.38      0.14
 
           A         B         C         D         E         F
----------------------------------------------------------------- min      0.04      0.04      0.11      0.11      0.38
    0.13 max      0.10      0.10      0.92      0.94      1.64      1.21 median   0.04      0.04      0.11      0.11
 0.39      0.13 average  0.04      0.04      0.11      0.12      0.40      0.14
 

So much lower numbers (better CPU, no virtualization, etc.), but 
otherwise exactly the same conclusion - no overhead compared to master.

I think of three ways how to make the checks more expensive:
   (a) using more indexes
       The current benchmark already uses 40 indexes (and I've tried       with 100), and we've seen no impact at all.
Addingmore indexes       will eventually show some overhead, but the number of indexes       will be very high - I
doubtanyone has a table with hundreds of       partial indexes on a it.
 
   (b) using more complex index predicates
       I expect the predicate_implied_by() call to get more expensive       for more complex predicates. I however
believethat's quite       uncommon case - vast majority of index predicates that I've seen       use just a single
equalityclause.
 
   (c) using more complex queries (more WHERE conditions)
       Having more complex WHERE clauses seems quite plausible, though,       so I've decided to try it. Instead of the
simplequery used       before:
 
           select a from t where b >= 100 and b <= 200;
       I've used a query with a bunch of other conditions:
           select a from t where b >= 100 and b <= 200                             and c >= 100 and c <= 200
                and d >= 100 and d <= 200                             and a >= 100 and a <= 100;
 
       And indeed, this made a (tiny) difference - on the master, the       planning was 0.50 ms on average, while with
thepatch it was       0.55. But 0.05 ms is just barely above noise, even on this HW.
 
       Of course, this only impacts the case with partial indexes, all       the other cases were exactly the same with
andwithout patch.
 
>
>     However that was the idea of the benchmark, and I got no difference.
>
>
> Please explain what this means and your conclusion, so its clear. That
> way we can either reject the patch or commit it. Thanks

That means I've been unable to measure any significant overhead of the 
patch. There certainly are extreme cases where this patch might make the 
planning noticeably slower, but I believe those are rather artificial, 
and certainly wouldn't expect them in databases where a tiny increase of 
planning time would be a problem.

This benchmark however only looked at the planning overhead, but we 
should weight that with respect to possible gains. And IOS is a great 
optimization - it's not uncommon to see 2-3x improvements on databases 
that fit into RAM, and order of magnitude improvements on large 
databases (thanks to eliminating the random I/O when accessing the heap).

So my opinion is that we should commit this patch.

regards

-- 
Tomas Vondra                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Counting lines correctly in psql help displays
Следующее
От: Greg Stark
Дата:
Сообщение: Re: Counting lines correctly in psql help displays