Обсуждение: Force another plan.

Поиск
Список
Период
Сортировка

Force another plan.

От
Fredrik Olsson
Дата:
I have some quite huge queries, inside functions, so debugging is kind
of hard. But I have located the query that for some reason gets 4 times
as slow after an analyze.

Before analyze the plan for the query is this:
Nested Loop  (cost=16.80..34.33 rows=1 width=28)
  Join Filter: (ischildof(2, "outer".calendar) OR (hashed subplan))
  ->  Nested Loop  (cost=0.00..11.66 rows=1 width=32)
        ->  Index Scan using t_events_eventype on t_events e
(cost=0.00..5.82 rows=1 width=28)
              Index Cond: (eventtype = 1)
              Filter: (rrfreq IS NOT NULL)
        ->  Index Scan using t_entities_pkey on t_entities te
(cost=0.00..5.83 rows=1 width=4)
              Index Cond: (te."ID" = "outer".entity)
              Filter: (partof = 'events'::name)
  ->  Index Scan using t_entities_pkey on t_entities  (cost=0.00..5.85
rows=1 width=4)
        Index Cond: (t_entities."ID" = "outer".entity)
        Filter: ((haveaccess(createdby, responsible, "class", false) OR
CASE WHEN (partof = 'contacts'::name) THEN ischildof(ancestorof(me()),
"ID") ELSE false END) AND (subplan))
        SubPlan
          ->  Function Scan on alleventoccurances  (cost=0.00..12.50
rows=1000 width=8)
  SubPlan
    ->  Seq Scan on t_attendees  (cost=0.00..16.38 rows=170 width=4)
          Filter: ischildof(2, contact)

In reality this takes approximately 1.0s in the general case. After an
analyze the plan becomes:

Nested Loop  (cost=2.09..4.82 rows=1 width=28)
  Join Filter: ("inner"."ID" = "outer"."ID")
  ->  Hash Join  (cost=2.09..3.59 rows=1 width=32)
        Hash Cond: ("outer"."ID" = "inner".entity)
        Join Filter: (ischildof(2, "inner".calendar) OR (hashed subplan))
        ->  Seq Scan on t_entities  (cost=0.00..1.46 rows=6 width=4)
              Filter: ((haveaccess(createdby, responsible, "class",
false) OR CASE WHEN (partof = 'contacts'::name) THEN
ischildof(ancestorof(me()), "ID") ELSE false END) AND (subplan))
              SubPlan
                ->  Function Scan on alleventoccurances
(cost=0.00..12.50 rows=1000 width=8)
        ->  Hash  (cost=1.06..1.06 rows=2 width=28)
              ->  Seq Scan on t_events e  (cost=0.00..1.06 rows=2 width=28)
                    Filter: ((rrfreq IS NOT NULL) AND (eventtype = 1))
        SubPlan
          ->  Seq Scan on t_attendees  (cost=0.00..1.02 rows=1 width=4)
                Filter: ischildof(2, contact)
  ->  Seq Scan on t_entities te  (cost=0.00..1.16 rows=5 width=4)
        Filter: (partof = 'events'::name)

This takes on approximately 4.5s. So obviously it has degraded.

I count myself as a newbie here, so any hints on what goes on, why a
plan might be chosen, and how I can make is better is appreciated.
Naturally the I can provide scripts to set up all or parts of the
database if anyone like.

regards

--
//Fredrik Olsson
  Treyst AB
  +46-(0)19-362182
  fredrik.olsson@treyst.se


Re: Force another plan.

От
Tom Lane
Дата:
Fredrik Olsson <fredrik.olsson@treyst.se> writes:
> I have some quite huge queries, inside functions, so debugging is kind
> of hard. But I have located the query that for some reason gets 4 times
> as slow after an analyze.

Could we see EXPLAIN ANALYZE output for these cases, not just EXPLAIN?
It seems a bit premature to be discussing ways to "force" a plan choice
when you don't even have a clear idea what's going wrong.

            regards, tom lane

Re: Force another plan.

От
Fredrik Olsson
Дата:
Tom Lane skrev:
> Fredrik Olsson <fredrik.olsson@treyst.se> writes:
>
>> I have some quite huge queries, inside functions, so debugging is kind
>> of hard. But I have located the query that for some reason gets 4 times
>> as slow after an analyze.
>>
>
> Could we see EXPLAIN ANALYZE output for these cases, not just EXPLAIN?
> It seems a bit premature to be discussing ways to "force" a plan choice
> when you don't even have a clear idea what's going wrong.
>
Sorry about that, my fault. Here comes EXPLAIN ANALYZE:

Before VACUUM ANALYZE:
===
Nested Loop  (cost=16.80..34.33 rows=1 width=28) (actual
time=54.197..98.598 rows=1 loops=1)
  Join Filter: (ischildof(2, "outer".calendar) OR (hashed subplan))
  ->  Nested Loop  (cost=0.00..11.66 rows=1 width=32) (actual
time=0.307..0.458 rows=3 loops=1)
        ->  Index Scan using t_events_eventype on t_events e
(cost=0.00..5.82 rows=1 width=28) (actual time=0.241..0.307 rows=3 loops=1)
              Index Cond: (eventtype = 1)
              Filter: (rrfreq IS NOT NULL)
        ->  Index Scan using t_entities_pkey on t_entities te
(cost=0.00..5.83 rows=1 width=4) (actual time=0.035..0.039 rows=1 loops=3)
              Index Cond: (te."ID" = "outer".entity)
              Filter: (partof = 'events'::name)
  ->  Index Scan using t_entities_pkey on t_entities  (cost=0.00..5.85
rows=1 width=4) (actual time=28.445..28.447 rows=0 loops=3)
        Index Cond: (t_entities."ID" = "outer".entity)
        Filter: ((haveaccess(createdby, responsible, "class", false) OR
CASE WHEN (partof = 'contacts'::name) THEN ischildof(ancestorof(me()),
"ID") ELSE false END) AND (subplan))
        SubPlan
          ->  Function Scan on alleventoccurances  (cost=0.00..12.50
rows=1000 width=8) (actual time=19.745..19.745 rows=0 loops=3)
  SubPlan
    ->  Seq Scan on t_attendees  (cost=0.00..16.38 rows=170 width=4)
(actual time=0.422..0.447 rows=2 loops=1)
          Filter: ischildof(2, contact)
Total runtime: 99.814 ms

After VACUUM ANALYZE:
===
Nested Loop  (cost=2.11..4.92 rows=1 width=28) (actual
time=434.321..439.102 rows=1 loops=1)
  Join Filter: ("inner"."ID" = "outer"."ID")
  ->  Hash Join  (cost=2.11..3.67 rows=1 width=32) (actual
time=434.001..438.775 rows=1 loops=1)
        Hash Cond: ("outer"."ID" = "inner".entity)
        Join Filter: (ischildof(2, "inner".calendar) OR (hashed subplan))
        ->  Seq Scan on t_entities  (cost=0.00..1.49 rows=7 width=4)
(actual time=404.539..409.302 rows=2 loops=1)
              Filter: ((haveaccess(createdby, responsible, "class",
false) OR CASE WHEN (partof = 'contacts'::name) THEN
ischildof(ancestorof(me()), "ID") ELSE false END) AND (subplan))
              SubPlan
                ->  Function Scan on alleventoccurances
(cost=0.00..12.50 rows=1000 width=8) (actual time=27.871..27.871 rows=0
loops=14)
        ->  Hash  (cost=1.07..1.07 rows=3 width=28) (actual
time=0.063..0.063 rows=3 loops=1)
              ->  Seq Scan on t_events e  (cost=0.00..1.07 rows=3
width=28) (actual time=0.023..0.034 rows=3 loops=1)
                    Filter: ((rrfreq IS NOT NULL) AND (eventtype = 1))
        SubPlan
          ->  Seq Scan on t_attendees  (cost=0.00..1.02 rows=1 width=4)
(actual time=0.205..0.228 rows=2 loops=1)
                Filter: ischildof(2, contact)
  ->  Seq Scan on t_entities te  (cost=0.00..1.18 rows=6 width=4)
(actual time=0.029..0.045 rows=6 loops=1)
        Filter: (partof = 'events'::name)
Total runtime: 440.385 ms

As I read it, the villain is the sequential sqan on t_entities with the
huge filter; haveacces() ...
And that is understandable, doing haveaccess() on all rows is not good.
A  much better solution in this case would be to first get the set that
conforms to (partof = 'events'::name), that would reduce the set to a
third. Secondly applying (eventtype=1) would reduce that to half.  Then
it is time to do the  (haveaccess() ...).

Perhaps a small explanation of the tables, and their intent is in order.

What I have is one "master table" with entities (t_entitites), and two
child tables t_contacts and t_events. In a perfect world rt_contacts and
t_events would have inherited from t_entities as they share muuch data,
but then I would not be able to have foreign key referencing just
events, or just contacts. So instead every row in events and contacts
have a corresponding one to one row in entities. The fourth table used
in this query is t_attendees, that links well attendees for an event to
contacts. Here goes a simplified example (SQL says more then 1000 words):

CREATE TABLE t_entities (
  "ID" integer PRIMARY KEY,
  "createdby" integer NOT NULL,
  "class" integer NOT NULL,  -- Defines visibility for entity and is
used by haveaccess()
  "partof" name NOT NULL
);

CREATE TABLE t_contacts (
  "entity" integer PRIMARY KEY REFERENCES t_entities ("ID"),
  "undercontact" integer REFERENCES t_contacts ("entity"), -- Tree
structure, used by haveaccess()
  "name" varchar(48)
);

ALTER TABLE t_entities ADD FOREIGN KEY ("createdby")
  REFERENCES t_contacts ("entity");

CREATE TABLE t_events (
  "entity" integer PRIMARY KEY REFERENCES t_entities ("ID"),
  "calendar" integer NOT NULL REFERENCES t_entities ("ID"),
  "eventtype" integer NOT NULL,
  "start" timestamptz,
  "end" timestamptz
);

CREATE TABLE t_attendees (
  "event" integer NOT NULL REFERENCES t_events ("entity"),
  "contact" integer NOT NULL REFERENCES t_contacts ("entity"),
  PRIMARY KEY ("event", "contact")
);

No user have privileges to select, update or delete in any of these
tables. They are instead accessed with views that are made updatable
using rules. These rules uses the function haveaccess() to sort out the
rows that each user is allowed to see. Users are also contacts in the
t_contacts table.

Well the complete database, with a shell script for setting it up can as
I said be provided if wanted.

--
//Fredrik Olsson
  Treyst AB
  +46-(0)19-362182
  fredrik.olsson@treyst.se



Re: Force another plan.

От
Tom Lane
Дата:
Fredrik Olsson <fredrik.olsson@treyst.se> writes:
>         ->  Seq Scan on t_entities  (cost=0.00..1.49 rows=7 width=4)
> (actual time=404.539..409.302 rows=2 loops=1)
>               Filter: ((haveaccess(createdby, responsible, "class",
> false) OR CASE WHEN (partof = 'contacts'::name) THEN
> ischildof(ancestorof(me()), "ID") ELSE false END) AND (subplan))
>               SubPlan
>                 ->  Function Scan on alleventoccurances
> (cost=0.00..12.50 rows=1000 width=8) (actual time=27.871..27.871 rows=0
> loops=14)

This seems to be your problem right here: evaluating that subplan for
each row of t_entities is pretty expensive, and yet the planner's
estimating a total cost of only 1.49 to run the scan.  What PG version
is this?  AFAICT we've accounted for subplan costs in scan quals for
a long time, certainly since 7.4.  Can you put together a self-contained
test case for this?

            regards, tom lane

Re: Force another plan.

От
Fredrik Olsson
Дата:
Tom Lane skrev:
> Fredrik Olsson <fredrik.olsson@treyst.se> writes:
>
>>         ->  Seq Scan on t_entities  (cost=0.00..1.49 rows=7 width=4)
>> (actual time=404.539..409.302 rows=2 loops=1)
>>               Filter: ((haveaccess(createdby, responsible, "class",
>> false) OR CASE WHEN (partof = 'contacts'::name) THEN
>> ischildof(ancestorof(me()), "ID") ELSE false END) AND (subplan))
>>               SubPlan
>>                 ->  Function Scan on alleventoccurances
>> (cost=0.00..12.50 rows=1000 width=8) (actual time=27.871..27.871 rows=0
>> loops=14)
>>
>
> This seems to be your problem right here: evaluating that subplan for
> each row of t_entities is pretty expensive, and yet the planner's
> estimating a total cost of only 1.49 to run the scan.  What PG version
> is this?  AFAICT we've accounted for subplan costs in scan quals for
> a long time, certainly since 7.4.  Can you put together a self-contained
> test case for this?
>
>
I have found the real bottle-neck, looks like I trusted PG to do a bit
too much magic. Since all table accesses go through views, and the first
11 columns always look the same, I did a "proxy-view" to simplify my
"real-views" to:
SELECT pv.*, ...
The "proxy-view" fetches from t_entities, and so does the "real-views"
to do a filter on "partof". This resulted in two scans on  t_entities
when only one is needed. Skipping the "proxy-view" allows PG to chose
the best plan for every case. I guess I stressed the optimizer a bit too
far :).

Is a self contained test-case for the old way with the "proxy-view" is
still wanted?

--
//Fredrik Olsson
  Treyst AB
  +46-(0)19-362182
  fredrik.olsson@treyst.se


Re: Force another plan.

От
Tom Lane
Дата:
Fredrik Olsson <fredrik.olsson@treyst.se> writes:
> Is a self contained test-case for the old way with the "proxy-view" is
> still wanted?

Yes, something still seems funny there.  And for that matter it wasn't
clear what was wrong with your proxy view, either.

            regards, tom lane