Bad Estimate for complex query with JOINS on subselects and OR in where conditions

Поиск
Список
Период
Сортировка
От Peter Grman
Тема Bad Estimate for complex query with JOINS on subselects and OR in where conditions
Дата
Msg-id CACF7Wx0twq1scHAuc3Q7itp727nfYZsj6EWYvFf1zYYjG1nJNQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Bad Estimate for complex query with JOINS on subselects and OR in where conditions  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
Hello,

our ORM with tenant separation enabled is creating the following query:

    SELECT *
        FROM "Reservation"."Reservation" AS q
                 LEFT JOIN (
            SELECT q0.*
            FROM "Reservation"."PaymentAccount" AS q0
            WHERE q0."AccountCode" = 'A1'
        ) AS t ON q."PaymentAccountId" = t."Id"
                 INNER JOIN (
            SELECT q1.*
            FROM "Reservation"."NoShowFee" AS q1
            WHERE q1."AccountCode" = 'A1'
        ) AS t0 ON q."NoShowFeeId" = t0."Id"
                 INNER JOIN (
            SELECT q2.*
            FROM "Policy_Fees"."Configuration" AS q2
            WHERE q2."AccountCode" = 'A1'
        ) AS t1 ON t0."OriginalFeeConfigurationId" = t1."Id"
                 INNER JOIN (
            SELECT q3.*
            FROM "Reservation"."CancellationFee" AS q3
            WHERE q3."AccountCode" = 'A1'
        ) AS t2 ON q."CancellationFeeId" = t2."Id"
                 INNER JOIN (
            SELECT q4.*
            FROM "Policy"."CancellationPolicy" AS q4
            WHERE q4."AccountCode" = 'A1'
        ) AS t3 ON t2."OriginalCancellationPolicyId" = t3."Id"
                 INNER JOIN (
            SELECT q5.*
            FROM "Reservation"."Booking" AS q5
            WHERE q5."AccountCode" = 'A1'
        ) AS t4 ON q."BookingId" = t4."Id"
                 INNER JOIN (
            SELECT q6.*
            FROM "Reservation"."Person" AS q6
            WHERE q6."AccountCode" = 'A1'
        ) AS t5 ON t4."BookerId" = t5."Id"
                 INNER JOIN (
            SELECT q7.*
            FROM "Reservation"."Person" AS q7
            WHERE q7."AccountCode" = 'A1'
        ) AS t6 ON q."PrimaryGuestId" = t6."Id"
        WHERE (q."AccountCode" = 'A1') AND (((((t6."SearchVector" @@ to_tsquery(('foobar' || ':*'))) = TRUE) OR ((t5."SearchVector" @@ to_tsquery(('foobar' || ':*'))) = TRUE)) OR ((q."SearchVector" @@ to_tsquery(('foobar' || ':*'))) = TRUE)) OR EXISTS (
                SELECT 1
                FROM "Reservation"."TimeSlice" AS q8
                         LEFT JOIN (
                    SELECT q9.*
                    FROM "Inventory"."Unit" AS q9
                    WHERE q9."AccountCode" = 'A1'
                ) AS t7 ON q8."UnitId" = t7."Id"
                WHERE ((q8."AccountCode" = 'A1') AND ((t7."SearchVector" @@ to_tsquery(('foobar' || ':*'))) = TRUE)) AND (q."Id" = q8."ReservationId")))

This is already optimized to really only include the tables that we need (I left out the projection as we fetch almost every column from those tables).

When I tried to analyze the execution plan in http://tatiyants.com/pev/ I saw a label "bad estimate" in one path of the execution plan:

Screenshot_20190816_131944.png

I looked around to see what it might mean, and found https://www.postgresql.org/message-id/CAFh58O_Myr6G3tcH3gcGrF-=OExB08PJdWZcSBcEcovaiPsrHA@mail.gmail.com but there the issues seemed to have been a long running transaction, which we don't have (https://www.postgresql.org/message-id/CAFh58O9ttZpyCM%2BnDb7DMQsi1ca%2BZ1c%3DB50eP2Vf3AQiNZsF9w%40mail.gmail.com)

The is the execution plan created:

Nested Loop  (cost=7.37..6041.66 rows=1 width=2792) (actual time=1.930..528.372 rows=252 loops=1)
  Join Filter: ((q7."SearchVector" @@ to_tsquery('foobar:*'::text)) OR (q6."SearchVector" @@ to_tsquery('foobar:*'::text)) OR (q."SearchVector" @@ to_tsquery('foobar:*'::text)) OR (alternatives: SubPlan 1 or hashed SubPlan 2))
  Rows Removed by Join Filter: 18800
  Buffers: shared hit=622893
  ->  Nested Loop  (cost=6.95..5995.95 rows=1 width=2463) (actual time=0.104..297.051 rows=19052 loops=1)
        Buffers: shared hit=383698
        ->  Nested Loop  (cost=6.53..5992.42 rows=1 width=2134) (actual time=0.091..239.030 rows=19052 loops=1)
              Buffers: shared hit=307469
              ->  Nested Loop Left Join  (cost=6.23..5990.48 rows=1 width=1759) (actual time=0.081..192.608 rows=19052 loops=1)
                    Buffers: shared hit=250274
                    ->  Nested Loop  (cost=5.94..5989.22 rows=1 width=1582) (actual time=0.072..161.734 rows=19052 loops=1)
                          Buffers: shared hit=223445
                          ->  Nested Loop  (cost=5.66..5988.56 rows=2 width=1281) (actual time=0.062..130.338 rows=19052 loops=1)
                                Buffers: shared hit=166289
                                ->  Nested Loop  (cost=5.24..5960.88 rows=15 width=1137) (actual time=0.051..81.762 rows=19052 loops=1)
                                      Buffers: shared hit=90036
                                      ->  Nested Loop  (cost=4.82..5651.40 rows=110 width=620) (actual time=0.035..23.878 rows=19052 loops=1)
                                            Buffers: shared hit=13824
                                            ->  Bitmap Heap Scan on "CancellationPolicy" q4  (cost=4.40..53.38 rows=16 width=245) (actual time=0.018..0.052 rows=16 loops=1)
                                                  Recheck Cond: ("AccountCode" = 'A1'::text)
                                                  Heap Blocks: exact=10
                                                  Buffers: shared hit=12
                                                  ->  Bitmap Index Scan on "IX_CancellationPolicy_AccountCode_EntityId"  (cost=0.00..4.40 rows=16 width=0) (actual time=0.014..0.014 rows=16 loops=1)
                                                        Index Cond: ("AccountCode" = 'A1'::text)
                                                        Buffers: shared hit=2
                                            ->  Index Scan using "IX_CancellationFee_OriginalCancellationPolicyId" on "CancellationFee" q3  (cost=0.42..349.69 rows=19 width=375) (actual time=0.008..1.236 rows=1191 loops=16)
                                                  Index Cond: ("OriginalCancellationPolicyId" = q4."Id")
                                                  Filter: ("AccountCode" = 'A1'::text)
                                                  Buffers: shared hit=13812
                                      ->  Index Scan using "IX_Reservation_CancellationFeeId" on "Reservation" q  (cost=0.42..2.81 rows=1 width=517) (actual time=0.002..0.002 rows=1 loops=19052)
                                            Index Cond: ("CancellationFeeId" = q3."Id")
                                            Filter: ("AccountCode" = 'A1'::text)
                                            Buffers: shared hit=76212
                                ->  Index Scan using "PK_NoShowFee" on "NoShowFee" q1  (cost=0.42..1.85 rows=1 width=144) (actual time=0.002..0.002 rows=1 loops=19052)
                                      Index Cond: ("Id" = q."NoShowFeeId")
                                      Filter: ("AccountCode" = 'A1'::text)
                                      Buffers: shared hit=76253
                          ->  Index Scan using "PK_Configuration" on "Configuration" q2  (cost=0.28..0.33 rows=1 width=301) (actual time=0.001..0.001 rows=1 loops=19052)
                                Index Cond: ("Id" = q1."OriginalFeeConfigurationId")
                                Filter: ("AccountCode" = 'A1'::text)
                                Buffers: shared hit=57156
                    ->  Index Scan using "PK_PaymentAccount" on "PaymentAccount" q0  (cost=0.29..1.26 rows=1 width=177) (actual time=0.001..0.001 rows=0 loops=19052)
                          Index Cond: (q."PaymentAccountId" = "Id")
                          Filter: ("AccountCode" = 'A1'::text)
                          Buffers: shared hit=26829
              ->  Index Scan using "PK_Booking" on "Booking" q5  (cost=0.29..1.94 rows=1 width=375) (actual time=0.002..0.002 rows=1 loops=19052)
                    Index Cond: ("Id" = q."BookingId")
                    Filter: ("AccountCode" = 'A1'::text)
                    Buffers: shared hit=57195
        ->  Index Scan using "PK_Person" on "Person" q6  (cost=0.42..3.53 rows=1 width=329) (actual time=0.002..0.002 rows=1 loops=19052)
              Index Cond: ("Id" = q5."BookerId")
              Filter: ("AccountCode" = 'A1'::text)
              Buffers: shared hit=76229
  ->  Index Scan using "PK_Person" on "Person" q7  (cost=0.42..3.33 rows=1 width=329) (actual time=0.002..0.002 rows=1 loops=19052)
        Index Cond: ("Id" = q."PrimaryGuestId")
        Filter: ("AccountCode" = 'A1'::text)
        Buffers: shared hit=76340
  SubPlan 1
    ->  Nested Loop  (cost=0.71..41.61 rows=1 width=0) (actual time=0.007..0.007 rows=0 loops=18800)
          Buffers: shared hit=162855
          ->  Index Scan using "IX_TimeSlice_ReservationId_RatePlanId_UnitGroupId_FromUtc_T~" on "TimeSlice" q8  (cost=0.42..24.78 rows=1 width=8) (actual time=0.003..0.003 rows=2 loops=18800)
                Index Cond: (q."Id" = "ReservationId")
                Filter: ("AccountCode" = 'A1'::text)
                Buffers: shared hit=81574
          ->  Index Scan using "PK_Unit" on "Unit" q9  (cost=0.29..8.56 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=33918)
                Index Cond: ("Id" = q8."UnitId")
                Filter: (("AccountCode" = 'A1'::text) AND ("SearchVector" @@ to_tsquery('foobar:*'::text)))
                Rows Removed by Filter: 1
                Buffers: shared hit=81281
  SubPlan 2
    ->  Nested Loop  (cost=32.69..187.36 rows=1 width=8) (never executed)
          ->  Bitmap Heap Scan on "Unit" q9_1  (cost=32.26..40.63 rows=1 width=8) (never executed)
                Recheck Cond: ("SearchVector" @@ to_tsquery('foobar:*'::text))
                Filter: ("AccountCode" = 'A1'::text)
                ->  Bitmap Index Scan on "IX_Unit_SearchVector"  (cost=0.00..32.26 rows=2 width=0) (never executed)
                      Index Cond: ("SearchVector" @@ to_tsquery('foobar:*'::text))
          ->  Index Scan using "IX_TimeSlice_UnitId" on "TimeSlice" q8_1  (cost=0.42..146.70 rows=4 width=16) (never executed)
                Index Cond: ("UnitId" = q9_1."Id")
                Filter: ("AccountCode" = 'A1'::text)
Planning Time: 19.838 ms
Execution Time: 528.758 ms

I'm aware that those OR conditions make it worse than it has to be, using AND instead for the where condition like this:

        WHERE (q."AccountCode" = 'A1') AND (((((t6."SearchVector" @@ to_tsquery(('foobar' || ':*'))) = TRUE) AND ((t5."SearchVector" @@ to_tsquery(('foobar' || ':*'))) = TRUE)) AND ((q."SearchVector" @@ to_tsquery(('foobar' || ':*'))) = TRUE)) AND EXISTS (
                SELECT 1
                FROM "Reservation"."TimeSlice" AS q8
                         LEFT JOIN (
                    SELECT q9.*
                    FROM "Inventory"."Unit" AS q9
                    WHERE q9."AccountCode" = 'A1'
                ) AS t7 ON q8."UnitId" = t7."Id"
                WHERE ((q8."AccountCode" = 'A1') AND ((t7."SearchVector" @@ to_tsquery(('foobar' || ':*'))) = TRUE)) AND (q."Id" = q8."ReservationId")))

I get a seemingly similar plan, with all those nested loops, but without the "bad estimate" label and it's supposed to be much faster in execution:

Nested Loop  (cost=263.89..395.88 rows=1 width=2792) (actual time=0.835..0.835 rows=0 loops=1)
  Buffers: shared hit=137
  ->  Nested Loop Semi Join  (cost=263.47..387.09 rows=1 width=2463) (actual time=0.834..0.834 rows=0 loops=1)
        Buffers: shared hit=137
        ->  Nested Loop  (cost=262.75..360.82 rows=1 width=2463) (actual time=0.834..0.834 rows=0 loops=1)
              Buffers: shared hit=137
              ->  Nested Loop  (cost=262.33..357.00 rows=1 width=2134) (actual time=0.763..0.808 rows=3 loops=1)
                    Buffers: shared hit=125
                    ->  Nested Loop  (cost=262.05..356.66 rows=1 width=1889) (actual time=0.751..0.791 rows=3 loops=1)
                          Buffers: shared hit=116
                          ->  Nested Loop  (cost=261.63..348.22 rows=1 width=1514) (actual time=0.739..0.771 rows=3 loops=1)
                                Buffers: shared hit=104
                                ->  Nested Loop  (cost=261.35..347.89 rows=1 width=1213) (actual time=0.721..0.749 rows=3 loops=1)
                                      Buffers: shared hit=95
                                      ->  Nested Loop Left Join  (cost=260.93..339.45 rows=1 width=1069) (actual time=0.709..0.729 rows=3 loops=1)
                                            Buffers: shared hit=83
                                            ->  Nested Loop  (cost=260.64..331.13 rows=1 width=892) (actual time=0.702..0.717 rows=3 loops=1)
                                                  Buffers: shared hit=80
                                                  ->  Bitmap Heap Scan on "Reservation" q  (cost=260.35..314.51 rows=2 width=517) (actual time=0.689..0.695 rows=3 loops=1)
                                                        Recheck Cond: ("SearchVector" @@ to_tsquery('foobar:*'::text))
                                                        Filter: ("AccountCode" = 'A1'::text)
                                                        Rows Removed by Filter: 6
                                                        Heap Blocks: exact=8
                                                        Buffers: shared hit=71
                                                        ->  Bitmap Index Scan on "IX_Reservation_SearchVector"  (cost=0.00..260.35 rows=13 width=0) (actual time=0.677..0.677 rows=13 loops=1)
                                                              Index Cond: ("SearchVector" @@ to_tsquery('foobar:*'::text))
                                                              Buffers: shared hit=63
                                                  ->  Index Scan using "PK_Booking" on "Booking" q5  (cost=0.29..8.31 rows=1 width=375) (actual time=0.006..0.006 rows=1 loops=3)
                                                        Index Cond: ("Id" = q."BookingId")
                                                        Filter: ("AccountCode" = 'A1'::text)
                                                        Buffers: shared hit=9
                                            ->  Index Scan using "PK_PaymentAccount" on "PaymentAccount" q0  (cost=0.29..8.31 rows=1 width=177) (actual time=0.003..0.003 rows=0 loops=3)
                                                  Index Cond: (q."PaymentAccountId" = "Id")
                                                  Filter: ("AccountCode" = 'A1'::text)
                                                  Buffers: shared hit=3
                                      ->  Index Scan using "PK_NoShowFee" on "NoShowFee" q1  (cost=0.42..8.44 rows=1 width=144) (actual time=0.005..0.005 rows=1 loops=3)
                                            Index Cond: ("Id" = q."NoShowFeeId")
                                            Filter: ("AccountCode" = 'A1'::text)
                                            Buffers: shared hit=12
                                ->  Index Scan using "PK_Configuration" on "Configuration" q2  (cost=0.28..0.33 rows=1 width=301) (actual time=0.006..0.006 rows=1 loops=3)
                                      Index Cond: ("Id" = q1."OriginalFeeConfigurationId")
                                      Filter: ("AccountCode" = 'A1'::text)
                                      Buffers: shared hit=9
                          ->  Index Scan using "PK_CancellationFee" on "CancellationFee" q3  (cost=0.42..8.44 rows=1 width=375) (actual time=0.005..0.005 rows=1 loops=3)
                                Index Cond: ("Id" = q."CancellationFeeId")
                                Filter: ("AccountCode" = 'A1'::text)
                                Buffers: shared hit=12
                    ->  Index Scan using "PK_CancellationPolicy" on "CancellationPolicy" q4  (cost=0.28..0.33 rows=1 width=245) (actual time=0.004..0.004 rows=1 loops=3)
                          Index Cond: ("Id" = q3."OriginalCancellationPolicyId")
                          Filter: ("AccountCode" = 'A1'::text)
                          Buffers: shared hit=9
              ->  Index Scan using "PK_Person" on "Person" q6  (cost=0.42..3.78 rows=1 width=329) (actual time=0.008..0.008 rows=0 loops=3)
                    Index Cond: ("Id" = q5."BookerId")
                    Filter: (("AccountCode" = 'A1'::text) AND ("SearchVector" @@ to_tsquery('foobar:*'::text)))
                    Rows Removed by Filter: 1
                    Buffers: shared hit=12
        ->  Nested Loop  (cost=0.71..26.26 rows=1 width=8) (never executed)
              ->  Index Scan using "IX_TimeSlice_ReservationId_RatePlanId_UnitGroupId_FromUtc_T~" on "TimeSlice" q8  (cost=0.42..24.78 rows=1 width=16) (never executed)
                    Index Cond: ("ReservationId" = q."Id")
                    Filter: ("AccountCode" = 'A1'::text)
              ->  Index Scan using "PK_Unit" on "Unit" q9  (cost=0.29..0.89 rows=1 width=8) (never executed)
                    Index Cond: ("Id" = q8."UnitId")
                    Filter: (("AccountCode" = 'A1'::text) AND ("SearchVector" @@ to_tsquery('foobar:*'::text)))
  ->  Index Scan using "PK_Person" on "Person" q7  (cost=0.42..8.69 rows=1 width=329) (never executed)
        Index Cond: ("Id" = q."PrimaryGuestId")
        Filter: (("AccountCode" = 'A1'::text) AND ("SearchVector" @@ to_tsquery('foobar:*'::text)))
Planning Time: 31.910 ms
Execution Time: 1.184 ms

We're working on being able to have only one "SearchVector" column, but until then, I'd like to know how I could help postgres estimating the execution better.

We're using Postgres 11 on RDS with the default settings

Thank you very much!
Peter
Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Charles Clavadetscher
Дата:
Сообщение: Re: Variable constants ?
Следующее
От: stan
Дата:
Сообщение: A 3 table join question