Query plan choice issue

Поиск
Список
Период
Сортировка
От Yaroslav Tykhiy
Тема Query plan choice issue
Дата
Msg-id 2630508E-BB4C-4B1F-9FD8-40E87684A403@barnet.com.au
обсуждение исходный текст
Ответы Re: Query plan choice issue  (Martin Gainty <mgainty@hotmail.com>)
Список pgsql-general
Hi all,

I'm seeing a funny behaviour in Postgresql 8.4.4.  Namely, a query can
be executed using either of two different query plans, one taking a
few milliseconds and the other, tens of seconds.  The work_mem setting
doesn't seem to affect it -- tried to increase or decrease it by 2 or
4 times, but it didn't seem to favour the fast plan choice.  Honestly,
I have no idea what affects the plan choice, but I saw Postgresql
change it at random.

The query in question looks like this -- sorry, it's rather complex:

SELECT message_idnr
FROM dbmail_messages m
JOIN dbmail_physmessage p ON m.physmessage_id=p.id
JOIN dbmail_headervalue v ON v.physmessage_id=p.id
JOIN dbmail_headername n ON v.headername_id=n.id
WHERE mailbox_idnr = 12345 AND status IN (0,1) AND
headername ILIKE 'MESSAGE-ID' AND SUBSTRING(headervalue,0,255)
ILIKE '%<...@mail.gmail.com>%'
ORDER BY message_idnr;

It comes from DBMail.  That said, I don't think DBMail can be blamed
here because at least sometimes Postgresql is able to do the right
thing with respect to that query.

Here is the slow plan:
                                                                               QUERY
  PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Sort (cost=1234610.11..1234610.11 rows=1 width=8) (actual
time=20933.166..20933.168 rows=1 loops=1)
    Sort Key: m.message_idnr
    Sort Method: quicksort Memory: 25kB
    -> Nested Loop (cost=0.00..1234610.10 rows=1 width=8) (actual
time=3327.658..20933.122 rows=1 loops=1)
          -> Nested Loop (cost=0.00..1234601.59 rows=1 width=24)
(actual time=3327.599..20933.056 rows=1 loops=1)
                -> Nested Loop (cost=0.00..1234549.74 rows=6 width=8)
(actual time=3327.503..20932.944 rows=1 loops=1)
                      Join Filter: (v.headername_id = n.id)
                      -> Seq Scan on dbmail_headername n
(cost=0.00..108.46 rows=1 width=8) (actual time=0.027..0.985 rows=1
loops=1)
                            Filter: ((headername)::text ~~* 'MESSAGE-
ID'::text)
                      -> Seq Scan on dbmail_headervalue v
(cost=0.00..1234407.96 rows=2666 width=16) (actual
time=3327.465..20931.942 rows=1 loops=1)
                            Filter: ("substring"(v.headervalue, 0,
255) ~~* '%<...@mail.gmail.com>%'::text)
                -> Index Scan using dbmail_messages_physmessage_idx on
dbmail_messages m (cost=0.00..8.63 rows=1 width=16) (actual
time=0.088..0.100 rows=1 loops=1)
                      Index Cond: (m.physmessage_id = v.physmessage_id)
                      Filter: ((m.status = ANY ('{0,1}'::integer[]))
AND (m.mailbox_idnr = 12345))
          -> Index Scan using dbmail_physmessage_pkey on
dbmail_physmessage p (cost=0.00..8.50 rows=1 width=8) (actual
time=0.048..0.050 rows=1 loops=1)
                Index Cond: (p.id = m.physmessage_id)
  Total runtime: 20933.241 ms
(17 rows)

And here is the fast plan:
                                                                                   QUERY
  PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Sort (cost=1175284.93..1175284.94 rows=1 width=8) (actual
time=14.163..14.165 rows=1 loops=1)
    Sort Key: m.message_idnr
    Sort Method: quicksort Memory: 25kB
    -> Nested Loop (cost=0.00..1175284.92 rows=1 width=8) (actual
time=4.272..14.152 rows=1 loops=1)
          -> Nested Loop (cost=0.00..1175276.41 rows=1 width=24)
(actual time=4.261..14.135 rows=1 loops=1)
                -> Nested Loop (cost=0.00..1175268.46 rows=1 width=32)
(actual time=4.249..14.117 rows=1 loops=1)
                      -> Index Scan using dbmail_messages_mailbox_idx
on dbmail_messages m (cost=0.00..4153.35 rows=786 width=16) (actual
time=0.043..2.810 rows=358 loops=1)
                            Index Cond: (mailbox_idnr = 12345)
                            Filter: (status = ANY ('{0,1}'::integer[]))
                      -> Index Scan using dbmail_headervalue_2 on
dbmail_headervalue v (cost=0.00..1489.96 rows=1 width=16) (actual
time=0.028..0.029 rows=0 loops=358)
                            Index Cond: (v.physmessage_id =
m.physmessage_id)
                            Filter: ("substring"(v.headervalue, 0,
255) ~~* '%<...@mail.gmail.com>%'::text)
                -> Index Scan using dbmail_headername_pkey on
dbmail_headername n (cost=0.00..7.94 rows=1 width=8) (actual
time=0.007..0.008 rows=1 loops=1)
                      Index Cond: (n.id = v.headername_id)
                      Filter: ((n.headername)::text ~~* 'MESSAGE-
ID'::text)
          -> Index Scan using dbmail_physmessage_pkey on
dbmail_physmessage p (cost=0.00..8.50 rows=1 width=8) (actual
time=0.006..0.007 rows=1 loops=1)
                Index Cond: (p.id = m.physmessage_id)
  Total runtime: 14.231 ms
(18 rows)

Do you think this query plan choice can be affected by postgresql.conf
settings?  Or shall I file a bug report?

I'll be happy to provide the table and index structure details if
needed.

Thanks!

Yar

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Huge amount of memory errors with libpq
Следующее
От: Martin Gainty
Дата:
Сообщение: Re: Query plan choice issue