Re: Processing long AND/OR lists

Поиск
Список
Период
Сортировка
От Gurjeet Singh
Тема Re: Processing long AND/OR lists
Дата
Msg-id CABwTF4X0oemqP7VN_is_ReYsC6h_RsVxf4Pexb=-yRgk6Hj+HQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Processing long AND/OR lists  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Sun, May 26, 2013 at 11:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Josh Berkus <josh@agliodbs.com> writes:
> ***15,000***?  I'd say that someone has an application design issue.
> Fixing the stack overflow is a good thing, but that query is never going
> to return ...

Just for the record, it does finish in 5 sec on my laptop. But yes, point taken.
 

Yeah, the parser's stack consumption seems like only the tip of the
iceberg here.

True. After getting past the parser, the bigger lists consume exponentially longer times around process_equivalence(). And BTW, a backend stuck in that stack does not respond to pg_cancel/terminate_backend()! Should there be a CHECK_FOR_INTERRUPT() in process_equivalence(), perhaps invoked only every N calls of that function.
 

I find it hard to visualize a use-case for so many AND'ed conditions
anyway.  I could believe a lot of OR'd conditions, but very likely it
would be a list that could and should be converted to an IN condition.

As noted earlier, this was seen in real-world, at a customer setup, generated by Slony, a highly regarded community project. Also, the patch addresses the stack limit for long OR clauses as well.

Anytime such conditions are generated programmatically, there's always a possibility that the programmer underestimated the amount of data they are generating the query for; just like it happened in Slony's case I quoted.

Slony has compress_actionseq() function to scan a list of numbers, and generate a smallest possible WHERE clause. If it sees consecutive numbers that form a range, it turns that range into a BETWEEN clause, and the numbers that don't seem to be in a range are added to the where clause as 'AND col <> n1 AND col <> n2 ...'. It's quite likely that it generates such long lists because it wrongly assumes that the input list is ordered, or at least mostly ordered.

Agreed that that function can/should be fixed to do a better job of sorting these numbers before scanning, and also using the NOT IN construct. But the point remains that Postgres should be capable of handling this simple construct efficiently.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Christopher Browne
Дата:
Сообщение: Re: Planning incompatibilities for Postgres 10.0
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: Planning incompatibilities for Postgres 10.0