Re: Planner reluctant to start from subquery

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: Planner reluctant to start from subquery
Дата
Msg-id 43E0C8F5.EE98.0025.0@wicourts.gov
обсуждение исходный текст
Ответ на Re: Planner reluctant to start from subquery  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Planner reluctant to start from subquery  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: Planner reluctant to start from subquery  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
>>> On Wed, Feb 1, 2006 at  2:36 pm, in message
<4359.1138826175@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I'm interested to poke at this ... are you in a position to provide
a
>>> test case?
>
>> I can't supply the original data, since many of the tables have
>> millions of rows, with some of the data (related to juvenile,
paternity,
>> sealed, and expunged cases) protected by law.  I could try to put
>> together a self- contained example, but I'm not sure the best way to
do
>> that, since the table sizes and value distributions may be
significant
>> here.  Any thoughts on that?
>
> I think that the only aspect of the data that really matters here is
the
> number of distinct values, which would affect decisions about
whether
> HashAggregate is appropriate or not.  And you could probably get the
> same thing to happen with at most a few tens of thousands of rows.
>
> Also, all we need to worry about is the columns used in the
WHERE/JOIN
> conditions, which looks to be mostly case numbers, dates, and county
> identification ... how much confidential info is there in that?  At
> worst you could translate the case numbers to some randomly
generated
> identifiers.

OK, I could probably obliterate name, addresses, etc. in a copy of the
data (those aren't significant to the query anyway) and provide a test
case.  However, I just found another clue.

Since you were so confident it couldn't be the outer join, I went
looking for what else I changed at the same time.  I eliminated the code
referencing that table, which contained an OR.  I've seen ORs cause
nasty problems with optimizers in the past.  I took out the OR in the
where clause, without eliminating that last outer join, and it optimized
fine.

I'll hold off a bit to see if you still need the test case.   ;-)

-Kevin


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Index Usage using IN
Следующее
От: "Kevin Grittner"
Дата:
Сообщение: Re: Planner reluctant to start from subquery