Re: Performance penalty when using WITH

Поиск
Список
Период
Сортировка
От Robert Klemme
Тема Re: Performance penalty when using WITH
Дата
Msg-id CAM9pMnNpZ4Sq5okZYGJdojki16a1nsLXXr-UuFDWJT5UfAoAHQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Performance penalty when using WITH  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-performance
On Wed, Aug 3, 2011 at 6:24 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Aug 3, 2011 at 2:18 AM, Robert Klemme
> <shortcutter@googlemail.com> wrote:
>>>> Another observation: That criterion looks suspicious to me. I would
>>>> expect any RDBMS to be better able to optimize this:
>>>>
>>>> WHERE firstname = 'Eddie' AND lastname like 'T%'
>>>>
>>>> I know it's semantically not the same but I would assume this is good
>>>> enough for the common usecase.  Plus, if there is an index on
>>>> (firstname, lastname) then that could be used.
>>>
>>> disagree. just one of the ways that could be stymied would to change
>>> the function behind the '||' operator.
>>
>> I don't understand what you mean.  Can you please elaborate?
>>
>> To explain my point a bit: I meant that by querying individual fields
>> separately instead of applying a criterion on a function of the two
>> the RDBMS has a better chance to use indexes and come up with a better
>> plan for this part of the query.
>
> Yes, but your assuming that it is safe and generally advantageous to
> do that.  Both assumptions I think are false.

I am not sure why you say I assume this is _safe_.  I said it is "good
enough for the common usecase".  And it is certainly good enough for
this particular query.

As for the "generally advantageous" I'd say that an index on "raw"
column values is usually useful for more queries than an index on a
specific function.  That's why I'd say generally an index on column
values is more versatile and I would prefer it.  Of course you might
achieve orders of magnitude of speedup for individual queries with an
index on a function tailored to that particular query but if you need
to do that for multiple queries you pay a higher penalty for updates.

> The || operator is trivially hacked:
> create or replace function funky_concat(l text, r text) returns text as
> $$
>  select textcat(textcat($1, 'abc'), $2);
> $$ language sql immutable ;
>
> update pg_operator set oprcode = 'funky_concat' where oid = 654;
>
> postgres=# select 'a' || 'b';
> ?column?
> ----------
>  aabcb
> (1 row)
>
> Also even ignoring the above it's not free to have the database try
> and analyze every instance of the || operator to see if it can be
> decomposed to boolean field operations.

Even with your hacked operator you would need an index on the
expression to make it efficient.  That could be done with the original
|| as well.  But my point was to query

WHERE a = 'foo' and b like 'b%'
instead of WHERE a || ' ' || substring(b, 1, 1) = 'foo b'

to use an index on (a,b).  That index would also be useful for queries like

WHERE a = 'foo'
WHERE a like 'fo%'
WHERE a = 'foo' and b = 'bar'

and probably also

WHERE a > 'foo'
WHERE a > 'foo' and b like 'b%'
WHERE a > 'foo' and b = 'bar'

Kind regards

robert


PS: Sorry for the earlier duplicate.  Gmail had a hickup.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Kevin Grittner"
Дата:
Сообщение: Re: Postgres performance on Linux and Windows
Следующее
От: Jayadevan M
Дата:
Сообщение: Re: Parameters for PostgreSQL