Обсуждение: View Index and UNION

Поиск
Список
Период
Сортировка

View Index and UNION

От
Stefan Keller
Дата:
Hi

I've encountered a fundamental problem which - to me - can only be
solved with an (future/possible) real index on views in PostgreSQL
(like the exist already in MS SQL Server and Ora):

Given following schema:

1. TABLE a and TABLE b, each with INDEX on attribute geom.

2. A VIEW with union:

CREATE VIEW myview AS SELECT * FROM a UNION SELECT * FROM b;

3. And a simple query with KNN index and a coordinate "mypos" :

SELECT * FROM myview
ORDER BY ST_Geomfromtext(mypos) <-> myview.geom

Now, the problem is, that for the "order by" it is not enough that
each on the two tables calculate the ordering separately: We want a
total ordering over all involved tables!

In fact, the planner realizes that and chooses a seq scan over all
tuples of table a and b - which is slow and suboptimal!

To me, that's a use case where we would wish to have a distinct index on views.

Any opinions on this?

Yours, Stefan



Re: View Index and UNION

От
William King
Дата:
Could this scenario not be handled by a step that orders the two tables
independently, then for the view interleaves the presorted results?
Merging two sorted sets into a single sorted set is usually a trivial
task, and it could still take advantage of the existing indexes.

William King
Senior Engineer
Quentus Technologies, INC
1037 NE 65th St Suite 273
Seattle, WA 98115
Main:   (877) 211-9337
Office: (206) 388-4772
Cell:   (253) 686-5518
william.king@quentustech.com

On 05/25/2013 05:35 PM, Stefan Keller wrote:
> Hi
> 
> I've encountered a fundamental problem which - to me - can only be
> solved with an (future/possible) real index on views in PostgreSQL
> (like the exist already in MS SQL Server and Ora):
> 
> Given following schema:
> 
> 1. TABLE a and TABLE b, each with INDEX on attribute geom.
> 
> 2. A VIEW with union:
> 
> CREATE VIEW myview AS
>   SELECT * FROM a
>   UNION
>   SELECT * FROM b;
> 
> 3. And a simple query with KNN index and a coordinate "mypos" :
> 
> SELECT * FROM myview
> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom
> 
> Now, the problem is, that for the "order by" it is not enough that
> each on the two tables calculate the ordering separately: We want a
> total ordering over all involved tables!
> 
> In fact, the planner realizes that and chooses a seq scan over all
> tuples of table a and b - which is slow and suboptimal!
> 
> To me, that's a use case where we would wish to have a distinct index on views.
> 
> Any opinions on this?
> 
> Yours, Stefan
> 
> 



Re: View Index and UNION

От
Stefan Keller
Дата:
Yes, it actually does, but the planner chooses a seq scan to prepare for that.

-S.

2013/5/26 William King <william.king@quentustech.com>:
> Could this scenario not be handled by a step that orders the two tables
> independently, then for the view interleaves the presorted results?
> Merging two sorted sets into a single sorted set is usually a trivial
> task, and it could still take advantage of the existing indexes.
>
> William King
> Senior Engineer
> Quentus Technologies, INC
> 1037 NE 65th St Suite 273
> Seattle, WA 98115
> Main:   (877) 211-9337
> Office: (206) 388-4772
> Cell:   (253) 686-5518
> william.king@quentustech.com
>
> On 05/25/2013 05:35 PM, Stefan Keller wrote:
>> Hi
>>
>> I've encountered a fundamental problem which - to me - can only be
>> solved with an (future/possible) real index on views in PostgreSQL
>> (like the exist already in MS SQL Server and Ora):
>>
>> Given following schema:
>>
>> 1. TABLE a and TABLE b, each with INDEX on attribute geom.
>>
>> 2. A VIEW with union:
>>
>> CREATE VIEW myview AS
>>   SELECT * FROM a
>>   UNION
>>   SELECT * FROM b;
>>
>> 3. And a simple query with KNN index and a coordinate "mypos" :
>>
>> SELECT * FROM myview
>> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom
>>
>> Now, the problem is, that for the "order by" it is not enough that
>> each on the two tables calculate the ordering separately: We want a
>> total ordering over all involved tables!
>>
>> In fact, the planner realizes that and chooses a seq scan over all
>> tuples of table a and b - which is slow and suboptimal!
>>
>> To me, that's a use case where we would wish to have a distinct index on views.
>>
>> Any opinions on this?
>>
>> Yours, Stefan
>>
>>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



Re: View Index and UNION

От
William King
Дата:
I appear to have been able to replicate what you are talking about, but
it required explicitly binding the order by in different ways. See
attached files.

William King
Senior Engineer
Quentus Technologies, INC
1037 NE 65th St Suite 273
Seattle, WA 98115
Main:   (877) 211-9337
Office: (206) 388-4772
Cell:   (253) 686-5518
william.king@quentustech.com

On 05/26/2013 02:22 AM, Stefan Keller wrote:
> Yes, it actually does, but the planner chooses a seq scan to prepare for that.
>
> -S.
>
> 2013/5/26 William King <william.king@quentustech.com>:
>> Could this scenario not be handled by a step that orders the two tables
>> independently, then for the view interleaves the presorted results?
>> Merging two sorted sets into a single sorted set is usually a trivial
>> task, and it could still take advantage of the existing indexes.
>>
>> William King
>> Senior Engineer
>> Quentus Technologies, INC
>> 1037 NE 65th St Suite 273
>> Seattle, WA 98115
>> Main:   (877) 211-9337
>> Office: (206) 388-4772
>> Cell:   (253) 686-5518
>> william.king@quentustech.com
>>
>> On 05/25/2013 05:35 PM, Stefan Keller wrote:
>>> Hi
>>>
>>> I've encountered a fundamental problem which - to me - can only be
>>> solved with an (future/possible) real index on views in PostgreSQL
>>> (like the exist already in MS SQL Server and Ora):
>>>
>>> Given following schema:
>>>
>>> 1. TABLE a and TABLE b, each with INDEX on attribute geom.
>>>
>>> 2. A VIEW with union:
>>>
>>> CREATE VIEW myview AS
>>>   SELECT * FROM a
>>>   UNION
>>>   SELECT * FROM b;
>>>
>>> 3. And a simple query with KNN index and a coordinate "mypos" :
>>>
>>> SELECT * FROM myview
>>> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom
>>>
>>> Now, the problem is, that for the "order by" it is not enough that
>>> each on the two tables calculate the ordering separately: We want a
>>> total ordering over all involved tables!
>>>
>>> In fact, the planner realizes that and chooses a seq scan over all
>>> tuples of table a and b - which is slow and suboptimal!
>>>
>>> To me, that's a use case where we would wish to have a distinct index on views.
>>>
>>> Any opinions on this?
>>>
>>> Yours, Stefan
>>>
>>>
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: View Index and UNION

От
Tom Lane
Дата:
Stefan Keller <sfkeller@gmail.com> writes:
> Given following schema:

> 1. TABLE a and TABLE b, each with INDEX on attribute geom.

> 2. A VIEW with union:

> CREATE VIEW myview AS
>   SELECT * FROM a
>   UNION
>   SELECT * FROM b;

> 3. And a simple query with KNN index and a coordinate "mypos" :

> SELECT * FROM myview
> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom

I think this would work out-of-the-box in 9.1 or later, if you
made the view use UNION ALL instead of UNION.
        regards, tom lane



Re: View Index and UNION

От
Stefan Keller
Дата:
Hi Tom

You are right: UNION ALL is correct in terms of contents (tables
contents are disjunct) and of performance (no separate sort required
theoretically).
In my specific case even with UNION ALL the planner still chose a "Seq Scan".
Note that there is a KNN index with "ORDER BY ... <-> ..." involved.
I have to dig into my tests in order to give you the EXPLAIN ANALYZE.

Yours, Stefan


2013/5/26 Tom Lane <tgl@sss.pgh.pa.us>:
> Stefan Keller <sfkeller@gmail.com> writes:
>> Given following schema:
>
>> 1. TABLE a and TABLE b, each with INDEX on attribute geom.
>
>> 2. A VIEW with union:
>
>> CREATE VIEW myview AS
>>   SELECT * FROM a
>>   UNION
>>   SELECT * FROM b;
>
>> 3. And a simple query with KNN index and a coordinate "mypos" :
>
>> SELECT * FROM myview
>> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom
>
> I think this would work out-of-the-box in 9.1 or later, if you
> made the view use UNION ALL instead of UNION.
>
>                         regards, tom lane