Обсуждение: one column from huge view

От:
Marcin Stępnicki
Дата:

Hello.

I've googled a bit but I think I can't match the keywords, so I thought
I'll ask here:

Let's say I've got a view with 100 columns and 1mln rows; some of them are
calculated "on the fly". For some reason I want only one column from
this view:

select col1 from huge_view;

Now, does PostgreSQL skip all the calculations from other columns and
executes this query faster then select * from huge_view?

--
| And Do What You Will be the challenge | http://apcoln.linuxpl.org
|    So be it in love that harms none   | http://biznes.linux.pl
|   For this is the only commandment.   | http://www.juanperon.info
`---*  JID:  *---' http://www.naszedzieci.org

От:
Heikki Linnakangas
Дата:

Marcin Stępnicki wrote:
> Hello.
>
> I've googled a bit but I think I can't match the keywords, so I thought
> I'll ask here:
>
> Let's say I've got a view with 100 columns and 1mln rows; some of them are
> calculated "on the fly". For some reason I want only one column from
> this view:
>
> select col1 from huge_view;
>
> Now, does PostgreSQL skip all the calculations from other columns and
> executes this query faster then select * from huge_view?

In simple cases, yes. But for example, if you have a LEFT OUTER JOIN in
the view, the join is performed even if your query doesn't return any
columns from the outer relation. Also, if the calculation contains
immutable functions, it's not skipped.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

От:
Michael Fuhr
Дата:

On Thu, Jul 12, 2007 at 09:50:42AM +0100, Heikki Linnakangas wrote:
> Marcin Stępnicki wrote:
> >Let's say I've got a view with 100 columns and 1mln rows; some of them are
> >calculated "on the fly". For some reason I want only one column from
> >this view:
> >
> >select col1 from huge_view;
> >
> >Now, does PostgreSQL skip all the calculations from other columns and
> >executes this query faster then select * from huge_view?
>
> In simple cases, yes. But for example, if you have a LEFT OUTER JOIN in
> the view, the join is performed even if your query doesn't return any
> columns from the outer relation. Also, if the calculation contains
> immutable functions, it's not skipped.

Don't you mean "if the calculation contains VOLATILE functions,
it's not skipped"?

--
Michael Fuhr

От:
Heikki Linnakangas
Дата:

Michael Fuhr wrote:
> On Thu, Jul 12, 2007 at 09:50:42AM +0100, Heikki Linnakangas wrote:
>> Marcin Stępnicki wrote:
>>> Let's say I've got a view with 100 columns and 1mln rows; some of them are
>>> calculated "on the fly". For some reason I want only one column from
>>> this view:
>>>
>>> select col1 from huge_view;
>>>
>>> Now, does PostgreSQL skip all the calculations from other columns and
>>> executes this query faster then select * from huge_view?
>> In simple cases, yes. But for example, if you have a LEFT OUTER JOIN in
>> the view, the join is performed even if your query doesn't return any
>> columns from the outer relation. Also, if the calculation contains
>> immutable functions, it's not skipped.
>
> Don't you mean "if the calculation contains VOLATILE functions,
> it's not skipped"?

Yes, thanks for the correction.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

От:
Tom Lane
Дата:

Heikki Linnakangas <> writes:
> Marcin Stępnicki wrote:
>> Now, does PostgreSQL skip all the calculations from other columns and
>> executes this query faster then select * from huge_view?

> In simple cases, yes.

A rule of thumb is that it's been optimized if you don't see a "Subquery
Scan" node in the plan.  As an example:

regression=# create view v1 as select * from tenk1;
CREATE VIEW
regression=# create view v2 as select *,random() from tenk1;
CREATE VIEW
regression=# explain select unique1 from v1;
                        QUERY PLAN
-----------------------------------------------------------
 Seq Scan on tenk1  (cost=0.00..458.00 rows=10000 width=4)
(1 row)

regression=# explain select unique1 from v2;
                            QUERY PLAN
-------------------------------------------------------------------
 Subquery Scan v2  (cost=0.00..583.00 rows=10000 width=4)
   ->  Seq Scan on tenk1  (cost=0.00..483.00 rows=10000 width=244)
(2 rows)

If you want to look closer you can use EXPLAIN VERBOSE and count the
TARGETENTRY nodes in the targetlist for each plan node.  In the above
example, it's possible to see in the EXPLAIN VERBOSE output that the
Seq Scan node in the first plan is computing only the single variable
requested, whereas in the second plan the Seq Scan node is computing
all the outputs of the view (including the random() function call)
and then the Subquery Scan is projecting only a single column from
that result.

            regards, tom lane