Обсуждение: How to refer to computed columns from other computed columns?

Поиск
Список
Период
Сортировка

How to refer to computed columns from other computed columns?

От
Matthew Wilson
Дата:
I'm converting some procedural code to SQL as an experiment.  Here's the
pseudocode:

    c = a - b
    if c < 0 then d = 'no'
    else d = 'yes'

In SQL, I've got this:

    select a, b, a - b as c,
    case when a - b < 0 then 'no'
    else 'yes'
    end as d

    from foo;

This is a trivial example, but you can see how I calculate a - b two
separate times.

In reality, I have much nastier calculations and they happen more than
just twice.

I'm looking for an elegant solution for this puzzle.  I don't want to
repeat that a - b part over and over because I likely will need to
change how c gets defined and I don't want to have to change more than
one place in the code.

All I can come up with so far is to use a view and then another view on
top of that one:

    create view v1 as
    select a, b, a - b as c
    from foo;

    create view v2 as
    select a, b, c,
    case when c < 0 then 'no'
    else 'yes'
    end as d
    from v1;

This is better than the first solution because c is only defined in a
single place.  Is this the best possible solution?

Thanks for the help.

Matt

Re: How to refer to computed columns from other computed columns?

От
Tom Lane
Дата:
Matthew Wilson <matt@tplus1.com> writes:
> All I can come up with so far is to use a view and then another view on
> top of that one:

Note that you don't actually need a view, as you can just write the
subselect in-line:

     select a, b, c,
     case when c < 0 then 'no'
     else 'yes'
     end as d
     from (select a, b, a - b as c from foo) as v1;

This is the standard method for avoiding repeat calculations in SQL.

One thing to keep in mind is that the planner will usually try to
"flatten" a nested sub-select (and whether it was written out manually
or pulled from a view does not matter here).  This will result in the
sub-select's expressions getting inlined into the parent, so that the
calculations will actually get done more than once.  If you're trying
to reduce execution time not just manual labor, you may want to put an
"offset 0" into the sub-select to create an optimization fence.  But
test whether that really saves anything --- if there are bigger joins
or additional WHERE conditions involved, you can easily lose more than
you gain by preventing flattening.

            regards, tom lane

Re: How to refer to computed columns from other computed columns?

От
Eric Ndengang
Дата:
Am 16.08.2010 14:45, schrieb Matthew Wilson:
> I'm converting some procedural code to SQL as an experiment.  Here's the
> pseudocode:
>
>      c = a - b
>      if c<  0 then d = 'no'
>      else d = 'yes'
>
> In SQL, I've got this:
>
>      select a, b, a - b as c,
>      case when a - b<  0 then 'no'
>      else 'yes'
>      end as d
>
>      from foo;
>
> This is a trivial example, but you can see how I calculate a - b two
> separate times.
>
> In reality, I have much nastier calculations and they happen more than
> just twice.
>
> I'm looking for an elegant solution for this puzzle.  I don't want to
> repeat that a - b part over and over because I likely will need to
> change how c gets defined and I don't want to have to change more than
> one place in the code.
>
> All I can come up with so far is to use a view and then another view on
> top of that one:
>
>      create view v1 as
>      select a, b, a - b as c
>      from foo;
>
>      create view v2 as
>      select a, b, c,
>      case when c<  0 then 'no'
>      else 'yes'
>      end as d
>      from v1;
>
> This is better than the first solution because c is only defined in a
> single place.  Is this the best possible solution?
>
> Thanks for the help.
>
> Matt
>
>
>
You can also use the ' with Queries ' option to solve this Problem like
this:

with table_1 as (select a,b, a-b as c from foo)
Select a,b, c,
case when c<0 then 'no'
          else 'yes' end as d
          from table_1;
I hope , it will help you

--
Eric Ndengang
Junior Datenbankentwickler

Affinitas GmbH  |  Kohlfurter Straße 41/43  |  10999 Berlin  |  Germany
email: eric.ndengang_foyet@affinitas.de  | tel: +49.(0)30. 991 949 5 0  |  www.edarling.de

Geschäftsführer: Lukas Brosseder, David Khalil, Kai Rieke, Christian Vollmann
Eingetragen beim Amtsgericht Berlin, HRB 115958


Re: How to refer to computed columns from other computed columns?

От
Matthew Wilson
Дата:
On Mon Aug 16 10:26:36 2010, Tom Lane wrote:
> Matthew Wilson <matt@tplus1.com> writes:
>> All I can come up with so far is to use a view and then another view on
>> top of that one:
>
> Note that you don't actually need a view, as you can just write the
> subselect in-line:
>
>      select a, b, c,
>      case when c < 0 then 'no'
>      else 'yes'
>      end as d
>      from (select a, b, a - b as c from foo) as v1;
>
> This is the standard method for avoiding repeat calculations in SQL.
>
> One thing to keep in mind is that the planner will usually try to
> "flatten" a nested sub-select (and whether it was written out manually
> or pulled from a view does not matter here).  This will result in the
> sub-select's expressions getting inlined into the parent, so that the
> calculations will actually get done more than once.  If you're trying
> to reduce execution time not just manual labor, you may want to put an
> "offset 0" into the sub-select to create an optimization fence.  But
> test whether that really saves anything --- if there are bigger joins
> or additional WHERE conditions involved, you can easily lose more than
> you gain by preventing flattening.
>
>             regards, tom lane
>

Thanks so much for the help!

I don't care if the code is rearranged so that c is replaced with an
inline definition during compilation.  I'm not concerned about
efficiency here.  I just don't want to have to redefine it manually over
and over again, because I know that as I update how c is defined, I'll
forget to update it everywhere.

Maybe sql needs a preprocessing macro language like C.

<ducks>

Matt

Re: How to refer to computed columns from other computed columns?

От
Boszormenyi Zoltan
Дата:
Matthew Wilson írta:
> On Mon Aug 16 10:26:36 2010, Tom Lane wrote:
>
>> Matthew Wilson <matt@tplus1.com> writes:
>>
>>> All I can come up with so far is to use a view and then another view on
>>> top of that one:
>>>
>> Note that you don't actually need a view, as you can just write the
>> subselect in-line:
>>
>>      select a, b, c,
>>      case when c < 0 then 'no'
>>      else 'yes'
>>      end as d
>>      from (select a, b, a - b as c from foo) as v1;
>>
>> This is the standard method for avoiding repeat calculations in SQL.
>>
>> One thing to keep in mind is that the planner will usually try to
>> "flatten" a nested sub-select (and whether it was written out manually
>> or pulled from a view does not matter here).  This will result in the
>> sub-select's expressions getting inlined into the parent, so that the
>> calculations will actually get done more than once.  If you're trying
>> to reduce execution time not just manual labor, you may want to put an
>> "offset 0" into the sub-select to create an optimization fence.  But
>> test whether that really saves anything --- if there are bigger joins
>> or additional WHERE conditions involved, you can easily lose more than
>> you gain by preventing flattening.
>>
>>             regards, tom lane
>>
>>
>
> Thanks so much for the help!
>
> I don't care if the code is rearranged so that c is replaced with an
> inline definition during compilation.  I'm not concerned about
> efficiency here.  I just don't want to have to redefine it manually over
> and over again, because I know that as I update how c is defined, I'll
> forget to update it everywhere.
>
> Maybe sql needs a preprocessing macro language like C.
>

Or maybe we can dust off my GENERATED column patch
I posted here in 2006. :-)

Best regards,
Zoltán Böszörményi


Re: How to refer to computed columns from other computed columns?

От
Thom Brown
Дата:
2010/8/16 Boszormenyi Zoltan <zb@cybertec.at>:
> Matthew Wilson írta:
>> On Mon Aug 16 10:26:36 2010, Tom Lane wrote:
>>
>>> Matthew Wilson <matt@tplus1.com> writes:
>>>
>>>> All I can come up with so far is to use a view and then another view on
>>>> top of that one:
>>>>
>>> Note that you don't actually need a view, as you can just write the
>>> subselect in-line:
>>>
>>>      select a, b, c,
>>>      case when c < 0 then 'no'
>>>      else 'yes'
>>>      end as d
>>>      from (select a, b, a - b as c from foo) as v1;
>>>
>>> This is the standard method for avoiding repeat calculations in SQL.
>>>
>>> One thing to keep in mind is that the planner will usually try to
>>> "flatten" a nested sub-select (and whether it was written out manually
>>> or pulled from a view does not matter here).  This will result in the
>>> sub-select's expressions getting inlined into the parent, so that the
>>> calculations will actually get done more than once.  If you're trying
>>> to reduce execution time not just manual labor, you may want to put an
>>> "offset 0" into the sub-select to create an optimization fence.  But
>>> test whether that really saves anything --- if there are bigger joins
>>> or additional WHERE conditions involved, you can easily lose more than
>>> you gain by preventing flattening.
>>>
>>>                      regards, tom lane
>>>
>>>
>>
>> Thanks so much for the help!
>>
>> I don't care if the code is rearranged so that c is replaced with an
>> inline definition during compilation.  I'm not concerned about
>> efficiency here.  I just don't want to have to redefine it manually over
>> and over again, because I know that as I update how c is defined, I'll
>> forget to update it everywhere.
>>
>> Maybe sql needs a preprocessing macro language like C.
>>
>
> Or maybe we can dust off my GENERATED column patch
> I posted here in 2006. :-)
>
> Best regards,
> Zoltán Böszörményi
>

You mean this?:
http://archives.postgresql.org/pgsql-hackers/2006-07/msg00543.php
And this?: http://archives.postgresql.org/pgsql-hackers/2006-08/msg00038.php
And this?: http://archives.postgresql.org/pgsql-patches/2007-04/msg00107.php

--
Thom Brown
Registered Linux user: #516935

Re: How to refer to computed columns from other computed columns?

От
Alvaro Herrera
Дата:
Excerpts from Boszormenyi Zoltan's message of lun ago 16 14:45:07 -0400 2010:
> Matthew Wilson írta:

> > I don't care if the code is rearranged so that c is replaced with an
> > inline definition during compilation.  I'm not concerned about
> > efficiency here.  I just don't want to have to redefine it manually over
> > and over again, because I know that as I update how c is defined, I'll
> > forget to update it everywhere.
> >
> > Maybe sql needs a preprocessing macro language like C.
> >
>
> Or maybe we can dust off my GENERATED column patch
> I posted here in 2006. :-)

Hmm, that seems entirely unrelated ...

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: How to refer to computed columns from other computed columns?

От
Boszormenyi Zoltan
Дата:
Alvaro Herrera írta:
> Excerpts from Boszormenyi Zoltan's message of lun ago 16 14:45:07 -0400 2010:
>
>> Matthew Wilson írta:
>>
>
>
>>> I don't care if the code is rearranged so that c is replaced with an
>>> inline definition during compilation.  I'm not concerned about
>>> efficiency here.  I just don't want to have to redefine it manually over
>>> and over again, because I know that as I update how c is defined, I'll
>>> forget to update it everywhere.
>>>
>>> Maybe sql needs a preprocessing macro language like C.
>>>
>>>
>> Or maybe we can dust off my GENERATED column patch
>> I posted here in 2006. :-)
>>
>
> Hmm, that seems entirely unrelated ...
>

What makes you think so? A generated column would put
the work into INSERT and UPDATE statements, SELECTs
would be faster and this way re-typing the same expression
would be avoided. The generated column's definition is defined
at one central place, with the type modifier on such a column in
CREATE or ALTER TABLE , so the problem of the OP
would be also solved.

There was only one drawback, as Tom Lane pointed out a while back,
but this was explicitely covered by the SQL standard at the time,
it said that before triggers cannot look at the content of the generated
columns.

And with HOT and no indexes on the generated column, most
of the bloat would also be avoided that comes from the extra
internal UPDATE that such a column would introduce.