Обсуждение: How to write a function that manipulates a set of results
Hi I'm passing this on second hand so forgive me if I'm a bit vague... My boss is trying to write a function in PL/pgsql that manipulates a small, temporary set of data. It fetches about 20 rows from a table of approx 4 million, then it needs to do further calculations to narrow them down further. Previously we have only written database code for SQL Server, and there we would use a table variable. Apparently temporary tables are not equivalent and not suitable for this. The closest I've come from reading the docs is functions that return SETOF. However I can't see any way to query the result set like a table. Can anyone offer any hints/links? Thanks Ashley
Ashley Moran <work@ashleymoran.me.uk> schrieb: > The closest I've come from reading the docs is functions that return > SETOF. However I can't see any way to query the result set like a table. Right. > Can anyone offer any hints/links? select * from <insert_your_srf-function_here>. For instance, if your SRF-function is called foo(), do: select * from foo(); Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknow) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On Mar 14, 2007, at 6:02 pm, Andreas Kretschmer wrote: > select * from <insert_your_srf-function_here>. > > For instance, if your SRF-function is called foo(), do: > > select * from foo(); Andreas, It's more complicated than that. What we need to do is something along the lines of: results = SELECT * FROM foo(); DELETE FROM results WHERE (some condition involving results); some_value = SELECT value FROM results WHERE (etc); and so on... All of which is easy with table variable, but I can't see how to translate it to PL/pgsql. Is there any way to manipulate result sets in a set-based manner like this? Ashley
Ashley Moran <work@ashleymoran.me.uk> schrieb: > > On Mar 14, 2007, at 6:02 pm, Andreas Kretschmer wrote: > >select * from <insert_your_srf-function_here>. > >For instance, if your SRF-function is called foo(), do: > >select * from foo(); > > > Andreas, > > It's more complicated than that. What we need to do is something along > the lines of: > > results = SELECT * FROM foo(); Try something like: create results as SELECT * FROM foo(); Than you can do: > DELETE FROM results WHERE (some condition involving results); > some_value = SELECT value FROM results WHERE (etc); Try: create table some_value as SELECT value FROM results WHERE (etc); > > and so on... Yes, and so on... > > All of which is easy with table variable, but I can't see how to translate > it to PL/pgsql. Is there any way to manipulate result sets in a set-based > manner like this? You can do a lot in pl/pgsql. For some things, for instance dynamically created SQLs, you need EXECUTE. But you can do a lot with pl/pgsql. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknow) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
Ashley Moran <work@ashleymoran.me.uk> writes:
> ... All of which is easy with table variable, but I can't see how to
> translate it to PL/pgsql.  Is there any way to manipulate result sets
> in a set-based manner like this?
Sure: use a table.  What was the objection to temp tables exactly?
            regards, tom lane
			
		Ashley Moran wrote: > It fetches about 20 rows from a table of approx 4 million, then it > needs to do further calculations to narrow them down further. > Previously we have only written database code for SQL Server, and > there we would use a table variable. Apparently temporary tables are > not equivalent and not suitable for this. > > Seems like a temp table with a insert into would work for you. insert into mytemptable (field1,field2) select field1,field2 from sometable where field1 = 5; Then you can query the new temp table anyway you would like while you are still in the function. -- Tony Caduto AM Software Design http://www.amsoftwaredesign.com Home of PG Lightning Admin for Postgresql Your best bet for Postgresql Administration
On Mar 14, 2007, at 7:25 pm, Tom Lane wrote: > Sure: use a table. What was the objection to temp tables exactly? My boss is under the impression that calling the procedure twice on the same connection creates a name conflict for the temporary table. Also I'm unsure how the use of temporary tables will affect the running of the query. This function will be called up to 400 times by a wrapping function that needs to complete quickly (relatively - the calculations need to return in a few seconds). I haven't had chance to read much about PG temp tables yet, so I don't know how the implementation works. Are small tables stored in RAM, or does creating a temporary table always force a disk write? Sorry my sheer ignorance here. I have already written a working, fully tested - albeit unoptimised - implementation of this algorithm in Ruby, accessible as a basic (HTTP GET) web service. However, a developer here believes that the overhead of an HTTP connection over our gigabit LAN will add too much to the (> 1 second) running time of the calculations. This means that for political reasons we are forced to rewrite it as a stored procedure that we can call directly from application code. The project is urgent, the aforementioned developer is now on holiday, and so my non-developer boss is scrabbling to learn PL/pgsql and I am forced to bombard the list with inane questions. Thanks for your patience :) Ashley
> Sorry my sheer ignorance here.  I have already written a working, fully
> tested - albeit unoptimised - implementation of this algorithm in Ruby,
> accessible as a basic (HTTP GET) web service.  However, a developer here
> believes that the overhead of an HTTP connection over our gigabit LAN
> will add too much to the (> 1 second) running time of the calculations.
> This means that for political reasons we are forced to rewrite it as a
> stored procedure that we can call directly from application code.  The
> project is urgent, the aforementioned developer is now on holiday, and
> so my non-developer boss is scrabbling to learn PL/pgsql and I am forced
> to bombard the list with inane questions.  Thanks for your patience :)
Use plperl :) (seriously). Or plruby, there is one.
Joshua D. Drake
>
> Ashley
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match
>
--
      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/
			
		Ashley Moran wrote: > > On Mar 14, 2007, at 7:25 pm, Tom Lane wrote: > >Sure: use a table. What was the objection to temp tables exactly? > > My boss is under the impression that calling the procedure twice on > the same connection creates a name conflict for the temporary table. > > Also I'm unsure how the use of temporary tables will affect the > running of the query. This function will be called up to 400 times > by a wrapping function that needs to complete quickly (relatively - > the calculations need to return in a few seconds). I haven't had > chance to read much about PG temp tables yet, so I don't know how the > implementation works. Are small tables stored in RAM, or does > creating a temporary table always force a disk write? Temp tables stay in RAM until they are bigger than temp_buffers. If you need them to be big and quick, maybe it would be appropriate to use indexes (note these count towards temp_buffers), ANALYZE, etc. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Ashley Moran wrote:
>> Also I'm unsure how the use of temporary tables will affect the
>> running of the query.  This function will be called up to 400 times
>> by a wrapping function that needs to complete quickly (relatively -
>> the calculations need to return in a few seconds).  I haven't had
>> chance to read much about PG temp tables yet, so I don't know how the
>> implementation works.  Are small tables stored in RAM, or does
>> creating a temporary table always force a disk write?
> Temp tables stay in RAM until they are bigger than temp_buffers.  If you
> need them to be big and quick, maybe it would be appropriate to use
> indexes (note these count towards temp_buffers), ANALYZE, etc.
You do need to realize that creation of a temp table involves making
entries in the system catalogs.  If you can set it up so that you reuse
the same temp table(s) for the life of a connection, you'll save a lot
of thrashing and need for catalog vacuuming (the ON COMMIT DELETE ROWS
option for temp tables might help here).  Other than that gotcha, they
should be pretty efficient.
            regards, tom lane
			
		On 15 Mar 2007, at 00:21, Tom Lane wrote: >> Temp tables stay in RAM until they are bigger than temp_buffers. >> If you >> need them to be big and quick, maybe it would be appropriate to use >> indexes (note these count towards temp_buffers), ANALYZE, etc. > > You do need to realize that creation of a temp table involves making > entries in the system catalogs. If you can set it up so that you > reuse > the same temp table(s) for the life of a connection, you'll save a lot > of thrashing and need for catalog vacuuming (the ON COMMIT DELETE ROWS > option for temp tables might help here). Other than that gotcha, they > should be pretty efficient. Thanks for these tips, they are extremely useful
On Wed, 14 Mar 2007 18:50:27 +0000, work@ashleymoran.me.uk (Ashley Moran) wrote: in <B5B55C8F-8C52-48A5-B8D9-8B071681299D@ashleymoran.me.uk> >It's more complicated than that. What we need to do is something >along the lines of: > >results = SELECT * FROM foo(); >DELETE FROM results WHERE (some condition involving results); >some_value = SELECT value FROM results WHERE (etc); > >and so on... > >All of which is easy with table variable, but I can't see how to >translate it to PL/pgsql. Is there any way to manipulate result sets >in a set-based manner like this? A table returning function or SRF can be used in joins with other tables or subqueries. In fact, you can use it in either of two formats: If the SRF returns a native data type then you can use just the function name. Consider the function foo() which returns INTEGER. CREATE OR REPLACE FUNCTION foo () RETURNS SETOF INTEGER AS SELECT * FROM foo() F INNER JOIN some_table T ON F=T.id; If the SRF returns a composite type then you can use the function name qualified by any of the members of the list of types. SELECT * FROM foo() F INNER JOIN some_table T ON F.num=T.id; --- Stefan Berglund