I want to change libpq and libpgtcl for better handling of large query results
От | Constantin Teodorescu |
---|---|
Тема | I want to change libpq and libpgtcl for better handling of large query results |
Дата | |
Msg-id | 34B129E1.538A95D3@flex.ro обсуждение исходный текст |
Ответы |
Re: [HACKERS] I want to change libpq and libpgtcl for better handling of large query results
(Bruce Momjian <maillist@candle.pha.pa.us>)
Re: [HACKERS] I want to change libpq and libpgtcl for better handling of large query results (The Hermit Hacker <scrappy@hub.org>) Re: [HACKERS] I want to change libpq and libpgtcl for better handling of large query results (Peter T Mount <psqlhack@maidast.demon.co.uk>) |
Список | pgsql-hackers |
Hello all, WARNING : It's a long mail, but please have patience and read it *all* I have reached a point in developing PgAccess when I discovered that some functions in libpgtcl are bad implemented and even libpq does not help me a lot! What's the problem ! Is about working with large queries result, big tables with thousand of records that have to be processed in order to get a full report for example. Getting a query result from Tcl/Tk (pg_select function) uses PQexec. But PQexec IS GETTING ALL THE RECORDS IN MEMORY and after that user can handle query results. But what if table has thousand records ? Probably I would need more than 512 Mb of RAM in order to get a report finished. Viorel Mitache from RENEL Braila, (mviorel@flex.ro, please cc him and me because we aren't on hacker list) proposed another sollution. With some small changes in libpq-fe.h ( void (* callback)(PGresult *,void *ptr,int stat); void *usr_ptr;) and also in libpq to allow a newly defined function in libpgtcl (pg_loop) to initiate a query and then calling back a user defined function after every record fetched from the connection. In order to do this, the connection is 'cloned' and on this new connection the query is issued. For every record fetched, the C callback function is called, here the Tcl interpreted is invoked for the source inside the loop, then memory used by the record is release and the next record is ready to come. More than that, after processing some records, user can choose to break the loop (using break command in Tcl) that is actually breaking the connection. What we achieve making this patches ? First of all the ability of sequential processing large tables. Then increasing performance due to parallel execution of receiving data on the network and local processing. The backend process on the server is filling the communication channel with data and the local task is processing it as it comes. In the old version, the local task has to wait until *all* data has comed (buffered in memory if it was room enough) and then processing it. What I would ask from you? 1) First of all, if my needs could be satisfied in other way with current functions in libpq of libpgtcl. I can assure you that with current libpgtcl is rather impossible. I am not sure if there is another mechanism using some subtle functions that I didn't know about them. 2) Then, if you agree with the idea, to whom we must send more accurate the changes that we would like to make in order to be analysed and checked for further development of Pg. 3) Is there any other normal mode to tell to the backend not to send any more tuples instead of breaking the connection ? 4) Even working in C, using PQexec , it's impossible to handle large query results, am I true ? Please cc to : mviorel@flex.ro and also teo@flex.ro -- Constantin Teodorescu FLEX Consulting Braila, ROMANIA
В списке pgsql-hackers по дате отправления:
Следующее
От: Bruce MomjianДата:
Сообщение: Re: Error messages/logging (Was: Re: [HACKERS] Re: [COMMITTERS] 'pgsql/src/backend/parser gram.y parse_oper.c')