I want to change libpq and libpgtcl for better handling of large query results

Поиск
Список
Период
Сортировка
От Constantin Teodorescu
Тема I want to change libpq and libpgtcl for better handling of large query results
Дата
Msg-id 34B129E1.538A95D3@flex.ro
обсуждение исходный текст
Ответы Re: [HACKERS] I want to change libpq and libpgtcl for better handling of large query results  (Bruce Momjian <maillist@candle.pha.pa.us>)
Re: [HACKERS] I want to change libpq and libpgtcl for better handling of large query results  (The Hermit Hacker <scrappy@hub.org>)
Re: [HACKERS] I want to change libpq and libpgtcl for better handling of large query results  (Peter T Mount <psqlhack@maidast.demon.co.uk>)
Список pgsql-hackers
Hello all,

WARNING : It's a long mail, but please have patience and read it *all*

I have reached a point in developing PgAccess when I discovered that
some functions in libpgtcl are bad implemented and even libpq does not
help me a lot!

What's the problem ! Is about working with large queries result, big
tables with thousand of records that have to be processed in order to
get a full report for example.

Getting a query result from Tcl/Tk (pg_select function) uses PQexec.
But PQexec IS GETTING ALL THE RECORDS IN MEMORY and after that user can
handle query results.
But what if table has thousand records ? Probably I would need more than
512 Mb of RAM in order to get a report finished.

Viorel Mitache from RENEL Braila, (mviorel@flex.ro, please cc him and me
because we aren't on hacker list) proposed another sollution.

With some small changes in libpq-fe.h

 (  void (* callback)(PGresult *,void *ptr,int stat);
    void *usr_ptr;)

and also in libpq to allow a newly defined function in libpgtcl
(pg_loop) to initiate a query and then calling back a user defined
function after every record fetched from the connection.

In order to do this, the connection is 'cloned' and on this new
connection the query is issued. For every record fetched, the C callback
function is called, here the Tcl interpreted is invoked for the source
inside the loop, then memory used by the record is release and the next
record is ready to come.
More than that, after processing some records, user can choose to break
the loop (using break command in Tcl) that is actually breaking the
connection.

What we achieve making this patches ?

First of all the ability of sequential processing large tables.
Then increasing performance due to parallel execution of receiving data
on the network and local processing. The backend process on the server
is filling the communication channel with data and the local task is
processing it as it comes.
In the old version, the local task has to wait until *all* data has
comed (buffered in memory if it was room enough) and then processing it.

What I would ask from you?
1) First of all, if my needs could be satisfied in other way with
current functions in libpq of libpgtcl. I can assure you that with
current libpgtcl is rather impossible. I am not sure if there is another
mechanism using some subtle functions that I didn't know about them.
2) Then, if you agree with the idea, to whom we must send more accurate
the changes that we would like to make in order to be analysed and
checked for further development of Pg.
3) Is there any other normal mode to tell to the backend not to send any
more tuples instead of breaking the connection ?
4) Even working in C, using PQexec , it's impossible to handle large
query results, am I true ?

Please cc to : mviorel@flex.ro      and also     teo@flex.ro

--
Constantin Teodorescu
FLEX Consulting Braila, ROMANIA

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Vadim B. Mikheev"
Дата:
Сообщение: Re: [HACKERS] subselect
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Error messages/logging (Was: Re: [HACKERS] Re: [COMMITTERS] 'pgsql/src/backend/parser gram.y parse_oper.c')