Re: [PERFORM] Correct use of cursors for very large result sets in Postgres

Поиск
Список
Период
Сортировка
От Mike Beaton
Тема Re: [PERFORM] Correct use of cursors for very large result sets in Postgres
Дата
Msg-id CAHzAAWTRCdbNkQqfS74jAtoT8ThBnQAuDm+=Uw-JOfgADQ3jtQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PERFORM] Correct use of cursors for very large result sets in Postgres  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [PERFORM] Correct use of cursors for very large result sets in Postgres  (Mike Beaton <mjsbeaton@gmail.com>)
Re: [PERFORM] Correct use of cursors for very large result sets in Postgres  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
> Seems odd.  Is your cursor just on "SELECT * FROM table", or is there
> some processing in there you're not mentioning?  Maybe it's a cursor
> WITH HOLD and you're exiting the source transaction?

Hi Tom,

I've deleted my own Stack Overflow answer in favour of Laurenz Albe's one.

New TL;DR (I'm afraid): PostgreSQL is always generating a huge buffer file on `FETCH ALL FROM CursorToHuge`.

The test data is created by:

`SELECT * INTO large FROM generate_series(1, 10000000) id;`

The test function to generate the cursor is:

````
CREATE OR REPLACE FUNCTION lump() RETURNS refcursor
   LANGUAGE plpgsql AS
$$DECLARE
   c CURSOR FOR SELECT id FROM large;
BEGIN
   c := 'c';
   OPEN c;
   RETURN c;
END;$$;
````

The two tests are:

`SELECT * FROM large;`

Result: no buffer file.

And:

````
BEGIN;
SELECT lump();
FETCH ALL FROM c;
COMMIT;
````

Result: 14MB buffer, every time.

The buffer file appears in `base\pgsql_tmp` while the data is streaming but only appears in the Postgres log file at the point when it is released (which makes sense, as its final size is part of the log row).

This has the additionally confusing result that the buffer file is reported in the Postgres logs just before the user sees the first row of data on `psql` (and on anything using `libpq`), but just after the user sees the last row of data, on any client program which is streaming the data via a streaming data access layer (such as `Npgsql`, or `JDBC` with the right configuration).

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [PERFORM] Correct use of cursors for very large result sets in Postgres
Следующее
От: Diego Vargas
Дата:
Сообщение: [PERFORM] Query Performance