Re: Disk buffering of resultsets

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: Disk buffering of resultsets
Дата
Msg-id 5432019C.7000401@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Disk buffering of resultsets  (Vitalii Tymchyshyn <vit@tym.im>)
Список pgsql-jdbc
On 10/06/2014 10:22 AM, Vitalii Tymchyshyn wrote:
> Hello.
>
> As of routing I meant exactly attaching to original statement result
> belongs to and throwing an error as soon as someone get to correct point
> calling this original statement.
> As of threading I tend to agree with you. It can be revised in Java 8
> where there is a default executor that can be used.
>
> The primary problem as I can see it are OOMs on large result sets.
> Currently the workaround is to use fetchSize, but it works only in very
> limited scenarios.

Specifically, autocommit=off and only one active Statement. Right?

The PostgreSQL backend its self supports multiple open portals, but only
within the same open transaction, and if any statement causes an error
all portals are invalidated. So we can't really use that if you want
multiple *independent* statements.

The usual answer would be WITH HOLD cursors (or a portal-level
equivalent). There's a server-side impact to them, but it's probably the
first thing to try.

Failing that, as you say, we'd have to fetch the whole result set to the
client and store it off-heap, which is where the abstraction I've talked
about on this mailing list thread comes in.

An issue here is that PgJDBC currently has some restrictions around
multiple application threads using a single Connection at the same time.
If you're working with multiple statements you're quite possibly also
working with multiple threads, right?

> I can see two ways here: improve fetchSize handling (potentially setting
> default fetchSize to some value like 10000) or storing result sets out
> of heap.
> One more thing to remember is time to get first row. It would be great
> to have first fast enough  without reading, parsing and storing
> somewhere all the 1000000 rows.

... which is where the issues with threading come in, because you want
to fetch some results, return them to the client, and continue
processing the rest of the results in a helper thread in the driver.

I think we'd have to get the client app involved more directly in that,
by exposing partial fetches more directly to the client and allowing it
to ask us (possibly via a separate thread) to consume more results and
append them to the result set. We'd have to deal with the resulting
locking issues carefully.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


В списке pgsql-jdbc по дате отправления:

Предыдущее
От: Vitalii Tymchyshyn
Дата:
Сообщение: Re: Disk buffering of resultsets
Следующее
От: Swapna Shetty
Дата:
Сообщение: Network error