Обсуждение: where does postgres keep the query result until it is returned?

Поиск
Список
Период
Сортировка

where does postgres keep the query result until it is returned?

От
Willy-Bas Loos
Дата:
Hi,

Where does postgres keep the query result until it is returned?
In the shared_buffers?
Or in extra memory that was not previously allocated, or something else?

What if the query result becomes very large, so that it won't fit into memory?

cheers,

WBL

--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw

Re: where does postgres keep the query result until it is returned?

От
Sam Mason
Дата:
On Fri, Sep 17, 2010 at 03:00:36PM +0200, Willy-Bas Loos wrote:
> Where does postgres keep the query result until it is returned?
> In the shared_buffers?
> Or in extra memory that was not previously allocated, or something else?

Postgres, the server software, will spill large results (and any
intermediate working sets) to disk automatically as needed.  I believe
any memory allocated for this task will be up to work_mem in size.

The client side however isn't as intelligent, libpq will, by default,
try and read all records into memory and will crash if the results are
too large.  There are various ways of dealing with this, but haven't
tried myself.

--
  Sam  http://samason.me.uk/

Re: where does postgres keep the query result until it is returned?

От
Sam Mason
Дата:
On Fri, Sep 17, 2010 at 02:14:57PM +0100, Sam Mason wrote:
> Postgres, the server software, will spill large results (and any
> intermediate working sets) to disk automatically as needed.  I believe
> any memory allocated for this task will be up to work_mem in size.

That wasn't very clear was it; when I said "this task" I meant the task
of accumulating results before things spill.

--
  Sam  http://samason.me.uk/

Re: where does postgres keep the query result until it is returned?

От
Tom Lane
Дата:
Sam Mason <sam@samason.me.uk> writes:
> On Fri, Sep 17, 2010 at 02:14:57PM +0100, Sam Mason wrote:
>> Postgres, the server software, will spill large results (and any
>> intermediate working sets) to disk automatically as needed.  I believe
>> any memory allocated for this task will be up to work_mem in size.

> That wasn't very clear was it; when I said "this task" I meant the task
> of accumulating results before things spill.

If the question was about the final query result sent to the client:
we don't spill that to disk, nor "hold" it anywhere.  The backend
sends it to the client on-the-fly as each row is generated.  It's
the client's responsibility to cope if the result set is huge.
(As you noted, libpq is none too bright about that.)

            regards, tom lane

Re: where does postgres keep the query result until it is returned?

От
Willy-Bas Loos
Дата:
> If the question was about the final query result sent to the client:
yes

> we don't spill that to disk, nor "hold" it anywhere.  The backend
> sends it to the client on-the-fly as each row is generated.
thanks, i didn't know that.
I asked because i have a function that produces a result in xml.
that is one row, one value even, but it grows pretty large.
how is that handled?

--
"Patriotism is the conviction that your country is superior to all
others because you were born in it." -- George Bernard Shaw

Re: where does postgres keep the query result until it is returned?

От
Sam Mason
Дата:
On Fri, Sep 17, 2010 at 04:51:46PM +0200, Willy-Bas Loos wrote:
> i have a function that produces a result in xml.
> that is one row, one value even, but it grows pretty large.
> how is that handled?

Rows are sent back in the entireity, so the PG instance would need
enough memory to work with that row.  When you're running a 32bit
version of PG, values whose size is beyond ~100MB are a bit touch and go
whether it will work.

--
  Sam  http://samason.me.uk/

Re: where does postgres keep the query result until it is returned?

От
Merlin Moncure
Дата:
On Fri, Sep 17, 2010 at 3:59 PM, Sam Mason <sam@samason.me.uk> wrote:
> On Fri, Sep 17, 2010 at 04:51:46PM +0200, Willy-Bas Loos wrote:
>> i have a function that produces a result in xml.
>> that is one row, one value even, but it grows pretty large.
>> how is that handled?
>
> Rows are sent back in the entireity, so the PG instance would need
> enough memory to work with that row.  When you're running a 32bit
> version of PG, values whose size is beyond ~100MB are a bit touch and go
> whether it will work.

well, the entire result has to be stored in memory on the client side,
not just the row.  This is why certain operations are so much more
convenient on the server (like insert into foo select * from bar).
After working with pg for a while you just get used to paging through
results on the client (I try to keep the result size under 10mb or so)
using any one of a number of methods.  This is probably 'good sql
style' anyways.   One group of people that bump into this are xbase
refugees who could browse massive results without fear due to the
serverless architecture.

There is a probably a good case to be made for more flexible result
architecture in libpq that supports streaming, but I'd personally much
rather see features like wCTE and stored procedures that remove cases
where you have to send lots of data through the protocol to the
client.

merlin

Re: where does postgres keep the query result until it is returned?

От
Sandeep Srinivasa
Дата:


On Mon, Sep 20, 2010 at 5:27 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Fri, Sep 17, 2010 at 3:59 PM, Sam Mason <sam@samason.me.uk> wrote:
> On Fri, Sep 17, 2010 at 04:51:46PM +0200, Willy-Bas Loos wrote:
>> i have a function that produces a result in xml.
>> that is one row, one value even, but it grows pretty large.
>> how is that handled?
>
> Rows are sent back in the entireity, so the PG instance would need
> enough memory to work with that row.  When you're running a 32bit
> version of PG, values whose size is beyond ~100MB are a bit touch and go
> whether it will work.

Is'nt "work_mem" used for this on the server side (when sorts and hash-tables are required) and "temp_buffers" (when temporary tables are created) ?

Re: where does postgres keep the query result until it is returned?

От
Willy-Bas Loos
Дата:
> Rows are sent back in the entireity, so the PG instance would need
> enough memory to work with that row.  When you're running a 32bit
> version of PG, values whose size is beyond ~100MB are a bit touch and go
> whether it will work.

?? I hear that it's posible to store up to a GiB as a value for a "text" field.
How would i possibly retrieve that data from the database?

It would be logical to use work_mem for this.. (IMHO)

Cheers,

Willy-Bas


--
"Patriotism is the conviction that your country is superior to all
others because you were born in it." -- George Bernard Shaw

Re: where does postgres keep the query result until it is returned?

От
Tom Lane
Дата:
Willy-Bas Loos <willybas@gmail.com> writes:
>> Rows are sent back in the entireity, so the PG instance would need
>> enough memory to work with that row. �When you're running a 32bit
>> version of PG, values whose size is beyond ~100MB are a bit touch and go
>> whether it will work.

> ?? I hear that it's posible to store up to a GiB as a value for a "text" field.
> How would i possibly retrieve that data from the database?

That's the *theoretical* limit.  In a 32-bit build, the practical limit
is going to be quite a bit less.

            regards, tom lane

Re: where does postgres keep the query result until it is returned?

От
Willy-Bas Loos
Дата:
>>> Rows are sent back in the entireity, so the PG instance would need
>>> enough memory to work with that row.  When you're running a 32bit
>>> version of PG, values whose size is beyond ~100MB are a bit touch and go
>>> whether it will work.
>
>> ?? I hear that it's posible to store up to a GiB as a value for a "text" field.
>> How would i possibly retrieve that data from the database?
>
> That's the *theoretical* limit.  In a 32-bit build, the practical limit
> is going to be quite a bit less.
Sheesh..

- why?
- how about 64 bit?
- should this be documented?



--
"Patriotism is the conviction that your country is superior to all
others because you were born in it." -- George Bernard Shaw

Re: where does postgres keep the query result until it is returned?

От
Karsten Hilbert
Дата:
On Mon, Sep 20, 2010 at 03:37:58PM +0200, Willy-Bas Loos wrote:

> > Rows are sent back in the entireity, so the PG instance would need
> > enough memory to work with that row.  When you're running a 32bit
> > version of PG, values whose size is beyond ~100MB are a bit touch and go
> > whether it will work.
>
> ?? I hear that it's posible to store up to a GiB as a value for a "text" field.
> How would i possibly retrieve that data from the database?

    http://www.postgresql.org/docs/9.0/static/functions-string.html

    │ substring(string [from int] [for int]) │ text   │ Extract substring


@all: Regarding the HTML docs:

Could index entries be made HTML anchors such that I
could have turned the above link something like this:

    http://www.postgresql.org/docs/9.0/static/bookindex.html#substring

?

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: where does postgres keep the query result until it is returned?

От
Scott Ribe
Дата:
On Sep 20, 2010, at 8:06 AM, Willy-Bas Loos wrote:

> Sheesh..
>
> - why?

Because you start with 4GB of address space, immediately give up maybe 0.5GB for mapping executable code (all the
systemlibraries that get linked), plus a good chunk for the main thread stack, plus library data structures. End result
isyou likely have about 2GB (or less) address space available for heap allocations. And this of course will get used
andat least somewhat fragmented as your program runs. So it is difficult to find a 1GB contiguous block for allocation. 

> - how about 64 bit?

Well, you start with 17,179,869,184GB of address space, so of course it's different. It tends to be pretty easy to find
acontiguous 1GB block in an address space that size. 

> - should this be documented?

Doubtful. It has to do with very well known limits of 32-bit programs, applies to any use for a 1GB block regardless of
whereit comes from (network or disk), really has nothing to do with postgres, and the actual size at which you'll start
tohave problems with vary a good bit depending on: OS, libraries you link to, memory management library you use, memory
managementtechniques you use, and to a very great extent your specific app and exactly what it does. 

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice





Re: where does postgres keep the query result until it is returned?

От
Craig Ringer
Дата:
On 20/09/2010 10:56 PM, Scott Ribe wrote:

>> - should this be documented?
>
> Doubtful. It has to do with very well known limits of 32-bit programs, applies to any use for a 1GB block regardless
ofwhere it comes from (network or disk), really has nothing to do with postgres, and the actual size at which you'll
startto have problems with vary a good bit depending on: OS, libraries you link to, memory management library you use,
memorymanagement techniques you use, and to a very great extent your specific app and exactly what it does. 

If nothing else, it'd be helpful to mention in the docs that storage of
individual values requires a contiguous memory block for each value.
It's not unusual to have data structures that can be used to store large
amounts of data structured as linked lists of extent blocks, so this is
really a perfectly reasonable question to ask.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Re: where does postgres keep the query result until it is returned?

От
Merlin Moncure
Дата:
On Mon, Sep 20, 2010 at 10:06 AM, Willy-Bas Loos <willybas@gmail.com> wrote:
>>>> Rows are sent back in the entireity, so the PG instance would need
>>>> enough memory to work with that row.  When you're running a 32bit
>>>> version of PG, values whose size is beyond ~100MB are a bit touch and go
>>>> whether it will work.
>>
>>> ?? I hear that it's posible to store up to a GiB as a value for a "text" field.
>>> How would i possibly retrieve that data from the database?
>>
>> That's the *theoretical* limit.  In a 32-bit build, the practical limit
>> is going to be quite a bit less.
> Sheesh..
>
> - why?

A database is not a filesystem, and a filesystem is not a database.
Postgres like many databases is designed to organize small pieces of
data for flexible retrieval.  Filesystems are designed to store pieces
of any size in a relatively inflexible way.   You can adapt either
technology into purposes they are not really designed for, but there
going to be practical constraints.  I expect these differences will
become especially glaring when the coming revolution in storage
arrives and most systems will become cpu (or even network) bound
again.

There are at least three specific features in postgres you can look at
to make dealing with large datums easier: toast (which is implicitly
used), large objects, and automatic column data compression features.
 You can also search the archives for a wealth of information on the
topic.  If I ever wrote a book on things not to do as a dba though,
storing 1gb+ xml files in a database would be in the first chapter
:-).  How about breaking the file down?

> - how about 64 bit?

The various 32 bit based limits in postgres are coming from
architecture choices and the desire to give consistent behavior across
all platforms, not necessarily hardware constraints.

merlin

Re: where does postgres keep the query result until it is returned?

От
Willy-Bas Loos
Дата:
One thing that i think works is to create a table from your results.
That is, if your result is not one big chunk in one row that won't fit
in memory, but lots of rows.

like this:
create table foo as
select <humongous query>

If you use "screen" (on linux) you can log out while your humongous
query runs on the server.

Cheers,

WBL
--
"Patriotism is the conviction that your country is superior to all
others because you were born in it." -- George Bernard Shaw

Re: where does postgres keep the query result until it is returned?

От
Willy-Bas Loos
Дата:
On Thu, Sep 30, 2010 at 7:29 PM, Willy-Bas Loos <willybas@gmail.com> wrote:
> One thing that i think works is to create a table from your results.
> That is, if your result is not one big chunk in one row that won't fit
> in memory, but lots of rows.

On second thought, that helps for the scenario where your client
looses connection or otherwise stops functioning before it has
received the whole query result, it isn't really related to the topic.
--
"Patriotism is the conviction that your country is superior to all
others because you were born in it." -- George Bernard Shaw