Обсуждение: sort mem: size in RAM vs size on Disk

Поиск
Список
Период
Сортировка

sort mem: size in RAM vs size on Disk

От
"mark"
Дата:
Hi all,


I am wondering if anyone has any estimates on how much larger a working set
for a sort is when the query execution puts it in memory vs when it spills
out to disk. It seems like sorts in memory are larger than they are if they
still out to disk. (which I could understand), I am just looking for a
general 'rule' if I see 20M in an exernal disk merge that it means I would
have needed 2 x that for work_mem before it would not have spilled out. (2x
seems to be about right thus far)


Also I am seeing COPY statements (to stdout) have temp files a lot. These
copies have a select in them so usually them temp file is only a few meg,
yet no matter how large my work_mem is they always seem to use a temp file.
Is this normal or should I keep looking into this? E.g. my work mem is 32MB
currently and I see some copies to stdout use a 12MB temp file.

3rd question:

If I see (eg.) pgsql_tmp25049.0 and then pgsql_tmp25049.1, pgsql_tmp25049.2,
pgsql_tmp25049.3 should I assume the max file size for a temp file is 1024MB
and then it spills to the next one ?(it would not surprise me if this query
needed Gigs of temp files...). is it normal for .0 to be sub 1024MB while 1
and 2 are ? they all list the same same statement as the cause and I don't
think we ran it 3 times.



Thank you,

-Mark


Re: sort mem: size in RAM vs size on Disk

От
Peter Geoghegan
Дата:
What version of PostgreSQL are you using? sort_mem is now called
work_mem (to better reflect the reality that it isn't just used in
sorting, I think), and has been for some time.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Re: sort mem: size in RAM vs size on Disk

От
mark
Дата:
On Thu, Mar 10, 2011 at 8:03 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
> What version of PostgreSQL are you using? sort_mem is now called
> work_mem (to better reflect the reality that it isn't just used in
> sorting, I think), and has been for some time.
>
> --
> Peter Geoghegan       http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training and Services
>

I hang my head somewhat down in shame as I type this...- 8.3.7

I just used the term "sort mem" as it's the sorting operations that I
am typically seeing spilling over to disk when I check the query with
an explain analyze.

I understand that work_mem is a more accurate description.

in summary it seems that if I see a temp file logged of say 20MB I
need about 40MB of work_mem before it doesn't spill to disk. just
wondering if I am at all accurate with this or if I am way off base.

thanks

-Mark

Re: sort mem: size in RAM vs size on Disk

От
"Igor Neyman"
Дата:
> -----Original Message-----
> From: mark [mailto:dvlhntr@gmail.com]
> Sent: Thursday, March 10, 2011 9:37 AM
> To: pgsql-general@postgresql.org
> Subject: sort mem: size in RAM vs size on Disk
>
> Hi all,
>
>
> I am wondering if anyone has any estimates on how much larger
> a working set for a sort is when the query execution puts it
> in memory vs when it spills out to disk. It seems like sorts
> in memory are larger than they are if they still out to disk.
> (which I could understand), I am just looking for a general
> 'rule' if I see 20M in an exernal disk merge that it means I
> would have needed 2 x that for work_mem before it would not
> have spilled out. (2x seems to be about right thus far)
>
>
> Also I am seeing COPY statements (to stdout) have temp files
> a lot. These copies have a select in them so usually them
> temp file is only a few meg, yet no matter how large my
> work_mem is they always seem to use a temp file.
> Is this normal or should I keep looking into this? E.g. my
> work mem is 32MB currently and I see some copies to stdout
> use a 12MB temp file.
>
> 3rd question:
>
> If I see (eg.) pgsql_tmp25049.0 and then pgsql_tmp25049.1,
> pgsql_tmp25049.2,
> pgsql_tmp25049.3 should I assume the max file size for a temp
> file is 1024MB and then it spills to the next one ?(it would
> not surprise me if this query needed Gigs of temp files...).
> is it normal for .0 to be sub 1024MB while 1 and 2 are ? they
> all list the same same statement as the cause and I don't
> think we ran it 3 times.
>
> Thank you,
>
> -Mark
>

Mark, you are about right in regards to ratio between required work_mem
size and the size that sort operation occupies on disk.  work_mem needs
to be 2-3 times greater than sort occupies on disk.

As for "temp" files, PG has separate from work_mem memory area called
temp_buffers, try to play with this configuration parameter.

And, yes there is a 1GB files size limit (not only for temp files),
that's why you see .1, .2, ... In the file names.

Regards,
Igor Neyman