using a lot of maintenance_work_mem

Поиск
Список
Период
Сортировка
От Frederik Ramm
Тема using a lot of maintenance_work_mem
Дата
Msg-id 4D593814.7030908@remote.org
обсуждение исходный текст
Ответы Re: using a lot of maintenance_work_mem  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: using a lot of maintenance_work_mem  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Список pgsql-hackers
Hi,
    I am (ab)using a PostgreSQL database (with PostGIS extension) in a 
large data processing job - each day, I load several GB of data, run a 
lot of analyses on it, and then throw everything away again. Loading, 
running, and dumping the results takes about 18 hours every day.

The job involves a lot of index building and sorting, and is run on a 
64-bit machine with 96 GB of RAM.

Naturally I would like the system to use as much RAM as possible before 
resorting to disk-based operations, but no amount of 
maintenance_work_mem setting seems to make it do my bidding.

I'm using PostgreSQL 8.3 but would be willing and able to upgrade to any 
later version.

Some googling has unearthed the issue - which is likely known to all of 
you, just repeating it to prove I've done my homework - that tuplesort.c 
always tries to double its memory allocation, and will refuse to do so 
if that results in an allocation greater than MaxAllocSize:
 if ((Size) (state->memtupsize * 2) >= MaxAllocSize / sizeof(SortTuple))     return false;

And MaxAllocSize is hardcoded to 1 GB in memutils.h.

(All this based on Postgres 9.1alpha source - I didn't want to bring 
something up that has been fixed already.)

Now I assume that there are reasons that you're doing this. memutils.h 
has the (for me) cryptic comment about MaxAllocSize: "XXX This is 
deliberately chosen to correspond to the limiting size of varlena 
objects under TOAST. See VARATT_MASK_SIZE in postgres.h.", but 
VARATT_MASK_SIZE has zero other occurences in the source code.

If I were to either (a) increase MaxAllocSize to, say, 48 GB instead of 
1 GB, or (b) hack tuplesort.c to ignore MaxAllocSize, just for my local 
setup - would that likely be viable in my situation, or would I break 
countless things?

I can afford some experimentation; as I said, I'm throwing away the 
database every day anyway. I just thought I'd solicit your advice before 
I do anything super stupid. - If I can use my setup to somehow 
contribute to further PostgreSQL development by trying out some things, 
I'll be more than happy to do so. I do C/C++ but apart from building 
packages for several platforms, I haven't worked with the PostgreSQL 
source code.

Of course the cop-out solution would be to just create a huge RAM disk 
and instruct PostgreSQL to use that for disk-based sorting. I'll do that 
if all of you say "OMG don't touch MaxAllocSize" ;)

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik@remote.org  ##  N49°00'09" E008°23'33"


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Add support for logging the current role
Следующее
От: Tom Lane
Дата:
Сообщение: Re: using a lot of maintenance_work_mem