Re: [HACKERS] \dt and disk access

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: [HACKERS] \dt and disk access
Дата
Msg-id 640f2198ab5800447460bf75cbfc2d09
обсуждение исходный текст
Ответ на [HACKERS] \dt and disk access  (Bruce Momjian <maillist@candle.pha.pa.us>)
Список pgsql-hackers
> I don't fully understand what do you mean...
> btree-build uses qsort to sort items on a page and has own
> priority queue methods (instead of lselect.c - leftist tree
> selection algorithm).
> Do you want use qsort for in-memory sorts ?

When I said btree-build, I meant the leftist tree sort, so I am
considering replacing the leftist-tree selection algorithm with a qsort
of an array of HeapTuple pointers.  Any comments on this?  The only
performance problem I see is re-allocing the array as the number of
entries grows, but if I double the size each time I exceed the current
size, I don't believe it will be much of a problem.  The leftist tree
looked very funny to me, with recursion and stuff.  I saw the new
FASTBUILD sort used qsort, so I thought it may be a good idea.

>
> BTW, I may suggest additional trick:
> if size of sort-keys is (much?) smaller then size of tuple
> (select _int4_field_, _big_text_field_ from ... order by _int4_field_)
> and you are going to switch to tape-methods (i.e. you've pallocted
> all sort-memory) then you may try to create temp-file to store
> (unsorted) tuples there and to deal with sort-keys only. After
> you'll got sort-keys (or better say - records which keep sort-keys
> and pointers - temp-file offsets - to original tuples) sorted, you
> may use these pointers to get requested tuple from temp-file.

I am a little lost here.  The Mariposa sort code allows ExecSort to read
directly from final sorted tape file, preventing the copy to a temp
table.  So, if I don't store the full tuple in the tape file, I can't
pass a REAL tuple back to ExecSort.

I assume you are suggesting putting all the tuples in one tape file,
then doing the sort with tape file ONLY with the indexed fields, then
using those indexed fields to retrieve the data from the tape.

The problem I see is that the final read of sorted rows from the full
tape file will be seeking all over the file, causing poor performance.
Now, not having to move around extra data during the sort is going to be
good for performance, but I am not sure what the result will be.

I think you are correct that it will be a win to only sort the needed
columns, but considering that any sort that uses tape files will be big,
and hence will not fit in the file system cache, and considering the
extra code to do this, I am uncertain about its usefulness.

>
> Even for select int4_field from ... order by int4_field
> current psort palloc 48(header) + 4 (field) + 4 (aligning) = 56 bytes
> for each tuple - why don't work with 4(field) + 4 (offset) = 8 bytes
> of sort-records (tm-:)) ?

Again, can I do this if I need to pass back the actual row?

Also, another confusing issue is that when I start the sort, I don't
know if it is going to fit into memory or not, so I must store the
entire row into memory because I might be returning a pointer and not
using tape files.  Can I then switch in the middle of the sort to tape
files and start doing my comparisons differently.  Seems hard to me.

I am just guessing on these issues.  Any comments?

- --
Bruce Momjian
maillist@candle.pha.pa.us

------------------------------

В списке pgsql-hackers по дате отправления: