Is tuplesort_heap_siftup() a misnomer?

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Is tuplesort_heap_siftup() a misnomer?
Дата
Msg-id CAM3SWZQ+2gJMNV7ChxwEXqXopLfb_FEW2RfEXHJ+GsYF39f6MQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Is tuplesort_heap_siftup() a misnomer?  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
Doesn't tuplesort_heap_siftup() actually shift-down?

The Wikipedia article on heaps [1] lists "shift-down" (never "sift
down", FWIW) as a common operation on a heap:

"shift-down: move a node down in the tree, similar to shift-up; used
to restore heap condition after deletion or replacement."

This seems to be what tuplesort_heap_siftup() actually does (among
other things; its job is to compact the heap following caller
returning the root, where caller leaves a "hole" at the root position
0). As it happens, the node that this tuplesort.c function "moves down
in the tree" is the memtuple that was previously positioned physically
last in the heap portion of the memtuples array. This node is then
considered as a "candidate root" (i) for each subheap as we actually
shift down, starting from i = 0. Conceptually, we immediately "fill
the hole" that caller left in the root with this other tuple (the
physically last tuple), with this new tuple/node, then shifted down as
needed. (I say "conceptually" because we don't actually bother with an
eager swap into the "hole" position, but that's just to avoid a
potentially useless swap.)

Now, tuplesort_heap_insert() actually does sift (or shift up, if you
prefer), which is necessary because it doesn't assume that there is an
existing "hole" anywhere in the heap (it just assumes the availability
of space to insert the caller's tuple, at the end of the heap's
memtuple portion).

Or, as Wikipedia describes it:

"shift-up: move a node up in the tree, as long as needed; used to
restore heap condition after insertion. Called "sift" because node
moves up the tree until it reaches the correct level, as in a sieve."

I think that tuplesort_heap_insert() is correct, because it doesn't
claim to shift at all (or sift, or whatever). I'm just contrasting it
with tuplesort_heap_siftup() to make a point.

I think that tuplesort_heap_siftup() should probably be renamed to
describe its purpose at a higher level, rather than how it shifts,
which in any case is an implementation detail. Specifically, I suggest
we rename tuplesort_heap_siftup() to tuplesort_heap_compact(), which
suggests that it's roughly the opposite of tuplesort_heap_insert(). I
think that the contrast between these two functions add clarity.
tuplesort_heap_siftup() comments will also need to be updated, since
"sift up" is mentioned there.

Should I write a patch?

[1] https://en.wikipedia.org/wiki/Heap_(data_structure)
-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Pluggable storage
Следующее
От: Greg Stark
Дата:
Сообщение: Re: Add hint for function named "is"