Re: Insertion Sort Improvements

Поиск
Список
Период
Сортировка
От Benjamin Coutu
Тема Re: Insertion Sort Improvements
Дата
Msg-id 042b841e324e22c79e23@zeyos.com
обсуждение исходный текст
Ответ на Re: Insertion Sort Improvements  (John Naylor <john.naylor@enterprisedb.com>)
Ответы Re: Insertion Sort Improvements  (John Naylor <john.naylor@enterprisedb.com>)
Список pgsql-hackers
Greetings,

I would like to revisit the discussion and concur with John's perspective that incremental progress through small,
successivemodifications is the appropriate approach to move forward. Therefore, I would like to propose two distinct
ideasaimed at enhancing the functionality of insertion sort: 

1. Implementation of a more efficient variant (as described in the introductory part of this thread):

------------ OLD ------------

for (pm = a + ST_POINTER_STEP; pm < a + n * ST_POINTER_STEP;
     pm += ST_POINTER_STEP)
    for (pl = pm; pl > a && DO_COMPARE(pl - ST_POINTER_STEP, pl) > 0;
         pl -= ST_POINTER_STEP)
        DO_SWAP(pl, pl - ST_POINTER_STEP);

------------ NEW ------------

for (
    pm = a + ST_POINTER_STEP;
    pm < a + n * ST_POINTER_STEP;
    pm += ST_POINTER_STEP
) {
    ST_POINTER_TYPE tmp = *pm;

    for (
        pl = pm - ST_POINTER_STEP;
        pl >= a && DO_COMPARE(pl, &tmp) > 0;
        pl -= ST_POINTER_STEP
    )
        *(pl + ST_POINTER_STEP) = *pl;

    *(pl + ST_POINTER_STEP) = tmp;
}

------------

2. It appears that there is a consensus against disregarding the presorted check, despite its questionable value. In
lightof this, an alternative suggestion is to integrate the presorted check into the insertion sort path by utilizing
anunbounded insertion sort. Only after a maximum number of swaps have occurred should we abandon the sorting process.
Ifinsertion sort is executed on the entire array without any swaps, we can simply return. If not, and we continue with
quicksortafter the swap limit has been reached, we at least have left the array in a more sorted state, which may
likelybe advantageous for subsequent iterations. 

------------ OLD ------------

if (n < 7)
{
    for (pm = a + ST_POINTER_STEP; pm < a + n * ST_POINTER_STEP;
         pm += ST_POINTER_STEP)
        for (pl = pm; pl > a && DO_COMPARE(pl - ST_POINTER_STEP, pl) > 0;
             pl -= ST_POINTER_STEP)
            DO_SWAP(pl, pl - ST_POINTER_STEP);
    return;
}
presorted = 1;
for (pm = a + ST_POINTER_STEP; pm < a + n * ST_POINTER_STEP;
     pm += ST_POINTER_STEP)
{
    DO_CHECK_FOR_INTERRUPTS();
    if (DO_COMPARE(pm - ST_POINTER_STEP, pm) > 0)
    {
        presorted = 0;
        break;
    }
}
if (presorted)
    return;

------------ NEW ------------

#define LIMIT_SWAPS 30 /* to be determined empirically */

int swaps = 0;

for (pm = a + ST_POINTER_STEP; pm < a + n * ST_POINTER_STEP;
     pm += ST_POINTER_STEP)
    for (pl = pm; pl > a && DO_COMPARE(pl - ST_POINTER_STEP, pl) > 0;
         pl -= ST_POINTER_STEP) {
        DO_SWAP(pl, pl - ST_POINTER_STEP);

        if (++swaps == LIMIT_SWAPS)
            goto quicksort;
    }

if (swaps == 0)
    return;

quicksort:

------------

Naturally, both modifications (with point 2 being highly speculative) are currently independent of each other, and it
isalso crucial to benchmark the combined variant as well as different values for LIMIT_SWAPS. 
I would greatly appreciate assistance in benchmarking these proposed changes. Your collaboration in this matter would
beinvaluable. 

Cheers, Ben



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Richard Guo
Дата:
Сообщение: ERROR: no relation entry for relid 6
Следующее
От: Aleksander Alekseev
Дата:
Сообщение: Re: "38.10.10. Shared Memory and LWLocks" may require a clarification