Re: Shrinking tuplesort.c's SortTuple struct (Was: More ideas forspeeding up sorting)

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Shrinking tuplesort.c's SortTuple struct (Was: More ideas forspeeding up sorting)
Дата
Msg-id c25e9517-c631-0e24-bcf0-d73a50c2ce3a@iki.fi
обсуждение исходный текст
Ответ на Shrinking tuplesort.c's SortTuple struct (Was: More ideas forspeeding up sorting)  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: Shrinking tuplesort.c's SortTuple struct (Was: More ideas forspeeding up sorting)  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On 10/08/2019 02:14, Peter Geoghegan wrote:
> The easy part was removing SortTuple.tupindex itself -- it was fairly
> natural to stash that in the slab allocation for each tape. I used the
> aset.c trick of having a metadata "chunk" immediately prior to address
> that represents the allocation proper -- we can just back up by a few
> bytes from stup.tuple to find the place to stash the tape number
> during merging. The worst thing about this change was that it makes a
> tape slab allocation mandatory in cases that previously didn't have
> any need for a stup.tuple allocation (e.g. datum tuplesorts of
> pass-by-value types), though only during merging. Since we must always
> store the tapenum when merging, we always need a slab buffer for each
> tape when merging. This aspect wasn't so bad.

Hmm. Wouldn't it be more straightforward to have the extra tupindex 
field at the end of the struct? Something like:

typedef struct
{
    void       *tuple;            /* the tuple itself */
    Datum        datum1;            /* value of first key column */
    bool        isnull1;        /* is first key column NULL? */
} SortTuple;

typedef struct
{
    SortTuple stuple;
    int            tupindex;        /* see notes above */
} MergeTuple;

The initial sorting phase would deal with SortTuples, and the merge 
phase would deal with MergeTuples. The same comparison routines work 
with both.

> The hard/ugly part was getting rid of the remaining "unnecessary"
> SortTuple field, isnull1. This involved squeezing an extra bit out of
> the stup.tuple pointer, by stealing the least-significant bit. This
> was invasive in about the way you'd expect it to be. It wasn't awful,
> but it also wasn't something I'd countenance pursuing without getting
> a fairly noticeable benefit for users. (Actually, the code that I
> wrote so far *is* pretty awful, but I could certainly clean it up some
> more if I felt like it.)
> 
> I think that the rough patch that I came up with gives us an accurate
> picture of what the benefits of having SortTuples that are only 16
> bytes wide are. The benefits seem kind of underwhelming at this point.
> For something like a "SELECT COUNT(distinct my_int4_col) FROM tab"
> query, which uses the qsort_ssup() qsort specialization, we can easily
> go from getting an external sort to getting an internal sort. We can
> maybe end up sorting about 20% faster if things really work out for
> the patch.

If you separate the NULLs from non-NULLs in a separate array, as was 
discussed back in 2016, instead of stealing a bit, you can squeeze some 
instructions out of the comparison routines, which might give some extra 
speedup.

> But in cases that users really care about, such as REINDEX,
> the difference is in the noise. ISTM that this is simple not worth the
> trouble at this time. These days, external sorts are often slightly
> faster than internal sorts in practice, due to the fact that we can do
> an on-the-fly merge with external sorts, so we could easily hurt
> performance by making more memory available!

Yeah, that's a bit sad.

That makes me think: even when everything fits in memory, it might make 
sense to divide the input into a few batches, qsort them individually, 
and do an on-the-fly merge of the batches. I guess I'm essentially 
suggesting that we should use merge instead of quicksort for the 
in-memory case, too.

If we had the concept of in-memory batches, you could merge together 
in-memory and external batches. That might be handy. For example, when 
doing an external sort, instead of flushing the last run to disk before 
you start merging, you could keep it in memory. That might be 
significant in the cases where the input is only slightly too big to fit 
in memory.

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Julien Rouhaud
Дата:
Сообщение: Re: [survey] New "Stable" QueryId based on normalized query text
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)