Re: Using quicksort for every external sort run

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Using quicksort for every external sort run
Дата
Msg-id CA+TgmoZDgbmPXu5N-yHYHJOrzvnOLZdCbaxpi_g9X4G9k2va1g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Using quicksort for every external sort run  (Peter Geoghegan <pg@heroku.com>)
Ответы Re: Using quicksort for every external sort run  (Peter Geoghegan <pg@heroku.com>)
Re: Using quicksort for every external sort run  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
On Tue, Dec 22, 2015 at 8:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Sure, there are arbitrary numbers all over the code, driven by
>> empirical observations about what factors are important to model.  But
>> this is not that.  You don't have a thing called seq_page_cost and a
>> thing called cpu_tuple_cost and then say, well, empirically the ratio
>> is about 100:1, so let's make the former 1 and the latter 0.01.  You
>> just have some numbers, and it's not clear what, if anything, they
>> actually represent.
>
> What I find difficult to accept about what you say here is that at
> *this* level, something like cost_sort() has little to recommend it.
> It costs a sort of a text attribute at the same level as the cost of
> sorting the same tuples using an int4 attribute (based on the default
> cpu_operator_cost for C functions -- without any attempt to
> differentiate text and int4).
>
> Prior to 9.5, sorting text took about 5 - 10 times longer that this
> similar int4 sort. That's a pretty big difference, and yet I recall no
> complaints. The cost of a comparison in a sort can hardly be
> considered in isolation, anyway -- cache efficiency is at least as
> important.
>
> Of course, the point is that the goal of a cost model is not to
> simulate reality as closely as possible -- it's to produce a good
> outcome for performance purposes under realistic assumptions.
> Realistic assumptions include that you can't hope to account for
> certain differences in cost. Avoiding a terrible outcome is very
> important, but the worst case for useselection() is no worse than
> today's behavior (or a lost opportunity to do better than today's
> behavior).

I agree with that.  So, the question for any given cost model is: does
it model the effects that matter?

If you think that the cost of sorting integers vs. sorting text
matters to the crossover point, then that should be modeled here.  If
it doesn't matter, then don't include it.

The point is, nobody can tell WHAT effects this is modeling.
Increasing the tuple size makes the crossover go up.  Or down.

> Recently, the paper that was posted to the list about the Postgres
> optimizer stated formally what I know I had a good intuitive sense of
> for a long time: that better selectivity estimates are much more
> important than better cost models in practice. The "empirical
> observations" driving something like DEFAULT_EQ_SEL are very weak --
> but what are you gonna do?

This analogy is faulty.  It's true that when we run across a qual
whose selectivity we cannot estimate in any meaningful way, we have to
just take a stab in the dark and hope for the best.  Similarly, if we
have no information about what the crossover point for a given sort
is, we'd have to take some arbitrary estimate, like 75%, and hope for
the best.  But in this case, we DO have information.  We have an
estimated row count and an estimated row width.  And those values are
not being ignored, they are getting used.  The problem is that they
are being used in an arbitrary way that is not justified by any chain
of reasoning.

> But yes, let me concede more clearly: the cost model is based on
> frobbing. But at least it's relatively honest about that, and is
> relatively simple. I think it might be possible to make it simpler,
> but I have a feeling that anything we can come up with will basically
> have the same quality that you so dislike. I don't know how to do
> better. Frankly, I'd rather be roughly correct than exactly wrong.

Sure, but the fact that the model has huge discontinuities - perhaps
most notably a case where adding a single tuple to the estimated
cardinality changes the crossover point by a factor of two - suggests
that you are probably wrong.  The actual behavior does not change
sharply when the size of the SortTuple array crosses 1GB, but the
estimates do.  That means that either the estimates are wrong for
44,739,242 tuples or they are wrong for 44,739,243 tuples.  The
behavior cannot be right in both cases unless that one extra tuple
changes the behavior radically, or unless the estimate doesn't matter
in the first place.

> By the way, I think that there needs to be a little work done to
> cost_sort() too, which so far I've avoided.

Yeah, I agree, but that can be a separate topic.

>> I agree, but that's not what I proposed.  You don't want to keep
>> re-sorting to incorporate new tuples into the run, but if you've got
>> 1010 tuples and you can fit 1000 tuples in, you can (a) quicksort the
>> first 1000 tuples, (b) read in 10 more tuples, dumping the first 10
>> tuples from run 0 to disk, (c) quicksort the last 10 tuples to create
>> run 1, and then (d) merge run 0 [which is mostly in memory] with run 1
>> [which is entirely in memory].  In other words, yes, quicksorting
>> doesn't let you add things to the sort incrementally, but you can
>> still write out the run incrementally, writing only as many tuples as
>> you need to dump to get the rest of the input data into memory.
>
> Merging is still sorting. The 10 tuples are not very cheap to merge
> against the 1000 tuples, because you'll probably still end up reading
> most of the 1000 tuples to do so.

You're going to read all of the 1000 tuples no matter what, because
you need to return them, but you will also need to make comparisons on
most of them, unless the data distribution is favorable.   Assuming no
special good luck, it'll take something close to X + Y - 1 comparisons
to do the merge, so something around 1009 comparisons here.
Maintaining the heap property is not free either, but it might be
cheaper.

> The cost of heapification of 1.01 million tuples to spill 0.01 million
> tuples is pretty low (relative to the cost of sorting them in
> particular). The only difference between what you say here and what I
> actually do is that the remaining tuples are heapified rather than
> sorted, and I quicksort everything together to "merge run 1 and run 0"
> rather than doing two quicksorts and a merge. I believe that this can
> be demonstrated to be cheaper.
>
> Another factor is that the heap could be useful for other stuff in the
> future. As Simon Riggs pointed out, for deduplicating values as
> they're read in by tuplesort. (Okay, that's really the only other
> thing, but it's a good one).

Not sure how that would work?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Joe Conway
Дата:
Сообщение: Re: exposing pg_controldata and pg_config as functions
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: pgbench --latency-limit option