Re: [HACKERS] Hash Functions

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [HACKERS] Hash Functions
Дата
Msg-id CA+TgmoZrz6gX7qfQKEnjwwg55crirXm2o21fwihWAj7SvdxXeQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Hash Functions  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: [HACKERS] Hash Functions  (Joe Conway <mail@joeconway.com>)
Список pgsql-hackers
On Fri, Jun 2, 2017 at 1:24 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> 1. For range partitioning, I think it's "yes, a little". As you point
> out, there are already some weird edge cases -- the main way range
> partitioning would make the problem worse is simply by having more
> users.

I agree.

> But for hash partitioning I think the problems will become more
> substantial. Different encodings, endian issues, etc. will be a
> headache for someone, and potentially a bad day if they are urgently
> trying to restore on a new machine. We should remember that not
> everyone is a full-time postgres DBA, and users might reasonably think
> that the default options to pg_dump[all] will give them a portable
> dump.

I agree to an extent.  I think the problem will be worse for hash
partitioning but I might disagree with you on how much worse.  I think
that most people don't do encoding conversions very often, and that
those who do know (or should know) enough to expect trouble.  I think
most people do endian-ness conversions almost never, but since that's
a matter of hardware not configuration I'd like to paper over that
case if we can.

> 2. I basically see two approaches to solve the problem:
>   (a) Tom suggested at PGCon that we could have a GUC that
> automatically causes inserts to the partition to be re-routed through
> the parent. We could discuss whether to always route through the
> parent, or do a recheck on the partition constrains and only reroute
> tuples that will fail it. If the user gets into trouble, the worst
> that would happen is a helpful error message telling them to set the
> GUC. I like this idea.

Yeah, that's not crazy.  I find it a bit surprising in terms of the
semantics, though.  SET
when_i_try_to_insert_into_a_specific_partition_i_dont_really_mean_it =
true?

>   (b) I had suggested before that we could make the default text dump
> (and the default output from pg_restore, for consistency) route
> through the parent. Advanced users would dump with -Fc, and pg_restore
> would support an option to do partition-wise loading. To me, this is
> simpler, but users might forget to use (or not know about) the
> pg_restore option and then it would load more slowly. Also, the ship
> is sailing on range partitioning, so we might prefer option (a) just
> to avoid making any changes.

I think this is a non-starter.  The contents of the dump shouldn't
depend on the format chosen; that is bound to confuse somebody.  I
also do not wish to inflict a speed penalty on the users of
plain-format dumps.

>> 2. Add an option like --dump-partition-data-with-parent.  I'm not sure
>> who originally proposed this, but it seems that everybody likes it.
>> What we disagree about is the degree to which it's sufficient.  Jeff
>> Davis thinks it doesn't go far enough: what if you have an old
>> plain-format dump that you don't want to hand-edit, and the source
>> database is no longer available?  Most people involved in the
>> unconference discussion of partitioning at PGCon seemed to feel that
>> wasn't really something we should be worry about too much.  I had been
>> taking that position also, more or less because I don't see that there
>> are better alternatives.
>
> If the suggestions above are unacceptable, and we don't come up with
> anything better, then of course we have to move on. I am worrying now
> primarily because now is the best time to worry; I don't expect any
> horrible outcome.

OK.

>> 3. Implement portable hash functions (Jeff Davis or me, not sure
>> which).  Andres scoffed at this idea, but I still think it might have
>> legs.
>
> I think it reduces the problem, which has value, but it's hard to make
> it rock-solid.

I agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Hao Lee
Дата:
Сообщение: Re: [HACKERS] Do we need the gcc feature "__builtin_expect" topromote the branches prediction?
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: [HACKERS] Effect of changing the value for PARALLEL_TUPLE_QUEUE_SIZE