Re: Collations and Replication; Next Steps

Поиск

Список

Период

Сортировка

От	Matthew Kelly
Тема	Re: Collations and Replication; Next Steps
Дата	17 сентября 2014 г. 16:08:12
Msg-id	76A634FB-0BEC-4FCF-AC9C-B6EA2C50C290@tripadvisor.com обсуждение исходный текст
Ответ на	Re: Collations and Replication; Next Steps (Martijn van Oosterhout <kleptog@svana.org>)
Ответы	Re: Collations and Replication; Next Steps (Robert Haas <robertmhaas@gmail.com>) Re: Collations and Replication; Next Steps (Martijn van Oosterhout <kleptog@svana.org>) Re: Collations and Replication; Next Steps (Peter Eisentraut <peter_e@gmx.net>) Re: Collations and Replication; Next Steps (Bruce Momjian <bruce@momjian.us>)
Список	pgsql-hackers

Дерево обсуждения

Here is where I think the timezone and PostGIS cases are fundamentally different:
I can pretty easily make sure that all my servers run in the same timezone.  That's just good practice.  I'm also going
toinstall the same version of PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies from the exact
samesource files, regardless of when I build the machine. 

Timezone is a user level setting; PostGIS is a user level library used by a subset.

glibc is a system level library, and text is a core data type, however.  Changing versions to something that doesn't
matchthe kernel can lead to system level instability, broken linkers, etc.  (I know because I tried).  Here are some
subtleother problems that fall out: 
* Upgrading glibc, the kernel, and linker through the package manager in order to get security updates can cause the
corruption.*A basebackup that is taken in production and placed on a backup server might not be valid on that server,
oryour desktop machine, or on the spare you keep to do PITR when someone screws up.* Unless you keep _all_ of your
clusterson the same OS, machines from your database spare pool probably won't be the right OS when you add them to the
clusterbecause a member failed. 

Keep in mind here, by OS I mean CentOS versions.  (we're running a mix of late 5.x and 6.x, because of our numerous
issueswith the 6.x kernel) 

The problem with LC_IDENTIFICATION is that every machine I have seen reports revision "1.0", date "2000-06-24".  It
doesn'tseem like the versioning is being actively maintained. 

I'm with Martjin here, lets go ICU, if only because it moves sorting to a user level library, instead of a system
level. Martjin do you have a link to the out of tree patch?  If not I'll find it.  I'd like to apply it to a branch and
startplaying with it. 

- Matt K

On Sep 17, 2014, at 7:39 AM, Martijn van Oosterhout <kleptog@svana.org>wrote:

> On Tue, Sep 16, 2014 at 02:57:00PM -0700, Peter Geoghegan wrote:
>> On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
>>> Clearly, this is worth documenting, but I don't think we can completely
>>> prevent the problem.  There has been talk of a built-in index integrity
>>> checking tool.  That would be quite useful.
>>
>> We could at least use the GNU facility for versioning collations where
>> available, LC_IDENTIFICATION [1]. By not versioning collations, we are
>> going against the express advice of the Unicode consortium (they also
>> advise to do a strcmp() tie-breaker, something that I think we
>> independently discovered in 2005, because of a bug report - this is
>> what I like to call "the Hungarian issue". They know what our
>> constraints are.). I recognize it's a tricky problem, because of our
>> historic dependence on OS collations, but I think we should definitely
>> do something. That said, I'm not volunteering for the task, because I
>> don't have time. While I'm not sure of what the long term solution
>> should be, it *is not* okay that we don't version collations. I think
>> that even the best possible B-Tree check tool is a not a solution.
>
> Personally I think we should just support ICU as an option. FreeBSD has
> been maintaining an out of tree patch for 10 years now so we know it
> works.
>
> The FreeBSD patch is not optimal though, these days ICU supports UTF-8
> directly so many of the push-ups FreeBSD does are no longer necessary.
> It is often faster than glibc and the key sizes for strxfrm are more
> compact [1] which is relevent for the recent optimisation patch.
>
> Lets solve this problem for once and for all.
>
> [1] http://site.icu-project.org/charts/collation-icu4c48-glibc
>
> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
>> He who writes carelessly confesses thereby at the very outset that he does
>> not attach much importance to his own thoughts.
>   -- Arthur Schopenhauer

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Martijn van Oosterhout
Дата: 17 сентября 2014 г., 15:39:25
Сообщение: Re: Collations and Replication; Next Steps

Следующее

От: Robert Haas
Дата: 17 сентября 2014 г., 16:17:29
Сообщение: Re: Collations and Replication; Next Steps

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Collations and Replication; Next Steps

Предыдущее

Следующее