Re: Collations and Replication; Next Steps
От | Matthew Kelly |
---|---|
Тема | Re: Collations and Replication; Next Steps |
Дата | |
Msg-id | 76A634FB-0BEC-4FCF-AC9C-B6EA2C50C290@tripadvisor.com обсуждение исходный текст |
Ответ на | Re: Collations and Replication; Next Steps (Martijn van Oosterhout <kleptog@svana.org>) |
Ответы |
Re: Collations and Replication; Next Steps
(Robert Haas <robertmhaas@gmail.com>)
Re: Collations and Replication; Next Steps (Martijn van Oosterhout <kleptog@svana.org>) Re: Collations and Replication; Next Steps (Peter Eisentraut <peter_e@gmx.net>) Re: Collations and Replication; Next Steps (Bruce Momjian <bruce@momjian.us>) |
Список | pgsql-hackers |
Here is where I think the timezone and PostGIS cases are fundamentally different: I can pretty easily make sure that all my servers run in the same timezone. That's just good practice. I'm also going toinstall the same version of PostGIS everywhere in a cluster. I'll build PostGIS and its dependencies from the exact samesource files, regardless of when I build the machine. Timezone is a user level setting; PostGIS is a user level library used by a subset. glibc is a system level library, and text is a core data type, however. Changing versions to something that doesn't matchthe kernel can lead to system level instability, broken linkers, etc. (I know because I tried). Here are some subtleother problems that fall out: * Upgrading glibc, the kernel, and linker through the package manager in order to get security updates can cause the corruption.*A basebackup that is taken in production and placed on a backup server might not be valid on that server, oryour desktop machine, or on the spare you keep to do PITR when someone screws up.* Unless you keep _all_ of your clusterson the same OS, machines from your database spare pool probably won't be the right OS when you add them to the clusterbecause a member failed. Keep in mind here, by OS I mean CentOS versions. (we're running a mix of late 5.x and 6.x, because of our numerous issueswith the 6.x kernel) The problem with LC_IDENTIFICATION is that every machine I have seen reports revision "1.0", date "2000-06-24". It doesn'tseem like the versioning is being actively maintained. I'm with Martjin here, lets go ICU, if only because it moves sorting to a user level library, instead of a system level. Martjin do you have a link to the out of tree patch? If not I'll find it. I'd like to apply it to a branch and startplaying with it. - Matt K On Sep 17, 2014, at 7:39 AM, Martijn van Oosterhout <kleptog@svana.org>wrote: > On Tue, Sep 16, 2014 at 02:57:00PM -0700, Peter Geoghegan wrote: >> On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >>> Clearly, this is worth documenting, but I don't think we can completely >>> prevent the problem. There has been talk of a built-in index integrity >>> checking tool. That would be quite useful. >> >> We could at least use the GNU facility for versioning collations where >> available, LC_IDENTIFICATION [1]. By not versioning collations, we are >> going against the express advice of the Unicode consortium (they also >> advise to do a strcmp() tie-breaker, something that I think we >> independently discovered in 2005, because of a bug report - this is >> what I like to call "the Hungarian issue". They know what our >> constraints are.). I recognize it's a tricky problem, because of our >> historic dependence on OS collations, but I think we should definitely >> do something. That said, I'm not volunteering for the task, because I >> don't have time. While I'm not sure of what the long term solution >> should be, it *is not* okay that we don't version collations. I think >> that even the best possible B-Tree check tool is a not a solution. > > Personally I think we should just support ICU as an option. FreeBSD has > been maintaining an out of tree patch for 10 years now so we know it > works. > > The FreeBSD patch is not optimal though, these days ICU supports UTF-8 > directly so many of the push-ups FreeBSD does are no longer necessary. > It is often faster than glibc and the key sizes for strxfrm are more > compact [1] which is relevent for the recent optimisation patch. > > Lets solve this problem for once and for all. > > [1] http://site.icu-project.org/charts/collation-icu4c48-glibc > > -- > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ >> He who writes carelessly confesses thereby at the very outset that he does >> not attach much importance to his own thoughts. > -- Arthur Schopenhauer
В списке pgsql-hackers по дате отправления: