On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> Clearly, this is worth documenting, but I don't think we can completely
> prevent the problem. There has been talk of a built-in index integrity
> checking tool. That would be quite useful.
We could at least use the GNU facility for versioning collations where
available, LC_IDENTIFICATION [1]. By not versioning collations, we are
going against the express advice of the Unicode consortium (they also
advise to do a strcmp() tie-breaker, something that I think we
independently discovered in 2005, because of a bug report - this is
what I like to call "the Hungarian issue". They know what our
constraints are.). I recognize it's a tricky problem, because of our
historic dependence on OS collations, but I think we should definitely
do something. That said, I'm not volunteering for the task, because I
don't have time. While I'm not sure of what the long term solution
should be, it *is not* okay that we don't version collations. I think
that even the best possible B-Tree check tool is a not a solution.
[1] http://www.postgresql.org/message-id/CAEYLb_UTMgM2V_pP7qnuKZYmTYXoym-zNYVbwoU79=TuP8HE3A@mail.gmail.com
--
Peter Geoghegan