Re: Strangeness with UNIQUE indexes and UTF-8

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: Strangeness with UNIQUE indexes and UTF-8
Дата
Msg-id CABUevEwE5F6H2kkWyFs1Ox3zEdjzikrbuYjEJqgjKGWpBz=0fQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Strangeness with UNIQUE indexes and UTF-8  (Omar Kilani <omar.kilani@gmail.com>)
Список pgsql-hackers
On Sun, Jun 6, 2021 at 11:20 PM Omar Kilani <omar.kilani@gmail.com> wrote:
>
> I mean, maybe it's because I've been awake since... 7am yesterday, but
> it seems to me that if Postgres fails catastrophically silently (and I
> would say "it looks like all your data in this table disappeared
> because of some arcane locale / btree issue that no one except Tom
> Lane even knows exists" -- see the replies about hardware issues and
> ON CONFLICT as an example) -- then maybe that is... not good, and
> Postgres shouldn't do that?

It is most definitely not an "arcane issue no one except Tom lane even
knows exists". I would assume most people who work with consulting or
support around PostgreSQL know it exists, because some of their
customers have hit it :/

I think it's more in the other direction -- those people are more
likely to dismiss that issue as "the person reporting this will
already have checked this, it must be something else"...



> Not only that, it's only indices which have non-ASCII or whatever in
> them that silently fail, so it's like 95% of your indices work just
> fine, but the ones that don't... look fine. They're not corrupt on
> disk, they have their full size, etc.

No it's not. ASCII will also fail in many cases. Did you read the page
that you were linked to? It even includes an example of why ASCII
cases will also fail.

It's only non-text indexes that are "safe".


> How is anyone supposed to know about this issue? I've been using
> Postgres since 1999, built the Postgres website, worked with Neil and
> Gavin on Postgres, submitted patches to Postgres and various
> Postgres-related projects, and this is the first time I've become
> aware of it. I mean, maybe I'm dumb, and... fine. But your average
> user is going to have no idea about this.

This problem has been around before, just usually doesn't affect the
English locale. Surely if you've spent that much time around Postgres
and in the community you must've heard about it before?

And this particular issue has been written about numerous times, which
has been published through the postgres website and blog aggregators.

It is definitely a weakness in how PostgreSQL does things, but it's a
pretty well known weakness by now.


> Why can't some "locale signature" or something be encoded into the
> index so Postgres can at least warn you? Or not use the messed up
> index altogether instead of silently returning no data?

If you use ICU for your text indexes, it does exactly that. The page
at https://www.postgresql.org/docs/13/sql-altercollation.html shows
you examples of what wouldh appen in that case. (This page also
documents that there is no version tracking for the built-in
collections, but I definitely agree that's pretty well hidden-away by
being on the reference page of alter collation..)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Decoding speculative insert with toast leaks memory
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: Decoding speculative insert with toast leaks memory