Re: Strangeness with UNIQUE indexes and UTF-8

Поиск
Список
Период
Сортировка
От Omar Kilani
Тема Re: Strangeness with UNIQUE indexes and UTF-8
Дата
Msg-id CA+8F9hiMLco2N9BG4dnXAjm1f5vR1bCHs5xWH8x7QmQ8=e=FQg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Strangeness with UNIQUE indexes and UTF-8  (Omar Kilani <omar.kilani@gmail.com>)
Ответы Re: Strangeness with UNIQUE indexes and UTF-8  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Strangeness with UNIQUE indexes and UTF-8  (Magnus Hagander <magnus@hagander.net>)
Список pgsql-hackers
I mean, maybe it's because I've been awake since... 7am yesterday, but
it seems to me that if Postgres fails catastrophically silently (and I
would say "it looks like all your data in this table disappeared
because of some arcane locale / btree issue that no one except Tom
Lane even knows exists" -- see the replies about hardware issues and
ON CONFLICT as an example) -- then maybe that is... not good, and
Postgres shouldn't do that?

Not only that, it's only indices which have non-ASCII or whatever in
them that silently fail, so it's like 95% of your indices work just
fine, but the ones that don't... look fine. They're not corrupt on
disk, they have their full size, etc.

How is anyone supposed to know about this issue? I've been using
Postgres since 1999, built the Postgres website, worked with Neil and
Gavin on Postgres, submitted patches to Postgres and various
Postgres-related projects, and this is the first time I've become
aware of it. I mean, maybe I'm dumb, and... fine. But your average
user is going to have no idea about this.

Why can't some "locale signature" or something be encoded into the
index so Postgres can at least warn you? Or not use the messed up
index altogether instead of silently returning no data?

On Sun, Jun 6, 2021 at 2:06 PM Omar Kilani <omar.kilani@gmail.com> wrote:
>
> We do use ON CONFLICT… it doesn’t work because the index is both “good” and “bad” at the same time.
>
> On Sun, Jun 6, 2021 at 2:03 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>>
>> On Sun, Jun 06, 2021 at 03:54:48AM -0700, Omar Kilani wrote:
>> > What I sort of don't get is... before we insert anything into these
>> > tables, we always check to see if a value already exists. And Postgres
>> > must be returning no results for some reason. So it goes to insert a
>> > duplicate value which somehow succeeds despite the unique index, but
>> > then a reindex says it's a duplicate. Pretty weird.
>>
>> In addition to the other issues, this is racy.
>>
>> You 1) check if a key exists, and if not then 2) INSERT (or maybe you UPDATE if
>> it did exist).
>>
>> https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use
>>
>> Maybe you'll say that "this process only runs once", but it's not hard to
>> imagine that might be violated.  For example, if you restart a multi-threaded
>> process, does the parent make sure that the child processes die before itself
>> dying?  Do you create a pidfile, and do you make sure the children are dead
>> before removing the pidfile ?
>>
>> The right way to do this since v9.6 is INSERT ON CONFLICT, which is also more
>> efficient in a couple ways.
>>
>> --
>> Justin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Omar Kilani
Дата:
Сообщение: Re: Strangeness with UNIQUE indexes and UTF-8
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Multiple hosts in connection string failed to failover in non-hot standby mode