Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY

Поиск
Список
Период
Сортировка
От Michail Nikolaev
Тема Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY
Дата
Msg-id CANtu0ogJ4Xmz9kfmniE8SPuomOcN7Jwv54xF7Z_pQ3iHw2xC3w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY
Список pgsql-hackers
Hello, Michael!

> As far as I can see, it depends on what kind of query semantics and
> the amount of transparency you are looking for here in your
> application.  An error in the query itself can also be defined as
> useful so as your application is aware of what happens as an effect of
> the concurrent index build (reindex or CIC/DIC), and it is not really
> clear to me why silently falling back to a re-selection of the arbiter
> indexes would be always better.

From my point of view, INSERT ON CONFLICT UPDATE should never fail with "ERROR:  duplicate key value violates unique constraint" because the main idea of upsert is to avoid such situations.
So, it is expected by majority and, probably, is even documented.

On the other side, REINDEX CONCURRENTLY should not cause any queries to fail accidentally without any clear reason.

Also, as you can see from the topic starter letter, we could see errors like this:

* ERROR:  duplicate key value violates unique constraint "tbl_pkey"
* ERROR:  duplicate key value violates unique constraint "tbl_pkey_ccnew"
* ERROR:  duplicate key value violates unique constraint "tbl_pkey_ccold"

So, the first error message does not provide any clue for the developer to understand what happened.

> - The planner ignores indexes with !indisvalid.
> - The executor ignores indexes with !indislive.

Yes, and it feels like we need one more flag here to distinguish !indisvalid indexes which are going to become valid and which are going to become !indislive.

For example, let name it as indiscorrect (it means it contains all the data). In such case, we may use the following logic:

1) !indisvalid && !indiscorrect - index in validation phase probably, do not use it as arbiter because it does not contain all the data yet
2) !indisvalid && indiscorrect - index will be dropped most likely. Do not plan new queries with it, but it still may be used by other queries (including upserts). So, we still need to include it to the arbiters.

And, during the reindex concurrently:

1) begin; mark new index as indisvalid and indiscorrect; mark old one as !indisvalid but still indiscorrect. invalidate relcache; commit;

Currently, some queries are still using the old one as arbiter, some queries use both.

2) WaitForLockersMultiple

Now all queries use both indexes as arbiter.

3) begin; mark old index as !indiscorrect, additionally to !indisvalid; invalidate cache; commit;

Now, some queries use only the new index, both some still use both.

4)  WaitForLockersMultiple;

Now, all queries use only the new index - we are safe to mark the old one it as !indislive.

> It should, but I'm wondering if that's necessary for two reasons.
In that case, it becomes:

    Assert(indexRelation->rd_index->indiscorrect);
    Assert(indexRelation->rd_index->indislive);

and it is always the valid check.

Best regards,
Mikhail.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Backporting BackgroundPsql
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin