Re: Add parallelism and glibc dependent only options to reindexdb

Поиск

Список

Период

Сортировка

От	Julien Rouhaud
Тема	Re: Add parallelism and glibc dependent only options to reindexdb
Дата	1 июля 2019 г. 16:14:20
Msg-id	CAOBaU_bg3VheGYkjjvPd6Buw2Uk7yqAAGwXy7BXm6DwytWAdwA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Add parallelism and glibc dependent only options to reindexdb (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы	Re: Add parallelism and glibc dependent only options to reindexdb
Список	pgsql-hackers

Дерево обсуждения

On Mon, Jul 1, 2019 at 3:51 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> Please don't reuse a file name as generic as "parallel.c" -- it's
> annoying when navigating source.  Maybe conn_parallel.c multiconn.c
> connscripts.c admconnection.c ...?

I could use scripts_parallel.[ch] as I've already used it in the #define part?

> If your server crashes or is stopped midway during the reindex, you
> would have to start again from scratch, and it's tedious (if it's
> possible at all) to determine which indexes were missed.  I think it
> would be useful to have a two-phase mode: in the initial phase reindexdb
> computes the list of indexes to be reindexed and saves them into a work
> table somewhere.  In the second phase, it reads indexes from that table
> and processes them, marking them as done in the work table.  If the
> second phase crashes or is stopped, it can be restarted and consults the
> work table.  I would keep the work table, as it provides a bit of an
> audit trail.  It may be important to be able to run even if unable to
> create such a work table (because of the <ironic>numerous</> users that
> DROP DATABASE postgres).

Or we could create a table locally in each database, that would fix
this problem and probably make the code simpler?

It also raises some additional concerns about data expiration.  I
guess that someone could launch the tool by mistake, kill reindexdb,
and run it again 2 months later while a lot of new objects have been
added for instance.

> The "glibc filter" thing (which I take to mean "indexes that depend on
> collations") would apply to the first phase: it just skips adding other
> indexes to the work table.  I suppose ICU collations are not affected,
> so the filter would be for glibc collations only?

Indeed, ICU shouldn't need such a filter.  xxx_pattern_ops based
indexes are also excluded.

>  The --glibc-dependent
> switch seems too ad-hoc.  Maybe "--exclude-rule=glibc"?  That way we can
> add other rules later.  (Not "--exclude=foo" because we'll want to add
> the possibility to ignore specific indexes by name.)

That's a good point, I like the --exclude-rule switch.

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Add parallelism and glibc dependent only options to reindexdb