Re: pg_background (and more parallelism infrastructure patches)

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: pg_background (and more parallelism infrastructure patches)
Дата	24 октября 2014 г. 20:25:36
Msg-id	CA+TgmoZVFHQMO2MxOLt8RphnJC6ido_iFhLv3M25exiA8tN5XQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: pg_background (and more parallelism infrastructure patches) (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список	pgsql-hackers

Дерево обсуждения

On Fri, Oct 24, 2014 at 3:23 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> Here's a review of patch 4.

Thanks!

> Perhaps it would be good to document the serialization format. I at least
> would have found it helpful, especially when reading
> estimate_variable_size(). I can take a stab at that if you'd rather not be
> bothered.

I find the existing code pretty clear: to my eyes,
serialize_variable() really couldn't be much more transparent.  But if
you want to suggest something I'm all ears.

> estimate_variable_size():
> +                               valsize = 5;    /* max(strlen('true'),
> strlen('false')) */
> ...
> +                               if (abs(*conf->variable) < 1000)
> +                                       valsize = 3 + 1;
>
> If we're worried about size, perhaps we should use 1/0 for bool instead of
> true/false?

I guess we could.  I'm not particularly worried about size, myself.
We need all the rigamarole about estimating the amount of space needed
because we can't allocate more space once we begin serializing.  If
the DSM where this data is getting stored turns out not to be big
enough, we can't resize it.  So we need to *know* how big the data is
going to be, but I don't think we need to *care* very much.

> On the serialization structure itself, should we be worried about a mismatch
> between available GUCs on the sender vs the receiver? Presumably if the
> sender outputs a GUC that the receiver doesn't know about we'll get an
> error, but what if the sender didn't include something? Maybe not an issue
> today, but could this cause problems down the road if we wanted to use the
> serialized data some other way? But maybe I'm just being paranoid. :)

I think the principal danger here is that there could be some loadable
module which is present in one backend but not the other, or which (in
its inimitable wisdom) chose to register a different set of GUCs or
valid values for those GUCs in one backend than the other.  That would
indeed be sad.  Given that our rules for registering GUCs resemble
nothing so much as the wild west, I don't believe there's any ironclad
way to defend against it, either.

One thing we could do - not in this patch - is provide a mechanism
that would cause a background worker to load every loadable module
that's present in the launching worker.  I'm a little reluctant to
spend time on that now.  If pg_background doesn't work with certain
combinations of loadable modules and GUC settings... it's not a
catastrophe.  It might be more of an issue for parallel query proper,
because expectations may be higher there, but if so it won't hurt to
have some experience seeing whether things go wrong with this simple
system, and in what manner.  At least IMHO, it doesn't seem like a
good idea to put fiddling with the edges of this ahead of other
parallelism-related projects.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: pg_background (and more parallelism infrastructure patches)