Re: pg_background (and more parallelism infrastructure patches)

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: pg_background (and more parallelism infrastructure patches)
Дата
Msg-id 20141030232819.GL17724@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: pg_background (and more parallelism infrastructure patches)  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: pg_background (and more parallelism infrastructure patches)
Список pgsql-hackers
Hi,

On 2014-10-30 18:03:27 -0400, Robert Haas wrote:
> On Wed, Oct 29, 2014 at 4:58 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> >> That's true.  I don't know what to do about it.  I'm somewhat inclined
> >> to think that, if this remains in contrib, it's OK to ignore those
> >> issues until such time as people complain about them, because anybody
> >> who dislikes the things that can be done with this extension doesn't
> >> have to install it.  Also, the people complaining might have useful
> >> ideas about what a good fix would look like, which I currently don't.
> >> There's some push to move this into core, which I think is overkill,
> >> but if we do it then we'd better have a good solution to this problem.
> >
> > At the very least it need to be clearly documented. Another solution
> > would be to simply not give out PUBLIC rights, and restrict it to the
> > owner/superuesers lest somebody makes explicit grants. I favor
> > combining those two.
> 
> I don't think it's appropriate to put superuser() checks in the code,
> if that's what you are proposing.

That's not at all what I'm thinking of... The superuser reference was
only that explicit EXECUTE rights aren't required for superusers.

> Forcing this to be super-user only is hitting a mouse with a
> battleship.  If you're saying we should put REVOKE commands into the
> install script as we do for some other modules, like dblink, that
> makes sense to me.

But instead this.

> >> We could try to make connection limits apply to pg_background, and we
> >> could also check CONNECT permission when starting a background worker.
> >> Both of those things feel slightly odd because there's no actual
> >> server connection.  There *might* be a connection to the user backend
> >> that started it, but it's sort of a "virtual" connection through
> >> shared memory, and the background process continues running unimpeded
> >> if it goes away, so there might be no actual connection at all.
> >
> > I think that'd not be bad.
> 
> Looks like those checks happen in InitializeSessionUserId(), which is
> called from InitPostgres(), which is called from
> BackgroundWorkerInitializeConnection().  That makes me think we're
> already applying these checks.

Oh, neat.

> > Hm. I'm unconvinced. It looks almost trivial to fail back to the text
> > based protocol.
> 
> It's hard to argue with "I'm unconvinced".  What specifically about
> that argument do you think isn't valid?

We don't normally just say "this is unsupported" for things that are
perfectly valid. And types without binary send/recv functions imo are
perfectly valid. You've made that kind of argument yourself more than
once, no?
It's really not a academic thing. Popular extensions like postgis don't
provide send/recv for all its types. Same with a couple of our own
contrib types (intarray, chkpass, seg, cube).

I guess we need input from others on this point.

> While I am sure the problem can be worked around, it isn't trivial.
> Right now, execute_sql_string() just requests binary format
> unconditionally.  To do what you're talking about, we'd need to
> iterate through all of the types and figure out which ones have
> typsend/typreceive functions.  If there's a convenience function that
> will do that for us, I don't see it, probably because I believe there
> are exact zero situations where we do that kind of inference today.
> Then, the user backend has to save the format codes from the
> RowDescription message and decide whether to use text or binary.  That
> just seems like a silly waste of code and cycles.

The current client/server protocol definitely can do it. You can specify
per column whether you want binary/textual data. I'm pretty sure that at
least the jdbc driver does that.

It's also what I've decided to do for BDR's output plugin. Some builtin
types will be transferred 'as is' if the architecture/version
matches. Otherwise, if available and the version matches, send/recv will
be used (exluding some corner case, but ..). Only if neither is the case
if falls back to the textual protocol.

> I think this actually matters, too, because the question is what we're
> going to do with full-blown parallelism.

I think for parallelism it'll most definitely not be acceptable to not
support types without send/recv. So building some convenience
infrastructure here imo is going to pay off. It's not like it's that
hard to detect - just check OidIsValid(pg_type->typsend). The pg_type
tuple has to be looked up anyway to find the send/recv functions...

> Best would be to actually
> shuttle entire raw tuples between backends; second best, binary
> format; third best, text format or mixture of text and binary.  I'm
> not sure what it's reasonable to try to get away with there, but I
> certainly think minimizing IPC costs is going to be an important goal.
> I didn't try to get around with shipping raw tuples here because we
> don't lock types while they're in use, and I'm worried that Bad Things
> Could Happen.

Yea, because of type changes and such I went with using the 'as-is'
format only for builtin types (via att->atttypid < FirstNormalObjectId)
- we know those aren't changed. And the majority of the data usually
uses such types...

> > There's some PGC_BACKEND guc's that normally are only allowed to be set
> > at connection start. I'm not sure whether you're just circumventing that
> > check or whether you didn't hit a problematic case.
> >
> > What does e.g. happen if you set PGOPTIONS='-c
> > local_preload_libraries=auto_explain' after linking auto_explain into
> > the plugin directory?
> 
> Eh, I'm embarrassed to admit I don't know exactly how to do this.  My
> install directory doesn't seem to have a subdirectory called "plugin".

Sadly you need to create it. Inside the normal $libdir. I.e. $(pg_config
--libdir)/plugins. And then (sym-)link/copy libraries that you want
every user to use in there.

> BTW, are we in agreement, more or less, on patch #3 now, the support
> for error propagation?  I'd like to go ahead and commit that if there
> are no further review comments there.

You're talking about
0003-Support-frontend-backend-protocol-communication-usin-v3.patch from
http://archives.postgresql.org/message-id/CA%2BTgmoY9B0mL7Ci1fG1cZ4-zwDHOjjQaKyLowB3F-jsbuvVw2g%40mail.gmail.com?

If so, go ahead from my POV. It seems likely that we'll whack it around
some more, but I don't see an advantage in waiting longer.

Andres

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Nasby
Дата:
Сообщение: Re: TAP test breakage on MacOS X
Следующее
От: Tom Lane
Дата:
Сообщение: Re: TAP test breakage on MacOS X