Re: Add on_perl_init and proper destruction to plperl [PATCH]

Поиск
Список
Период
Сортировка
От Tim Bunce
Тема Re: Add on_perl_init and proper destruction to plperl [PATCH]
Дата
Msg-id 20100127143318.GE713@timac.local
обсуждение исходный текст
Ответ на Re: Add on_perl_init and proper destruction to plperl [PATCH]  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Wed, Jan 27, 2010 at 01:14:16AM -0500, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > Tim Bunce wrote:
> >> - Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP)
> >> SPI functions are not available when the code is run.
> >> 
> >> - Added normal interpreter destruction behaviour
> >> END blocks, if any, are run then objects are
> >> destroyed, calling their DESTROY methods, if any.
> >> SPI functions will die if called at this time.
> 
> > So, are there still objections to applying this patch?
> 
> Yes.

To focus the discussion I've looked back through all the messages from
you that relate to this issue so I can summarize and try to address your
objections.

Some I've split or presented out of order, most relate to earlier (less
restricted) versions of the patch before it was split out, and naturally
they are lacking some context, so I've included archive URLs.

Please forgive and correct me if I misrepresent you or your intent here.


Regarding the utility of plperl.on_perl_init and END:
   http://archives.postgresql.org/message-id/18338.1260033447@sss.pgh.pa.us   The question is not about whether we
thinkit's useful; the question    is about whether it's safe.
 

I agree.


Regarding visibility of changes to plperl.on_perl_init:
   http://archives.postgresql.org/message-id/28618.1259952660@sss.pgh.pa.us   What is to happen if the admin changes
thevalue when the system is already up?
 

If a GUC could be defined as PGC_BACKEND and only settable by superuser,
perhaps that would be a good fit. [GucContext seems to conflate some things.]
Meanwhile the _init name is meant to convey the fact that it's a
before-first-use GUC, like temp_buffers.

I'm happy to accept whatever you'd recommend by way of PGC_* GUC selection.
Documentation can note any caveats associated with combining
plperl.on_perl_init with shared_preload_libraries.
   http://archives.postgresql.org/message-id/4516.1263168347@sss.pgh.pa.us   However, I think PGC_SIGHUP would be
enoughto address my basic   worry, which is that people shouldn't be depending on the ability to set   these things
withinan individual session.
 

The patch uses PGC_SIGHUP for plperl.on_perl_init.

   http://archives.postgresql.org/message-id/8950.1259994082@sss.pgh.pa.us   > Tom, what's your objection to Shlib load
timebeing user-visible?                                                                                      It's not
reallydesigned to be user-visible.  Let me give you just   two examples:
 
   * We call a plperl function for the first time in a session, causing   plperl.so to be loaded.  Later the
transactionfails and is rolled   back.  If loading plperl.so caused some user-visible things to happen,   should those
berolled back?  If so, how do we get perl to play along?   If not, how do we get postgres to play along?
 

I believe that's addressed by spi functions being disabled when init code runs.
   * We call a plperl function for the first time in a session, causing   plperl.so to be loaded.  This happens in the
contextof a superuser   calling a non-superuser security definer function, or perhaps vice   versa.  Whose permissions
applyto whatever the on_load code tries   to do?  (Hint: every answer is wrong.)
 

I think that related to on_*trusted_init not plperl.on_perl_init, and
is also addressed by spi functions being disabled when init code runs.
   That doesn't even begin to cover the problems with allowing any of   this to happen inside the postmaster.  Recall
thatthe postmaster   does not have any database access.
 

I believe that's addressed by spi functions being disabled when init code runs.
   Furthermore, it is a very long   established reliability principle around here that the postmaster   process should
doas little as possible, because every thing that it   does creates another opportunity to have a nonrecoverable
failure.  The postmaster can recover if a child crashes, but the other way   round, not so much.
 

I understand that concern. Ultimately, though, that comes down to the
judgement of DBAs and the trust placed in them. They can already
load arbitrary code via shared_preload_libraries.

   http://archives.postgresql.org/message-id/18338.1260033447@sss.pgh.pa.us   > I think if we do this the on_perl_init
settingshould probably be   > PGC_POSTMASTER, which would remove any issue about it changing   > underneath us.
 
   Yes, if the main intended usage is in combination with preloading perl   at postmaster start, it would be pointless
toimagine that PGC_SIGHUP   is useful anyway.
 
   http://archives.postgresql.org/message-id/17793.1260031296@sss.pgh.pa.us   Yeah, in the shower this morning I was
thinkingthat not loading   SPI till after the on_init code runs would alleviate the concerns   about transactionality
andpermissions --- that would ensure that   whatever on_init does affects only the Perl world and not the database
world.

That's included in the current patch (and also applies to END blocks).
   However, we're not out of the woods yet.  In a trusted interpreter   (plperl not plperlu), is the on_init code
executedbefore we lock down   the interpreter with Safe?  I would think it has to be since the main   point AFAICS is
tolet you preload code via "use".  But then what is   left of the security guarantees of plperl?  I can hardly imagine
DBAs  wanting to vet a few thousand lines of random Perl code to see if it   contains anything that could be
subverted.

plperl.on_perl_init code, set by the DBA, runs before the Safe
compartment is created. Without explicitextra  steps the Safe
compartment has no access to code loaded by plperl.on_perl_init.

The Safe compartment (plperl) could get access to loaded code in one of
these ways:
1. by using SQL to call a plperlu function that accesses the code.
2. by the DBA 'sharing' a specific subroutine with the compartment.
3. by the DBA loading a module into the compartment.

There's no formal interface for 2. and 3. at the moment, so the only
official option is 1. (The final patch in the series includes some
building blocks towards an interface for 2 & 3, but doesn't add one.)
   If you're willing to also confine the feature to plperlu, then maybe   the risk level could be decreased from insane
tomerely unreasonable.
 

I think you could reasonably describe plperl.on_perl_init as effectively
confined to plperlu (because plperl has no access to any new code).

   http://archives.postgresql.org/message-id/26766.1263149361@sss.pgh.pa.us   For the record, I think it's a bad idea
torun arbitrary   user-defined code in the postmaster, and I think it's a worse idea to   run arbitrary user-defined
codeat backend shutdown (the END-blocks bit).   I do not care in the least what applications you think this might
enable--- the negative consequences for overall system stability seem   to me to outweigh any possible arguments on
thatside.
 
-   What happens when the supplied code has errors,

For on_perl_init it throws an exception that propagates to the user
statement that triggered the initialization of perl. It also ensures
that perl is left in a non-initialized state, so any further uses
also fail.

For END blocks an error triggers an exception that's caught by perl.

(As noted above, there's no access to postgres from init or END code.)

-   takes an unreasonable amount of time to run,

Unreasonable is in the eye of the DBA, of course, and they
have the discretion to set on_perl_init to fit their needs.

For END blocks, I don't see how this issue is any different from
"users might do something dumb", like DO 'while(1){}' language plperl;
(or plpython , pltcl, or plpgsql for that matter).

-   does something unsafe,

Such as? The code can't do anything more unsafe than is already possible.

-   depends on the backend not being in an error state already,

The code has no access to postgress, whatever the state.

-   etc. etc?

I'd welcome more concrete examples of potential issues.


Tim.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Hitoshi Harada
Дата:
Сообщение: Re: Improving the accuracy of estimate_num_groups()
Следующее
От: Ivan Sergio Borgonovo
Дата:
Сообщение: Re: C function accepting/returning cstring vs. text