Обсуждение: Re: - Proposal for repreparing prepared statements

Поиск
Список
Период
Сортировка

Re: - Proposal for repreparing prepared statements

От
"Marshall, Steve"
Дата:
Tom,

Thanks for the update on the roadmap.  I have a couple of questions:

1. Is the invalidation of stored plans going to be part of 8.2?  If not,
any idea when it would be available?  I'd be willing to work on this, if
it would help.

2. Is there any plan for the other part of my proposal, i.e. the ability
to force one or all backends to reload their dynamically linked
libraries?  This is needed by backends that use loadable modules with
procedural languages (like pltcl_modules) in cases where a loadable
module gets updated.

Thanks,
Steve

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Wednesday, September 13, 2006 2:08 PM
To: Marshall, Steve
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] - Proposal for repreparing prepared statements

Stephen Marshall <smarshall@wsi.com> writes:
> The following is a proposal for work I'd like to do to force
> long-running backend processes to reprepare their prepared statements.
> It would be used in cases where the user knows they have made a
> database change that will invalidate an existing prepared statement.

There should be no need for users to concern themselves with this.  The
direction we've been intending to go in is to automatically invalidate
stored plans when any related schema or statistics change occurs,
forcing a re-plan on any subsequent use.  See past discussions (IIRC,
Neil Conway actually did some work on this idea earlier this year, but
didn't get it done).

The appropriate cross-backend communication mechanism already exists:
it's the catcache/relcache invalidation code.  No need to fool with
finding a spare signal; and you can't do any meaningful work in a signal
handler anyway.
        regards, tom lane



Re: - Proposal for repreparing prepared statements

От
Tom Lane
Дата:
"Marshall, Steve" <smarshall@wsi.com> writes:
> 1. Is the invalidation of stored plans going to be part of 8.2?  If not,
> any idea when it would be available?  I'd be willing to work on this, if
> it would help.

No, it did not get done; feel free to work on it for 8.3.

> 2. Is there any plan for the other part of my proposal, i.e. the ability
> to force one or all backends to reload their dynamically linked
> libraries?

I would vote urgently against that, as not all loadable libraries are
necessarily designed to survive being reloaded.  Nor have I heard any
previous requests for it; how often do loadable libraries get updated in
production?  Stuff like pltcl.so would only change at a version update,
for which you have to restart the postmaster anyway to bring in the new
executable image for postgres itself.

The LOAD command is meant for development, not as something that would
be invoked in production ...
        regards, tom lane


Re: - Proposal for repreparing prepared statements

От
"Marshall, Steve"
Дата:

"Marshall, Steve" <smarshall@wsi.com> writes:
> 1. Is the invalidation of stored plans going to be part of 8.2?  If
> not, any idea when it would be available?  I'd be willing to work on
> this, if it would help.

No, it did not get done; feel free to work on it for 8.3.

[steve's reply]:
Could you clue me into the buzz words for searching the lists to find
out more details on the plan, what has been done, and what is left to
do?

> 2. Is there any plan for the other part of my proposal, i.e. the
> ability to force one or all backends to reload their dynamically
> linked libraries?

I would vote urgently against that, as not all loadable libraries are
necessarily designed to survive being reloaded.  Nor have I heard any
previous requests for it; how often do loadable libraries get updated in
production?  Stuff like pltcl.so would only change at a version update,
for which you have to restart the postmaster anyway to bring in the new
executable image for postgres itself.

The LOAD command is meant for development, not as something that would
be invoked in production

[steve's reply]:
I can understand your reservations about reloading libraries, and I'm
really not committed to doing that.  Reloading libraries is just a
mechanism to do what I really want to do.  I guess that did not come
across very clearly in my first email, so let me explain again.

I want to be able to tell a backend to reinitialize some of the cached
data it is holding in static variables.  Generally I want to do this
when the something "cached" in memory has gotten out-of-sync with what
is in the database.  The two examples that affect me are:

1. A database table has changed in its structure, and the prepared
statements related to it either no longer work, or do the wrong thing.

2. A TCL module in pltcl_modules is updated with a new version, but
running processes that have already executed a function from this module
do not pick up the change.

The former is handled by the prepared statement invalidation plan, but
the latter is not.  Effectively, I want something that does the
equivalent to what "CREATE OR REPLACE FUNCTION" does for SQL function,
except for these procedural language modules.  The current way the
pltcl_modules works is like have a "CREATE OR REPLACE FUNCTION"
capability that will only effect newly started connections, but has no
effect on existing connections.

The reason I advocated the library reloading is not because pltcl.so has
changed (which should be rare), but that by reloading it, the static
variables associated with the library would get reinitialized.
Essentially reloading the library pltcl.so has the side effect of
causing the TCL modules in pltcl_modules to get reread from disk.  I'd
love to have a lighter-weight way to do this, but unfortunately, I have
not come up with one.  If someone has a better way to do this, I'm all
ears.

Currently the only way I know of to do a complete reinitialization of
static variables is to force a disconnection on all the backend
processes, and let them all reconnect to new postgres backends that have
clean slates in their static variables.  There does not seem to be a way
to do this via SQL; as far as I know, this requires access to the
database server as the postgres user or superuser and sending a TERM
signal to each backend you want to kill.
-----
Basically, I've come up with two basic approaches for forcing a reload
of pltcl_modules:  have a system admin function to do some kind of
"resetting" in each backend that cleans out the static variables
associated with a procedural language, or kill most or all of the
backends and let the client applications handle the errors and reset
their connections.  Currently, the only "resetting" operation I know to
work is reloaded the pltcl.so library.

If anyone has any better ideas for how to crack this problem, I love to
hear them.

Here's some more details on the approaches I've thought of, for those
who are interested:

1. Add special logic to pltcl_loadmod (the utility that facilitates
inserting and updating TCL modules in the database). However, this would
still require some kind of communication to all the other backend
processes, which would probably require something like the next
suggestion....

2. A system admin command to force the reinitialization of the static
data associated with a procedural language.  This command would need to
communicate either with all the backends, or to a backend with a
particular pid (like pg_cancel_backend does); it is not sufficient for
it only to effect the backend associated with its connection.  This
command could be specific to the procedural language (e.g.
reload_pltcl_modules), or allow a particular dynamically linked library
to be reload (reload_library_if_already_loaded(<libname>).  There are
probably other ways to effect the reinitialization, but all other them
seem to require communication to a backend process other than the one we
are connected to, which made it seem to me like a candidate for using
signals.   [BTW, I never meant that this type of work should be done in
a signal handler, but rather by setting a variable in a signal handler
that indicates that the reinit needs to be done in the next pass through
one of the "main loop" functions.  As far as I can tell this is how
pg_reload_conf works]

3. Create a system admin command to terminate backends by process id.
This would allow one to do the equivalent of the kill command to
processes via SQL, and would be relative simple to implement.  While
this would be most similar to existing system admin function, which also
send signals, it would be more disruptive to operations, killing
processes that do not even use pltcl_modules, just so the ones that do
get the effects of the changes.  It would also be more invasive,
potentially generating failures of client's queries or commands, just
because they happened to occur when a low-level module got updated...

4. Use system level access to kill all the processes.  This would not
require any code changes, and would essentially have the same properties
as #3, except that it would be harder to administer from remote systems
(i.e. you would need system-level access to the system, rather than SQL
access).  You could probably also more easily limit the kills to just
the processes that have loaded the targeted library using lsof or a
similar utility to see which ones are referencing the library file, so
the solution could have less of the "meat-axe" effect of kiling
everything.  But it would be pretty ugly and OS-specific...
----


Re: - Proposal for repreparing prepared statements

От
Tom Lane
Дата:
"Marshall, Steve" <smarshall@wsi.com> writes:
> I want to be able to tell a backend to reinitialize some of the cached
> data it is holding in static variables.  Generally I want to do this
> when the something "cached" in memory has gotten out-of-sync with what
> is in the database.  The two examples that affect me are:

> 1. A database table has changed in its structure, and the prepared
> statements related to it either no longer work, or do the wrong thing.

This we need to fix.

> 2. A TCL module in pltcl_modules is updated with a new version, but
> running processes that have already executed a function from this module
> do not pick up the change.  

To be frank, this is not Postgres' problem, it's pltcl's.  I think the
pltcl_modules facility is poorly designed (precisely because it doesn't
fit into any sane way of handling the schema-update problem) and needs
to be thrown out and redone.  If the units you were reloading were
Postgres functions, or had some other way of being represented in the
system catalogs, then we'd have a reasonable way to attack the problem.
But forcing a reload of pltcl.so is nothing but a kluge --- it leaks
memory like there's no tomorrow, and it's only an accident that it fails
to crash.  I don't want to design a further kluge on top of it.
        regards, tom lane


Re: - Proposal for repreparing prepared statements

От
"Marshall, Steve"
Дата:

But forcing a reload of pltcl.so is nothing but a kluge --- it leaks
memory like there's no tomorrow, and it's only an accident that it fails
to crash.  I don't want to design a further kluge on top of it.

Are you saying that pltcl.so leaks memory in general, or that forcing a
reload of the pltcl.so library leaks memory?  If the former, I haven't
seen evidence of it in my use of pltcl, but I'd be interested to know
ofo problems other have had with it.

Yours,
Steve Marshall



Re: - Proposal for repreparing prepared statements

От
Tom Lane
Дата:
"Marshall, Steve" <smarshall@wsi.com> writes:
>> But forcing a reload of pltcl.so is nothing but a kluge --- it leaks
>> memory like there's no tomorrow, and it's only an accident that it fails
>> to crash.  I don't want to design a further kluge on top of it.

> Are you saying that pltcl.so leaks memory in general, or that forcing a
> reload of the pltcl.so library leaks memory?

The latter.  There's no mechanism to release the previous instance's data
structures (Tcl interpreters, function bodies, etc).
        regards, tom lane