Обсуждение: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

Поиск
Список
Период
Сортировка

What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Moshe Jacobson
Дата:
I am working on an audit logging trigger that gets called for every row inserted, updated or deleted on any table.
For this, I need to store a couple of temporary session variables such as the ID of the user performing the change, which can be set at the start of the session.
Until now I have been using a permanent table to store the session variables, but it has been difficult to wipe the data properly at the end of the session.
So I have decided to try to implement them using temporary tables. 

The problem now is that for every row now, I need to check for the existence of the temporary table before I access it, in order to avoid exceptions.
Either I can do all such accesses within a BEGIN...EXCEPTION block, or I can precede any such accesses with CREATE TEMP TABLE IF NOT EXISTS.
Is one of these much faster than the other? Will I be slowing things down inordinately by doing this for every row?

Thanks.

--
Moshe Jacobson
Nead Werx, Inc. | Senior Systems Engineer
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
moshe@neadwerx.com | 
www.neadwerx.com

Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Merlin Moncure
Дата:
On Mon, Oct 1, 2012 at 8:36 AM, Moshe Jacobson <moshe@neadwerx.com> wrote:
> I am working on an audit logging trigger that gets called for every row
> inserted, updated or deleted on any table.
> For this, I need to store a couple of temporary session variables such as
> the ID of the user performing the change, which can be set at the start of
> the session.
> Until now I have been using a permanent table to store the session
> variables, but it has been difficult to wipe the data properly at the end of
> the session.
> So I have decided to try to implement them using temporary tables.
>
> The problem now is that for every row now, I need to check for the existence
> of the temporary table before I access it, in order to avoid exceptions.
> Either I can do all such accesses within a BEGIN...EXCEPTION block, or I can
> precede any such accesses with CREATE TEMP TABLE IF NOT EXISTS.
> Is one of these much faster than the other? Will I be slowing things down
> inordinately by doing this for every row?

Couple points:
*) Functions without exception blocks are faster than those with.
*) Therefore, CREATE/IF NOT EXISTS is probably faster (test to be sure)
*) Carefully consider if you you will ever in the future introduce
connection pooling.  If you do, relying on session scoped objects like
temp tables is probably not a good idea.
*) You can rig permanent tables around pg_backend_pid().  On session
login, clear session private records that have your pid (if any).
Transaction temporary data can be similarly rigged around
txid_current() with an even simpler maintenance process.

merlin


Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Moshe Jacobson
Дата:
Merlin,

On Mon, Oct 1, 2012 at 10:28 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

Couple points:
*) Functions without exception blocks are faster than those with.

Clearly.
 
*) Therefore, CREATE/IF NOT EXISTS is probably faster (test to be sure)

I don't think that can be assumed by your premise above. Essentially we are comparing the price of starting an exception block against checking the catalog for a table.
 
*) Carefully consider if you you will ever in the future introduce
connection pooling.  If you do, relying on session scoped objects like
temp tables is probably not a good idea.

We already use connection pooling with pgbouncer, but upon disconnect, it issues a DISCARD ALL statement, which should take care of this.
 
*) You can rig permanent tables around pg_backend_pid().  On session
login, clear session private records that have your pid (if any).
Transaction temporary data can be similarly rigged around
txid_current() with an even simpler maintenance process.

We currently do use permanent tables using pg_backend_pid(). It's because of the connection pooling specifically that we are having problems with stale data. I have been unable to find a way to automatically clear that data upon start or end of a session, or at least check if it's been set in this session or not.
 
-- 
Moshe Jacobson
Nead Werx, Inc. | Senior Systems Engineer
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
moshe@neadwerx.com | 
www.neadwerx.com

Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Merlin Moncure
Дата:
On Mon, Oct 1, 2012 at 10:21 AM, Moshe Jacobson <moshe@neadwerx.com> wrote:
> Merlin,
>
> On Mon, Oct 1, 2012 at 10:28 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>>
>>
>> Couple points:
>> *) Functions without exception blocks are faster than those with.
>
>
> Clearly.
>
>>
>> *) Therefore, CREATE/IF NOT EXISTS is probably faster (test to be sure)
>
>
> I don't think that can be assumed by your premise above. Essentially we are
> comparing the price of starting an exception block against checking the
> catalog for a table.

A vanilla create table has to scan the catalogs also.

>> *) Carefully consider if you you will ever in the future introduce
>> connection pooling.  If you do, relying on session scoped objects like
>> temp tables is probably not a good idea.
>
> We already use connection pooling with pgbouncer, but upon disconnect, it
> issues a DISCARD ALL statement, which should take care of this.

Especially if you're using pgbouncer transaction mode, using temporary
tables is probably not a good idea.  When you DISCARD them, it
invalidates all your function plans which is going to be painful if
you have a lot of pl/pgsql (test any non-trivial pl/pgsql routine and
you'll see it's much slower on the first invocation).  Also, if you
need to share data between transactions, it's not clear how you're
organizing such that different application sessions are going to tromp
over each other's data.

>> *) You can rig permanent tables around pg_backend_pid().  On session
>> login, clear session private records that have your pid (if any).
>> Transaction temporary data can be similarly rigged around
>> txid_current() with an even simpler maintenance process.
>
> We currently do use permanent tables using pg_backend_pid(). It's because of
> the connection pooling specifically that we are having problems with stale
> data. I have been unable to find a way to automatically clear that data upon
> start or end of a session, or at least check if it's been set in this
> session or not.

IMO the right way to do it is to generate a unique application token
(sequence is ok if you're not worried about it being guessed) when
your application session logs in.  That token should be passed into
*all* your session specific backend functions and can be used to
organize session specific temporary data in your permanent tables.

To deal with ungraceful application client exit, you can consider
implementing an on_proc_exit handler to close the session down so that
it can be appropriately cleaned up (there are severe limits to the SQL
you can execute in the handler but you can make dblink calls).  If
some of them still sneak through,  periodic sweep on stale pids
against pg_stat_activity should take care of them.

Note, if your users have some type of unique identifier (like a login
or an email) and if they are only allowed to have one active session
at a time, you can organize your session data around that instead of
generating a token.

merlin


Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Merlin Moncure
Дата:
On Mon, Oct 1, 2012 at 11:22 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> We currently do use permanent tables using pg_backend_pid(). It's because of
>> the connection pooling specifically that we are having problems with stale
>> data. I have been unable to find a way to automatically clear that data upon
>> start or end of a session, or at least check if it's been set in this
>> session or not.
>
> IMO the right way to do it is to generate a unique application token
> (sequence is ok if you're not worried about it being guessed) when
> your application session logs in.  That token should be passed into
> *all* your session specific backend functions and can be used to
> organize session specific temporary data in your permanent tables.
>
> To deal with ungraceful application client exit, you can consider
> implementing an on_proc_exit handler to close the session down so that

actually, you can't do that (on_proc_exit or scan for pids) if you're
using transaction mode connection pooling.   In our case, we modified
pgbouncer to pass async notifications and would have used that to
periodically scan connected clients if we didn't have the luxury of
one client/session only.

merlin


Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Moshe Jacobson
Дата:
On Mon, Oct 1, 2012 at 12:22 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> *) Functions without exception blocks are faster than those with.
>> *) Therefore, CREATE/IF NOT EXISTS is probably faster (test to be sure)
>
> I don't think that can be assumed by your premise above. Essentially we are
> comparing the price of starting an exception block against checking the
> catalog for a table.

A vanilla create table has to scan the catalogs also.

Yes but that is irrelevant to the discussion. I am comparing the speed of repeated table existence checks with the speed of repeated exception blocks that access said table. 

> We already use connection pooling with pgbouncer, but upon disconnect, it
> issues a DISCARD ALL statement  [...]

Especially if you're using pgbouncer transaction mode, using temporary
tables is probably not a good idea.  

We are using it in session mode, so none of that is relevant to my situation.

>> *) You can rig permanent tables around pg_backend_pid(). [...]
>
> We currently do use permanent tables using pg_backend_pid(). It's because of
> the connection pooling specifically that we are having problems with stale
> data. I have been unable to find a way to automatically clear that data upon
> start or end of a session, or at least check if it's been set in this
> session or not.

IMO the right way to do it is to generate a unique application token
[...] when your application session logs in.  That token should be passed into
*all* your session specific backend functions [...]

No, this will not work because the backend functions are trigger functions, so they cannot be passed this data.

Thanks.

--
Moshe Jacobson
Nead Werx, Inc. | Senior Systems Engineer
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
moshe@neadwerx.com | 
www.neadwerx.com

Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Merlin Moncure
Дата:
On Mon, Oct 1, 2012 at 3:58 PM, Moshe Jacobson <moshe@neadwerx.com> wrote:
> On Mon, Oct 1, 2012 at 12:22 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>>
>> >> *) Functions without exception blocks are faster than those with.
>> >> *) Therefore, CREATE/IF NOT EXISTS is probably faster (test to be sure)
>> >
>> > I don't think that can be assumed by your premise above. Essentially we
>> > are
>> > comparing the price of starting an exception block against checking the
>> > catalog for a table.
>>
>> A vanilla create table has to scan the catalogs also.
>
>
> Yes but that is irrelevant to the discussion. I am comparing the speed of
> repeated table existence checks with the speed of repeated exception blocks
> that access said table.

Both approaches have to do a catalog scan (even if you've established
an exception block the server still internally has to do a catalog
scan in order to raise an appropriate error).  The exception block has
the additional overhead of a subtransaction.

>> > We already use connection pooling with pgbouncer, but upon disconnect,
>> > it
>> > issues a DISCARD ALL statement  [...]
>>
>> Especially if you're using pgbouncer transaction mode, using temporary
>> tables is probably not a good idea.
>
> We are using it in session mode, so none of that is relevant to my
> situation.

OK. (but I don't like solutions that prevent stateless connection
pooling).  In just about all cases where scalability was a concern and
I used session scoped objects I ended up regretting it somewhere down
the line.  A more stateless approach has a lot of advantages besides
supporting more aggressive connection pooling -- for example you can
restart the server and all your connected clients wont lose local
temporary data.  Anyways, enough lecturing -- I'm sure you've got
things pretty well figured out :-).

>> >> *) You can rig permanent tables around pg_backend_pid(). [...]
>>
>> >
>> > We currently do use permanent tables using pg_backend_pid(). It's
>> > because of
>> > the connection pooling specifically that we are having problems with
>> > stale
>> > data. I have been unable to find a way to automatically clear that data
>> > upon
>> > start or end of a session, or at least check if it's been set in this
>> > session or not.
>>
>> IMO the right way to do it is to generate a unique application token
>> [...] when your application session logs in.  That token should be passed
>> into
>> *all* your session specific backend functions [...]
>
>
> No, this will not work because the backend functions are trigger functions,
> so they cannot be passed this data.

Sure they can...there are a number of ways to do it (although they all
require A. an extra round trip to establish the ID for later
statements in the transaction to read or B. a wrapping function that
handles the work on the server side).  That said, if you're super
duper sure you'll never use transaction mode pooling, Temp tables are
ok to use unless your sessions are quite short (in which case all the
plan invalidation flying around will start to hurt).

merlin


Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Moshe Jacobson
Дата:
On Tue, Oct 2, 2012 at 9:18 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> Yes but that is irrelevant to the discussion. I am comparing the speed of
> repeated table existence checks with the speed of repeated exception blocks
> that access said table.

Both approaches have to do a catalog scan (even if you've established
an exception block the server still internally has to do a catalog
scan in order to raise an appropriate error).  The exception block has
the additional overhead of a subtransaction.

OK this makes sense. Thanks :-) I will go for the CREATE TABLE approach.

--
Moshe Jacobson
Nead Werx, Inc. | Senior Systems Engineer
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
moshe@neadwerx.com | 
www.neadwerx.com

Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Ivan Voras
Дата:
On 01/10/2012 15:36, Moshe Jacobson wrote:
> I am working on an audit logging trigger that gets called for every row
> inserted, updated or deleted on any table.
> For this, I need to store a couple of temporary session variables such as
> the ID of the user performing the change, which can be set at the start of
> the session.
> Until now I have been using a permanent table to store the session
> variables, but it has been difficult to wipe the data properly at the end
> of the session.

Do you know about session variables? I did something similar to what you
are describing and it ended up much simpler than using tables, temporary
or not.

You need to configure them in postgresql.conf, e.g.:

    custom_variable_classes = 'myapp'

Then in the application code:

    SET myapp.uid = 42;

And in the pl/pgsql function:

CREATE OR REPLACE FUNCTION dblog() RETURNS TRIGGER AS $$
DECLARE
    uid     INTEGER;
BEGIN
    BEGIN
        SELECT current_setting('myapp.uid') INTO uid;
    EXCEPTION
        WHEN undefined_object THEN
            uid = null;
        WHEN data_exception THEN
            uid = null;
    END;
    ...
END;
$$ LANGUAGE plpgsql;

The major benefit here is that it doesn't touch the table engines,
temporary or not.


Вложения

Re: Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Moshe Jacobson
Дата:
On Thu, Oct 4, 2012 at 6:12 AM, Ivan Voras <ivoras@freebsd.org> wrote:
On 01/10/2012 15:36, Moshe Jacobson wrote:
> I am working on an audit logging trigger that gets called for every row
> inserted, updated or deleted on any table.
> For this, I need to store a couple of temporary session variables such as
> the ID of the user performing the change, which can be set at the start of
> the session.

Do you know about session variables? 
The major benefit here is that it doesn't touch the table engines,
temporary or not.

This sounds incredibly useful. Why have I not heard of this until today??
In your example you still had to use a BEGIN...EXCEPTION block. Is that faster than a create temp table?

--
Moshe Jacobson
Nead Werx, Inc. | Senior Systems Engineer
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
moshe@neadwerx.com | 
www.neadwerx.com

Re: Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Ivan Voras
Дата:
On 5 October 2012 04:53, Moshe Jacobson <moshe@neadwerx.com> wrote:
> On Thu, Oct 4, 2012 at 6:12 AM, Ivan Voras <ivoras@freebsd.org> wrote:
>>
>> On 01/10/2012 15:36, Moshe Jacobson wrote:
>> > I am working on an audit logging trigger that gets called for every row
>> > inserted, updated or deleted on any table.
>> > For this, I need to store a couple of temporary session variables such
>> > as
>> > the ID of the user performing the change, which can be set at the start
>> > of
>> > the session.
>>
>> Do you know about session variables?
>> The major benefit here is that it doesn't touch the table engines,
>> temporary or not.
>
> This sounds incredibly useful. Why have I not heard of this until today??
> In your example you still had to use a BEGIN...EXCEPTION block. Is that
> faster than a create temp table?

I think I can make a fairly educated guess that catching exceptions
while dealing with session variables should be much, much faster than
creating any kind of a table :)

Besides, from what you said, you will ensure on the app level that the
session variable is set sometime close to when you open a connection
to Pg, so the "catch" part of the exception block will probably not
run at all.


Re: Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Merlin Moncure
Дата:
On Fri, Oct 5, 2012 at 3:09 AM, Ivan Voras <ivoras@freebsd.org> wrote:
> I think I can make a fairly educated guess that catching exceptions
> while dealing with session variables should be much, much faster than
> creating any kind of a table :)

That is true, but it's not clear how using session variables keeps you
from having to create the table.  If the table is already there, a
session variable guarding the table construction shouldn't be any
faster/better than a simple 'create if not exists'.  A catalog scan is
basically boils down to a query.  So I'm not sure that actually helps.

On the other hand, if the temp table can be completely ditched for a
session variable or two, then yeah, that would be much better since
you'd avoid the overhead of creating the table completely.

merlin


Re: Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Ivan Voras
Дата:
On 5 October 2012 15:55, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Fri, Oct 5, 2012 at 3:09 AM, Ivan Voras <ivoras@freebsd.org> wrote:
>> I think I can make a fairly educated guess that catching exceptions
>> while dealing with session variables should be much, much faster than
>> creating any kind of a table :)

> On the other hand, if the temp table can be completely ditched for a
> session variable or two, then yeah, that would be much better since
> you'd avoid the overhead of creating the table completely.

Yes, this is what I was aiming at, based on the OP mentioning he only
has a limited amount of data to manage in this way.


Re: Re: What's faster? BEGIN ... EXCEPTION or CREATE TEMP TABLE IF NOT EXISTS?

От
Moshe Jacobson
Дата:
On Fri, Oct 5, 2012 at 11:13 AM, Ivan Voras <ivoras@freebsd.org> wrote:
> On the other hand, if the temp table can be completely ditched for a
> session variable or two, then yeah, that would be much better since
> you'd avoid the overhead of creating the table completely.

Yes, this is what I was aiming at, based on the OP mentioning he only
has a limited amount of data to manage in this way.

Yup, I'm going to see how this goes using session variables. I think it is exactly what I need. Thanks!

--
Moshe Jacobson
Nead Werx, Inc. | Senior Systems Engineer
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
moshe@neadwerx.com | 
www.neadwerx.com