Обсуждение: very slow execution of stored procedures

Поиск
Список
Период
Сортировка

very slow execution of stored procedures

От
"Vilson farias"
Дата:
Greetings,

    I found something very weird related with stored procedures execution. I
have this stored procedure to finalize a phone call, writing tha time of
call finalization and some other values to a calls table, called
cham_chamada. Please check this out (very simple) :

------------------------------------------------------------------

CREATE FUNCTION finaliza_chamv2(INT4, INT4, INT4, CHAR(23), CHAR(23),
CHAR(1),
                                INT4, INT4, INT4, CHAR(23), INT4, INT4,
CHAR(25),
                                INT4, INT4, INT4) RETURNS int4 AS
'
DECLARE
  pbxs           ALIAS FOR $1;
  pchave         ALIAS FOR $2;
  pidentificacao ALIAS FOR $3;
  pdtinicial     ALIAS FOR $4;
  pdtfinal       ALIAS FOR $5;
  pflgliber      ALIAS FOR $6;
  ptempototal    ALIAS FOR $7;
  pcodliber      ALIAS FOR $8;
  pddd           ALIAS FOR $9;
  pdtocup        ALIAS FOR $10;
  pindicadora    ALIAS FOR $11;
  pcategoria     ALIAS FOR $12;
  pidentidadea   ALIAS FOR $13;
  pfds           ALIAS FOR $14;
  presultcham    ALIAS FOR $15;
  pcifraorigem   ALIAS FOR $16;

BEGIN

  UPDATE cham_chamada
  SET dt_final        = TIMESTAMP(pdtfinal),
      flg_liberacao   = pflgliber,
      temp_total      = ptempototal,
      cod_liberjuntor = pcodliber,
      ddd             = pddd,
      indicadora      = pindicadora,
      cod_categoria   = pcategoria,
      identidadea     = pidentidadea,
      cod_fds         = pfds,
      cod_resultcham  = presultcham,
      cifra_origem    = pcifraorigem
  WHERE cod_bxs       = pbxs           AND
        chave         = pchave         AND
        identificacao = pidentificacao AND
        dt_inicial    = TIMESTAMP(pdtinicial);

  IF pdtocup <> '''' THEN
    UPDATE cham_servico
    SET
      dt_ocupacao = TIMESTAMP(pdtocup)
    WHERE
      cod_bxs       = pbxs           AND
      chave         = pchave         AND
      identificacao = pidentificacao AND
      dt_inicial    = TIMESTAMP(pdtinicial)     AND
      dt_finalizacao is null;
  END IF;

  RETURN 0;

END;
'
LANGUAGE 'plpgsql';

------------------------------------------------------------------

Once you know all about this stored procedure, lets see this call :
SELECT finaliza_chamv2(1,65535,49644,'2001-04-17 11:50:55','2001-04-17
12:17:40', '0', 0, 0, 48, '', 0, 10, '2817107', 0, 261, 65535);

If I change all variables to the parameters value inside the stored
procedure and then execute the frist script, then it is very fast, check out
:


bxs=#
bxs=# UPDATE cham_chamada
bxs-#   SET dt_final = TIMESTAMP('2001-04-17 12:12:10'),
bxs-#       flg_liberacao = '0',
bxs-#       temp_total = 0,
bxs-#       cod_liberjuntor = 0,
bxs-#       ddd = 48,
bxs-#       indicadora = 0,
bxs-#       cod_categoria = 10,
bxs-#       identidadea = '2817005',
bxs-#       cod_fds = 0,
bxs-#       cod_resultcham  = 6,
bxs-#       cifra_origem = 65535
bxs-#   WHERE cod_bxs = 1 AND
bxs-#         chave = 65535 AND
bxs-#         identificacao = 49644 AND
bxs-#         dt_inicial = TIMESTAMP('2001-04-17 12:12:00');
UPDATE 1

execution time : <1ms


now its time to do the same thing using the stored procedure :
bxs=#
bxs=# SELECT finaliza_chamv2(1,65535,49644,'2001-04-17 11:50:55','2001-04-17
12:17:40', '0', 0, 0, 48, '', 0, 10, '2817107', 0, 261, 65535);
 finaliza_chamv2
-----------------
               0
(1 row)

execution time : about 5s

Is it supose to execute with different speed? What can I do to fix it?

I'm using postgres RPM 7.0.3-2 in RedHat 6.2.

ps: There are some specific procedures I needed to execute before I got
pl/pgsql working :

CREATE FUNCTION plpgsql_call_handler ()
RETURNS opaque AS '/usr/lib/pgsql/plpgsql.so'
LANGUAGE 'C';

CREATE PROCEDURAL LANGUAGE 'plpgsql'
HANDLER plpgsql_call_handler
LANCOMPILER 'PL/PgSql internal';


Best regards from Brazil,

José Vilson de Mello de Farias
Dígitro Tecnologia Ltda - Brasil


Re: very slow execution of stored procedures

От
"Richard Huxton"
Дата:
From: "Vilson farias" <vilson.farias@digitro.com.br>

> Greetings,
>
>     I found something very weird related with stored procedures execution.
I
> have this stored procedure to finalize a phone call, writing tha time of
> call finalization and some other values to a calls table, called
> cham_chamada. Please check this out (very simple) :

Nobody else seems to have answered this yet, so I'll have a stab.

> ------------------------------------------------------------------
>
> CREATE FUNCTION finaliza_chamv2(INT4, INT4, INT4, CHAR(23), CHAR(23),
> CHAR(1),
>                                 INT4, INT4, INT4, CHAR(23), INT4, INT4,
> CHAR(25),
>                                 INT4, INT4, INT4) RETURNS int4 AS

[snipped simple update function]

> If I change all variables to the parameters value inside the stored
> procedure and then execute the frist script, then it is very fast, check
out
>
> execution time : <1ms

> now its time to do the same thing using the stored procedure :
> execution time : about 5s
>
> Is it supose to execute with different speed? What can I do to fix it?
>
> I'm using postgres RPM 7.0.3-2 in RedHat 6.2.

Well - there are some differences - if I understand correctly, the parser is
smarter about things when constants are explicitly specified (like in the
quick example).

I'm assuming your table is large and what is happening is that the function
is not using indexes. The most likely reason I can see is the timestamp()
calls in the code.

If you do:

select proname,proiscachable from pg_proc where proname='timestamp';

You'll see that the conversion functions are marked not cachable, so that
would probably discourage the use of the index on the timestamp fields.

Use another variable to hold the converted timestamp value and see if that
helps. If you still don't get an improvement try passing in the values as
timestamps rather than text.

If that still doesn't help try:

SET ENABLE_SEQSCAN = OFF;

before calling the function and see what that does.

If you are still having problems, can you supply the output of EXPLAIN for
the fast version.

> ps: There are some specific procedures I needed to execute before I got
> pl/pgsql working :
>
> CREATE FUNCTION plpgsql_call_handler ()
> RETURNS opaque AS '/usr/lib/pgsql/plpgsql.so'
> LANGUAGE 'C';
>
> CREATE PROCEDURAL LANGUAGE 'plpgsql'
> HANDLER plpgsql_call_handler
> LANCOMPILER 'PL/PgSql internal';

A simpler method is to do:

createlang plpgsql mydatabase

from the command-line.

HTH

- Richard Huxton


Re: very slow execution of stored procedures

От
"Vilson farias"
Дата:
Richard,

  I've tried all your advices and still nothing.
  It's been very hard to me understand why this is happening. The only
solution is to modify the stored procedure.

----------------------------------------------------------------------------
----------------------------------------------------------

bxs=# SET ENABLE_SEQSCAN = OFF;
SET VARIABLE
bxs=# SELECT finaliza_chamv2(1,65535,49644,'2001-04-17 11:50:55','2001-04-17
12:17:40', '0', 0, 0, 48, '', 0, 10, '2817107', 0, 261, 65535);
 finaliza_chamv2
-----------------
               0
(1 row)

Execution time = about 4s.

----------------------------------------------------------------------------
----------------------------------------------------------

This time I'm trying to remove timestamp conversions from where clause:

DROP FUNCTION finaliza_chamv2(INT4, INT4, INT4, CHAR(23), CHAR(23), CHAR(1),
                                INT4, INT4, INT4, CHAR(23), INT4, INT4,
CHAR(25),
                                INT4, INT4, INT4);

CREATE FUNCTION finaliza_chamv2(INT4, INT4, INT4, CHAR(23), CHAR(23),
CHAR(1),
                                INT4, INT4, INT4, CHAR(23), INT4, INT4,
CHAR(25),
                                INT4, INT4, INT4) RETURNS int4 AS
'
DECLARE
  pbxs           ALIAS FOR $1;
  pchave         ALIAS FOR $2;
  pidentificacao ALIAS FOR $3;
  pdtinicial     ALIAS FOR $4;
  pdtfinal       ALIAS FOR $5;
  pflgliber      ALIAS FOR $6;
  ptempototal    ALIAS FOR $7;
  pcodliber      ALIAS FOR $8;
  pddd           ALIAS FOR $9;
  pdtocup        ALIAS FOR $10;
  pindicadora    ALIAS FOR $11;
  pcategoria     ALIAS FOR $12;
  pidentidadea   ALIAS FOR $13;
  pfds           ALIAS FOR $14;
  presultcham    ALIAS FOR $15;
  pcifraorigem   ALIAS FOR $16;

BEGIN
  UPDATE cham_chamada
  SET dt_final        = TIMESTAMP(pdtfinal);
      flg_liberacao   = pflgliber,
      temp_total      = ptempototal,
      cod_liberjuntor = pcodliber,
      ddd             = pddd,
      indicadora      = pindicadora,
      cod_categoria   = pcategoria,
      identidadea     = pidentidadea,
      cod_fds         = pfds,
      cod_resultcham  = presultcham,
      cifra_origem    = pcifraorigem
  WHERE cod_bxs       = pbxs           AND
        chave         = pchave         AND
        identificacao = pidentificacao AND
        dt_inicial    = pdtinicial;

  IF pdtocup <> '''' THEN
    UPDATE cham_servico
    SET
      dt_ocupacao = TIMESTAMP(pdtocup)
    WHERE
      cod_bxs       = pbxs           AND
      chave         = pchave         AND
      identificacao = pidentificacao AND
      dt_inicial    = tempo          AND
      dt_finalizacao is null;
  END IF;

  RETURN 0;

END;
'

LANGUAGE 'plpgsql';


bxs=# SELECT finaliza_chamv2(1,65535,49644,'2001-04-17 11:50:55','2001-04-17
12:17:40', '0', 0, 0, 48, '', 0,
 10, '2817107', 0, 261, 65535);
Cancel request sent
ERROR:  Query was cancelled.

Execution time : ?? (I was forced to abort after more than 1 minute waiting
and a lot of cpu was sucked while executing)

Note that execution time go back to 4seconds if I put the original stored
procedure back.

----------------------------------------------------------------------------
----------------------------------------------------------

Here some info about my table and fast execution :

bxs=# EXPLAIN UPDATE cham_chamada
bxs-# SET dt_final = TIMESTAMP('2001-04-17 12:12:10'),
bxs-# flg_liberacao = '0',
bxs-# temp_total = 0,
bxs-# cod_liberjuntor = 0,
bxs-# ddd = 48,
bxs-# indicadora = 0,
bxs-# cod_categoria = 10,
bxs-# identidadea = '2817005',
bxs-# cod_fds = 0,
bxs-# cod_resultcham  = 6,
bxs-# cifra_origem = 65535
bxs-# WHERE cod_bxs = 1 AND
bxs-# chave = 65535 AND
bxs-# identificacao = 49644 AND
bxs-# dt_inicial = TIMESTAMP('2001-04-17 12:12:00');
NOTICE:  QUERY PLAN:

Index Scan using xpkcham_chamada on cham_chamada  (cost=0.00..4.23 rows=1
width=58)

EXPLAIN


bxs=# \d cham_chamada
           Table "cham_chamada"
    Attribute    |    Type     | Modifier
-----------------+-------------+----------
 cod_bxs         | integer     | not null
 chave           | integer     | not null
 identificacao   | integer     | not null
 dt_inicial      | timestamp   | not null
 indicadora      | integer     |
 cod_categoria   | integer     |
 identidadea     | varchar(20) |
 dt_final        | timestamp   |
 juntor          | integer     |
 indicadorb      | integer     |
 identidadeb     | varchar(20) |
 flg_chamada     | char(1)     |
 flg_liberacao   | char(1)     |
 temp_total      | integer     |
 ddd             | smallint    |
 cod_liberjuntor | integer     |
 cod_resultcham  | integer     |
 cod_fds         | integer     |
 cifra_origem    | integer     |
Indices: xie1cham_chamada,
         xie2cham_chamada,
         xpkcham_chamada


bxs=# SELECT COUNT(*) FROM cham_chamada;
 count
--------
 145978
(1 row)

----------------------------------------------------------------------------
----------------------------------------------------------
I found a solution that uses index scan. I redone some parameters, check out
:

CREATE FUNCTION finaliza_chamv2(INT4, INT4, INT4, TIMESTAMP, TIMESTAMP,
CHAR(1),
                                INT4, INT4, INT4, CHAR(23), INT4, INT4,
CHAR(25),
                                INT4, INT4, INT4) RETURNS int4 AS
'
DECLARE
  pbxs           ALIAS FOR $1;
  pchave         ALIAS FOR $2;
  pidentificacao ALIAS FOR $3;
  pdtinicial     ALIAS FOR $4;
  pdtfinal       ALIAS FOR $5;
  pflgliber      ALIAS FOR $6;
  ptempototal    ALIAS FOR $7;
  pcodliber      ALIAS FOR $8;
  pddd           ALIAS FOR $9;
  pdtocup        ALIAS FOR $10;
  pindicadora    ALIAS FOR $11;
  pcategoria     ALIAS FOR $12;
  pidentidadea   ALIAS FOR $13;
  pfds           ALIAS FOR $14;
  presultcham    ALIAS FOR $15;
  pcifraorigem   ALIAS FOR $16;

BEGIN

  UPDATE cham_chamada
  SET dt_final        = pdtfinal,
      flg_liberacao   = pflgliber,
      temp_total      = ptempototal,
      cod_liberjuntor = pcodliber,
      ddd             = pddd,
      indicadora      = pindicadora,
      cod_categoria   = pcategoria,
      identidadea     = pidentidadea,
      cod_fds         = pfds,
      cod_resultcham  = presultcham,
      cifra_origem    = pcifraorigem
  WHERE cod_bxs       = pbxs           AND
        chave         = pchave         AND
        identificacao = pidentificacao AND
        dt_inicial    = pdtinicial;

  IF pdtocup <> '''' THEN
    UPDATE cham_servico
    SET
      dt_ocupacao = pdtocup
    WHERE
      cod_bxs       = pbxs           AND
      chave         = pchave         AND
      identificacao = pidentificacao AND
      dt_inicial    = pdtinicial     AND
      dt_finalizacao is null;
  END IF;

Now execution time is <1s. Ok, but I really would like to know what's
happening to the older version.

Best Regards,
José Vilson de Mello de Farias
Dígitro Tecnologia Ltda. - Brazil


----- Original Message -----
From: Richard Huxton <dev@archonet.com>
To: Vilson farias <vilson.farias@digitro.com.br>
Cc: <pgsql-general@postgresql.org>
Sent: Sexta-feira, 20 de Abril de 2001 06:26
Subject: Re: [GENERAL] very slow execution of stored procedures


: From: "Vilson farias" <vilson.farias@digitro.com.br>
:
: > Greetings,
: >
: >     I found something very weird related with stored procedures
execution.
: I
: > have this stored procedure to finalize a phone call, writing tha time of
: > call finalization and some other values to a calls table, called
: > cham_chamada. Please check this out (very simple) :
:
: Nobody else seems to have answered this yet, so I'll have a stab.
:
: > ------------------------------------------------------------------
: >
: > CREATE FUNCTION finaliza_chamv2(INT4, INT4, INT4, CHAR(23), CHAR(23),
: > CHAR(1),
: >                                 INT4, INT4, INT4, CHAR(23), INT4, INT4,
: > CHAR(25),
: >                                 INT4, INT4, INT4) RETURNS int4 AS
:
: [snipped simple update function]
:
: > If I change all variables to the parameters value inside the stored
: > procedure and then execute the frist script, then it is very fast, check
: out
: >
: > execution time : <1ms
:
: > now its time to do the same thing using the stored procedure :
: > execution time : about 5s
: >
: > Is it supose to execute with different speed? What can I do to fix it?
: >
: > I'm using postgres RPM 7.0.3-2 in RedHat 6.2.
:
: Well - there are some differences - if I understand correctly, the parser
is
: smarter about things when constants are explicitly specified (like in the
: quick example).
:
: I'm assuming your table is large and what is happening is that the
function
: is not using indexes. The most likely reason I can see is the timestamp()
: calls in the code.
:
: If you do:
:
: select proname,proiscachable from pg_proc where proname='timestamp';
:
: You'll see that the conversion functions are marked not cachable, so that
: would probably discourage the use of the index on the timestamp fields.
:
: Use another variable to hold the converted timestamp value and see if that
: helps. If you still don't get an improvement try passing in the values as
: timestamps rather than text.
:
: If that still doesn't help try:
:
: SET ENABLE_SEQSCAN = OFF;
:
: before calling the function and see what that does.
:
: If you are still having problems, can you supply the output of EXPLAIN for
: the fast version.
:
: > ps: There are some specific procedures I needed to execute before I got
: > pl/pgsql working :
: >
: > CREATE FUNCTION plpgsql_call_handler ()
: > RETURNS opaque AS '/usr/lib/pgsql/plpgsql.so'
: > LANGUAGE 'C';
: >
: > CREATE PROCEDURAL LANGUAGE 'plpgsql'
: > HANDLER plpgsql_call_handler
: > LANCOMPILER 'PL/PgSql internal';
:
: A simpler method is to do:
:
: createlang plpgsql mydatabase
:
: from the command-line.
:
: HTH
:
: - Richard Huxton
:


Re: very slow execution of stored procedures

От
dev@archonet.com
Дата:
On 4/20/01, 2:31:29 PM, Vilson farias <vilson.farias@digitro.com.br> wrote
regarding Re: [GENERAL] very slow execution of stored procedures:


> Richard,

>   I've tried all your advices and still nothing.
>   It's been very hard to me understand why this is happening. The only
> solution is to modify the stored procedure.

Debugging functions is not easy unfortunately. AFAIK in 7.0 the only way
to see what's going on is to turn the logging level up and rummage
through the logs after running the function.

>
--------------------------------------------------------------------------
--
> ----------------------------------------------------------

> bxs=# SET ENABLE_SEQSCAN = OFF;
> SET VARIABLE
> bxs=# SELECT finaliza_chamv2(1,65535,49644,'2001-04-17
11:50:55','2001-04-17
> 12:17:40', '0', 0, 0, 48, '', 0, 10, '2817107', 0, 261, 65535);
>  finaliza_chamv2
> -----------------
>                0
> (1 row)

> Execution time = about 4s.

OK - so we haven't persuaded it to use the index.

>
--------------------------------------------------------------------------
--
> ----------------------------------------------------------

> This time I'm trying to remove timestamp conversions from where clause:

> CREATE FUNCTION finaliza_chamv2(INT4, INT4, INT4, CHAR(23), CHAR(23),
> CHAR(1),
>                                 INT4, INT4, INT4, CHAR(23), INT4, INT4,
> CHAR(25),
>                                 INT4, INT4, INT4) RETURNS int4 AS
> '
[comparing timestamps to char(23) and letting PG figure it out]
> '
> LANGUAGE 'plpgsql';

> Execution time : ?? (I was forced to abort after more than 1 minute
waiting
> and a lot of cpu was sucked while executing)

Sounds like PG is having trouble figuring out how to convert from
char(23) to timestamp.

> Note that execution time go back to 4seconds if I put the original stored
> procedure back.

> Here some info about my table and fast execution :

[psql query...]
> bxs-# WHERE cod_bxs = 1 AND
> bxs-# chave = 65535 AND
> bxs-# identificacao = 49644 AND
> bxs-# dt_inicial = TIMESTAMP('2001-04-17 12:12:00');
> NOTICE:  QUERY PLAN:

> Index Scan using xpkcham_chamada on cham_chamada  (cost=0.00..4.23 rows=1
> width=58)

OK - so here we are using the query.

> EXPLAIN


> bxs=# \d cham_chamada
>            Table "cham_chamada"
>     Attribute    |    Type     | Modifier
> -----------------+-------------+----------
>  cod_bxs         | integer     | not null
>  chave           | integer     | not null
>  identificacao   | integer     | not null
>  dt_inicial      | timestamp   | not null
>  indicadora      | integer     |
>  cod_categoria   | integer     |
>  identidadea     | varchar(20) |
>  dt_final        | timestamp   |
>  juntor          | integer     |
>  indicadorb      | integer     |
>  identidadeb     | varchar(20) |
>  flg_chamada     | char(1)     |
>  flg_liberacao   | char(1)     |
>  temp_total      | integer     |
>  ddd             | smallint    |
>  cod_liberjuntor | integer     |
>  cod_resultcham  | integer     |
>  cod_fds         | integer     |
>  cifra_origem    | integer     |
> Indices: xie1cham_chamada,
>          xie2cham_chamada,
>          xpkcham_chamada

I take it xpkcham_chamada is a primary-key on the first 4 fields, which
would fit with the explain above.


> bxs=# SELECT COUNT(*) FROM cham_chamada;
>  count
> --------
>  145978
> (1 row)

I guess it could take 4s to scan this many records versus using the
index, so that would be the difference.

>
--------------------------------------------------------------------------
--
> ----------------------------------------------------------
> I found a solution that uses index scan. I redone some parameters, check
out
> :

> CREATE FUNCTION finaliza_chamv2(INT4, INT4, INT4, TIMESTAMP, TIMESTAMP,
> CHAR(1),
>                                 INT4, INT4, INT4, CHAR(23), INT4, INT4,
> CHAR(25),
>                                 INT4, INT4, INT4) RETURNS int4 AS
[same function but passing timestamps in rather than converting them to
text]

> Now execution time is <1s. Ok, but I really would like to know what's
> happening to the older version.

Looks like the parser is getting confused by the timestamp(x)
conversions.

Because timestamp() is marked non-cachable, Postgres doesn't know it is
constant over the duration of the query and so scans every row in the
table re-calculating the timestamp each time.

When you use it in the fast psql query you call timestamp('constant')
which Postgres probably is clever enough to figure out as a constant.

In fact, if you used WHERE ... dt_inicial = timestamp('2001-04-17
12:12:00') in the function Postgres would probably figure it out.

There are two ways you could help Postgres to do the right thing:

1. Use another variable for the conversions:

...
DECLARE
  ...
  ts_pdtinicial timestamp;
BEGIN
  ts_pdtinicial := timestamp(pdtinicial);
  ...
  WHERE
  ...
  dt_inicial = ts_pdtinicial;
...

Hopefully this will simplify things for Postgres and it will decide it
can use the index.

2. Mark the timestamp() conversions as cachable
You need to find the entry in pg_proc which takes text and returns a
timestamp and update the proiscachable flag.
You'll need to be user postgres, I don't know if it will work and it
might break other things, so be careful.

HTH

- Richard Huxton

Re: very slow execution of stored procedures

От
"Vilson farias"
Дата:
: Because timestamp() is marked non-cachable, Postgres doesn't know it is
: constant over the duration of the query and so scans every row in the
: table re-calculating the timestamp each time.
:
: When you use it in the fast psql query you call timestamp('constant')
: which Postgres probably is clever enough to figure out as a constant.
:
: In fact, if you used WHERE ... dt_inicial = timestamp('2001-04-17
: 12:12:00') in the function Postgres would probably figure it out.
:
: There are two ways you could help Postgres to do the right thing:
:
: 1. Use another variable for the conversions:
:
: ...
: DECLARE
:   ...
:   ts_pdtinicial timestamp;
: BEGIN
:   ts_pdtinicial := timestamp(pdtinicial);
:   ...
:   WHERE
:   ...
:   dt_inicial = ts_pdtinicial;
: ...
:
: Hopefully this will simplify things for Postgres and it will decide it
: can use the index.
:

Interesting, but I did it here and there was no performance increase.

: 2. Mark the timestamp() conversions as cachable
: You need to find the entry in pg_proc which takes text and returns a
: timestamp and update the proiscachable flag.
: You'll need to be user postgres, I don't know if it will work and it
: might break other things, so be careful.

No thanks, my boss will kill me if something goes wrong in the database :)

Thanks for all info Richard, I'm really greatfull about it. Now I can figure
out whats the problem and it's finally fixed.

Best regards

José Vilson de Mello de Farias
Dígitro Tecnologia Ltda.


Re: very slow execution of stored procedures

От
Tom Lane
Дата:
dev@archonet.com writes:
> Looks like the parser is getting confused by the timestamp(x)
> conversions.

I did some experimentation and found that if you write
    WHERE timestamp_var = char_var
what you actually end up with is
    WHERE text(timestamp_var) = text(char_var)
which of course is going to be horrendously slow: not only is it not
using the index, but it's doing a timestamp-to-text conversion for
every row.

It's not real clear to me why you get this rather than a complaint that
the '=' operator is ambiguous, but that's what you get.

> Because timestamp() is marked non-cachable, Postgres doesn't know it is
> constant over the duration of the query and so scans every row in the
> table re-calculating the timestamp each time.

Right.  If there's an invocation (whether implicit or explicit) of
text-to-timestamp conversion in the WHERE clause, Postgres won't
consider it indexable.  The best way around this is to force the value
being compared to to be timestamp *before* you get to the query.

> There are two ways you could help Postgres to do the right thing:

> 1. Use another variable for the conversions:

Instead of bothering with another variable, I'd suggest changing the
declared type of the function's parameter to be timestamp in the first
place.

> 2. Mark the timestamp() conversions as cachable

This would be a BAD idea.  Likely consequences include timestamp('now')
being evaluated at first use of a function, and not changing thereafter.
Probably not what you wanted...

            regards, tom lane

Re: very slow execution of stored procedures

От
Tom Lane
Дата:
"Richard Huxton" <dev@archonet.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> said:
> 2. Mark the timestamp() conversions as cachable
>>
>> This would be a BAD idea.  Likely consequences include timestamp('now')
>> being evaluated at first use of a function, and not changing thereafter.
>> Probably not what you wanted...

> I did say it might well break things - actually, does this mean you
> could mark fna() as cachable, use it in fnb(), mark fna() uncachable
> again and fnb() will never notice?

Until the next time fnb gets compiled, anyway --- but sooner or later we
are going to have logic to flush cached compiled functions as soon as
something they depend on changes.

What I have seen people do is make wrapper or alias functions that are
marked cachable just to fool the planner.  If fnx is marked cachable and
is applied to a constant, then the constant-folding logic will evaluate
fnx and replace the call by a constant --- no matter what might lurk
inside fnx's body.

> So - am I right in thinking the rules run:

> cachable_fn(constant)        = cached
> cachable_fn(variable)        = not cached
> cachable_fn(non_cachable_fn) = not cached

> with the "cachable" flag filtering up through the stack?

It's not so much "cachable" as "is it constant?".  If a function is
marked cachable and is applied to constant arguments, replace it by
a constant.  Apply this rule recursively throughout the expression.

There is a slightly different use of the cachability flag that is
actually what was creating the problem in this example.  A WHERE clause
like "indexed_var = expression" can be used as an indexscan bound if
the expression is constant, *or* if the expression contains function
parameters and constants but does not contain any noncachable function
calls.  The idea here is that the expression must be constant for the
duration of any one query --- if it can vary from one row to the next
then we can't use it as an indexscan bound.  Given a function parameter
p, "p" or "p+1" can be an indexscan bound --- but "timestamp(p)" fails
the test.

> Although I can see that something like timestamp('now') is always going to
> cause trouble because as text 'now' is constant and it's not until you
> evaluate it as a timestamp that it can vary.

Exactly --- the 'now' case is the reason why text-to-timestamp can't
be marked cachable.

There is a weaker notion of cachability that timestamp() *does* satisfy:
given the same input, its output won't change within a query (because
'now' actually means the start of the transaction, not realtime).
So in the case of "WHERE timestamp_col = timestamp(p)", it's really OK
to compute timestamp(p) once at the start of the query and use the
result for an indexscan bound.  We don't currently have any such
intermediate notion of cachability, unfortunately.  This omission is the
reason why you can rewrite a function as

        timestamp tp;

        tp := timestamp(p);
        SELECT ... WHERE timestamp_col = tp;

and get a speedup without breaking the logic: essentially, you've forced
the system to evaluate timestamp(p) just once per call, which is really
OK even though the system wouldn't risk it by itself.

            regards, tom lane