Re: Query performance strangeness..

Поиск
Список
Период
Сортировка
От Richard Huxton
Тема Re: Query performance strangeness..
Дата
Msg-id 469F45C2.4010700@archonet.com
обсуждение исходный текст
Ответ на Re: Query performance strangeness..  (Steve Spicklemire <steve@spvi.com>)
Список pgsql-general
Steve Spicklemire wrote:
>
> Here is the function body...  the data is stored in and XML "pickle". I
> had hoped that it would only be called in building the index.
>
> Since the query uses it in the 'filter' step.. I'm not sure if it's
> using the index or not.
>
> Does marking the function immutable help the planner know whether it can
> use the index or not?

Well, since you've got an index using it, you *must* have already marked
it immutable. Presumably it's not genuinely immutable though.

> CCOC=# \df+ get_cem_for_directBurial
>                                                       List of functions
>  Schema |           Name           | Result data type | Argument data
> types |  Owner  | Language | Source code | Description
>
--------+--------------------------+------------------+---------------------+---------+----------+-------------+-------------

>
>  public | get_cem_for_directburial | text             | character
> varying   | webuser | plpgsql  |
>     DECLARE
>         personID ALIAS for $1;
>         qResult RECORD;
>
>     BEGIN
>         SELECT INTO qResult
>
get_xml_value('/params/param/value/struct/member/*[contains(text(),''cemid'')]/parent::*/value/string/text()','People',personID,'')

> as cem;
>         return qResult.cem;
>     END;

That might be stable, but I don't see how it could be immutable unless
the xml is in your "people" table.

> CCOC=# \d people
>                                                      Table "public.people"
>    Column   |            Type
> |                                     Modifiers
>
------------+-----------------------------+------------------------------------------------------------------------------------

>
>  personid   | character varying(40)       | not null default
> ('AUTO'::text || (nextval(('People_seq'::text)::regclass))::text)
>  modified   | timestamp without time zone | default now()
>  created    | timestamp without time zone | default now()
>  enabled    | boolean                     |
>  first      | character varying(40)       |
>  middle     | character varying(15)       |
>  last       | character varying(80)       |
>  gender     | character varying(2)        |
>  sbirthdate | character varying(30)       |
>  sdeathdate | character varying(30)       |
>  status     | character varying(30)       |
> Indexes:
>     "people_pkey" PRIMARY KEY, btree (personid)
>     "idx_people_cemid" btree (get_cem_for_directburial(personid))
>     "idx_people_lower_concat3_last" btree (lower_concat3("last",
> "first", (middle::text || personid::text)::character varying))
>     "idx_people_servicenum" btree
> (get_numeric_servicenumber_for_personid(personid))
>     "idx_people_status" btree (status)
>     "idx_people_take4_personid_" btree (take4(personid))

>>>                      Filter: (('STJ'::text =
>>> get_cem_for_directburial(personid)) AND ((status)::text <> 'R'::text)
>>> AND ((status)::text <> 'F'::text))

>>>                      Filter: (('HCC'::text =
>>> get_cem_for_directburial(personid)) AND ((status)::text <> 'R'::text)
>>> AND ((status)::text <> 'F'::text))

> If the query is actually calling get_cem_for_directburial during the
> query... then I'm sunk. I'll have to move that data to a regular indexed
> field. I can do that... it's just a lot of work. ;-(

Where you see it calling "Filter" it's evaluating the function I'm
afraid. It's possible for the executor to call the function more than
once too.

You've got so much data hidden behind functions, it's unlikely you'll
get the planner to make any informed decisions as to the quickest plan.

You're testing for inequality on status, so unless <> F / R is uncommon
a partial query on that probably won't help much either.

Just to be sure, cheat. Run get_cem_for_directburial() over the whole
people table and dump the results into a cache table. Then, try the same
query with a function that just does a table lookup - see if that makes
things faster.

Then I'd just write a couple of trigger functions to keep the cache
table up-to-date and join against it. That will let the planner see
common values and make better predictions for its plans.


If you (or anyone else) would like to explore functional indexes and
calling, I've attached a small demo script.

--
   Richard Huxton
   Archonet Ltd
BEGIN;

CREATE TABLE fit (
    a SERIAL,
    b int4,
    PRIMARY KEY (a)
);

CREATE SEQUENCE dummy_seq;

CREATE FUNCTION fit_fn(id int4) RETURNS int4 AS $$
BEGIN
    PERFORM nextval('dummy_seq');
    RETURN (id % 100);
END;
$$ LANGUAGE plpgsql IMMUTABLE;

CREATE INDEX fit_fn_idx ON fit ( fit_fn(a) );

INSERT INTO fit SELECT generate_series(1,1000), round(random() * 100);

COMMIT;

VACUUM ANALYSE fit;

SELECT nextval('dummy_seq');
EXPLAIN ANALYSE SELECT a,b FROM fit WHERE fit_fn(a) = 7;
SELECT nextval('dummy_seq');
EXPLAIN ANALYSE SELECT a,b FROM fit WHERE fit_fn(a) = b;
SELECT nextval('dummy_seq');


В списке pgsql-general по дате отправления:

Предыдущее
От: Ottavio Campana
Дата:
Сообщение: memory leakage in libpg?
Следующее
От: "Paul Codler"
Дата:
Сообщение: IN clause performance