Re: Question on COUNT performance

Поиск
Список
Период
Сортировка
От Lee Hachadoorian
Тема Re: Question on COUNT performance
Дата
Msg-id AANLkTikWZdVzUQlOLRnPvqI0aJ9pNdUtUfeSLiXGIjfg@mail.gmail.com
обсуждение исходный текст
Ответ на Question on COUNT performance  (Anders Østergaard Jensen <aj@itersys.dk>)
Ответы Re: Question on COUNT performance  (Lee Hachadoorian <lee.hachadoorian@gmail.com>)
Список pgsql-sql
The first statement of the function

>                                             :         select into user *
> from users where id = uid;

appears to be a useless drag, as I don't see the user record referred
to anywhere else in the function. There appears to be other
unnecessary statements. For example :

>                                             :         select into pcount_rel
> COUNT(*) as acl_count FROM project_plan_events ppe WHERE ppe.plan_event_id =
> plan_id;
>
>                                             :         if
> (pcount_rel.acl_count > 0) then
>
>
>
>                                             :                 SELECT INTO
> project * FROM projects WHERE id IN (SELECT project_id FROM
> project_plan_events ppe2 WHERE ppe2.plan_event_id = plan_id) OFFSET 0 LIMIT
> 1;
>
>                                             :                 return
> f_project_acl(uid, project.id);
>
>
>                                             :         end if;
>

This appears to check whether the plan_id exists in a link table, find
an associated project_id, and run some function on project_id.

This could instead be done as:

FOR project_record IN SELECT project_id FROM project_plan_events WHERE
plan_event_id = plan_id LIMIT 1 LOOP   return f_project_acl(uid, project_record.project_id)
END LOOP;

If plan_id doesn't exist in project_plan_events, the LOOP just gets skipped.

The same could be done for the next IF block in they query which
checks to see whether plan_id has a matching customer_id in a link
table.

Note that your LIMIT 1 (which I have retained) strongly implies a
1-to-1 relationship between project_id and plan_id. If not, this
function gets applied to an arbitrary project_id from among all
matching project_ids. (Same goes for customer_id.)

Assuming f_project_acl and f_customer_acl return TRUE if successful,
the whole thing (from the original SELECT COUNT(*) looks like it can
be summarized as:

Call a function with a plan_id
If a matching project_id exists   Do some function on the project_id   count +1
Else If a matching customer_id exists   Do some function on the customer_id   count +1
Else   count +1

Return count, which, since the function gets called once for each row
in plan_events, count should always equal the number of rows in plan
events.

I would be inclined to replace the whole thing with something like this:

SELECT newfunc(uid);

CREATE FUNCTION newfunc(uid int) RETURNS int AS $$
DECLAREplan_record record;i int := 0;
BEGINFOR plan_record IN SELECT DISTINCT plan_id, project_id, customer_id
FROM plan_events LEFT JOIN project_plan_events USING (plan_id) LEFT
JOIN customer_plan_events USING (plan_id)  LOOP    IF plan_record.project_id IS NOT NULL THEN        PERFORM
f_project_acl(uid,plan_record.project_id);    ELSEIF plan_record.customer_id IS NOT NULL THEN        PERFORM
f_customer_acl(uid,plan_record.customer_id);    END IF;    i := i + 1;    END LOOP ;RETURN i;
 
END;
$$ LANGUAGE plpgsql;

If I understand what's going on in your function, I *think* this would
reduce 9000-12,000 SELECT statements to 1 SELECT statement.

Obviously, not tested. Hope this is helpful.

--Lee

-- 
Lee Hachadoorian
PhD Student, Geography
Program in Earth & Environmental Sciences
CUNY Graduate Center


В списке pgsql-sql по дате отправления:

Предыдущее
От: Thomas Kellerer
Дата:
Сообщение: Re: subtract two dates to get the number of days
Следующее
От: Lee Hachadoorian
Дата:
Сообщение: Re: Question on COUNT performance