Обсуждение: slow information schema with thausand users, seq.scan pg_authid

Поиск
Список
Период
Сортировка

slow information schema with thausand users, seq.scan pg_authid

От
"Pavel Stehule"
Дата:
Hello,

I know so db 500 000 users isn't normal situation, but I need it. After 
user's generation all selects on system's catalog are slow. For example: 
list of sequences

SELECT n.nspname as "Schema", c.relname as "Name", CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'i'
THEN
 
'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' END as "Type", r.rolname as "Owner"
FROM pg_catalog.pg_class c    LEFT JOIN pg_catalog.pg_roles r ON r.oid = c.relowner    LEFT JOIN
pg_catalog.pg_namespacen ON n.oid = c.relnamespace
 
WHERE c.relkind IN ('S','')     AND n.nspname NOT IN ('pg_catalog', 'pg_toast')     AND
pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;
                                                              QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
Sort  (cost=47532.09..47544.59 rows=5000 width=193) (actual 
time=30333.490..30333.504 rows=5 loops=1)  Sort Key: n.nspname, c.relname  ->  Hash Left Join  (cost=1.06..46947.04
rows=5000width=193) (actual 
 
time=45.918..30333.390 rows=5 loops=1)        Hash Cond: ("outer".relnamespace = "inner".oid)        Filter:
("inner".nspname<> ALL ('{pg_catalog,pg_toast}'::name[]))        ->  Nested Loop Left Join  (cost=0.00..46795.97
rows=5000
 
width=133) (actual time=28.648..30316.020 rows=5 loops=1)              Join Filter: ("inner".oid = "outer".relowner)
         ->  Seq Scan on pg_class c  (cost=0.00..9.59 rows=2 width=73) 
 
(actual time=16.212..165.521 rows=5 loops=1)                    Filter: ((relkind = ANY ('{S,""}'::"char"[])) AND 
pg_table_is_visible(oid))              ->  Seq Scan on pg_authid  (cost=0.00..12143.06 rows=500006 
width=118) (actual time=12.702..4306.537 rows=500006 loops=5)        ->  Hash  (cost=1.05..1.05 rows=5 width=68)
(actual
 
time=0.070..0.070 rows=5 loops=1)              ->  Seq Scan on pg_namespace n  (cost=0.00..1.05 rows=5 
width=68) (actual time=0.013..0.035 rows=5 loops=1)
Total runtime: 30376.547 ms

there is any possibility creating index for pg_authid?

best regards
Pavel Stehule

_________________________________________________________________
Chcete sdilet sve obrazky a hudbu s prateli? http://messenger.msn.cz/



Re: slow information schema with thausand users, seq.scan pg_authid

От
Peter Eisentraut
Дата:
Pavel Stehule wrote:
> I know so db 500 000 users isn't normal situation, but I need it.
> After user's generation all selects on system's catalog are slow. For
> example: list of sequences
>
> SELECT n.nspname as "Schema",
>   c.relname as "Name",
>   CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'i'
> THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' END as
> "Type", r.rolname as "Owner"
> FROM pg_catalog.pg_class c
>      LEFT JOIN pg_catalog.pg_roles r ON r.oid = c.relowner
>      LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
> WHERE c.relkind IN ('S','')
>       AND n.nspname NOT IN ('pg_catalog', 'pg_toast')
>       AND pg_catalog.pg_table_is_visible(c.oid)
> ORDER BY 1,2;

I suggest that your problem is the join order (unless you have 500000 
tables as well).  Moreover, using left joins instead of inner joins 
seems to be quite useless unless you plan to have tables that are not 
owned by anyone and are not in a schema.

> there is any possibility creating index for pg_authid?

It already has indexes.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: slow information schema with thausand users, seq.scan pg_authid

От
Andrew - Supernews
Дата:
On 2006-02-06, Peter Eisentraut <peter_e@gmx.net> wrote:
> I suggest that your problem is the join order (unless you have 500000 
> tables as well).  Moreover, using left joins instead of inner joins 
> seems to be quite useless unless you plan to have tables that are not 
> owned by anyone and are not in a schema.

Perhaps you missed the fact that the query was not one that he wrote,
but is the query that psql uses for \ds ?

>> there is any possibility creating index for pg_authid?
>
> It already has indexes.

True, but they're not being used where you'd expect. This seems to be
something to do with the fact that it's not pg_authid which is being
accessed, but rather the view pg_roles.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services


Re: slow information schema with thausand users, seq.scan pg_authid

От
Peter Eisentraut
Дата:
Andrew - Supernews wrote:
> Perhaps you missed the fact that the query was not one that he wrote,
> but is the query that psql uses for \ds ?

I did miss that.  Perhaps with dependency tracking and all, we don't 
need the left joins anymore?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: slow information schema with thausand users, seq.scan pg_authid

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> Andrew - Supernews wrote:
>> Perhaps you missed the fact that the query was not one that he wrote,
>> but is the query that psql uses for \ds ?

> I did miss that.  Perhaps with dependency tracking and all, we don't 
> need the left joins anymore?

I don't see anything wrong with leaving the left joins as-is, on the
grounds that

1. the planner can simplify the left joins to inner joins, eg the
join to pg_namespace should be simplified on the strength of the
test on nspname.  (This seems to be broken in HEAD, but it does
work in 8.1 --- I think I broke it with the changes to treat IN
as a ScalarArrayOp.  Will fix.)

2. HEAD also knows how to change the order of the left joins at need.

The real question to me is why the planner doesn't want to use the
index on pg_authid.oid.  That's pretty curious ...
        regards, tom lane


Re: slow information schema with thausand users, seq.scan pg_authid

От
Tom Lane
Дата:
Andrew - Supernews <andrew+nonews@supernews.com> writes:
> On 2006-02-06, Peter Eisentraut <peter_e@gmx.net> wrote:
>> It already has indexes.

> True, but they're not being used where you'd expect. This seems to be
> something to do with the fact that it's not pg_authid which is being
> accessed, but rather the view pg_roles.

I looked into this and it seems the problem is that the view doesn't
get flattened into the main query because of the has_nullable_targetlist
limitation in prepjointree.c.  That's triggered because pg_roles has'********'::text AS rolpassword
which isn't nullable, meaning it would produce wrong behavior if
referenced above the outer join.

Ultimately, the reason this is a problem is that the planner deals only
in simple Vars while processing joins; it doesn't want to think about
expressions.  I'm starting to think that it may be time to fix this,
because I've run into several related restrictions lately, but it seems
like a nontrivial project.

In the meantime, reducing the LEFT JOIN to pg_roles to a JOIN as per
Peter's suggestion seems like the best short-term workaround.
        regards, tom lane


Re: slow information schema with thausand users, seq.scan pg_authid

От
"Pavel Stehule"
Дата:
>In the meantime, reducing the LEFT JOIN to pg_roles to a JOIN as per
>Peter's suggestion seems like the best short-term workaround.
>

It's solution

explain analyze SELECT n.nspname as "Schema", c.relname as "Name", CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN
'view'WHEN 'i' THEN 
 
'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN \'special' END as "Type", r.rolname as "Owner"
FROM pg_catalog.pg_class c    JOIN pg_catalog.pg_roles r ON r.oid = c.relowner    LEFT JOIN pg_catalog.pg_namespace n
ONn.oid = c.relnamespace
 
WHERE c.relkind IN ('S','')     AND n.nspname NOT IN ('pg_catalog', 'pg_toast')     AND
pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;

Sort  (cost=22.68..22.68 rows=2 width=193) (actual time=1.047..1.064 rows=5 
loops=1)  Sort Key: n.nspname, c.relname  ->  Nested Loop Left Join  (cost=1.05..22.67 rows=2 width=193) (actual 
time=0.480..0.983 rows=5 loops=1)        Join Filter: ("inner".oid = "outer".relnamespace)        Filter:
("inner".nspname<> ALL ('{pg_catalog,pg_toast}'::name[]))        ->  Nested Loop  (cost=0.00..21.34 rows=2 width=133)
(actual
 
time=0.386..0.642 rows=5 loops=1)              ->  Seq Scan on pg_class c  (cost=0.00..9.29 rows=2 width=73) 
(actual time=0.334..0.431 rows=5 loops=1)                    Filter: ((relkind = ANY ('{S,""}'::"char"[])) AND 
pg_table_is_visible(oid))              ->  Index Scan using pg_authid_oid_index on pg_authid  
(cost=0.00..6.01 rows=1 width=68) (actual time=0.02$                     
Index Cond: (pg_authid.oid = "outer".relowner)        ->  Materialize  (cost=1.05..1.10 rows=5 width=68) (actual 
time=0.007..0.032 rows=5 loops=5)              ->  Seq Scan on pg_namespace n  (cost=0.00..1.05 rows=5 
width=68) (actual time=0.008..0.028 rows=5 loops=1$ Total runtime: 1.294 ms

Regards
Pavel Stehule

_________________________________________________________________
Najdete si svou lasku a nove pratele na Match.com. http://www.msn.cz/