An Analyze question

Поиск
Список
Период
Сортировка
От Nick Fankhauser
Тема An Analyze question
Дата
Msg-id NEBBLAAHGLEEPCGOBHDGIEDKENAA.nickf@ontko.com
обсуждение исходный текст
Ответы Re: An Analyze question
Список pgsql-admin
Hello-

Does analyze consider the entire key when creating statistics for a table,
or a substring composed of the leading chars?

Some background:

I've just modified *all* of the keys on my database by prefixing them with
the same source_id that is 5 chars long. This is in preparation for merging
together data from several different sources where I know that the keys are
unique to the source, but not necessarily between sources.

So on every primary & foreign key, I have executed this update:

update table set key = '18105'||key;

Now, a few queries that used to be swift are very slow, and on further
investigation, I found that the planner is making different decisions-
essentially it looks like the statistics now indicate that each key is much
less selective & hence a poor candidate for an index scan.

I did an analyze on the whole database, and then did an analyze specifically
on the tables involved, and checked against an original copy of the database
to make sure the indexes are identical in  both.

If the keys are taken in their entirety, nothing has changed- they are just
as selective as ever. However, if only the leading chars are considered, or
if the leading chars have a higher weighting, they would certainly appear
much less selective.

Any thoughts on what happened here?

Thanks-

-Nick

PS: The before & after explains are pasted in below:

Before:

monroe=# explain select * from actor_cases where actor_id = '18105A7313 53';
NOTICE:  QUERY PLAN:

Merge Join  (cost=27748.94..27807.92 rows=145 width=192)
  ->  Sort  (cost=27713.16..27713.16 rows=3410 width=144)
        ->  Nested Loop  (cost=0.00..27372.75 rows=3410 width=144)
              ->  Index Scan using actor_case_assignment_both on
actor_case_assignment  (cost=0.00..11766.67 rows=3410 width=24)
              ->  Index Scan using case_data_case_id on case_data
(cost=0.00..4.56 rows=1 width=120)
  ->  Sort  (cost=35.78..35.78 rows=522 width=48)
        ->  Seq Scan on local_case_type  (cost=0.00..12.22 rows=522
width=48)



After:

develop=#  explain select * from actor_cases where actor_id = '18105A7313
53';
NOTICE:  QUERY PLAN:

Hash Join  (cost=27801.99..53031.15 rows=306 width=192)
  ->  Hash Join  (cost=27788.47..51957.43 rows=11377 width=144)
        ->  Seq Scan on case_data  (cost=0.00..6932.35 rows=226535
width=120)
        ->  Hash  (cost=27693.03..27693.03 rows=11377 width=24)
              ->  Seq Scan on actor_case_assignment  (cost=0.00..27693.03
rows=11377 width=24)
  ->  Hash  (cost=12.22..12.22 rows=522 width=48)
        ->  Seq Scan on local_case_type  (cost=0.00..12.22 rows=522
width=48)




--------------------------------------------------------------------------
Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
Ray Ontko & Co.     Software Consulting Services     http://www.ontko.com/


В списке pgsql-admin по дате отправления:

Предыдущее
От: Ferdinand Smit
Дата:
Сообщение: Re: Connection problem
Следующее
От: Tom Lane
Дата:
Сообщение: Re: An Analyze question