Re: Peculiar performance observation....
| От | Net Virtual Mailing Lists |
|---|---|
| Тема | Re: Peculiar performance observation.... |
| Дата | |
| Msg-id | 20050315031431.1274@mail.net-virtual.com обсуждение исходный текст |
| Ответ на | Re: Peculiar performance observation.... (Scott Marlowe <smarlowe@g2switchworks.com>) |
| Ответы |
Re: Peculiar performance observation....
|
| Список | pgsql-general |
>On Mon, 2005-03-14 at 06:11, Net Virtual Mailing Lists wrote:
>> Hello,
>>
>>
>> I am sorry to bring this up again.... Does anyone have any idea what
>> might be going on here?... I'm very worried about this situation.. ;-(
>
>It looks to me like either you're not analyzing often enough, or your
>statistics target is too low to get a good sample. Note your estimated
>versus real rows are off by a factor of 70 (28 est. versus 1943 actual
>rows). That's a pretty big difference, and where you should be looking.
>
>> > -> Seq Scan on jobdata (cost=0.00..7567.88 rows=28 width=52) (actual
>> >time=11.498..4800.907 rows=1943 loops=1)
>
>Yes, this is because PostgreSQL is using an index to approximate a
>sequential scan, which is not a good thing since PostgreSQL can't get
>all the information it needs from just an index, but has to visit the
>table to check visibility.
>
All of these were after a vacuum full analyze, which I actually do
nightly on the database.
I probably confused the issue with all of my posts, this is the query
which has me concerned. When running it on my system here, the disk
thrashes (and I mean *THRASHES*) the entire 12-20 seconds it takes to
run... WHen running on our production servers, I can't hear the disk,
but see an equally troubling performance loss when using the index.
database=> explain analyze select id from table1 where category <@ 'a.b';
QUERY
PLAN
-------------------------------------
-------------------------------------
-------------------------------------------------------------------------
Index Scan using table1_category_full_gist_idx on jobdata
(cost=0.00..113.48 rows=28 width=4) (actual time=43.814..12201.528
rows=1943 loops=1)
Index Cond: (category <@ 'a.b'::ltree)
Filter: (category <@ 'a.b'::ltree)
Total runtime: 12222.258 ms
I can do this to speed things up (this results in very little disk
activity, certainly not the thrashing the original query did):
create table yuck (id integer, category ltree[]);
insert into yuck select id, category from table1;
create index category_idx on yuck using gist(category);
vacuum analyze yuck;
jobs=> explain analyze select id from table1 where id in (select id from
yuck where category <@ 'a.b');
QUERY PLAN
-------------------------------------
-------------------------------------
-------------------------------------------------------------
Nested Loop (cost=108.64..114.28 rows=1 width=52) (actual
time=654.645..1245.212 rows=1943 loops=1)
-> HashAggregate (cost=108.64..108.64 rows=1 width=4) (actual
time=654.202..690.709 rows=1943 loops=1)
-> Index Scan using category_idx on yuck (cost=0.00..108.57
rows=28 width=4) (actual time=2.046..623.436 rows=1943 loops=1)
Index Cond: (category <@ 'a.b'::ltree)
Filter: (category <@ 'a.b'::ltree)
-> Index Scan using table1_pkey on jobdata (cost=0.00..5.64 rows=1
width=52) (actual time=0.219..0.235 rows=1 loops=1943)
Index Cond: (table1.id = "outer".id)
Total runtime: 1261.551 ms
(8 rows)
If I drop the index "table1_category_full_gist_idx", the query speeds up
dramatically (10-15 times faster on both dev and prod uction systems).
So my concern, in short: why is it so much slower when actually using an
index and why is it trying to make mince meat out of my hard drive?
- Greg
В списке pgsql-general по дате отправления: