Обсуждение: Serious Performance Loss in 7.0.2??

Поиск
Список
Период
Сортировка

Serious Performance Loss in 7.0.2??

От
Tim Perdue
Дата:
I know you're all sick of hearing from me, but I'm passing this along
anyway. Looks like I need to go back down to 6.5.3 for some reason.

The following query is at the very heart of the site and it takes
upwards of 15-20 seconds to run now. It used to be instantaneous.

explain SELECT mailid, mail_date, mail_is_followup, mail_from,
mail_subject FROM mail_archive WHERE mail_list=35 AND mail_year=2000AND mail_month=1 ORDER BY mail_date DESC LIMIT 26
OFFSET0;
 

NOTICE:  QUERY PLAN:

Sort  (cost=138.41..138.41 rows=34 width=44) ->  Index Scan using idx_mail_archive_list_yr_mo on tbl_mail_archive 
(cost=0.00..137.55 rows=34 width=44)

EXPLAIN

Tim

-- 
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723


Re: Serious Performance Loss in 7.0.2??

От
Andrew McMillan
Дата:
Tim Perdue wrote:
> 
> I know you're all sick of hearing from me.

I hope there was really a <g> at the end of that because it is not true
at all!  When problems are seen and solved they offer opportunities for
others in the future, and it is also how things get better :-)


> The following query is at the very heart of the site and it takes
> upwards of 15-20 seconds to run now. It used to be instantaneous.
> 
> explain SELECT mailid, mail_date, mail_is_followup, mail_from,
> mail_subject
>  FROM mail_archive WHERE mail_list=35 AND mail_year=2000
>  AND mail_month=1 ORDER BY mail_date DESC LIMIT 26 OFFSET 0;
> 
> NOTICE:  QUERY PLAN:
> 
> Sort  (cost=138.41..138.41 rows=34 width=44)
>   ->  Index Scan using idx_mail_archive_list_yr_mo on tbl_mail_archive
> (cost=0.00..137.55 rows=34 width=44)
> 
> EXPLAIN

OK, I'll give it a go :-)

First of all, I find it easiest to optimise these sort of queries in
psql because you can go back and edit things and 'play' quite a bit to
achieve the desired behaviour, then implement it back in the old PHP
code (or wherever :-).

The query optimiser changed quite a bit from 6.5.3 to 7.x and this seems
to be one area that now works harder to do what you say.  From the name
of your index it seems that you have an index on mail_list, mail_year,
mail_month, mail_date?

PostgreSQL seems to not get the index choice right when you have index
matches that are like =, =, =, DESC so you actually need to specify the
ORDER BY clause in full like:ORDER BY mail_list DESC, mail_year DESC, mail_month DESC, mail_date
DESC
and things will hopefully be all OK again.

Personally I consider this to be a 'bug', or at least a 'buglet', but I
guess I'd bow to Tom's opinion on that :-)

Hope this is some help,                Andrew.

-- 
_____________________________________________________________________           Andrew McMillan, e-mail:
Andrew@cat-it.co.nz
Catalyst IT Ltd, PO Box 10-225, Level 22, 105 The Terrace, Wellington
Me: +64 (21) 635 694, Fax: +64 (4) 499 5596, Office: +64 (4) 499 2267


Re: Serious Performance Loss in 7.0.2??

От
Tim Perdue
Дата:
Andrew McMillan wrote:
> PostgreSQL seems to not get the index choice right when you have index
> matches that are like =, =, =, DESC so you actually need to specify the
> ORDER BY clause in full like:
>         ORDER BY mail_list DESC, mail_year DESC, mail_month DESC, mail_date
> DESC
> and things will hopefully be all OK again.

Wow - that definitely does *not* work. That took 1:30 to return the
result for the linux-kernel list.

My server has a load of 19.25 right now - at 6 AM.

Tim

-- 
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723


Re: Serious Performance Loss in 7.0.2??

От
The Hermit Hacker
Дата:
On Wed, 12 Jul 2000, Tim Perdue wrote:

> Andrew McMillan wrote:
> > PostgreSQL seems to not get the index choice right when you have index
> > matches that are like =, =, =, DESC so you actually need to specify the
> > ORDER BY clause in full like:
> >         ORDER BY mail_list DESC, mail_year DESC, mail_month DESC, mail_date
> > DESC
> > and things will hopefully be all OK again.
> 
> Wow - that definitely does *not* work. That took 1:30 to return the
> result for the linux-kernel list.
> 
> My server has a load of 19.25 right now - at 6 AM.

Ouch!  because of that query, or is this standard?  what is the server,
anyway?




Re: Serious Performance Loss in 7.0.2??

От
Tom Lane
Дата:
Tim Perdue <tperdue@valinux.com> writes:
> explain SELECT mailid, mail_date, mail_is_followup, mail_from,
> mail_subject 
>  FROM mail_archive WHERE mail_list=35 AND mail_year=2000
>  AND mail_month=1 ORDER BY mail_date DESC LIMIT 26 OFFSET 0;

> NOTICE:  QUERY PLAN:

> Sort  (cost=138.41..138.41 rows=34 width=44)
>   -> Index Scan using idx_mail_archive_list_yr_mo on tbl_mail_archive 
> (cost=0.00..137.55 rows=34 width=44)

Hard to tell with this much info.  How many rows are actually retrieved
by the query (the planner is guessing 34, is that anywhere in the right
ballpark?  How big is the table, anyway?)

Also, what's the definition of the index idx_mail_archive_list_yr_mo?

It might help to see the EXPLAIN VERBOSE output also --- I'm wondering
if all the WHERE clauses are getting used as index keys or not...
        regards, tom lane


Re: Serious Performance Loss in 7.0.2??

От
Tim Perdue
Дата:
Tom Lane wrote:
> Hard to tell with this much info.  How many rows are actually retrieved
> by the query (the planner is guessing 34, is that anywhere in the right
> ballpark?  How big is the table, anyway?)
> 
> Also, what's the definition of the index idx_mail_archive_list_yr_mo?
> 
> It might help to see the EXPLAIN VERBOSE output also --- I'm wondering
> if all the WHERE clauses are getting used as index keys or not...

OK - there are 5851 rows in this query.

idx_mail_archive_list_yr_mo is an index on 
mail_list (int)
mail_year (int)
mail_month(int)

With 5850 rows to sort, I wouldn't expect it to be lightning fast, but
there is a very definite difference from 6.5.3 (or 6.4.x).

As requested by "The Hermit Hacker", I took out the ORDER BY and it was
instantaneous.

I'm sending the explain verbose separately to you as it's very big.

Tim

-- 
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723


Re: Serious Performance Loss in 7.0.2??

От
Tom Lane
Дата:
Tim Perdue <tperdue@valinux.com> writes:
> Tom Lane wrote:
>> Hard to tell with this much info.  How many rows are actually retrieved
>> by the query (the planner is guessing 34, is that anywhere in the right
>> ballpark?  How big is the table, anyway?)
>> 
>> Also, what's the definition of the index idx_mail_archive_list_yr_mo?
>> 
>> It might help to see the EXPLAIN VERBOSE output also --- I'm wondering
>> if all the WHERE clauses are getting used as index keys or not...

> OK - there are 5851 rows in this query.

> idx_mail_archive_list_yr_mo is an index on 
> mail_list (int)
> mail_year (int)
> mail_month(int)

> With 5850 rows to sort, I wouldn't expect it to be lightning fast, but
> there is a very definite difference from 6.5.3 (or 6.4.x).

> As requested by "The Hermit Hacker", I took out the ORDER BY and it was
> instantaneous.

> I'm sending the explain verbose separately to you as it's very big.

The explain verbose looks just like it should: all three WHERE clauses
are being used as indexkeys.  So I'm mystified.  7.0 is not doing
anything obviously wrong here, and I do not understand what 6.5 might
have done differently.  Given the query as posed and the available
index, there is no other alternative but an indexscan followed by sort.

I like Andreas' suggestion of rearranging things so that the indexscan
will produce already-sorted output (since that will allow the LIMIT to
stop the indexscan without reading the whole month's traffic).  But that
doesn't answer the question of why 7.0 is so much slower given the same
query as 6.5.

How long does it take to doSELECT count(*) FROM tbl_mail_archive WHERE fld_mail_list=35 ANDfld_mail_year=2000 AND
fld_mail_month=1
?  That should tell us how much time is being spent in the indexscan.

Also, when you are doing the complete query with sort, does a
pg_sorttemp file appear in the database directory?  If so, how big does
it get?
        regards, tom lane


Re: Serious Performance Loss in 7.0.2??

От
Tim Perdue
Дата:
Tom Lane wrote:
> > As requested by "The Hermit Hacker", I took out the ORDER BY and it was
> > instantaneous.
> 
> How long does it take to do
>         SELECT count(*) FROM tbl_mail_archive WHERE fld_mail_list=35 AND
>         fld_mail_year=2000 AND fld_mail_month=1
> ?  That should tell us how much time is being spent in the indexscan.

It's pretty much instantaneous.

> Also, when you are doing the complete query with sort, does a
> pg_sorttemp file appear in the database directory?  If so, how big does
> it get?

Doesn't look like it. There is a (supposedly) dead pg_sorttemp file in
that directory that is 74MB. Probably unrelated.

My only remaining question is whether an index on 4,000,000 datestamps
is going to be fast or not. Or how Long will my nightly vacuum run?
Forever?

Tim

-- 
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723


Re: Serious Performance Loss in 7.0.2??

От
Tom Lane
Дата:
Tim Perdue <tperdue@valinux.com> writes:
>> Also, when you are doing the complete query with sort, does a
>> pg_sorttemp file appear in the database directory?  If so, how big does
>> it get?

> Doesn't look like it. There is a (supposedly) dead pg_sorttemp file in
> that directory that is 74MB. Probably unrelated.

That's even odder.  If the sort is being done entirely in memory (which
I'd expected given the amount of data and your -S setting, but it's good
to confirm) then it's basically a qsort() call, and there shouldn't be
any measurable difference between 6.5 and 7.0.

All else being equal that is.  Since this is a sort on a char() field,
perhaps all else is not equal.  In particular I'm suddenly wondering
if your 7.0 installation was compiled with LOCALE or MULTIBYTE support
and your 6.5 not.  A few tens of thousands of strcoll() calls to do the
sort comparisons might account for the slowdown...

> My only remaining question is whether an index on 4,000,000 datestamps
> is going to be fast or not.

I'd expect no significant difference from your other indexes on that
table.  If anything, it'll be faster than your existing index because
of the lack of duplicate keys.
        regards, tom lane


Re: Serious Performance Loss in 7.0.2??

От
Tatsuo Ishii
Дата:
> All else being equal that is.  Since this is a sort on a char() field,
> perhaps all else is not equal.  In particular I'm suddenly wondering
> if your 7.0 installation was compiled with LOCALE or MULTIBYTE support
> and your 6.5 not.  A few tens of thousands of strcoll() calls to do the
> sort comparisons might account for the slowdown...

Just to clarify, MULTIBYTE never calls strcoll() as far as I know.
--
Tatsuo Ishii