Обсуждение: PostgreSQL 7.1 forces sequence scan when there is no reason
Hello,
I have a problem with PostgreSQL 7.1 forces sequence scan which is
almost 10 times slower than index scan.
Here is the information about query, tables, and data.
Any advise would be nice.
db=> \d listmembers
Table "listmembers"
Column | Type | Modifiers
-----------+---------+-------------------------------------------------------
- id | integer | not null default
nextval('"listmembers_id_seq"'::text) server_id | integer |
name | text |
email | text |
Indexes: listmembers_sid_key
Unique keys: listmembers_id_key,
listmembers_sid_email_key
db=> \d listmembers_sid_key
Index "listmembers_sid_key"
Column | Type
-----------+---------
server_id | integer
btree
db=> \d listmembers_sid_email_key
Index "listmembers_sid_email_key"
Column | Type
-----------+---------
server_id | integer
email | text
unique btree
db=> explain analyze select count(*) from listmembers where
server_id = 15182; NOTICE: QUERY PLAN:
Aggregate (cost=31428.34..31428.34 rows=1 width=0) (actual
time=38632.84..38632.84 rows=1 loops=1) -> Seq Scan on listmembers
(cost=0.00..31379.16 rows=19671 width=0) (actual time=198.13..38494.36
rows=10011 loops=1) Total runtime: 38633.01 msec
EXPLAIN
db=> set enable_seqscan to no;
SET VARIABLE
db=> explain analyze select count(*) from listmembers where
server_id = 15182; NOTICE: QUERY PLAN:
Aggregate (cost=60210.41..60210.41 rows=1 width=0) (actual
time=2117.61..2117.61 rows=1 loops=1) -> Index Scan using
listmembers_sid_key on listmembers (cost=0.00..60161.24 rows=19671 width=0)
(actual time=81.41..2096.67 rows=10011 loops=1) Total runtime: 2117.83 msec
EXPLAIN
db=> select count(*) from listmembers where server_id = 15182;
count
-------
10011
(1 row)
db=> select count(*) from listmembers;
count
---------
1372425
(1 row)
db=> select server_id, count(*) from listmembers group by
server_id order by count(*) desc limit 30; server_id | count
-----------+-------
34062 | 43154
32715 | 32232
42495 | 31034
38013 | 15905
13746 | 15237
42434 | 12045
31696 | 10539
15413 | 10334
15182 | 10011
28178 | 10000
33330 | 10000
37071 | 9960
46532 | 9860
15392 | 9737
29643 | 9423
36503 | 9329
25378 | 9280
32095 | 9079
28621 | 9047
15400 | 9043
25870 | 8972
38377 | 8724
13046 | 8484
42416 | 8442
14869 | 7973
24131 | 7940
32625 | 7918
46480 | 7783
43172 | 7179
36849 | 6887
(30 rows)
webmailstation=# select * from pg_class where relname='listmembers';
relname | reltype | relowner | relam | relfilenode | relpages |
reltuples | reltoastrelid | reltoastidxid | relhasindex | relisshared |
relkind | relnatts | relchecks | reltriggers | relukeys | relfkeys | relrefs
| relhasoids | relhaspkey | relhasrules | relhassubclass | relacl
-------------+---------+----------+-------+-------------+----------+--------
-----+---------------+---------------+-------------+-------------+---------+-
---------+-----------+-------------+----------+----------+---------+---------
---+------------+-------------+----------------+-------- listmembers |
6429403 | 102 | 0 | 6429402 | 14224 | 1.37241e+06 |
6429404 | 0 | t | f | r | 4 |
0 | 0 | 0 | 0 | 0 | t | f
| f | f | (1 row)
db=# select * from pg_statistic where starelid=6429402 ;
starelid | staattnum | stanullfrac | stawidth | stadistinct | stakind1 |
stakind2 | stakind3 | stakind4 | staop1 | staop2 | staop3 | staop4 |
stanumbers1
| stanumbers2 | stanumbers3 | stanumbers4 |
stavalues1
|
stavalues2
|
stavalues3 | stavalues4
----------+-----------+-------------+----------+-------------+----------+---
-------+----------+----------+--------+--------+--------+--------+-----------
-----------------------------------------------------------------------------
----------------------+-------------+--------------+-------------+-----------
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
---------------------------------------------------------------------------+-
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
-----------------------------------------------------------------------+-----
-------+------------ 6429402 | 1 | 0 | 4 |
-1 | 2 | 3 | 0 | 0 | 97 | 97 | 0 |
0 |
| {0.805365} | |
|
{590,520800,790589,1001533,1375234,1655946,1926816,2342644,2808910,3347435,3
532408}
|
| | 6429402 | 2 | 0 | 4 |
1150 | 1 | 2 | 3 | 0 | 96 | 97 | 97
| 0 |
{0.038,0.027,0.024,0.0143333,0.012,0.01,0.00933333,0.00766667,0.00733333,0.0
07} | | {0.428932} | |
{34062,32715,42495,15182,38013,36503,13746,46532,42434,15392}
|
{12855,15419,16223,20598,26019,30733,34402,38139,40811,44986,49509}
|
| 6429402 | 3 | 0.0376667 | 10 | 2581 |
1 | 2 | 3 | 0 | 98 | 664 | 664 | 0 |
{0.466333,0.00333333,0.003,0.002,0.002,0.00166667,0.00133333,0.00133333,0.00
1,0.001} | | {0.227739} | |
{"",webwizards,"The Ad Builder Safe List","Kane at InternetSeer","Network
Commerce",Michael,James,John,Admin,"D.Woodward "}
| {" Success Center","Brent Sommers",Debra,"Great
Vacations","Johnny Blaze",Mariani,"Peter Maglione","Silhouettes
Catalog",Wally,"johan kotze",жЛжН}
| | 6429402 | 4 |
0 | 25 | -0.118184 | 1 | 2 | 3 | 0
| 98 | 664 | 664 | 0 |
{0.00133333,0.001,0.001,0.000666667,0.000666667,0.000666667,0.000666667,0.00
0666667,0.000666667,0.000666667} | | {-0.0167706} | |
{servicebox@spedia.net,dougsreplyto3@excite.com,inquire@careerexpansion.com,
234freeb@webwizards-add-url.com,cashdueu@hotmail.com,cashonline1@excite.com,c
wmailer@yahoo.com,galaxy@mail2.galaxy.com,gmichel@post.com,half.com_by_ebay@h
alf.com} |
{05078475@email.com,bethebest@zwallet.com,cynric7@yahoo.com,ezine@yourhomejo
b.com,ilkst@beeline.samara.ru,kirk.stensrud@lpl.com,mjm@netset.com,ping13013@
yahoo.fr,sandrac@menta.net,tgaeke@worldsubmitter.com,zzzmuffin@aol.com} |
| (4 rows)
-------------------------------------------------------
Denis Perchine <dyp@perchine.com> writes:
> db=> explain analyze select count(*) from listmembers where
> server_id = 15182; NOTICE: QUERY PLAN:
> Aggregate (cost=31428.34..31428.34 rows=1 width=0) (actual
> time=38632.84..38632.84 rows=1 loops=1) -> Seq Scan on listmembers
> (cost=0.00..31379.16 rows=19671 width=0) (actual time=198.13..38494.36
> rows=10011 loops=1) Total runtime: 38633.01 msec
> EXPLAIN
> db=> set enable_seqscan to no;
> SET VARIABLE
> db=> explain analyze select count(*) from listmembers where
> server_id = 15182; NOTICE: QUERY PLAN:
> Aggregate (cost=60210.41..60210.41 rows=1 width=0) (actual
> time=2117.61..2117.61 rows=1 loops=1) -> Index Scan using
> listmembers_sid_key on listmembers (cost=0.00..60161.24 rows=19671 width=0)
> (actual time=81.41..2096.67 rows=10011 loops=1) Total runtime: 2117.83 msec
> EXPLAIN
Hm. Is it possible that the rows with server_id = 15182 are clustered
together? Given that you are fetching 10011 rows from a 14224-page
table, it seems unlikely that an indexscan could be such a big win
unless there was a very strong clustering effect.
> db=# select * from pg_statistic where starelid=6429402 ;
This is pretty unhelpful (not to mention unreadable) since we have no
idea which row is which. Could we see the pg_stats view, instead?
regards, tom lane
On Monday 20 May 2002 21:48, Tom Lane wrote:
> Denis Perchine <dyp@perchine.com> writes:
> > db=> explain analyze select count(*) from listmembers where
> > server_id = 15182; NOTICE: QUERY PLAN:
> > Aggregate (cost=31428.34..31428.34 rows=1 width=0) (actual
> > time=38632.84..38632.84 rows=1 loops=1) -> Seq Scan on listmembers
> > (cost=0.00..31379.16 rows=19671 width=0) (actual time=198.13..38494.36
> > rows=10011 loops=1) Total runtime: 38633.01 msec
> > EXPLAIN
> > db=> set enable_seqscan to no;
> > SET VARIABLE
> > db=> explain analyze select count(*) from listmembers where
> > server_id = 15182; NOTICE: QUERY PLAN:
> > Aggregate (cost=60210.41..60210.41 rows=1 width=0) (actual
> > time=2117.61..2117.61 rows=1 loops=1) -> Index Scan using
> > listmembers_sid_key on listmembers (cost=0.00..60161.24 rows=19671
> > width=0) (actual time=81.41..2096.67 rows=10011 loops=1) Total runtime:
> > 2117.83 msec EXPLAIN
>
> Hm. Is it possible that the rows with server_id = 15182 are clustered
> together? Given that you are fetching 10011 rows from a 14224-page
> table, it seems unlikely that an indexscan could be such a big win
> unless there was a very strong clustering effect.
Possible, but 10 000 records are less than 1% of all records.
How can I figure out whether they are clustered.
> > db=# select * from pg_statistic where starelid=6429402 ;
>
> This is pretty unhelpful (not to mention unreadable) since we have no
> idea which row is which. Could we see the pg_stats view, instead?
db=# select * from pg_stats where tablename='listmembers';
tablename | attname | null_frac | avg_width | n_distinct |
most_common_vals
| most_common_freqs
|
histogram_bounds
| correlation
-------------+-----------+-----------+-----------+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------
listmembers | id | 0 | 4 | -1 |
|
|
{590,520800,790589,1001533,1375234,1655946,1926816,2342644,2808910,3347435,3532408}
| 0.805365
listmembers | server_id | 0 | 4 | 1150 |
{34062,32715,42495,15182,38013,36503,13746,46532,42434,15392}
|
{0.038,0.027,0.024,0.0143333,0.012,0.01,0.00933333,0.00766667,0.00733333,0.007}
| {12855,15419,16223,20598,26019,30733,34402,38139,40811,44986,49509}
| 0.428932
listmembers | name | 0.0376667 | 10 | 2581 |
{"",webwizards,"The Ad Builder Safe List","Kane at InternetSeer","Network
Commerce",Michael,James,John,Admin,"D.Woodward "}
|
{0.466333,0.00333333,0.003,0.002,0.002,0.00166667,0.00133333,0.00133333,0.001,0.001}
| {" Success Center","Brent Sommers",Debra,"Great Vacations","Johnny
Blaze",Mariani,"Peter Maglione","Silhouettes Catalog",Wally,"johan
kotze",жЛжН}
| 0.227739
listmembers | email | 0 | 25 | -0.118184 |
{servicebox@spedia.net,dougsreplyto3@excite.com,inquire@careerexpansion.com,234freeb@webwizards-add-url.com,cashdueu@hotmail.com,cashonline1@excite.com,cwmailer@yahoo.com,galaxy@mail2.galaxy.com,gmichel@post.com,half.com_by_ebay@half.com}
|
{0.00133333,0.001,0.001,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667}
|
{05078475@email.com,bethebest@zwallet.com,cynric7@yahoo.com,ezine@yourhomejob.com,ilkst@beeline.samara.ru,kirk.stensrud@lpl.com,mjm@netset.com,ping13013@yahoo.fr,sandrac@menta.net,tgaeke@worldsubmitter.com,zzzmuffin@aol.com}
| -0.0167706
(4 rows)
--
Denis
Denis Perchine <dyp@perchine.com> writes:
> On Monday 20 May 2002 21:48, Tom Lane wrote:
>> Hm. Is it possible that the rows with server_id = 15182 are clustered
>> together? Given that you are fetching 10011 rows from a 14224-page
>> table, it seems unlikely that an indexscan could be such a big win
>> unless there was a very strong clustering effect.
> Possible, but 10 000 records are less than 1% of all records.
> How can I figure out whether they are clustered.
Look at the ctid column for those records. The range of block numbers
in the ctids would tell the tale. I don't think Postgres itself
provides any operations on type TID, but you could dump the info into
a file and then analyze it.
> listmembers | server_id | 0 | 4 | 1150 |
> {34062,32715,42495,15182,38013,36503,13746,46532,42434,15392}
> |
> {0.038,0.027,0.024,0.0143333,0.012,0.01,0.00933333,0.00766667,0.00733333,0.007}
> | {12855,15419,16223,20598,26019,30733,34402,38139,40811,44986,49509}
> | 0.428932
Hmm. Correlation 0.43 is high enough to suggest that there's some
clustering effect. If you look in the archives there's been prior
discussion about whether to make the optimizer weight the correlation
factor more strongly.
regards, tom lane
On Monday 20 May 2002 22:54, Tom Lane wrote:
> > listmembers | server_id | 0 | 4 | 1150 |
> > {34062,32715,42495,15182,38013,36503,13746,46532,42434,15392}
> >
> > {0.038,0.027,0.024,0.0143333,0.012,0.01,0.00933333,0.00766667,0.00733333,
> >0.007}
> >
> > | {12855,15419,16223,20598,26019,30733,34402,38139,40811,44986,49509}
> > | 0.428932
>
> Hmm. Correlation 0.43 is high enough to suggest that there's some
> clustering effect. If you look in the archives there's been prior
> discussion about whether to make the optimizer weight the correlation
> factor more strongly.
I have increased sequence scan weight 5 times. And it uses index scan now, but
this is not the solution. I will have a look in the archives.
--
Denis
Hello again, In PG 7.1.3, ISTR I used to be able to submit any length I wished for a VARCHAR (and maybe a CHAR). Now, I'm at 7.2.1 and get these errors: ERROR: value too long for type character varying(255) as JDBC Exceptions. Do you know how I can tell it to "silently truncate" the data as I believe it used to? Many thanks, Doug PS: This group is great, and some of you (you know who you are) are amazing.
On Tue, 21 May 2002 12:17:38 -0400 "Doug Fields" <dfields-pg-general@pexicom.com> wrote: > In PG 7.1.3, ISTR I used to be able to submit any length I wished for a > VARCHAR (and maybe a CHAR). > > Now, I'm at 7.2.1 and get these errors: > > ERROR: value too long for type character varying(255) > > as JDBC Exceptions. > > Do you know how I can tell it to "silently truncate" the data as I believe > it used to? Is there a reason you can't just use TEXT? It will be stored the same internally... But the most obvious way I can think of to do what you're asking is to define a rule to replace the value inserted with a call to substring(). Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
> > Do you know how I can tell it to "silently truncate" the data as I believe > > it used to? > >Is there a reason you can't just use TEXT? It will be stored the same >internally... The only reason being that I don't want to waste space storing arbitrarily long pieces of information. I was not aware that TEXT is just as (in)efficient as VARCHAR, though. If that is truly the case, I will consider using TEXT in some places, but only when I am very careful in the INSERT and UPDATE clauses so as not to store too long things. >But the most obvious way I can think of to do what you're asking is >to define a rule to replace the value inserted with a call to >substring(). So, basically: 1) There's absolutely no way to get back the 7.1.3 "silent truncation" behavior (which is, IMO, stupid, I like the non-standard truncation behavior) 2) You have to truncate on INSERT/UPDATE and continue using VARCHAR(x) 3) Instead of 2, you can use TEXT Several others have asked this question since 7.2 appeared but none received a response to #1. Thanks, Doug
On Tue, 21 May 2002, Doug Fields wrote: > > > > Do you know how I can tell it to "silently truncate" the data as I believe > > > it used to? > > > >Is there a reason you can't just use TEXT? It will be stored the same > >internally... > > The only reason being that I don't want to waste space storing arbitrarily > long pieces of information. > > I was not aware that TEXT is just as (in)efficient as VARCHAR, though. If > that is truly the case, I will consider using TEXT in some places, but only > when I am very careful in the INSERT and UPDATE clauses so as not to store > too long things. > > >But the most obvious way I can think of to do what you're asking is > >to define a rule to replace the value inserted with a call to > >substring(). > > So, basically: > > 1) There's absolutely no way to get back the 7.1.3 "silent truncation" > behavior (which is, IMO, stupid, I like the non-standard truncation behavior) You could do this via a text column and a before trigger which might work better than a rule, but I haven't tried either, so...
Neil Conway wrote:
> On Tue, 21 May 2002 12:17:38 -0400
> "Doug Fields" <dfields-pg-general@pexicom.com> wrote:
> > In PG 7.1.3, ISTR I used to be able to submit any length I wished for a
> > VARCHAR (and maybe a CHAR).
> >
> > Now, I'm at 7.2.1 and get these errors:
> >
> > ERROR: value too long for type character varying(255)
> >
> > as JDBC Exceptions.
> >
> > Do you know how I can tell it to "silently truncate" the data as I believe
> > it used to?
>
> Is there a reason you can't just use TEXT? It will be stored the same
> internally...
>
> But the most obvious way I can think of to do what you're asking is
> to define a rule to replace the value inserted with a call to
> substring().
And exactly that will fail because the rewritten query will
be of the same operation type on the same table, triggering
the same rule again ...
The way I see is to use TEXT or VARCHAR without a size limit,
and defining a custom BEFORE trigger that truncates the NEW
value.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #