Обсуждение: 100x slowdown for nearly identical tables

Поиск
Список
Период
Сортировка

100x slowdown for nearly identical tables

От
Craig James
Дата:
I have two tables that are nearly identical, yet the same query runs 100x slower on the newer one.  The two tables have the same number of rows (+/- about 1%), and are roughly the same size:

db=> SELECT relname AS table_name,
db-> pg_size_pretty(pg_relation_size(oid)) AS table_size,
db-> pg_size_pretty(pg_total_relation_size(oid)) AS total_size
db-> FROM pg_class
db-> WHERE relkind in ('r','i')
db-> ORDER BY pg_relation_size(oid) DESC;
               table_name               | table_size | total_size
----------------------------------------+------------+------------
 old_str_conntab                        | 26 GB      | 27 GB
 str_conntab                            | 20 GB      | 20 GB


Both tables have a single index, the primary key.  The new table has several more columns, but they're mostly empty (note that the new table is SMALLER, yet it is 100x slower).

I've already tried "reindex table ..." and "analyze table".  No difference.

This is running on PG 8.4.17 and Ubuntu 10.04.  Data is in a RAID10 (8 disks), and WAL is on a RAID1, both controlled by an LSI 3WARE  9650SE-12ML with BBU.

If I re-run the same query, both the old and new tables drop to about 35 msec.  But the question is, why is the initial query so fast on the old table, and so slow on the new table?  I have three other servers with similar or identical hardware/software, and this happens on all of them, including on a 9.1.2 version of Postgres.

Thanks in advance...
Craig


db=> explain analyze select id, 1 from str_conntab
where (id >= 12009977 and id <= 12509976) order by id;

 Index Scan using new_str_conntab_pkey_3217 on str_conntab
      (cost=0.00..230431.33 rows=87827 width=4)
      (actual time=65.771..51341.899 rows=48613 loops=1)
   Index Cond: ((id >= 12009977) AND (id <= 12509976))
 Total runtime: 51350.556 ms

db=> explain analyze select id, 1 from old_str_conntab
where (id >= 12009977 and id <= 12509976) order by id;

 Index Scan using str_conntab_pkey on old_str_conntab
     (cost=0.00..82262.56 rows=78505 width=4)
     (actual time=38.327..581.235 rows=48725 loops=1)
   Index Cond: ((id >= 12009977) AND (id <= 12509976))
 Total runtime: 586.071 ms

db=> \d str_conntab
      Table "registry.str_conntab"
      Column      |  Type   | Modifiers
------------------+---------+-----------
 id               | integer | not null
 contab_len       | integer |
 contab_data      | text    |
 orig_contab_len  | integer |
 orig_contab_data | text    |
 normalized       | text    |
Indexes:
    "new_str_conntab_pkey_3217" PRIMARY KEY, btree (id)
Referenced by:
    TABLE "parent" CONSTRAINT "fk_parent_str_conntab_parent_id_3217" FOREIGN KEY (parent_id) REFERENCES str_conntab(id)
    TABLE "version" CONSTRAINT "fk_version_str_conntab_version_id_3217" FOREIGN KEY (version_id) REFERENCES str_conntab(id)

db=> \d old_str_conntab
 Table "registry.old_str_conntab"
   Column    |  Type   | Modifiers
-------------+---------+-----------
 id          | integer | not null
 contab_len  | integer |
 contab_data | text    |
Indexes:
    "str_conntab_pkey" PRIMARY KEY, btree (id)


Re: 100x slowdown for nearly identical tables

От
Tom Lane
Дата:
Craig James <cjames@emolecules.com> writes:
> I have two tables that are nearly identical, yet the same query runs 100x
> slower on the newer one. ...

> db=> explain analyze select id, 1 from str_conntab
> where (id >= 12009977 and id <= 12509976) order by id;

>  Index Scan using new_str_conntab_pkey_3217 on str_conntab
>       (cost=0.00..230431.33 rows=87827 width=4)
>       (actual time=65.771..51341.899 rows=48613 loops=1)
>    Index Cond: ((id >= 12009977) AND (id <= 12509976))
>  Total runtime: 51350.556 ms

> db=> explain analyze select id, 1 from old_str_conntab
> where (id >= 12009977 and id <= 12509976) order by id;

>  Index Scan using str_conntab_pkey on old_str_conntab
>      (cost=0.00..82262.56 rows=78505 width=4)
>      (actual time=38.327..581.235 rows=48725 loops=1)
>    Index Cond: ((id >= 12009977) AND (id <= 12509976))
>  Total runtime: 586.071 ms

It looks like old_str_conntab is more or less clustered by "id",
and str_conntab not so much.  You could try EXPLAIN (ANALYZE, BUFFERS)
(on newer PG versions) to verify how many distinct pages are getting
touched during the indexscan.

            regards, tom lane


Re: 100x slowdown for nearly identical tables

От
Craig James
Дата:
On Wed, May 1, 2013 at 5:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Craig James <cjames@emolecules.com> writes:
> I have two tables that are nearly identical, yet the same query runs 100x
> slower on the newer one. ...

> db=> explain analyze select id, 1 from str_conntab
> where (id >= 12009977 and id <= 12509976) order by id;

>  Index Scan using new_str_conntab_pkey_3217 on str_conntab
>       (cost=0.00..230431.33 rows=87827 width=4)
>       (actual time=65.771..51341.899 rows=48613 loops=1)
>    Index Cond: ((id >= 12009977) AND (id <= 12509976))
>  Total runtime: 51350.556 ms

> db=> explain analyze select id, 1 from old_str_conntab
> where (id >= 12009977 and id <= 12509976) order by id;

>  Index Scan using str_conntab_pkey on old_str_conntab
>      (cost=0.00..82262.56 rows=78505 width=4)
>      (actual time=38.327..581.235 rows=48725 loops=1)
>    Index Cond: ((id >= 12009977) AND (id <= 12509976))
>  Total runtime: 586.071 ms

It looks like old_str_conntab is more or less clustered by "id",
and str_conntab not so much.  You could try EXPLAIN (ANALYZE, BUFFERS)
(on newer PG versions) to verify how many distinct pages are getting
touched during the indexscan.

Yeah, now that you say it, it's obvious.  The original table was built with ID from a sequence, so it's going to be naturally clustered by ID.  The new table was built by reloading the data in alphabetical order by supplier name, so it would have scattered the IDs all over the place.

I guess I could actually cluster the new table, but since that one table holds about 90% of the total data in the database, that would be a chore.  Probably better to find a more efficient way to do the query.

Thanks,
Craig
 

                        regards, tom lane

Re: 100x slowdown for nearly identical tables

От
Tom Lane
Дата:
Craig James <cjames@emolecules.com> writes:
> On Wed, May 1, 2013 at 5:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It looks like old_str_conntab is more or less clustered by "id",
>> and str_conntab not so much.  You could try EXPLAIN (ANALYZE, BUFFERS)
>> (on newer PG versions) to verify how many distinct pages are getting
>> touched during the indexscan.

> Yeah, now that you say it, it's obvious.  The original table was built with
> ID from a sequence, so it's going to be naturally clustered by ID.  The new
> table was built by reloading the data in alphabetical order by supplier
> name, so it would have scattered the IDs all over the place.

> I guess I could actually cluster the new table, but since that one table
> holds about 90% of the total data in the database, that would be a chore.
> Probably better to find a more efficient way to do the query.

Just out of curiosity, you could try forcing a bitmap indexscan to see
how much that helps.  The planner evidently thinks "not at all", but
it's been wrong before ;-)

            regards, tom lane