inaccurate stats on large tables

Поиск
Список
Период
Сортировка
От Kiran Mukhyala
Тема inaccurate stats on large tables
Дата
Msg-id 1220552512.3853.320.camel@sausalito.gene.com
обсуждение исходный текст
Ответы Re: inaccurate stats on large tables
Список pgsql-performance
Hello,

        I am running a select on a large table with two where
        conditions.
        Explain analyze shows that the estimated number of rows returned
        (190760) is much more than the actual rows returned (58221),
        which is probably the underlying cause for the poor performance
        I am seeing.

        Can someone please tell me how to improve the query planner
        estimate? I did try vacuum analyze. Here are some details:

        Explain plan:
        unison@csb-test=> explain analyze select * from paliasorigin a
        where
        a.origin_id=20 and a.tax_id=9606;


        QUERY PLAN
        --------------------------------------------------------------------------
         Bitmap Heap Scan on paliasorigin a  (cost=4901.38..431029.54
        rows=190760 width=118) (actual time=12.447..112.902 rows=58221
        loops=1)
           Recheck Cond: ((origin_id = 20) AND (tax_id = 9606))
           ->  Bitmap Index Scan on paliasorigin_search3_idx
        (cost=0.00..4853.69 rows=190760 width=0) (actual
        time=11.407..11.407
        rows=58221 loops=1)
                 Index Cond: ((origin_id = 20) AND (tax_id = 9606))

        Schema:
        unison@csb-test=> \d+ paliasorigin
         Column   |           Type           |
        Modifiers                             |
        -----------+--------------------------+------------
         palias_id | integer                  | not null
         origin_id | integer                  | not null
         alias     | text                     | not null
         descr     | text                     |
         tax_id    | integer                  |
         added     | timestamp with time zone | not null default
        timenow()
        Indexes:
            "palias_pkey" PRIMARY KEY, btree (palias_id)
            "paliasorigin_alias_unique_in_origin_idx" UNIQUE, btree
        (origin_id,
        alias)
            "paliasorigin_alias_casefold_idx" btree (upper(alias))
        CLUSTER
            "paliasorigin_alias_idx" btree (alias)
            "paliasorigin_o_idx" btree (origin_id)
            "paliasorigin_search1_idx" btree (palias_id, origin_id)
            "paliasorigin_search3_idx" btree (origin_id, tax_id,
        palias_id)
            "paliasorigin_tax_id_idx" btree (tax_id)
        Foreign-key constraints:
            "origin_id_exists" FOREIGN KEY (origin_id) REFERENCES
        origin(origin_id) ON UPDATE CASCADE ON DELETE CASCADE
        Has OIDs: no


        Number of rows:
        unison@csb-test=> select count(*) from paliasorigin;
          count
        ----------
         37909009
        (1 row)

        Pg version:
        unison@csb-test=> select version();
                                                  version
        --------------------------------------------------------------------------------------------
         PostgreSQL 8.3.3 on x86_64-unknown-linux-gnu, compiled by GCC
        gcc (GCC)
        4.1.0 (SUSE Linux)
        (1 row)


        Info from analyze verbose:
        unison@csb-test=> analyze verbose paliasorigin;
        INFO:  analyzing "unison.paliasorigin"
        INFO:  "paliasorigin": scanned 300000 of 692947 pages,
        containing
        16409041 live rows and 0 dead rows; 300000 rows in sample,
        37901986
        estimated total rows
        ANALYZE
        Time: 21999.506 ms


        Thank you,

        -Kiran Mukhyala




В списке pgsql-performance по дате отправления:

Предыдущее
От: "Scott Carey"
Дата:
Сообщение: Re: limit clause breaks query planner?
Следующее
От: "Matt Smiley"
Дата:
Сообщение: Re: limit clause breaks query planner?