Обсуждение: Need Some Explanation of an EXPLAIN

Поиск

Список

Период

Сортировка

Need Some Explanation of an EXPLAIN

От

Bill Thoen

Дата:

26 сентября 2008 г., 15:12:33

I'm trying to update a field in a table that has about 5 millin records
with a table  that has about 3.5 million records. I've created indexes
for th a joined columns, but PostgreSQL 8.1 doesn't seem to want to use
them. This makes for a very slow update.

Below are descriptions of the two tables followed by the query plan that
my PostgreSQL wants to use. So I have two questions:

1.) Why won't it use the indexes?
2.) How can I make this update faster?

TIA,
Bill Thoen


          Table "public.id2"
  Column   |     Type     | Modifiers
-----------+--------------+-----------
 grower_id | integer      |
 fmid      | character(7) |
 fsa_id    | character(9) |
Indexes:
    "id2_fsa_is_key" UNIQUE, btree (fsa_id)

              Table "public.growers"
    Column    |         Type          | Modifiers
--------------+-----------------------+-----------
 grower_id    | integer               |
 fsa_id       | character(9)          |
 co_name      | character varying(45) |
. . .
Indexes:
    "grower_fsa_id_key" btree (fsa_id)

fsa=# EXPLAIN UPDATE growers
              SET grower_id = id2.grower_id
              FROM id2 WHERE growers.fsa_id = id2.fsa_id;

                                QUERY PLAN
--------------------------------------------------------------------------
 Hash Join  (cost=70375.50..1603795.30 rows=6802720 width=355)
   Hash Cond: ("outer".fsa_id = "inner".fsa_id)
   ->  Seq Scan on growers  (cost=0.00..672373.20 rows=6802720 width=351)
   ->  Hash  (cost=46249.20..46249.20 rows=1966920 width=44)
         ->  Seq Scan on id2  (cost=0.00..46249.20 rows=1966920 width=44)
(5 rows)

Re: Need Some Explanation of an EXPLAIN

От

"Scott Marlowe"

Дата:

26 сентября 2008 г., 15:23:42

On Fri, Sep 26, 2008 at 12:13 PM, Bill Thoen <bthoen@gisnet.com> wrote:
> I'm trying to update a field in a table that has about 5 millin records with
> a table  that has about 3.5 million records. I've created indexes for th a
> joined columns, but PostgreSQL 8.1 doesn't seem to want to use them. This
> makes for a very slow update.

8.3 is a bit smarter on some of these types of queries, updating to
that may (or may not) help in this situation.

> 1.) Why won't it use the indexes?

The pgsql planner chooses whether to use indexes versus sequential
scans based on whether one or the other is "cheaper" to use.

> 2.) How can I make this update faster?

Get a faster server?

> fsa=# EXPLAIN UPDATE growers
>             SET grower_id = id2.grower_id
>             FROM id2 WHERE growers.fsa_id = id2.fsa_id;
>                                   QUERY PLAN
> --------------------------------------------------------------------------
> Hash Join  (cost=70375.50..1603795.30 rows=6802720 width=355)
>  Hash Cond: ("outer".fsa_id = "inner".fsa_id)
>  ->  Seq Scan on growers  (cost=0.00..672373.20 rows=6802720 width=351)
>  ->  Hash  (cost=46249.20..46249.20 rows=1966920 width=44)
>        ->  Seq Scan on id2  (cost=0.00..46249.20 rows=1966920 width=44)

1: Post explain analyze (if you can wait long enough for the update to finish.
2: Is the table bloated from multiple updates?  Have you been
vacuuming between each update you've tested?  Do you have autovacuum
enabled and is it aggresive enough to keep up?
3: If you want to force pgsql to use an index for testing purposes,
try running this before your update query:

set enable_indexscan=off;

and see if it's faster or slower.

Pgsql indexes don't have visibility, and this contributes to the
higher cost of index scans on pgsql than some other dbs.  However,
sequential scans make sense if you're updating the whole table in
almost any database.

Re: Need Some Explanation of an EXPLAIN

От

Tom Lane

Дата:

26 сентября 2008 г., 18:25:12

"Scott Marlowe" <scott.marlowe@gmail.com> writes:
> On Fri, Sep 26, 2008 at 12:13 PM, Bill Thoen <bthoen@gisnet.com> wrote:
>> 2.) How can I make this update faster?

> Get a faster server?

Increasing work_mem might help.  If the hashtable size estimate is
right, you'd need something like 100MB to be sure that the join gets
done in only one batch.

Also, if a lot of the rows needn't actually change, it'd be worthwhile
to add a WHERE clause:

 UPDATE growers
 SET grower_id = id2.grower_id
 FROM id2 WHERE growers.fsa_id = id2.fsa_id
      AND growers.grower_id != id2.grower_id

            regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Need Some Explanation of an EXPLAIN

Need Some Explanation of an EXPLAIN

Re: Need Some Explanation of an EXPLAIN

Re: Need Some Explanation of an EXPLAIN