Very slow update / hash join

Поиск
Список
Период
Сортировка
От Kurt Roeckx
Тема Very slow update / hash join
Дата
Msg-id 20160504222226.GA24085@roeckx.be
обсуждение исходный текст
Ответы Re: Very slow update / hash join  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-general
Hi,

I have an update query that's been running for 48 hours now.
Since it started it used about 2.5% CPU, and is writing to the
disk at about 3 MB/s, and reading at about 2 MB/s.  It's mostly
waiting for the disks.

The query plan looks like this:
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Update on certificates c  (cost=1224052.45..60710389.31 rows=19950420 width=1371)
   ->  Hash Join  (cost=1224052.45..60710389.31 rows=19950420 width=1371)
         Hash Cond: (c.id = cu.id)
         ->  Seq Scan on certificates c  (cost=0.00..8372137.31 rows=147868231 width=1258)
         ->  Hash  (cost=623981.20..623981.20 rows=19950420 width=117)
               ->  Seq Scan on certificates_update cu  (cost=0.00..623981.20 rows=19950420 width=117)

I've set the work_mem for this query to 6GB, which seem to be
enough to make pgsql_tmp empty, when it was only set to 1 GB it
did have files in it.  The process is using about 4GB of RAM, of
which 0.5 probably comes from the shared_buffers.

It did use 100% CPU at the start, but that was for about 80
seconds.  I'm guessing that's the time it needs to read and hash
the update table.  But from that point on, it gets really slow.

As you can see, the table is quite large and I want to update
about 20M rows of the 133M rows (not sure why the plan say 147M)

The table itself is 53GB, and the table it updates from is only
3.3 GB.  There are some index on some of the fields (like the id),
but none of them are being updated.  I tried removing those that
did get updated but that had little effect.  It does have foreign
keys to other tables, and other tables references it, but none of
the keys should get updated.

Reading or writing the whole table shouldn't take that long, and I
have no idea why it's this slow.  Does anybody have an idea why
it's this slow?

From what I understand, the hash join should be the one I want to
use, I tried to force the others but that doesn't seem to improve
anything.


Kurt



В списке pgsql-general по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: Function PostgreSQL 9.2
Следующее
От: Vincent Veyron
Дата:
Сообщение: Re: Thoughts on "Love Your Database"