update query taking 24+ hours

Поиск
Список
Период
Сортировка
От Ken
Тема update query taking 24+ hours
Дата
Msg-id 45A9CEA3.3020903@kwasnicki.com
обсуждение исходный текст
Ответы Re: update query taking 24+ hours
Список pgsql-sql
Hello,
I have postgres 8.1 on a linux box: 2.6Ghz P4, 1.5GB ram, 320GB hard 
drive.  I'm performing an update between two large tables and so far 
it's been running for 24+ hours.

I have two tables:
Master:
x int4
y int4
val1 int2
val2 int2

Import:
x int4
y int4
val int2

Each table has about 100 million rows.  I want to populate val2 in 
Master with val from Import where the two tables match on x and y.
So, my query looks like:
UPDATE Master SET val2=Import.val WHERE Master.x=Import.x AND 
Master.y=Import.y;

Both tables have indexes on the x and y columns.  Will that help?

Is there a better way to do this?  In each table x,y are unique, does 
that make a difference?  ie: would it be faster to run some kind of 
query, or loop, that just goes through each row in Import and updates 
Master (val2=val) where x=x and y=y?
If this approach would be better how to construct such a SQL statement?

The other weird thing is that when I monitor the system with xload it 
shows two bars of load, and the hard drive is going nuts,  so far my 
database directory has grown by 25GB,  however when I run "top" the 
system shows 98% idle and the postmaster process is usually only between 
1-2% CPU, although it is using 50% (750MB) ram.  Also the process shows 
up with a "D" status in the "S" column. 

Not sure what is going on.  If the size of the tables makes what I'm 
trying to do insane, or if I just have a bad SQL approach, or if 
something is wrong with my postgres configuration.

Really appreciate any help!
Thanks!
Ken





В списке pgsql-sql по дате отправления:

Предыдущее
От: Andrew Sullivan
Дата:
Сообщение: Re: deleting records from a table
Следующее
От: Tom Lane
Дата:
Сообщение: Re: update query taking 24+ hours