Обсуждение: Make UPDATE query quicker?

Поиск
Список
Период
Сортировка

Make UPDATE query quicker?

От
James David Smith
Дата:
Hi all,

Wondered if someone had any tips about how to do this UPDATE query
quicker. I've got two tables:

CREATE TABLE import_table
( date_time TIMESTAMP
person_id, INTEGER
data REAL)

CREATE TABLE master_table
(date_time TIMESTAMP
person_id INTEGER
data REAL)

Each table has 172800 rows in it.

I want to move the 'data' from the import_table to the master_table by
matching on both the date_time and the person_id.  I do this with this
query:

UPDATE master_table a
SET data =
(SELECT b.date
FROM import_table b
WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
b.data IS NOT NULL)

However I need to do this about 20 times, and the first time is still
running at the moment (about 15 minutes).

Am I doing something wrong? Should I put some indexes on the columns
somehow to improve performance?

Thanks

James


Re: Make UPDATE query quicker?

От
Michael Wood
Дата:
On 12 October 2012 12:16, James David Smith <james.david.smith@gmail.com> wrote:
>
> Hi all,
>
> Wondered if someone had any tips about how to do this UPDATE query
> quicker. I've got two tables:
>
> CREATE TABLE import_table
> ( date_time TIMESTAMP
> person_id, INTEGER
> data REAL)
>
> CREATE TABLE master_table
> (date_time TIMESTAMP
> person_id INTEGER
> data REAL)
>
> Each table has 172800 rows in it.
>
> I want to move the 'data' from the import_table to the master_table by
> matching on both the date_time and the person_id.  I do this with this
> query:
>
> UPDATE master_table a
> SET data =
> (SELECT b.date
> FROM import_table b
> WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
> b.data IS NOT NULL)
>
> However I need to do this about 20 times, and the first time is still
> running at the moment (about 15 minutes).
>
> Am I doing something wrong? Should I put some indexes on the columns
> somehow to improve performance?

I don't know if this will help performance-wise, but maybe it's worth a try:

See if doing it with an UPDATE FROM helps at all.  e.g.:

http://stackoverflow.com/a/2766766/495319

--
Michael Wood <esiotrot@gmail.com>


Re: Make UPDATE query quicker?

От
James David Smith
Дата:
Michael,

Thanks, I will give that a try later today.

The first of the 20 updates I need to do took about 25 minutes in the
end by the way.

All other useful suggestions gratefully recevied... ;-)

Thanks

James


On 12 October 2012 12:53, Michael Wood <esiotrot@gmail.com> wrote:
> On 12 October 2012 12:16, James David Smith <james.david.smith@gmail.com> wrote:
>>
>> Hi all,
>>
>> Wondered if someone had any tips about how to do this UPDATE query
>> quicker. I've got two tables:
>>
>> CREATE TABLE import_table
>> ( date_time TIMESTAMP
>> person_id, INTEGER
>> data REAL)
>>
>> CREATE TABLE master_table
>> (date_time TIMESTAMP
>> person_id INTEGER
>> data REAL)
>>
>> Each table has 172800 rows in it.
>>
>> I want to move the 'data' from the import_table to the master_table by
>> matching on both the date_time and the person_id.  I do this with this
>> query:
>>
>> UPDATE master_table a
>> SET data =
>> (SELECT b.date
>> FROM import_table b
>> WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
>> b.data IS NOT NULL)
>>
>> However I need to do this about 20 times, and the first time is still
>> running at the moment (about 15 minutes).
>>
>> Am I doing something wrong? Should I put some indexes on the columns
>> somehow to improve performance?
>
> I don't know if this will help performance-wise, but maybe it's worth a try:
>
> See if doing it with an UPDATE FROM helps at all.  e.g.:
>
> http://stackoverflow.com/a/2766766/495319
>
> --
> Michael Wood <esiotrot@gmail.com>


Re: Make UPDATE query quicker?

От
VB N
Дата:

On Fri, Oct 12, 2012 at 3:46 PM, James David Smith <james.david.smith@gmail.com> wrote:
Hi all,

Wondered if someone had any tips about how to do this UPDATE query
quicker. I've got two tables:

CREATE TABLE import_table
( date_time TIMESTAMP
person_id, INTEGER
data REAL)

CREATE TABLE master_table
(date_time TIMESTAMP
person_id INTEGER
data REAL)

Each table has 172800 rows in it.

I want to move the 'data' from the import_table to the master_table by
matching on both the date_time and the person_id.  I do this with this
query:

UPDATE master_table a
SET data =
(SELECT b.date
FROM import_table b
WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
b.data IS NOT NULL)

However I need to do this about 20 times, and the first time is still
running at the moment (about 15 minutes).

Am I doing something wrong? Should I put some indexes on the columns
somehow to improve performance?

Any idea how long the SELECT is taking ? Indexes on import_table (an Index on person_id) should help. ofcourse it all depends on the uniqueness of the column and size of the table etc. Whats the uniqueness (the n_distinct from pg_stats table) on all the 3 columns in the WHERE clause ?

Regards,
VB

Re: Make UPDATE query quicker?

От
James David Smith
Дата:
Hi VB,

1) No idea how long the select is taking unfortunately. How can I
figure this out?
2) I'll stick an index on person_id in the import_table.
3) Not sure what you mean by this unfortunately: "Whats the uniqueness
(the n_distinct from pg_stats table) on all the 3 columns in the WHERE
clause ?"

Thanks

James




On 12 October 2012 15:00, VB N <vbnpgc@gmail.com> wrote:
>
> On Fri, Oct 12, 2012 at 3:46 PM, James David Smith
> <james.david.smith@gmail.com> wrote:
>>
>> Hi all,
>>
>> Wondered if someone had any tips about how to do this UPDATE query
>> quicker. I've got two tables:
>>
>> CREATE TABLE import_table
>> ( date_time TIMESTAMP
>> person_id, INTEGER
>> data REAL)
>>
>> CREATE TABLE master_table
>> (date_time TIMESTAMP
>> person_id INTEGER
>> data REAL)
>>
>> Each table has 172800 rows in it.
>>
>> I want to move the 'data' from the import_table to the master_table by
>> matching on both the date_time and the person_id.  I do this with this
>> query:
>>
>> UPDATE master_table a
>> SET data =
>> (SELECT b.date
>> FROM import_table b
>> WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
>> b.data IS NOT NULL)
>>
>> However I need to do this about 20 times, and the first time is still
>> running at the moment (about 15 minutes).
>>
>> Am I doing something wrong? Should I put some indexes on the columns
>> somehow to improve performance?
>
>
> Any idea how long the SELECT is taking ? Indexes on import_table (an Index
> on person_id) should help. ofcourse it all depends on the uniqueness of the
> column and size of the table etc. Whats the uniqueness (the n_distinct from
> pg_stats table) on all the 3 columns in the WHERE clause ?
>
> Regards,
> VB


Re: Make UPDATE query quicker?

От
Tim Bowden
Дата:
On Fri, 2012-10-12 at 11:16 +0100, James David Smith wrote:
> Hi all,
>
> Wondered if someone had any tips about how to do this UPDATE query
> quicker. I've got two tables:
>
> CREATE TABLE import_table
> ( date_time TIMESTAMP
> person_id, INTEGER
> data REAL)
>
> CREATE TABLE master_table
> (date_time TIMESTAMP
> person_id INTEGER
> data REAL)
>
> Each table has 172800 rows in it.
>
> I want to move the 'data' from the import_table to the master_table by
> matching on both the date_time and the person_id.  I do this with this
> query:
>
> UPDATE master_table a
> SET data =
> (SELECT b.date
> FROM import_table b
> WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
> b.data IS NOT NULL)
>
> However I need to do this about 20 times, and the first time is still
> running at the moment (about 15 minutes).
>
> Am I doing something wrong? Should I put some indexes on the columns
> somehow to improve performance?
>
> Thanks
>
> James
>
>

For this you definitely want indexes.  Also you have not defined any
primary keys.  I'm going to assume the combination of date_time and
person_id defines a unique record.  A suitable index will automatically
be created if you define these two columns as the primary key.

ALTER TABLE master_table ADD PRIMARY KEY(date_time, person_id);


Do the same for import_table.

Also your query isn't doing what you think it is.  Where b.data is null,
then no record is returned by the select statement but the update is not
conditional so you end up with a.data is null.  You might want to check
any records in master_table that previously had a data value but the
same record in update_table had a null value for data. SELECT * from
import_table WHERE data IS NULL; would be a good place to start.

The query you want is something like this:

UPDATE master_table a SET data = b.data from import_table b where
    a.date_time = b.date_time and a.person_id = a.person_id
    and b.data is not null;

HTH
Tim Bowden