optimization ideas for frequent, large(ish) updates in frequently accessed DB?

Поиск
Список
Период
Сортировка
Hi,

one of our tables has to be updated frequently, but concurrently running
SELECT-queries must also have low latency times (it's being accessed
through a web interface).

I'm looking for ideas that might improve the interactive performance of
the system, without slowing down the updates too much. Here are the
characteristics of the table and its use:

- approx. 2 million rows

- approx. 4-5 million rows per day are replaced in short bursts of
1-200k rows (average ~3000 rows per update)

- the table needs 6 indexes (not all indexes are used all the time, but
keeping them all the time slows the system down less than re-creating
some of them just before they're needed and dropping them afterwards)

- an "update" means that 1-200k rows with a common value in a particular
field are replaced with an arbitrary number of new rows (with the same
value in that field), i.e.:

begin transaction;
   delete from t where id=5;
   insert into t (id,...) values (5,...);
   ... [1-200k rows]
end;

The problem is, that a large update of this kind can delay SELECT
queries running in parallel for several seconds, so the web interface
used by several people will be unusable for a short while.

Currently, I'm using temporary tables:

create table xyz as select * from t limit 0;
insert into xyz ...
...
begin transaction;
delete from t where id=5;
insert into t select * from xyz;
end;
drop table xyz;

This is slightly faster than inserting directly into t (and probably
faster than using COPY, even though using that might reduce the overall
load on the database).

What other possibilities are there, other than splitting up the 15
columns of that table into several smaller tables, which is something
I'd like to avoid? Would replication help? (I doubt it, but haven't
tried it yet...) Writing to one table (without indexes) and creating
indexes and renaming it to the "read table" periodically in a double
buffering-like fashion wouldn't work either(?), since the views and
triggers would have to be re-created every time as well and other
problems might arise.

The postgresql.conf options are already reasonably tweaked for
performance(IMHO), but perhaps some settings are particularly critical:

shared_buffers=100000
(I tried many values, this seems to work well for us - 12GB RAM)
wal_buffers=500
sort_mem=800000
checkpoint_segments=16
effective_cache_size=1000000
etc.

Any help/suggestions would be greatly appreciated... Even if it's
something like "you need a faster db box, there's no other way" ;-)

Regards,
  Marinos

В списке pgsql-performance по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: Disappointing performance in db migrated from MS SQL Server
Следующее
От: Josh Berkus
Дата:
Сообщение: Re: optimization ideas for frequent, large(ish) updates in frequently accessed DB?