Обсуждение: DB Performance decreases due to often written/accessed table

Поиск
Список
Период
Сортировка

DB Performance decreases due to often written/accessed table

От
"Jens Schipkowski"
Дата:
Hi,

we've got performance problems due to repeating SELECT, UPDATE, DELETE,
INSERT statements. This statements have to be executed every 10 seconds,
but they run into a timeout.
To obviate problems regarding to our Java Software and JDBC drivers, we
put the repeating sequence of statements to a file more than 100k times
(half a million statements) and executed "psql ourDB -f ourFile.sql -o
/dev/null". To accelerate the occurence of the performance drop, we
started 6 instances of this command.
The performance drop occured after 10 minutes shifting the server to 0
percent idle and 85 - 95 percent user.
For tracing the statement which raised the load, we are using pg_locks,
pg_stat_activity with current_query enabled. The responsible statement is
the DELETE, it hangs until its canceled by timeout. The first run on an
vacuumed DB took 300 - 600ms.
In a second test we removed the DELETE statements to see wich statements
also needs longer time by increasing the amount of data. After half an
hour the SELECT statements timed out.
An additional VACUUM - every 1 minute - does extend the timeout occurence
by factor 5 - 6.
It does not sound professional, but the database seems to be aging by the
increase of executed statements.

The Statements
---------------
// normal insert - does not seem to be the problem - runtime is ok
INSERT INTO tbl_reg(idreg,idtype,attr1,...,attr6,loc1,...,loc3,register)
VALUES(nextval('tbl_reg_idreg_seq'),1,[attr],[loc],1);

// select finds out which one has not an twin
// a twin is defined as record with the same attr* values
// decreases speed over time until timeout by postgresql
SELECT *
 FROM tbl_reg reg
WHERE register <> loc1 AND
    idreg NOT IN
        (
        SELECT reg.idreg
        FROM tbl_reg reg, tbl_reg regtwin
        WHERE regtwin.register = 1 AND
            regtwin.type <> 20 AND
            reg.attr1 = regtwin.attr1 AND
            reg.attr2 = regtwin.attr2 AND
            reg.attr3 = regtwin.attr3 AND
            reg.attr4 = regtwin.attr4 AND
            reg.attr5 = regtwin.attr5 AND
            reg.attr6 = regtwin.attr6 AND
            reg.idreg <> regtwin.idreg AND
            reg.register = 2
        );
I tried to optimize the seslect statement but the group by having count(*)
> 1 solution is half as fast as this statement - relating to the query
plan of EXPLAIN ANALYZE.

// delete data without a twin
// drastically decreases speed over time until timeout by postgresql
DELETE
 FROM tbl_reg
WHERE idregs IN
    (
    SELECT reg.idreg
    FROM tbl_reg reg, tbl_reg regtwin
    WHERE regtwin.register = 1 AND
        regtwin.type <> 20 AND
        reg.attr1 = regtwin.attr1 AND
        reg.attr2 = regtwin.attr2 AND
        reg.attr3 = regtwin.attr3 AND
        reg.attr4 = regtwin.attr4 AND
        reg.attr5 = regtwin.attr5 AND
        reg.attr6 = regtwin.attr6 AND
        reg.idreg <> regtwin.idreg AND
        reg.register = 2
    ) OR
    (loc1 = '2' AND loc2 = '2');
The runtime of this statement increases until it will canceled by
PostgreSQL.

// the where clause of this update statement is normally build in java
UPDATE tbl_reg SET loc1=2 WHERE idreg IN ('...',...,'...');

The Table
---------------
Tested with: 20.000, 80.000, 500.000 records

CREATE TABLE tbl_reg
(
   idreg bigserial NOT NULL,
   idtype int8 DEFAULT 0,
   attr1 int4,
   attr2 int4,
   attr3 varchar(20),
   attr4 varchar(20),
   attr5 int4,
   attr6 varchar(140) DEFAULT ''::character varying,
   loc1 int2 DEFAULT 0,
   loc2 int2 DEFAULT 0,
   loc3 int2 DEFAULT 0,
   register int2 DEFAULT 1,
   "timestamp" timestamp DEFAULT now(),
   CONSTRAINT tbl_reg_pkey PRIMARY KEY (idreg)
)
WITHOUT OIDS;

The Hardware
----------------
Dual Xeon 3.2GHz Hyperthreading
SCSI harddrives
RAID and non-RAID tested


We have the problem, that we cannot see any potential to improve SQL
statements. Indexing the attr* columns seems not to be an solution,
because the data mustn't be unique (twins) and changes really often so
reindexing will took too long.


thanks,
Jens Schipkowski

Re: DB Performance decreases due to often written/accessed

От
Richard Huxton
Дата:
Jens Schipkowski wrote:
> Hi,
>
> we've got performance problems due to repeating SELECT, UPDATE, DELETE,
> INSERT statements. This statements have to be executed every 10 seconds,
> but they run into a timeout.
> To obviate problems regarding to our Java Software and JDBC drivers, we
> put the repeating sequence of statements to a file more than 100k times
> (half a million statements) and executed "psql ourDB -f ourFile.sql -o
> /dev/null". To accelerate the occurence of the performance drop, we
> started 6 instances of this command.
> The performance drop occured after 10 minutes shifting the server to 0
> percent idle and 85 - 95 percent user.

After 10 minutes of what? Did the half-million statements complete? If
not, how many got completed? Were they all in separate transactions or
did you batch them? How ofternwere you vacuuming here?

> For tracing the statement which raised the load, we are using pg_locks,
> pg_stat_activity with current_query enabled. The responsible statement
> is the DELETE, it hangs until its canceled by timeout. The first run on
> an vacuumed DB took 300 - 600ms.
> In a second test we removed the DELETE statements to see wich statements
> also needs longer time by increasing the amount of data. After half an
> hour the SELECT statements timed out.
> An additional VACUUM - every 1 minute - does extend the timeout
> occurence by factor 5 - 6.

And running vacuum every 30 seconds does what?

> It does not sound professional, but the database seems to be aging by
> the increase of executed statements.

It sounds very likely if you aren't vacuuming enough or the tables are
growing rapidly.

> The Statements
> ---------------
> // normal insert - does not seem to be the problem - runtime is ok
> INSERT INTO tbl_reg(idreg,idtype,attr1,...,attr6,loc1,...,loc3,register)
> VALUES(nextval('tbl_reg_idreg_seq'),1,[attr],[loc],1);
>
> // select finds out which one has not an twin
> // a twin is defined as record with the same attr* values
> // decreases speed over time until timeout by postgresql
> SELECT *
> FROM tbl_reg reg
> WHERE register <> loc1 AND
>     idreg NOT IN
>         (
>         SELECT reg.idreg
>         FROM tbl_reg reg, tbl_reg regtwin
>         WHERE regtwin.register = 1 AND
>             regtwin.type <> 20 AND
>             reg.attr1 = regtwin.attr1 AND
>             reg.attr2 = regtwin.attr2 AND
>             reg.attr3 = regtwin.attr3 AND
>             reg.attr4 = regtwin.attr4 AND
>             reg.attr5 = regtwin.attr5 AND
>             reg.attr6 = regtwin.attr6 AND
>             reg.idreg <> regtwin.idreg AND
>             reg.register = 2
>         );
> I tried to optimize the seslect statement but the group by having count(*)
>> 1 solution is half as fast as this statement - relating to the query
> plan of EXPLAIN ANALYZE.

And what did EXPLAIN ANALYSE show here? I'm guessing you're getting time
increasing as the square of the number of rows in tbl_reg. So, if 50
rows takes 2500s then 100 rows will take 10000s. Now, if you had enough
RAM (or an index) I'd expect the planner to process the table in a
sorted manner so the query-time would increase linearly.

Oh, and you're doing one join more than you need to here (counting the
NOT IN as a join). You could get by with a LEFT JOIN and a test for
idreg being null on the right-hand table.

> // delete data without a twin
> // drastically decreases speed over time until timeout by postgresql
[snip delete doing the same query as above]
> The runtime of this statement increases until it will canceled by
> PostgreSQL.
>
> // the where clause of this update statement is normally build in java
> UPDATE tbl_reg SET loc1=2 WHERE idreg IN ('...',...,'...');
>
> The Table
> ---------------
> Tested with: 20.000, 80.000, 500.000 records
>
> CREATE TABLE tbl_reg
> (
>   idreg bigserial NOT NULL,
>   idtype int8 DEFAULT 0,

You can have more than 4 billion "types"?

>   attr1 int4,
>   attr2 int4,
>   attr3 varchar(20),
>   attr4 varchar(20),
>   attr5 int4,
>   attr6 varchar(140) DEFAULT ''::character varying,
>   loc1 int2 DEFAULT 0,
>   loc2 int2 DEFAULT 0,
>   loc3 int2 DEFAULT 0,
>   register int2 DEFAULT 1,
>   "timestamp" timestamp DEFAULT now(),

You probably want timestamp with time zone.

>   CONSTRAINT tbl_reg_pkey PRIMARY KEY (idreg)
> )
> WITHOUT OIDS;
>
> The Hardware
> ----------------
> Dual Xeon 3.2GHz Hyperthreading
> SCSI harddrives
> RAID and non-RAID tested
>
> We have the problem, that we cannot see any potential to improve SQL
> statements. Indexing the attr* columns seems not to be an solution,
> because the data mustn't be unique (twins) and changes really often so
> reindexing will took too long.

Eh? Why would an index force uniqueness? And are you sure that adding an
index makes updates too slow? What did your testing show as the
slow-down? I'd be tempted to put an index on attr1,attr2,attr5 (or
whichever combination provides the most selectivity) then make sure your
statistics are up to date (ANALYSE) and see if the plans change.

Of course, that's assuming your postgresql.conf has some reasonable
performance-related settings.

Oh, I'd also wonder whether, with "twin-ness" being such an important
concept it isn't its own thing and thus perhaps deserve its own table.

--
   Richard Huxton
   Archonet Ltd

Re: DB Performance decreases due to often written/accessed table

От
"Jens Schipkowski"
Дата:
On Thu, 19 Oct 2006 16:55:34 +0200, Richard Huxton <dev@archonet.com>
wrote:

> Jens Schipkowski wrote:
>> Hi,
>>  we've got performance problems due to repeating SELECT, UPDATE,
>> DELETE, INSERT statements. This statements have to be executed every 10
>> seconds, but they run into a timeout.
>> To obviate problems regarding to our Java Software and JDBC drivers, we
>> put the repeating sequence of statements to a file more than 100k times
>> (half a million statements) and executed "psql ourDB -f ourFile.sql -o
>> /dev/null". To accelerate the occurence of the performance drop, we
>> started 6 instances of this command.
>> The performance drop occured after 10 minutes shifting the server to 0
>> percent idle and 85 - 95 percent user.
>
> After 10 minutes of what?
start testing using the command above.
> Did the half-million statements complete? If not, how many got
> completed? Were they all in separate transactions or did you batch them?
> How ofternwere you vacuuming here?
We wrote a sql batch file which simulates the repeating cycle of SELECT,
UPDATE, DELETE, INSERT. The INSERT is fired using another backend.
The half-million statements of this file will probably complete after all
SELECT and DELETE statements timed out.
We had 6 seperate transactions executing the batch file.
>
>> For tracing the statement which raised the load, we are using pg_locks,
>> pg_stat_activity with current_query enabled. The responsible statement
>> is the DELETE, it hangs until its canceled by timeout. The first run on
>> an vacuumed DB took 300 - 600ms.
>> In a second test we removed the DELETE statements to see wich
>> statements also needs longer time by increasing the amount of data.
>> After half an hour the SELECT statements timed out.
>> An additional VACUUM - every 1 minute - does extend the timeout
>> occurence by factor 5 - 6.
>
> And running vacuum every 30 seconds does what?
Not yet fully tested. It seems to lower the slow down.
But minimizing the gain of slow down doesn't solve the problem. The
Problem is the decrease of execution speed of DELETE and SELECT statements
by a table row count between 150k - 200k. The table starts growing first
after DELETE statements fails during timeout.
>
>> It does not sound professional, but the database seems to be aging by
>> the increase of executed statements.
>
> It sounds very likely if you aren't vacuuming enough or the tables are
> growing rapidly.
vacuuming once a minute is not enough? We reach the execution of 3k
statements per minute (startup time of testing). 1/4 of them are INSERTs
and DELETEs. After 5 minutes a DELETE will took about 50 seconds -
compared to startup time about 300 - 600ms.
>
>> The Statements
>> ---------------
>> // normal insert - does not seem to be the problem - runtime is ok
>> INSERT INTO
>> tbl_reg(idreg,idtype,attr1,...,attr6,loc1,...,loc3,register)
>> VALUES(nextval('tbl_reg_idreg_seq'),1,[attr],[loc],1);
>>  // select finds out which one has not an twin
>> // a twin is defined as record with the same attr* values
>> // decreases speed over time until timeout by postgresql
>> SELECT *
>> FROM tbl_reg reg
>> WHERE register <> loc1 AND
>>     idreg NOT IN
>>         (
>>         SELECT reg.idreg
>>         FROM tbl_reg reg, tbl_reg regtwin
>>         WHERE regtwin.register = 1 AND
>>             regtwin.type <> 20 AND
>>             reg.attr1 = regtwin.attr1 AND
>>             reg.attr2 = regtwin.attr2 AND
>>             reg.attr3 = regtwin.attr3 AND
>>             reg.attr4 = regtwin.attr4 AND
>>             reg.attr5 = regtwin.attr5 AND
>>             reg.attr6 = regtwin.attr6 AND
>>             reg.idreg <> regtwin.idreg AND
>>             reg.register = 2
>>         );
>> I tried to optimize the seslect statement but the group by having
>> count(*)
>>> 1 solution is half as fast as this statement - relating to the query
>> plan of EXPLAIN ANALYZE.
>
> And what did EXPLAIN ANALYSE show here? I'm guessing you're getting time
> increasing as the square of the number of rows in tbl_reg. So, if 50
> rows takes 2500s then 100 rows will take 10000s. Now, if you had enough
> RAM (or an index) I'd expect the planner to process the table in a
> sorted manner so the query-time would increase linearly.

EXPLAIN ANALYZE shows at startup
    QUERY PLAN
-------------------------------------------------------------------------------------------------
  Seq Scan on tbl_reg  (cost=25841.35..31433.17 rows=72424 width=6) (actual
time=673.877..673.877 rows=0 loops=1)
    Filter: ((hashed subplan) OR ((loc1 = 2::smallint) AND (loc2 =
2::smallint)))
    SubPlan
      ->  Merge Join  (cost=22186.21..25646.57 rows=77913 width=8) (actual
time=285.624..285.624 rows=0 loops=1)
            Merge Cond: (("outer".attr1 = "inner".attr1) AND
("outer"."?column8?" = "inner"."?column8?") AND ("outer"."?column9?" =
"inner"."?column9?") AND ("outer"."?column10?" = "inner"."?column10?") AND
("outer"."?column11?" = "inner"."?column11?") AND ("outer".attr6 =
"inner".attr6))
            Join Filter: ("outer".idreg <> "inner".idreg)
            ->  Sort  (cost=4967.06..4971.65 rows=1835 width=56) (actual
time=285.622..285.622 rows=0 loops=1)
                  Sort Key: reg.attr1, (reg.attr2)::text,
(reg.attr3)::text, (reg.attr4)::text, (reg.attr5)::text, reg.attr6
                  ->  Seq Scan on tbl_reg reg  (cost=0.00..4867.59
rows=1835 width=56) (actual time=285.551..285.551 rows=0 loops=1)
                        Filter: (register = 2)
            ->  Sort  (cost=17219.15..17569.77 rows=140247 width=56) (never
executed)
                  Sort Key: regtwin.attr1, (regtwin.attr2)::text,
(regtwin.attr3)::text, (regtwin.attr4)::text, (regtwin.attr5)::text,
regtwin.attr6
                  ->  Seq Scan on tbl_reg regtwin  (cost=0.00..5229.70
rows=140247 width=56) (never executed)
                        Filter: ((register = 1) AND ("type" <> 20))
  Total runtime: 604.410 ms
(15 rows)

EXPLAIN ANALYZE shows after 10 minutes load and 1x vacuuming
       QUERY PLAN
----------------------------------------------------------------------------------------------------
  Seq Scan on tbl_reg  (cost=25841.35..31433.17 rows=72424 width=6) (actual
time=43261.910..43261.910 rows=0 loops=1)
    Filter: ((hashed subplan) OR ((loc1 = 2::smallint) AND (loc2 =
2::smallint)))
    SubPlan
      ->  Merge Join  (cost=22186.21..25646.57 rows=77913 width=8) (actual
time=43132.063..43132.063 rows=0 loops=1)
            Merge Cond: (("outer".attr1 = "inner".attr1) AND
("outer"."?column8?" = "inner"."?column8?") AND ("outer"."?column9?" =
"inner"."?column9?") AND ("outer"."?column10?" = "inner"."?column10?") AND
("outer"."?column11?" = "inner"."?column11?") AND ("outer".attr6 =
"inner".attr6))
            Join Filter: ("outer".idreg <> "inner".idreg)
            ->  Sort  (cost=4967.06..4971.65 rows=1835 width=56) (actual
time=387.071..387.872 rows=1552 loops=1)
                  Sort Key: reg.attr1, (reg.attr2)::text,
(reg.attr3)::text, (reg.attr4)::text, (reg.attr5)::text, reg.attr6
                  ->  Seq Scan on tbl_reg reg  (cost=0.00..4867.59
rows=1835 width=56) (actual time=303.966..325.614 rows=1552 loops=1)
                        Filter: (register = 2)
            ->  Sort  (cost=17219.15..17569.77 rows=140247 width=56)
(actual time=42368.934..42530.986 rows=145324 loops=1)
                  Sort Key: regtwin.attr1, (regtwin.attr2)::text,
(regtwin.attr3)::text, (regtwin.attr4)::text, (regtwin.attr5)::text,
regtwin.attr6
                  ->  Seq Scan on tbl_reg regtwin  (cost=0.00..5229.70
rows=140247 width=56) (actual time=0.015..1159.515 rows=145453 loops=1)
                        Filter: ((register = 1) AND ("type" <> 20))
  Total runtime: 44073.127 ms
(15 rows)

I know that the second query plan executes the sort, because it finds
matching data. So maybe indexing will help.

>
> Oh, and you're doing one join more than you need to here (counting the
> NOT IN as a join). You could get by with a LEFT JOIN and a test for
> idreg being null on the right-hand table.
It sounds good. First tests doesn't improve runtime - it needs more
extensive testing.
>
>> // delete data without a twin
>> // drastically decreases speed over time until timeout by postgresql
> [snip delete doing the same query as above]
>> The runtime of this statement increases until it will canceled by
>> PostgreSQL.
>>  // the where clause of this update statement is normally build in java
>> UPDATE tbl_reg SET loc1=2 WHERE idreg IN ('...',...,'...');
>>  The Table
>> ---------------
>> Tested with: 20.000, 80.000, 500.000 records
>>  CREATE TABLE tbl_reg
>> (
>>   idreg bigserial NOT NULL,
>>   idtype int8 DEFAULT 0,
>
> You can have more than 4 billion "types"?
it seems so, or not?
>
>>   attr1 int4,
>>   attr2 int4,
>>   attr3 varchar(20),
>>   attr4 varchar(20),
>>   attr5 int4,
>>   attr6 varchar(140) DEFAULT ''::character varying,
>>   loc1 int2 DEFAULT 0,
>>   loc2 int2 DEFAULT 0,
>>   loc3 int2 DEFAULT 0,
>>   register int2 DEFAULT 1,
>>   "timestamp" timestamp DEFAULT now(),
>
> You probably want timestamp with time zone.
No, just the server time is important. This is short living data.
>
>>   CONSTRAINT tbl_reg_pkey PRIMARY KEY (idreg)
>> )
>> WITHOUT OIDS;
>>  The Hardware
>> ----------------
>> Dual Xeon 3.2GHz Hyperthreading
>> SCSI harddrives
>> RAID and non-RAID tested
>>  We have the problem, that we cannot see any potential to improve SQL
>> statements. Indexing the attr* columns seems not to be an solution,
>> because the data mustn't be unique (twins) and changes really often so
>> reindexing will took too long.
>
> Eh? Why would an index force uniqueness? And are you sure that adding an
> index makes updates too slow? What did your testing show as the
> slow-down? I'd be tempted to put an index on attr1,attr2,attr5 (or
> whichever combination provides the most selectivity) then make sure your
> statistics are up to date (ANALYSE) and see if the plans change.
OK, I misunderstood the PostgreSQL INDEX. Will test it using an
multicolumn index.
>
> Of course, that's assuming your postgresql.conf has some reasonable
> performance-related settings.
postgresql.conf settings have been optimized. Searched the web for useful
information and got help from Mailing list by Tom Lane.
>
> Oh, I'd also wonder whether, with "twin-ness" being such an important
> concept it isn't its own thing and thus perhaps deserve its own table.
>
It's important due to software concept (conferencing groups).

Thank you for your suggestions. I will add indexes to the table and
overhaul the SELECT and DELETE statements. After testing I will post
results.

Re: DB Performance decreases due to often written/accessed

От
Richard Huxton
Дата:
Jens Schipkowski wrote:
>> And running vacuum every 30 seconds does what?
> Not yet fully tested. It seems to lower the slow down.
> But minimizing the gain of slow down doesn't solve the problem. The
> Problem is the decrease of execution speed of DELETE and SELECT
> statements by a table row count between 150k - 200k. The table starts
> growing first after DELETE statements fails during timeout.
>>
>>> It does not sound professional, but the database seems to be aging by
>>> the increase of executed statements.
>>
>> It sounds very likely if you aren't vacuuming enough or the tables are
>> growing rapidly.
> vacuuming once a minute is not enough? We reach the execution of 3k
> statements per minute (startup time of testing). 1/4 of them are INSERTs
> and DELETEs. After 5 minutes a DELETE will took about 50 seconds -
> compared to startup time about 300 - 600ms.

You want to vacuum enough so that the deletes don't leave
ever-increasing gaps in your table. If you run VACUUM VERBOSE it will
tell you what it did - if the number of (non-)removable rows keeps
increasing you're not vacuuming more. The trick is to vacuum often but
not have to do a lot of work in each. The autovacuum tool in recent
versions tries to estimate this for you, but might not cope in your case.

>>> The Statements
>>> ---------------
>>> // normal insert - does not seem to be the problem - runtime is ok
>>> INSERT INTO
>>> tbl_reg(idreg,idtype,attr1,...,attr6,loc1,...,loc3,register)
>>> VALUES(nextval('tbl_reg_idreg_seq'),1,[attr],[loc],1);
>>>  // select finds out which one has not an twin
>>> // a twin is defined as record with the same attr* values
>>> // decreases speed over time until timeout by postgresql
>>> SELECT *
>>> FROM tbl_reg reg
>>> WHERE register <> loc1 AND
>>>     idreg NOT IN
>>>         (
>>>         SELECT reg.idreg
>>>         FROM tbl_reg reg, tbl_reg regtwin
>>>         WHERE regtwin.register = 1 AND
>>>             regtwin.type <> 20 AND
>>>             reg.attr1 = regtwin.attr1 AND
>>>             reg.attr2 = regtwin.attr2 AND
>>>             reg.attr3 = regtwin.attr3 AND
>>>             reg.attr4 = regtwin.attr4 AND
>>>             reg.attr5 = regtwin.attr5 AND
>>>             reg.attr6 = regtwin.attr6 AND
>>>             reg.idreg <> regtwin.idreg AND
>>>             reg.register = 2
>>>         );
>>> I tried to optimize the seslect statement but the group by having
>>> count(*)
>>>> 1 solution is half as fast as this statement - relating to the query
>>> plan of EXPLAIN ANALYZE.
>>
>> And what did EXPLAIN ANALYSE show here? I'm guessing you're getting
>> time increasing as the square of the number of rows in tbl_reg. So, if
>> 50 rows takes 2500s then 100 rows will take 10000s. Now, if you had
>> enough RAM (or an index) I'd expect the planner to process the table
>> in a sorted manner so the query-time would increase linearly.
>
> EXPLAIN ANALYZE shows at startup
>    QUERY PLAN
> -------------------------------------------------------------------------------------------------
>
>  Seq Scan on tbl_reg  (cost=25841.35..31433.17 rows=72424 width=6)
> (actual time=673.877..673.877 rows=0 loops=1)
>    Filter: ((hashed subplan) OR ((loc1 = 2::smallint) AND (loc2 =
> 2::smallint)))
>    SubPlan
>      ->  Merge Join  (cost=22186.21..25646.57 rows=77913 width=8)
> (actual time=285.624..285.624 rows=0 loops=1)
>            Merge Cond: (("outer".attr1 = "inner".attr1) AND
> ("outer"."?column8?" = "inner"."?column8?") AND ("outer"."?column9?" =
> "inner"."?column9?") AND ("outer"."?column10?" = "inner"."?column10?")
> AND ("outer"."?column11?" = "inner"."?column11?") AND ("outer".attr6 =
> "inner".attr6))
>            Join Filter: ("outer".idreg <> "inner".idreg)
>            ->  Sort  (cost=4967.06..4971.65 rows=1835 width=56) (actual
> time=285.622..285.622 rows=0 loops=1)
>                  Sort Key: reg.attr1, (reg.attr2)::text,
> (reg.attr3)::text, (reg.attr4)::text, (reg.attr5)::text, reg.attr6
>                  ->  Seq Scan on tbl_reg reg  (cost=0.00..4867.59
> rows=1835 width=56) (actual time=285.551..285.551 rows=0 loops=1)
>                        Filter: (register = 2)
>            ->  Sort  (cost=17219.15..17569.77 rows=140247 width=56)
> (never executed)
>                  Sort Key: regtwin.attr1, (regtwin.attr2)::text,
> (regtwin.attr3)::text, (regtwin.attr4)::text, (regtwin.attr5)::text,
> regtwin.attr6
>                  ->  Seq Scan on tbl_reg regtwin  (cost=0.00..5229.70
> rows=140247 width=56) (never executed)
>                        Filter: ((register = 1) AND ("type" <> 20))
>  Total runtime: 604.410 ms
> (15 rows)
>
> EXPLAIN ANALYZE shows after 10 minutes load and 1x vacuuming
>       QUERY PLAN
> ----------------------------------------------------------------------------------------------------
>
>  Seq Scan on tbl_reg  (cost=25841.35..31433.17 rows=72424 width=6)
> (actual time=43261.910..43261.910 rows=0 loops=1)
>    Filter: ((hashed subplan) OR ((loc1 = 2::smallint) AND (loc2 =
> 2::smallint)))
>    SubPlan
>      ->  Merge Join  (cost=22186.21..25646.57 rows=77913 width=8)
> (actual time=43132.063..43132.063 rows=0 loops=1)
>            Merge Cond: (("outer".attr1 = "inner".attr1) AND
> ("outer"."?column8?" = "inner"."?column8?") AND ("outer"."?column9?" =
> "inner"."?column9?") AND ("outer"."?column10?" = "inner"."?column10?")
> AND ("outer"."?column11?" = "inner"."?column11?") AND ("outer".attr6 =
> "inner".attr6))
>            Join Filter: ("outer".idreg <> "inner".idreg)
>            ->  Sort  (cost=4967.06..4971.65 rows=1835 width=56) (actual
> time=387.071..387.872 rows=1552 loops=1)
>                  Sort Key: reg.attr1, (reg.attr2)::text,
> (reg.attr3)::text, (reg.attr4)::text, (reg.attr5)::text, reg.attr6
>                  ->  Seq Scan on tbl_reg reg  (cost=0.00..4867.59
> rows=1835 width=56) (actual time=303.966..325.614 rows=1552 loops=1)
>                        Filter: (register = 2)
>            ->  Sort  (cost=17219.15..17569.77 rows=140247 width=56)
> (actual time=42368.934..42530.986 rows=145324 loops=1)
>                  Sort Key: regtwin.attr1, (regtwin.attr2)::text,
> (regtwin.attr3)::text, (regtwin.attr4)::text, (regtwin.attr5)::text,
> regtwin.attr6
>                  ->  Seq Scan on tbl_reg regtwin  (cost=0.00..5229.70
> rows=140247 width=56) (actual time=0.015..1159.515 rows=145453 loops=1)
>                        Filter: ((register = 1) AND ("type" <> 20))
>  Total runtime: 44073.127 ms
> (15 rows)

OK - these plans look about the same, but the time is greatly different.
Both have rows=140247 as the estimated number of rows in tbl_reg. Either
  you have many more rows in the second case (in which case you're not
running ANALYSE enough) or you have lots of gaps in the table (you're
not running VACUUM enough).

I'd then try putting an index on (attr1,attr2,attr3...attr6) and see if
that helps reduce time.

--
   Richard Huxton
   Archonet Ltd

Re: DB Performance decreases due to often written/accessed

От
"Jim C. Nasby"
Дата:
On Thu, Oct 19, 2006 at 06:19:16PM +0100, Richard Huxton wrote:
> OK - these plans look about the same, but the time is greatly different.
> Both have rows=140247 as the estimated number of rows in tbl_reg. Either
>  you have many more rows in the second case (in which case you're not
> running ANALYSE enough) or you have lots of gaps in the table (you're
> not running VACUUM enough).

Look closer... the actual stats show that the sorts in the second case
are returning far more rows. And yes, analyze probably needs to happen.

> I'd then try putting an index on (attr1,attr2,attr3...attr6) and see if
> that helps reduce time.

With bitmap index scans, I think it'd be much better to create 6 indexes
and see which ones actually get used (and then drop the others).
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: DB Performance decreases due to often written/accessed table

От
"Merlin Moncure"
Дата:
On 10/19/06, Jens Schipkowski <jens.schipkowski@apus.co.at> wrote:
> // select finds out which one has not an twin
> // a twin is defined as record with the same attr* values
> // decreases speed over time until timeout by postgresql
> SELECT *
>  FROM tbl_reg reg
> WHERE register <> loc1 AND
>         idreg NOT IN
>                 (
>                 SELECT reg.idreg
>                 FROM tbl_reg reg, tbl_reg regtwin
>                 WHERE regtwin.register = 1 AND
>                         regtwin.type <> 20 AND
>                         reg.attr1 = regtwin.attr1 AND
>                         reg.attr2 = regtwin.attr2 AND
>                         reg.attr3 = regtwin.attr3 AND
>                         reg.attr4 = regtwin.attr4 AND
>                         reg.attr5 = regtwin.attr5 AND
>                         reg.attr6 = regtwin.attr6 AND
>                         reg.idreg <> regtwin.idreg AND
>                         reg.register = 2
>                 );

[...]

> We have the problem, that we cannot see any potential to improve SQL
> statements. Indexing the attr* columns seems not to be an solution,
> because the data mustn't be unique (twins) and changes really often so
> reindexing will took too long.

1. your database design is the real culprit here.  If you want things
to run really quickly, solve the problem there by normalizing your
schema. denomalization is the root cause of many, many, problems
posted here on this list.
2. barring that, the above query will run fastest by creating
multi-column indexes on regtwin (attr*) fields. and reg(attr*).  the
real solution to problems like this is often proper idnexing,
especially multi column.  saying indexes take to long to build is like
saying: 'i have a problem, so i am going to replace it with a much
worse problem'.
3. try where exists/not exists instead of where in/not in

merlin

Re: DB Performance decreases due to often written/accessed

От
Richard Huxton
Дата:
Jim C. Nasby wrote:
> On Thu, Oct 19, 2006 at 06:19:16PM +0100, Richard Huxton wrote:
>> OK - these plans look about the same, but the time is greatly different.
>> Both have rows=140247 as the estimated number of rows in tbl_reg. Either
>>  you have many more rows in the second case (in which case you're not
>> running ANALYSE enough) or you have lots of gaps in the table (you're
>> not running VACUUM enough).
>
> Look closer... the actual stats show that the sorts in the second case
> are returning far more rows. And yes, analyze probably needs to happen.

The results are different, I agree, but the plans (and estimates) are
the same. Given the deletes and inserts I wasn't sure whether this was
just lots more rows or a shift in values.

>> I'd then try putting an index on (attr1,attr2,attr3...attr6) and see if
>> that helps reduce time.
>
> With bitmap index scans, I think it'd be much better to create 6 indexes
> and see which ones actually get used (and then drop the others).

Good idea.

--
   Richard Huxton
   Archonet Ltd

Re: DB Performance decreases due to often written/accessed table

От
"Jens Schipkowski"
Дата:
On Thu, 19 Oct 2006 19:32:22 +0200, Merlin Moncure <mmoncure@gmail.com>
wrote:

> On 10/19/06, Jens Schipkowski <jens.schipkowski@apus.co.at> wrote:
>> // select finds out which one has not an twin
>> // a twin is defined as record with the same attr* values
>> // decreases speed over time until timeout by postgresql
>> SELECT *
>>  FROM tbl_reg reg
>> WHERE register <> loc1 AND
>>         idreg NOT IN
>>                 (
>>                 SELECT reg.idreg
>>                 FROM tbl_reg reg, tbl_reg regtwin
>>                 WHERE regtwin.register = 1 AND
>>                         regtwin.type <> 20 AND
>>                         reg.attr1 = regtwin.attr1 AND
>>                         reg.attr2 = regtwin.attr2 AND
>>                         reg.attr3 = regtwin.attr3 AND
>>                         reg.attr4 = regtwin.attr4 AND
>>                         reg.attr5 = regtwin.attr5 AND
>>                         reg.attr6 = regtwin.attr6 AND
>>                         reg.idreg <> regtwin.idreg AND
>>                         reg.register = 2
>>                 );
>
> [...]
>
>> We have the problem, that we cannot see any potential to improve SQL
>> statements. Indexing the attr* columns seems not to be an solution,
>> because the data mustn't be unique (twins) and changes really often so
>> reindexing will took too long.
>
> 1. your database design is the real culprit here.  If you want things
> to run really quickly, solve the problem there by normalizing your
> schema. denomalization is the root cause of many, many, problems
> posted here on this list.
Believe it is normalized. We also seperated configuration and runtime
data. And this is a runtime table.
This table holds short living data for devices to be registered by a
registration server. The INSERTs are triggered by external devices. The
master data tables are perfectly normalized too. What you are seeing is
not the real column names. I changed it due to readability. attr* have
really different names and meanings. A "twin" (in real, initiator/member
of the same conferencing group) is defined by these attributes. Due to
high flexibility of this system (serverside configuration/ deviceside
configuration for runtime) there is no other way to normalize.

> 2. barring that, the above query will run fastest by creating
> multi-column indexes on regtwin (attr*) fields. and reg(attr*).  the
> real solution to problems like this is often proper idnexing,
> especially multi column.  saying indexes take to long to build is like
> saying: 'i have a problem, so i am going to replace it with a much
> worse problem'.
I will index it. Just prepared the test and will run it tomorrow.
> 3. try where exists/not exists instead of where in/not in
Did try it, before I switched to NOT IN. It was 10 times slower.
>
> merlin

Re: DB Performance decreases due to often written/accessed table

От
"Merlin Moncure"
Дата:
On 10/19/06, Jens Schipkowski <jens.schipkowski@apus.co.at> wrote:
> On Thu, 19 Oct 2006 19:32:22 +0200, Merlin Moncure > > 1. your database design is the real culprit here.  If you want
things
> > to run really quickly, solve the problem there by normalizing your
> > schema. denomalization is the root cause of many, many, problems
> > posted here on this list.
> Believe it is normalized. We also seperated configuration and runtime
> data. And this is a runtime table.
> This table holds short living data for devices to be registered by a
> registration server. The INSERTs are triggered by external devices. The
> master data tables are perfectly normalized too. What you are seeing is
> not the real column names. I changed it due to readability. attr* have
> really different names and meanings. A "twin" (in real, initiator/member
> of the same conferencing group) is defined by these attributes. Due to
> high flexibility of this system (serverside configuration/ deviceside
> configuration for runtime) there is no other way to normalize.

ok, fair enough =)  still, it feels odd that you are relating two
tables on all 6 attributes. istm there is something more elegant
possible, hard to say.

> > 2. barring that, the above query will run fastest by creating
> > multi-column indexes on regtwin (attr*) fields. and reg(attr*).  the
> > real solution to problems like this is often proper idnexing,
> > especially multi column.  saying indexes take to long to build is like
> > saying: 'i have a problem, so i am going to replace it with a much
> > worse problem'.
> I will index it. Just prepared the test and will run it tomorrow.
> > 3. try where exists/not exists instead of where in/not in
> Did try it, before I switched to NOT IN. It was 10 times slower.

double check that when properly indexed.

merlin