Обсуждение: Question regarding performance (large objects involved)

Поиск
Список
Период
Сортировка

Question regarding performance (large objects involved)

От
u15074
Дата:
I have a small test program (using libpq) inserting a lot of data into the
database. Each command inserts a small large object (about 5k) into the database
and inserts one row into a table, that references the large object oid.
I repeat this 100.000 times. Each insert consists of his own transaction (begin
-> insert large object, insert row -> commit ...). I also measure the time taken
for always 100 inserts.
The performance is ok and stays constant over the whole time. But I have the
following effect: from time to time a short interruption occurs (my test program
is standing still for a moment) and then it goes on.
Has anyone an idea what might cause these pauses? Is it due to caching
mechanisms of the database?
Another question is concerning the reading of the written data. When I finished
the test, I used psql to check the written data. Therefore I started some
queries, searching for certain large objects in pg_largeobject (... where loid =
XX). These queries took very much time (about 5 seconds or more). After calling
vacuum on the database, the queries got fast. Can anyone explain this? Is the
index on pg_largeobject built by calling vacuum?

Thanks, Andreas.

-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/


Re: Question regarding performance (large objects involved)

От
Peter Childs
Дата:
On Thu, 26 Jun 2003, u15074 wrote:

> I have a small test program (using libpq) inserting a lot of data into the
> database. Each command inserts a small large object (about 5k) into the database
> and inserts one row into a table, that references the large object oid.
> I repeat this 100.000 times. Each insert consists of his own transaction (begin
> -> insert large object, insert row -> commit ...). I also measure the time taken
> for always 100 inserts.
> The performance is ok and stays constant over the whole time. But I have the
> following effect: from time to time a short interruption occurs (my test program
> is standing still for a moment) and then it goes on.
> Has anyone an idea what might cause these pauses? Is it due to caching
> mechanisms of the database?
> Another question is concerning the reading of the written data. When I finished
> the test, I used psql to check the written data. Therefore I started some
> queries, searching for certain large objects in pg_largeobject (... where loid =
> XX). These queries took very much time (about 5 seconds or more). After calling
> vacuum on the database, the queries got fast. Can anyone explain this? Is the
> index on pg_largeobject built by calling vacuum?
>

    Never used large objects but I guess its like everything else.
Vacuum changes nothing (except rubbish its a garbage collection routine).
    Analyse however looks at the table and decides which indexes are
worth using in which querys. There are times like if there are only 5
records in a table where reading the entire table is quicker that reading
the index and then reading the right bit of the table. If the statisics
created by analyse are out of date a analyse will be required.

    When to run what.

Analyse

Needs to be run when the data structure has changed. That means that the
data distribution has changed.
Also need to be run when the indexes change.
If your table is constatnly changing but the structure does not really
change. (ie status data or logs) a daily analyse may be advisable in case
the structure is changing very slowly....

Vacuum

Need to be run after deletes and updates (a delete is actually a delete
and an insert) if you do it with verbose on. Look at the vac number if it
is big you need to do it more often if its zero you can probably get away
with less often. Should be done on a table by table bases. If possible
durring a quite period. (if no quite periods are available do it more
often as it will be quicker.
Vaccum does not need to be done on static tables. however a vacuum will
not harm it and may help after the initial insert.

Vacuum Full

The file system equivlent is defrag. Needs to be done if Unused (from
vacuum) has grown too big. It also means that you need to vacuum more
often. Vacuum Full will lock your table while it works so will stop any
clients using the table. If you find your self vacuum fulling the whole
database you may be better off dumping the database and rebuilding.

This is the information I have gathered from reading this and the other
postgres newsgroups over the last few months. I would surgest that
somthing like this was added to the manual.
    If I'm wrong (which I probably am) I'm sure somone will correct
me.

Peter Childs





Re: Question regarding performance (large objects involved)

От
Mark Kirkwood
Дата:
Could be the the database is checkpointing then. Try experimenting with :

checkpoint_segments
checkpoint_timeout

Might be worth playing with :

wal_buffers

as well

regards

Mark

u15074 wrote:

>The performance is ok and stays constant over the whole time. But I have the
>following effect: from time to time a short interruption occurs (my test program
>is standing still for a moment) and then it goes on.
>Has anyone an idea what might cause these pauses? Is it due to caching
>mechanisms of the database?
>
>



Re: Question regarding performance (large objects involved)

От
Bruno Wolff III
Дата:
On Thu, Jun 26, 2003 at 08:33:10 +0100,
  Peter Childs <blue.dragon@blueyonder.co.uk> wrote:
> On Thu, 26 Jun 2003, u15074 wrote:
>
> Need to be run after deletes and updates (a delete is actually a delete
> and an insert) if you do it with verbose on. Look at the vac number if it

Just to avoid confusion, the above contains a mistake (presumably unintential).
It should read "an update is actually a delete and an insert".

Re: Question regarding performance (large objects involved)

От
Tom Lane
Дата:
Mark Kirkwood <markir@paradise.net.nz> writes:
> Could be the the database is checkpointing then.

Or the system 'syncer' process woke up and wrote a bunch of pages.
If the interval between pauses doesn't vary when you change
checkpoint_timeout and checkpoint_segments, then I'd blame the syncer
(or possibly some other external activity).

            regards, tom lane