Обсуждение: insert

Поиск
Список
Период
Сортировка

insert

От
Ulrich Wisser
Дата:
Hi,

is there anything I can doo to speed up inserts? One of my tables gets
about 100 new rows every five minutes. And somehow the inserts tend to
take more and more time.

Any suggestions welcome.

TIA

Ulrich


Re: insert

От
G u i d o B a r o s i o
Дата:
Tips!
*Delete indexes and recreate them after the insert.
*Disable auto-commit
*Perform a copy will be faster, sure.

Best wishes,
Guido

> Hi,
>
> is there anything I can doo to speed up inserts? One of my tables gets
> about 100 new rows every five minutes. And somehow the inserts tend to
> take more and more time.
>
> Any suggestions welcome.
>
> TIA
>
> Ulrich
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match


Re: insert

От
"gnari"
Дата:
"G u i d o B a r o s i o" <gbarosio@uolsinectis.com.ar> wrote:

[speeding up 100 inserts every 5 minutes]

> Tips!
> *Delete indexes and recreate them after the insert.

sounds a bit extreme, for only 100 inserts

gnari





Re: insert

От
G u i d o B a r o s i o
Дата:
As I see it's 100 inserts every 5 minutes, not only 100 inserts.

Sure it's extreme for only 100 inserts.

Cheers,
Guido

> "G u i d o B a r o s i o" <gbarosio@uolsinectis.com.ar> wrote:
>
> [speeding up 100 inserts every 5 minutes]
>
> > Tips!
> > *Delete indexes and recreate them after the insert.
>
> sounds a bit extreme, for only 100 inserts
>
> gnari
>
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match


Re: insert

От
"gnari"
Дата:
From: "G u i d o B a r o s i o" <gbarosio@uolsinectis.com.ar>:


> As I see it's 100 inserts every 5 minutes, not only 100 inserts.
>
> Sure it's extreme for only 100 inserts.

I am sorry, I do not quite grasp what you are saying.
my understanding was that there are constantly new inserts,
coming in bursts of 100 , every 5 minutes.
I imagined that the indexes were needed in between.

if this is the case, the bunches of 100 inserts should
be done inside a transaction (or by 1 COPY statement)

if, on the other hand, the inserts happen independently,
at a rate of 100 inserts / 5 minutes, then this will not help

gnari



>
> Cheers,
> Guido
>
> > "G u i d o B a r o s i o" <gbarosio@uolsinectis.com.ar> wrote:
> >
> > [speeding up 100 inserts every 5 minutes]
> >
> > > Tips!
> > > *Delete indexes and recreate them after the insert.
> >
> > sounds a bit extreme, for only 100 inserts
> >
> > gnari
> >
> >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 9: the planner will ignore your desire to choose an index scan if
your
> >       joining column's datatypes do not match
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>


Re: insert

От
Ulrich Wisser
Дата:
Hi,

my inserts are done in one transaction, but due to some foreign key
constraints and five indexes sometimes the 100 inserts will take more
than 5 minutes.

/Ulrich


Re: insert

От
Paul Thomas
Дата:
On 13/08/2004 13:10 Ulrich Wisser wrote:
> Hi,
>
> my inserts are done in one transaction, but due to some foreign key
> constraints and five indexes sometimes the 100 inserts will take more
> than 5 minutes.

Two possibilities come to mind:

a) you need an index on the referenced FK field
b) you have an index but a type mis-match (e.g, an int4 field referencing
an int8 field)

Either of these will cause a sequential table scan and poor performance.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for
Business             |
| Computer Consultants         |
http://www.thomas-micro-systems-ltd.co.uk   |
+------------------------------+---------------------------------------------+

Re: insert

От
Rod Taylor
Дата:
On Fri, 2004-08-13 at 08:10, Ulrich Wisser wrote:
> Hi,
>
> my inserts are done in one transaction, but due to some foreign key
> constraints and five indexes sometimes the 100 inserts will take more
> than 5 minutes.

It is likely that you are missing an index on one of those foreign key'd
items.

Do an EXPLAIN ANALYZE SELECT * FROM foreign_table WHERE foreign_col =
'<insert value>';

Fix them until they're quick.



Re: insert

От
Bruno Wolff III
Дата:
On Fri, Aug 13, 2004 at 08:57:56 -0400,
  Rod Taylor <pg@rbt.ca> wrote:
> On Fri, 2004-08-13 at 08:10, Ulrich Wisser wrote:
> > Hi,
> >
> > my inserts are done in one transaction, but due to some foreign key
> > constraints and five indexes sometimes the 100 inserts will take more
> > than 5 minutes.
>
> It is likely that you are missing an index on one of those foreign key'd
> items.

I don't think that is too likely as a foreign key reference must be a
unique key which would have an index. I think the type mismatch
suggestion is probably what the problem is.
The current solution is to make the types match. In 8.0.0 it would probably
work efficiently as is, though it isn't normal for foreign keys to have a type
mismatch and he may want to change that anyway.

Re: insert

От
Bruno Wolff III
Дата:
On Fri, Aug 13, 2004 at 17:17:10 +0100,
  Matt Clark <matt@ymogen.net> wrote:
> > > It is likely that you are missing an index on one of those foreign
> > > key'd items.
> >
> > I don't think that is too likely as a foreign key reference
> > must be a unique key which would have an index.
>
> I think you must be thinking of primary keys, not foreign keys.  All
> one-to-many relationships have non-unique foreign keys.

The target of the reference needs to have at least a unique index.
I am not sure if it needs to actually be declared as either a unique
or primary key, though that is the intention.

The records doing the referencing don't need (and normally aren't)
unique.

Re: insert

От
"Matt Clark"
Дата:
> > It is likely that you are missing an index on one of those foreign
> > key'd items.
>
> I don't think that is too likely as a foreign key reference
> must be a unique key which would have an index.

I think you must be thinking of primary keys, not foreign keys.  All
one-to-many relationships have non-unique foreign keys.


Re: insert

От
Tom Lane
Дата:
Bruno Wolff III <bruno@wolff.to> writes:
>   Rod Taylor <pg@rbt.ca> wrote:
>> It is likely that you are missing an index on one of those foreign key'd
>> items.

> I don't think that is too likely as a foreign key reference must be a
> unique key which would have an index. I think the type mismatch
> suggestion is probably what the problem is.

I agree.  It is possible to have a lack-of-index problem on the
referencing column (as opposed to the referenced column), but that
normally only hurts you for deletes from the referenced table.

> The current solution is to make the types match. In 8.0.0 it would probably
> work efficiently as is, though it isn't normal for foreign keys to have a type
> mismatch and he may want to change that anyway.

8.0 will not fix this particular issue, as I did not add any numeric-vs-int
comparison operators.  If we see a lot of complaints we could think
about adding such, but for 8.0 only the more common cases such as
int-vs-bigint are covered.

            regards, tom lane

Re: insert

От
Gaetano Mendola
Дата:
gnari wrote:

> "G u i d o B a r o s i o" <gbarosio@uolsinectis.com.ar> wrote:
>
> [speeding up 100 inserts every 5 minutes]
>
>
>>Tips!
>>*Delete indexes and recreate them after the insert.
>
>
> sounds a bit extreme, for only 100 inserts

which fsync method are you using ?
change it and see what happen

Regards
Gaetano Mendola



Re: insert

От
alan
Дата:
I think I figured it out myself.
If anyone sees issues with this (simple) approach, please let me know.

I changed my table definitions to this:

CREATE SEQUENCE public.product_id_seq
CREATE TABLE products (
    product_id INTEGER DEFAULT nextval('product_id_seq'::regclass) NOT
NULL,
    name VARCHAR(60) NOT NULL,
    category SMALLINT  NOT NULL,
    CONSTRAINT product_id PRIMARY KEY (product_id)
);
CREATE SEQUENCE public.category_id_seq
CREATE TABLE category (
   category_id INTEGER DEFAULT nextval('category_id_seq'::regclass)
NOT NULL,
   name VARCHAR(20) NOT NULL,
   CONSTRAINT category_id PRIMARY KEY (category_id)
);
ALTER TABLE products ADD CONSTRAINT category_products_fk
    FOREIGN KEY (category)
    REFERENCES category (category_id)
    ON DELETE NO ACTION ON UPDATE CASCADE
;

Then created this function:

CREATE OR REPLACE FUNCTION getid(_table text,_pk text,_name text)
RETURNS integer AS $$
DECLARE _id integer;
BEGIN
  EXECUTE 'SELECT '
    || _pk
    || ' FROM '
    || _table::regclass
    || ' WHERE name'
    || ' = '
    || quote_literal(_name)
   INTO _id;

  IF _id > 0 THEN
    return _id;
  ELSE
    EXECUTE 'INSERT INTO '
     || _table
     || ' VALUES (DEFAULT,' || quote_literal(_name) || ')'
     || ' RETURNING ' || _pk
    INTO _id;
    return _id;
  END IF;
END;
$$
 LANGUAGE 'plpgsql' VOLATILE;

Now I can just insert into the products table via:

INSERT INTO products VALUES(DEFAULT,'Postgresql for
Dummies',getid('category','category_id','books'));

For example:

testdb=# select * from products;
 product_id | name | category
------------+------+----------
(0 rows)

iims_test=# select * from category;
 category_id | name
-------------+------
(0 rows)

testdb=# insert into products values(DEFAULT,'Postgresql for
Dummies',getid('category','category_id','books'));
INSERT 0 1

testdb=# select * from
category;
 category_id | name
-------------+-------
           1 | books

testdb=# select * from products;
 product_id |          name          | category
------------+------------------------+----------
          1 | Postgresql for Dummies |        1

Updating the category_id in category table are also cascaded to the
product table.

testdb=# UPDATE category SET category_id = 2 WHERE category_id = 1;
UPDATE 1

testdb=# SELECT * FROM products;
 product_id |          name          | category
------------+------------------------+----------
          1 | Postgresql for Dummies |        2


Alan

Re: insert

От
Vitalii Tymchyshyn
Дата:
Hello.

Please note that in multitasking environment you may have problems with
your code. Two connections may check if "a" is available and if not (and
both got empty "select" result), try to insert. One will succeed,
another will fail if you have a unique constraint on category name (and
you'd better have one).

Please note that select for update won't help you much, since this is
new record you are looking for, and select don't return (and lock) it. I
am using "lock table <tableName> in SHARE ROW EXCLUSIVE mode" in this case.

But then, if you have multiple lookup dictinaries, you need to ensure
strict order of locking or you will be getting deadlocks. As for me, I
did create a special application-side class to retrieve such values. If
I can't find a value in main connection with simple select, I open new
connection, perform table lock, check if value is in there. If it is
not, add the value and commit. This may produce orphaned dictionary
entries (if dictionary entry is committed an main transaction is rolled
back), but this is usually OK for dictionaries. At the same time I don't
introduce hard locks into main transaction and don't have to worry about
deadlocks.

Best regards, Vitalii Tymchyshyn


Re: insert

От
"Kevin Grittner"
Дата:
Vitalii Tymchyshyn <tivv00@gmail.com> wrote:

> Please note that in multitasking environment you may have problems
> with your code. Two connections may check if "a" is available and
> if not (and both got empty "select" result), try to insert. One
> will succeed, another will fail if you have a unique constraint on
> category name (and you'd better have one).
>
> Please note that select for update won't help you much, since this
> is new record you are looking for, and select don't return (and
> lock) it. I am using "lock table <tableName> in SHARE ROW
> EXCLUSIVE mode" in this case.
>
> But then, if you have multiple lookup dictinaries, you need to
> ensure strict order of locking or you will be getting deadlocks.
> As for me, I did create a special application-side class to
> retrieve such values. If I can't find a value in main connection
> with simple select, I open new connection, perform table lock,
> check if value is in there. If it is not, add the value and
> commit. This may produce orphaned dictionary entries (if
> dictionary entry is committed an main transaction is rolled back),
> but this is usually OK for dictionaries. At the same time I don't
> introduce hard locks into main transaction and don't have to worry
> about deadlocks.

It sounds like you might want to check out the new "truly
serializable" transactions in version 9.1.  If you can download the
latest beta version of it and test with
default_transaction_isolation = 'serializable' I would be interested
to hear your results.  Note that you can't have deadlocks, but you
can have other types of serialization failures, so your software
needs to be prepared to start a transaction over from the beginning
when the SQLSTATE of a failure is '40001'.

The Wiki page which was used to document and organize the work is:

http://wiki.postgresql.org/wiki/Serializable

This is in a little bit of a funny state because not all of the
wording that was appropriate while the feature was under development
(e.g., future tense verbs) has been changed to something more
appropriate for a finished feature, but it should cover the
theoretical ground pretty well.  An overlapping document which was
initially based on parts of the Wiki page and has received more
recent attention is the README-SSI file here:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/backend/storage/lmgr/README-SSI;hb=master

Some examples geared toward programmers and DBAs is at this Wiki
page:

http://wiki.postgresql.org/wiki/SSI

It could use a couple more examples and a bit of language cleanup,
but what is there is fairly sound.  The largest omission is that we
need to show more explicitly that serialization failures can occur
at times other than COMMIT.  (I got a little carried away trying to
show that there was no blocking and that the "first committer
wins".)

-Kevin