Re: performance of insert/delete/update

Поиск
Список
Период
Сортировка
От Rich Scott
Тема Re: performance of insert/delete/update
Дата
Msg-id 20021126004339.46BB12FD1F@server3.fastmail.fm
обсуждение исходный текст
Ответ на performance of insert/delete/update  (Wei Weng <wweng@kencast.com>)
Ответы Re: performance of insert/delete/update
Список pgsql-performance
I have seen a number of real-world situations where bundling inserts into
transactions made a considerable difference - sometimes as much as a 100x
speed-up, and not just in Postgresql databases, but also commercial
systems
(my experience is in Oracle & Sybase). I've often used an idiom of
building
up rows until I hit some high-water mark, and then insert those rows in
one
fell swoop - it's almost always measurably faster than one-at-a-time.

Sidebar: a number of years ago, while contracting at a regional telephone
company,
I became peripherally enmired in a gigantic billing-system-makeover
fiasco.
Upon initial deployment, the system was so slow at processing that it was
taking about 30 hours for each day of billing data. After a week or so,
when it became apparent that fundamental Cash Flow was threatened, there
were
multi-hour conference calls, in which various VPs called for massive h/w
upgrades and/or lawsuits against Oracle. An astute cohort of mine asked
to
see some of the code, and found out that the original developers (at the
telco)
had created a bloated and slow control system in C++, using semaphores or
somesuch,
to *serialize* inserts/updates/deletes, and so they had gigantic
home-built
queues of insert jobs. Not only were they not bundling updates in
transactions,
they were only ever doing one transaction at a time. (Apparently, they
never
learned RDBMS fundamentals.) He told them to rip out all that code, and
let
Oracle (like any other decent RDBMS) handle the update ordering. The
resultant
speed-up factor was several hundred times.

-R


On Mon, 25 Nov 2002 15:59:16 -0700 (MST), "scott.marlowe"
<scott.marlowe@ihs.com> said:
> On Mon, 25 Nov 2002, Josh Berkus wrote:
>
> > Scott,
> >
> > > It's quite easy to test if you have a database with a large table to play
> > > with, use pg_dump to dump a table with the -d switch (makes the dump use
> > > insert statements.)  Then, make two versions of the dump, one which has a
> > > begin;end; pair around all the inserts and one that doesn't, then use psql
> > > -e to restore both dumps.  The difference is HUGE.  Around 10 to 20 times
> > > faster with the begin end pairs.
> > >
> > > I'd think that anyone who's used postgresql for more than a few months
> > > could corroborate my experience.
> >
> > Ouch!
> >
> > No need to get testy about it.
> >
> > Your test works as you said; the way I tried testing it before was different.
> > Good to know.   However, this approach is only useful if you are doing
> > rapidfire updates or inserts coming off a single connection.  But then it is
> > *very* useful.
>
> I didn't mean that in a testy way, it's just that after you've sat
> through
> a fifteen minute wait while a 1000 records are inserted, you pretty
> quickly switch to the method of inserting  them all in one big
> transaction.  That's all.
>
> Note that the opposite is what really gets people in trouble.  I've seen
> folks inserting rather large amounts of data, say into ten or 15 tables,
> and their web servers were crawling under parallel load.  Then, they put
> them into a single transaction and they just flew.
>
> The funny thing it, they've often avoided transactions because they
> figured they'd be slower than just inserting the rows, and you kinda have
> to make them sit down first before you show them the performance increase
> from putting all those inserts into a single transaction.
>
> No offense meant, really.  It's just that you seemed to really doubt that
> putting things into one transaction helped, and putting things into one
> big transaction if like the very first postgresql lesson a lot of
> newcomers learn. :-)
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

В списке pgsql-performance по дате отправления:

Предыдущее
От: "scott.marlowe"
Дата:
Сообщение: Re: performance of insert/delete/update
Следующее
От: Josh Berkus
Дата:
Сообщение: Re: performance of insert/delete/update