Обсуждение: Insertions slower than Updates?

Поиск
Список
Период
Сортировка

Insertions slower than Updates?

От
Ofer Israeli
Дата:

Hi all,

 

In performance testing we’re doing we are currently running two scenarios:

  1. Starting from an empty db, therefore all operations are INSERTs.
  2. Starting from an existing db – thereby UPDATing all records.

 

I should also mention that the tables we are dealing with are heavily indexed.

 

I would expect that the first scenario would give much better results than the second one as:

  1. INSERT should be cheaper than UPDATE due to only dealing with one record instead of two.
  2. We also have SELECT queries before the operation and in the first configuration, the SELECTs will be dealing with much less data for most of the run.

 

To our surprise, we see that the second scenario gives better results with an average processing time of an event at around %70 of the time run in the first scenario.

 

Anyone have any ideas on why the empty db is giving worse results??

 

Many Thanks,

Ofer

Re: Insertions slower than Updates?

От
"Kevin Grittner"
Дата:
Ofer Israeli <oferi@checkpoint.com> wrote:

> INSERT should be cheaper than UPDATE due to only dealing with one
> record instead of two.

... unless the UPDATE is a HOT update, in which case the indexes
don't need to be touched.

> Anyone have any ideas on why the empty db is giving worse
> results??

Besides the HOT updates being fast, there is the issue of having
space already allocated and ready for the database to use, rather
than needing to make calls to the OS to create and extend files as
space is needed.

-Kevin

Re: Insertions slower than Updates?

От
Ofer Israeli
Дата:
Kevin Grittner wrote:
> Ofer Israeli <oferi@checkpoint.com> wrote:
>> Anyone have any ideas on why the empty db is giving worse results??
>
> Besides the HOT updates being fast, there is the issue of having
> space already allocated and ready for the database to use, rather
> than needing to make calls to the OS to create and extend files as
> space is needed.
>

I thought about this direction as well, but on UPDATES, some of them will need to ask the OS for more space anyhow at
leastat the beginning of the run, additional pages will be needed.  Do you expect that the OS level allocations are so
expensiveas to show an ~%40 increase of processing time in average? 


Thanks,
Ofer

Re: Insertions slower than Updates?

От
"Kevin Grittner"
Дата:
Ofer Israeli <oferi@checkpoint.com> wrote:
> Kevin Grittner wrote:
>> Ofer Israeli <oferi@checkpoint.com> wrote:
>>> Anyone have any ideas on why the empty db is giving worse
>>> results??
>>
>> Besides the HOT updates being fast, there is the issue of having
>> space already allocated and ready for the database to use, rather
>> than needing to make calls to the OS to create and extend files
>> as space is needed.
>
> I thought about this direction as well, but on UPDATES, some of
> them will need to ask the OS for more space anyhow at least at the
> beginning of the run, additional pages will be needed.  Do you
> expect that the OS level allocations are so expensive as to show
> an ~%40 increase of processing time in average?

Gut feel, 40% does seem high for just that; but HOT updates could
easily account for that, especially since you said that the tables
are "heavily indexed".  That is, as long as there are enough updates
which don't modify indexed columns.

-Kevin

Re: Insertions slower than Updates?

От
Steve Horn
Дата:
If the updates don't hit indexed columns (so the indexes don't need to be rebuilt), then the update would be very fast.

Inserts would always affect the index causing it to constantly need modifying.

If you're doing a lot of INSERTs in a batch operation, you may want to consider dropping the indexes and recreating at the end.

On Mon, Feb 20, 2012 at 2:29 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
Ofer Israeli <oferi@checkpoint.com> wrote:
> Kevin Grittner wrote:
>> Ofer Israeli <oferi@checkpoint.com> wrote:
>>> Anyone have any ideas on why the empty db is giving worse
>>> results??
>>
>> Besides the HOT updates being fast, there is the issue of having
>> space already allocated and ready for the database to use, rather
>> than needing to make calls to the OS to create and extend files
>> as space is needed.
>
> I thought about this direction as well, but on UPDATES, some of
> them will need to ask the OS for more space anyhow at least at the
> beginning of the run, additional pages will be needed.  Do you
> expect that the OS level allocations are so expensive as to show
> an ~%40 increase of processing time in average?

Gut feel, 40% does seem high for just that; but HOT updates could
easily account for that, especially since you said that the tables
are "heavily indexed".  That is, as long as there are enough updates
which don't modify indexed columns.

-Kevin

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Steve Horn
http://www.stevehorn.cc
steve@stevehorn.cc
http://twitter.com/stevehorn
740-503-2300

Re: Insertions slower than Updates?

От
Ofer Israeli
Дата:
Kevin Grittner wrote:
> Ofer Israeli <oferi@checkpoint.com> wrote:
>> Kevin Grittner wrote:
>>> Ofer Israeli <oferi@checkpoint.com> wrote:
>>>> Anyone have any ideas on why the empty db is giving worse results??
>>>
>>> Besides the HOT updates being fast, there is the issue of having
>>> space already allocated and ready for the database to use, rather
>>> than needing to make calls to the OS to create and extend files
>>> as space is needed.
>>
>> I thought about this direction as well, but on UPDATES, some of them
>> will need to ask the OS for more space anyhow at least at the
>> beginning of the run, additional pages will be needed.  Do you expect
>> that the OS level allocations are so expensive as to show an ~%40
>> increase of processing time in average?
>
> Gut feel, 40% does seem high for just that; but HOT updates could
> easily account for that, especially since you said that the tables
> are "heavily indexed".  That is, as long as there are enough updates
> which don't modify indexed columns.

Most, if not all of our UPDATEs, involve updating an indexed column, so HOT updates are actually not performed at all
:(

Re: Insertions slower than Updates?

От
Ofer Israeli
Дата:

Ofer Israeli <oferi@checkpoint.com> wrote:

>Hi all,

 

> In performance testing we’re doing we are currently running two scenarios:

>  1. Starting from an empty db, therefore all operations are INSERTs.  
>  2. Starting from an existing db – thereby UPDATing all records.

  

> I should also mention that the tables we are dealing with are heavily indexed.

 

> I would expect that the first scenario would give much better results than the second one as:

> 1. INSERT should be cheaper than UPDATE due to only dealing with one record instead of two.  
> 2.  We also have SELECT queries before the operation and in the first configuration, the SELECTs will be dealing with much less data for most of the run.

 

> To our surprise, we see that the second scenario gives better results with an average processing time of an event at around %70 of the time run in the first scenario.

 

> Anyone have any ideas on why the empty db is giving worse results??

 

A little googleing led me to this thought, will be happy to hear you're input on this.  If the database is initially empty, the analyzer will probably decide to query the tables by full table scan as opposed to index searching.  Now supposedly, during our test run, the analyzer does not run frequently enough and so at some point we are using the wrong method for SELECTs.

 

The version we are using is 8.3.7.

 

My questions are:

1. Does the above seem reasonable to you?

2. How often does the analyzer run?  To my understanding it will always run with autovacuum, is that right?  Is it triggered at other times as well?

3. Does the autoanalyzer work only partially on the db like autovacuum going to sleep after a certain amount of work was done or does it work until finished?  If it is partial work, maybe it does not reach our relevant tables.  What dictates the order in which it will work?

 

 

Many thanks,

Ofer