Re: MVCC performance issue

Поиск
Список
Период
Сортировка
От Kenneth Marshall
Тема Re: MVCC performance issue
Дата
Msg-id 20101112135235.GI14016@aart.is.rice.edu
обсуждение исходный текст
Ответ на MVCC performance issue  ("Kyriacos Kyriacou" <kyriacosk@prime-tel.com>)
Ответы Re: MVCC performance issue
Список pgsql-performance
On Fri, Nov 12, 2010 at 03:47:30PM +0200, Kyriacos Kyriacou wrote:
> This is my first post in this mailing list and I would like to raise an
> issue that in my opinion is causing performance issues of PostgreSQL
> especially in a transaction processing environment. In my company we are
> using PostgreSQL for the last 8 year for our in-house developed billing
> system (telecom). The last few months we started considering moving to
> another RDBMS just because of this issue.
>
> After all these years, I believe that the biggest improvement that could
> be done and will boost overall performance especially for enterprise
> application will be to improve Multiversion Concurrency Control (MVCC)
> mechanism. In theory this seems to be improving performance for SELECT
> queries but on tables with very intensive and frequent updates, even
> that is not fully true because of the fragmentation of data caused by
> MVCC. I saw cases were a SELECT COUNT(*) on an empty (!!!) table (used
> as a buffer) took more than 40min to return a result! VACUUM is not a
> solution in my opinion even though after the introduction of autovacuum
> daemon situation got much better.
>
> PROBLEM DECRIPTION
> ------------------
> By definition of MVCC, when an UPDATE is performed, PostgreSQL creates a
> new copy of the row in a new location. Any SELECT queries within the
> same session are accessing the new version of the raw and all other
> queries from other users are still accessing the old version. When
> transaction is COMMIT PostgreSQL makes the a new version of the row as
> the "active" row and expires the old row that remains "dead" and then is
> up to VACUUM procedure to recover the "dead" rows space and make it
> available to the database engine. In case that transaction is ROLLBACK
> then space reserved for the new version of the row is released. The
> result is to have huge fragmentation on table space, unnecessary updates
> in all affected indexes, unnecessary costly I/O operations, poor
> performance on SELECT that retrieves big record sets (i.e. reports etc)
> and slower updates. As an example, consider updating the "live" balance
> of a customer for each phone call where the entire customer record has
> to be duplicated again and again upon each call just for modifying a
> numeric value!
>
> SUGGESTION
> --------------
> 1) When a raw UPDATE is performed, store all "new raw versions" either
> in separate temporary table space
>    or in a reserved space at the end of each table (can be allocated
> dynamically) etc
> 2) Any SELECT queries within the same session will be again accessing
> the new version of the row
> 3) Any SELECT queries from other users will still be accessing the old
> version
> 4) When UPDATE transaction is ROLLBACK just release the space used in
> new temporary location
> 5) When UPDATE transaction is COMMIT then try to LOCK the old version
> and overwrite it at the same physical location (NO FRAGMENTATION).
> 6) Similar mechanism can be applied on INSERTS and DELETES
> 7) In case that transaction was COMMIT, the temporary location can be
> either released or archived/cleaned on a pre-scheduled basis. This will
> possibly allow the introduction of a TRANSACTION LOG backup mechanism as
> a next step.
> 8) After that VACUUM will have to deal only with deletions!!!
>
>
> I understand that my suggestion seems to be too simplified and also that
> there are many implementation details and difficulties that I am not
> aware.
>
> I strongly believe that the outcome of the discussion regarding this
> issue will be helpful.
>
> Best Regards,
>
> Kyriacos Kyriacou
> Senior Developer/DBA
>

I cannot speak to your suggestion, but it sounds like you are not
vacuuming enough and a lot of the bloat/randomization would be helped
by making use of HOT updates in which the updates are all in the same
page and are reclaimed almost immediately.

Regards,
Ken

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Kyriacos Kyriacou"
Дата:
Сообщение: MVCC performance issue
Следующее
От: Thom Brown
Дата:
Сообщение: Re: MVCC performance issue