Re: MVCC performance issue

Поиск
Список
Период
Сортировка
От Andy Colson
Тема Re: MVCC performance issue
Дата
Msg-id 4CDD69A3.5000804@squeakycode.net
обсуждение исходный текст
Ответ на MVCC performance issue  ("Kyriacos Kyriacou" <kyriacosk@prime-tel.com>)
Список pgsql-performance
On 11/12/2010 7:47 AM, Kyriacos Kyriacou wrote:
>
> SUGGESTION
> --------------
> 1) When a raw UPDATE is performed, store all "new raw versions" either
> in separate temporary table space
>     or in a reserved space at the end of each table (can be allocated
> dynamically) etc

Your use of "raw" is confusing.  I'll just ignore the word.  New row
versions are already stored in a dynamically allocated spot, right along
with the other versions of the table.  You are assuming that getting to
the "correct" version of the row is very slow?  That's only going to be
the case if you have lots and lots of versions.  And your solution will
not actually help if there are lots of versions.  While one person who
is hitting the most recent version might be ok, everyone else will still
have to search for theirs.  Just as they do now.

> 2) Any SELECT queries within the same session will be again accessing
> the new version of the row

I don't see how this is different from what we currently have.  "same
session" could have been dropped from your separate table space, and
then you'd have to go search through previous versions of the row...
exactly like you do now.

And worse, if you dont want to drop your version of the row from the
separate table space until you commit/rollback, then no other user can
start a transaction on that table until your done!  oh no!  You have
reads and writes blocking each other.

> 3) Any SELECT queries from other users will still be accessing the old
> version

Again.. the same.

> 4) When UPDATE transaction is ROLLBACK just release the space used in
> new temporary location

current layout makes rollback very very fast.

> 5) When UPDATE transaction is COMMIT then try to LOCK the old version
> and overwrite it at the same physical location (NO FRAGMENTATION).

Not sure what you mean by lock, but lock requires single user access and
slow's things down.  Right now we just bump the "most active transaction
number", which is very efficient, and requires no locks.  As soon as you
lock anything, somebody, by definition, has to wait.


> 6) Similar mechanism can be applied on INSERTS and DELETES
> 7) In case that transaction was COMMIT, the temporary location can be
> either released or archived/cleaned on a pre-scheduled basis. This will
> possibly allow the introduction of a TRANSACTION LOG backup mechanism as
> a next step.

You are kind of assuming there will only ever be one new transaction,
and one old transaction.  What about a case where 10 people start a
transaction, and there are 10 versions of the row?


It seems to me like you are using very long transactions, which is
causing lots of row versions to show up.  Have you run explain analyze
on your slow querys to find out the problems?

Have you checked to see if you are cpu bound or io bound?  If you are
dealing with lots of row versions, I'd assume you are cpu bound.  If you
check your system though, and see you are io bound, I think that might
invalidate your assumptions above.

MVCC makes multi user access very nice because readers and writers dont
block each other, and there are very few locks.  It does come with some
kinks (gotta vacuum, keep transactions short, you must commit, etc).

select count(*) for example is always going to be slow... just expect
it, lets not destroy what works well about the database just to make it
fast.  Instead, find a better alternative so you dont have to run it.

Just like any database, you have to work within MVCC's good points and
try to avoid the bad spots.

-Andy

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Kyriacos Kyriacou"
Дата:
Сообщение: Re: MVCC performance issue
Следующее
От: Kenneth Marshall
Дата:
Сообщение: Re: MVCC performance issue