Re: MVCC performance issue

Поиск
Список
Период
Сортировка
От Kyriacos Kyriacou
Тема Re: MVCC performance issue
Дата
Msg-id 8BCBF9DB739F034B87FE7C7D30EAE55C2AB8F887@hqex2k.francoudi.com
обсуждение исходный текст
Ответ на MVCC performance issue  ("Kyriacos Kyriacou" <kyriacosk@prime-tel.com>)
Ответы Re: MVCC performance issue  (Scott Carey <scott@richrelevance.com>)
Список pgsql-performance
My suggestion had just a single difference from what currently MVCC is
doing (btw I never said that MVCC is bad).

NOW ===> on COMMIT previous version record is expired and the
         new version record (created in new dynamically allocated
         spot, as you said) is set as "active"

MY  ===> on COMMIT, to update new version data over the same physical
location that initial version was
         and release the space used to keep the new version (that was
dynamically allocated).

The rest are all the same! I do not think that this is breaking anything
and I still believe that this might help.

I will try to plan upgrade the soonest possible to the newest version.
Reading few words about HOT updates
it seems that more or less is similar to what I have described and will
be very helpful.

Kyriacos

> -----Original Message-----
> From: Andy Colson [mailto:andy@squeakycode.net]
> Sent: Friday, November 12, 2010 6:22 PM
> To: Kyriacos Kyriacou
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] MVCC performance issue
>
> On 11/12/2010 7:47 AM, Kyriacos Kyriacou wrote:
> >
> > SUGGESTION
> > --------------
> > 1) When a raw UPDATE is performed, store all "new raw versions"
either
> > in separate temporary table space
> >     or in a reserved space at the end of each table (can be
allocated
> > dynamically) etc
>
> Your use of "raw" is confusing.  I'll just ignore the word.  New row
> versions are already stored in a dynamically allocated spot, right
along
> with the other versions of the table.  You are assuming that getting
to
> the "correct" version of the row is very slow?  That's only going to
be
> the case if you have lots and lots of versions.  And your solution
will
> not actually help if there are lots of versions.  While one person who
> is hitting the most recent version might be ok, everyone else will
still
> have to search for theirs.  Just as they do now.
>
> > 2) Any SELECT queries within the same session will be again
accessing
> > the new version of the row
>
> I don't see how this is different from what we currently have.  "same
> session" could have been dropped from your separate table space, and
> then you'd have to go search through previous versions of the row...
> exactly like you do now.
>
> And worse, if you dont want to drop your version of the row from the
> separate table space until you commit/rollback, then no other user can
> start a transaction on that table until your done!  oh no!  You have
> reads and writes blocking each other.
>
> > 3) Any SELECT queries from other users will still be accessing the
old
> > version
>
> Again.. the same.
>
> > 4) When UPDATE transaction is ROLLBACK just release the space used
in
> > new temporary location
>
> current layout makes rollback very very fast.
>
> > 5) When UPDATE transaction is COMMIT then try to LOCK the old
version
> > and overwrite it at the same physical location (NO FRAGMENTATION).
>
> Not sure what you mean by lock, but lock requires single user access
and
> slow's things down.  Right now we just bump the "most active
transaction
> number", which is very efficient, and requires no locks.  As soon as
you
> lock anything, somebody, by definition, has to wait.
>
>
> > 6) Similar mechanism can be applied on INSERTS and DELETES
> > 7) In case that transaction was COMMIT, the temporary location can
be
> > either released or archived/cleaned on a pre-scheduled basis. This
will
> > possibly allow the introduction of a TRANSACTION LOG backup
mechanism as
> > a next step.
>
> You are kind of assuming there will only ever be one new transaction,
> and one old transaction.  What about a case where 10 people start a
> transaction, and there are 10 versions of the row?
>
>
> It seems to me like you are using very long transactions, which is
> causing lots of row versions to show up.  Have you run explain analyze
> on your slow querys to find out the problems?
>
> Have you checked to see if you are cpu bound or io bound?  If you are
> dealing with lots of row versions, I'd assume you are cpu bound.  If
you
> check your system though, and see you are io bound, I think that might
> invalidate your assumptions above.
>
> MVCC makes multi user access very nice because readers and writers
dont
> block each other, and there are very few locks.  It does come with
some
> kinks (gotta vacuum, keep transactions short, you must commit, etc).
>
> select count(*) for example is always going to be slow... just expect
> it, lets not destroy what works well about the database just to make
it
> fast.  Instead, find a better alternative so you dont have to run it.
>
> Just like any database, you have to work within MVCC's good points and
> try to avoid the bad spots.
>
> -Andy



В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: anti-join chosen even when slower than old plan
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: MVCC performance issue