Re: Scalability in postgres

Поиск
Список
Период
Сортировка
От simon@2ndquadrant.com
Тема Re: Scalability in postgres
Дата
Msg-id 1721003159.17251.1250413828132.JavaMail.open-xchange@oxltgw16.schlund.de
обсуждение исходный текст
Ответ на Re: Scalability in postgres  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-performance

On 14 August 2009 at 03:18 Jeff Janes <jeff.janes@gmail.com> wrote:

> This my rough understanding.  Oracle never
> "takes" a snapshot, it computes one the fly, if and when it is needed.  It maintains a
> structure of recently committed transactions, with the XID for when they committed.  If a
> process runs into a tuple that is neither from the future nor from the deep past, it
> consults this structure to see if that transaction has committed, and if so whether it did so before or
> after the current query was started.  The structure is partionable so
> it does not have one global lock to serialize on, and the lock is short as it only gets
> the info it needs, not the entire set of global open transactions.

If this is the way Oracle does it then the data structure you describe would need to be updated on both transaction start and transaction commit, as well as being locked while it was read. Transaction commits would need to be serialized so that the commit order was maintained. 

The Oracle structure would be read much less often, yet updated twice as often at snapshot point and at commit. It could be partitionable, but that would increase the conflict associated with reads of the data structure.

Oracle's structure works well for an "ideal workload" such as TPC-C where the data is physically separated and so the reads on this structure are almost nil. It would work very badly on data that continuously conflicted, which may account for the fact that no Oracle benchmark has been published on TPC-E. This bears out the experience of many Oracle DBAs, including myself. I certainly wouldn't assume Oracle have solved every problem.

The Postgres procarray structure is read often, yet only exclusively locked during commit. As Tom said, we optimize away the lock at xid assignment and also optimize away many xid assignments altogether. We don't have any evidence that the size of the procarray reduces the speed of reads, but we do know that the increased queue length you get from having many concurrent sessions increases time to record commit.

We might be able to do something similar to Oracle with Postgres, but it would require significant changes and much complex thought. The reason for doing so would be to reduce the number of reads on the "MVCC structure", making mild partitioning more palatable. The good thing about the current Postgres structure is that it doesn't increase contention when accessing concurrently updated data.

On balance it would appear that Oracle gains a benchmark win by giving up some real world usefulness. That's never been something anybody here has been willing to trade. 

Further thought in this area could prove useful, but it seems a lower priority for development simply because of the code complexity required to make this sort of change.

Best Regards, Simon Riggs

В списке pgsql-performance по дате отправления:

Предыдущее
От: Nickolay
Дата:
Сообщение: Re: Per-database warm standby?
Следующее
От: Jean-Max Reymond
Дата:
Сообщение: Re: Less expensive proprietary or Open source ETL tools