Re: The plan for FDW-based sharding

Поиск
Список
Период
Сортировка
От Álvaro Hernández Tortosa
Тема Re: The plan for FDW-based sharding
Дата
Msg-id 56D18F82.1020206@8kdata.com
обсуждение исходный текст
Ответ на Re: The plan for FDW-based sharding  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Список pgsql-hackers

On 27/02/16 09:19, Konstantin Knizhnik wrote:
> On 02/27/2016 06:54 AM, Robert Haas wrote:
>
[...]
>
>> So maybe the goal for the GTM isn't to provide true serializability
> across the cluster but some lesser degree of transaction isolation.
> But then exactly which serialization anomalies are we trying to
> prevent, and why is it OK to prevent those and not others?
>
> Absolutely agree. There are some theoretical discussion regarding CAP 
> and different distributed level of isolation.
> But at practice people want to solve their tasks. Most of PostgeSQL 
> used are using default isolation level: read committed although there 
> are alot of "wonderful" anomalies with it.
> Serialazable transaction in Oracle are actually violating fundamental 
> serializability rule and still Oracle is one of ther most popular 
> database in the world...
> The was isolation bug in Postgres-XL which doesn't prevent from using 
> it by commercial customers...
    I think this might be a dangerous line of thought. While I agree 
PostgreSQL should definitely look at the market and answer questions 
that (current and prospective) users may ask, and be more practical than 
idealist, easily ditching isolation guarantees might not be a good thing.
     That Oracle is the leader with their isolation problems or that 
most people run PostgreSQL under read committed is not a good argument 
to cut the corner and just go to bare minimum (if any) isolation 
guarantees. First, because PostgreSQL has always been trusted and 
understood as a system with *strong* guarantees (whatever that means). . 
Second, because what we may perceive as OK from the market, might change 
soon. From my observations, while I agree with you most people "don't 
care" or, worse, "don't realize", is rapidly changing. More and more 
people are becoming aware of the problems of distributed systems and the 
significant consequences they may have on them.
    A lot of them have been illustrated in the famous Jepsen posts. As 
an example, and a good one given that you have mentioned Galera before, 
is this one: https://aphyr.com/posts/327-jepsen-mariadb-galera-cluster 
which demonstrates how Galera fails to provide Snapshot Isolation, even 
on healthy state --despite they claim that.
    As of today, I would expect any distributed system to clearly state 
its guarantees in the documentation. And them adhere to them, like for 
instance proving it with tests such as Jepsen.

>
> So I do not say that discussing all this theoretical questions is not 
> need as formally proven correctness of distributed algorithm.
    I would like to see work forward here, so I really appreciate all 
your work here. I cannot give an opinion on whether the DTM API is good 
or not, but I agree with Robert a good technical discussion on these 
issues is a good, and a needed, starting point. Feedback may also help 
you avoid pitfalls that may have gone unnoticed until tons of code are 
implemented.
    Academical approaches are sometimes "very academical", but studying 
them doesn't hurt either :)

    Álvaro


-- 
Álvaro Hernández Tortosa


-----------
8Kdata




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Konstantin Knizhnik
Дата:
Сообщение: Re: The plan for FDW-based sharding
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: PATCH: index-only scans with partial indexes