Обсуждение: PostgreSQL Active-Active Clustering

Поиск
Список
Период
Сортировка

PostgreSQL Active-Active Clustering

От
"Sarkar, Subhadeep"
Дата:

Hi,

 

We are evaluating features of the Community edition of PostgreSQL in relation to a proposal for a prospective client and need help with the queries below:-

 

  • Does the Community edition of PostgreSQL provide NATIVE active-active high availability clustering with objectives of scalability, load balancing and high availability without using any extensions or external components or usage of Kubernetes/Dockers.

 

  • In the Community edition of PostgreSQL is it possible to setup a database cluster in load balancing mode and provide vertical and horizontal scalability without repartitioning or changes to the database objects or 3rd party transaction routing mechanisms using NATIVE features only (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).

 

  • In the Community edition of PostgreSQL is it possible to setup  a cluster where all the nodes are able to concurrently read-write the underlying database image using NATIVE features (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).

 

Kindly direct us to the right contributor/user group. Thanks in advance!

 

Regards,

Subhadeep

 

Subhadeep Sarkar

Program Director

Government & Public Service - Technology.

 

O +91 33 4403 4000

M +91 98301 69398

 

Email: :subhadeepsarkar@kpmg.com

 

KPMG Advisory Services Private Limited

Unit No. 604, 6th Floor, Tower 1, Plot No. 5,

Block DP, Godrej Waterside, Sector V,

Salt Lake,

Kolkata - 700 091

 


________________________________________
KPMG (in India) allows reasonable personal use of the e-mail system. Views and opinions expressed in these communications do not necessarily represent those of KPMG (in India).

*******************************************************************************************************
DISCLAIMER
The information in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee. Access to this e-mail by anyone else is unauthorized. If you have received this communication in error, please address with the subject heading "Received in error," send to postmaster1@kpmg.com, then delete the e-mail and destroy any copies of it. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Any opinions or advice contained in this e-mail are subject to the terms and conditions expressed in the governing KPMG client engagement letter. Opinions, conclusions and other information in this e-mail and any attachments that do not relate to the official business of the firm are neither given nor endorsed by it.

If this email is being sent by a KPMG team member deployed with a client in relation to KPMG’s engagement with such client for supporting the client through deployment of a team as per the requirement of the [Work Order/Letter of Award/Purchase order/Contract] and such team member is working under the client’s supervision and control, then the respective client is responsible for the content of this email and/or attachments to the email regardless of whether such team member is using client/project domain ID or that of KPMG.

If this email is received by Government Authority on a public email IDs (like gmail, yahoo etc.), then it should be noted that this was sent on public email ID since KPMG was not provided with an alternate secure email ID by the recipient. It is highlighted to the recipient that Government guidelines have advised not to use public emails and an alternate secure email id in line with Government IT Security policy should be provided to KPMG for further communications. KPMG will not be responsible for any situations which may arise on account of using public email id by such recipient.

KPMG cannot guarantee that e-mail communications are secure or error-free, as information could be intercepted, corrupted, amended, lost, destroyed, arrive late or incomplete, or contain viruses.

KPMG, an Indian partnership and a member firm of KPMG International Cooperative ("KPMG International"), an English entity that serves as a coordinating entity for a network of independent firms operating under the KPMG name. KPMG International Cooperative (“KPMG International”) provides no services to clients. Each member firm of KPMG International Cooperative (“KPMG International”) is a legally distinct and separate entity and each describes itself as such.
*******************************************************************************************************

Re: PostgreSQL Active-Active Clustering

От
"David G. Johnston"
Дата:
On Mon, Jul 15, 2024 at 12:06 PM Sarkar, Subhadeep <subhadeepsarkar@kpmg.com> wrote:


We are evaluating features of the Community edition of PostgreSQL in relation to a proposal for a prospective client and need help with the queries below:-

 

  • Does the Community edition of PostgreSQL provide NATIVE active-active high availability clustering with objectives of scalability, load balancing and high availability without using any extensions or external components or usage of Kubernetes/Dockers.
 
Basically, no.  See "logical replication" if you'd like to delve into why I qualified this one.
 
  • In the Community edition of PostgreSQL is it possible to setup a database cluster in load balancing mode and provide vertical and horizontal scalability without repartitioning or changes to the database objects or 3rd party transaction routing mechanisms using NATIVE features only (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).

No

  • In the Community edition of PostgreSQL is it possible to setup  a cluster where all the nodes are able to concurrently read-write the underlying database image using NATIVE features (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).



No

There is only one server process per underlying data directory in PostgreSQL.

David J.

Re: PostgreSQL Active-Active Clustering

От
Christophe Pettus
Дата:

> On Jul 15, 2024, at 12:06, Sarkar, Subhadeep <subhadeepsarkar@kpmg.com> wrote:
>
>   • Does the Community edition of PostgreSQL provide NATIVE active-active high availability clustering with
objectivesof scalability, load balancing and high availability without using any extensions or external components or
usageof Kubernetes/Dockers.  
>
>   • In the Community edition of PostgreSQL is it possible to setup a database cluster in load balancing mode and
providevertical and horizontal scalability without repartitioning or changes to the database objects or 3rd party
transactionrouting mechanisms using NATIVE features only (i.e. without using any extensions or external components or
usageof Kubernetes/Dockers). 
>
>   • In the Community edition of PostgreSQL is it possible to setup  a cluster where all the nodes are able to
concurrentlyread-write the underlying database image using NATIVE features (i.e. without using any extensions or
externalcomponents or usage of Kubernetes/Dockers). 

Short answer: No.

The community version of PostgreSQL, without any extensions beyond what is available in the core distribution, supports
noneof these.  I will offer that your client is not being realistic if these are their requirements, and they expect
themto be fulfilled by the core distribution of any open-source database. 

There are commercial extensions to PostgreSQL that provide the first, but PostgreSQL does not do so out of the box.  It
ispossible, now, to build this on top of community PostgreSQL with logical replication, but there is notable
developmentwork involved, and you cannot just drop an existing database into PostgreSQL and expect this to work.  In
anyevent, you will need to make sure the schema is compatible with an active-active model. 

For the second, you can explore open-source projects such as Citus, but some attention to the schema and queries will
berequired.  

No product, either commercial or open-source, provides the last one (read-write shared storage), although there are
commercialproducts that provide for a shared-storage model single-writer, multiple-reader model (for example, Amazon
Aurora).


Re: PostgreSQL Active-Active Clustering

От
Ron Johnson
Дата:
On Mon, Jul 15, 2024 at 3:28 PM Christophe Pettus <xof@thebuild.com> wrote:

> On Jul 15, 2024, at 12:06, Sarkar, Subhadeep <subhadeepsarkar@kpmg.com> wrote:
>
[snip] 
>   • In the Community edition of PostgreSQL is it possible to setup  a cluster where all the nodes are able to concurrently read-write the underlying database image using NATIVE features (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).
[snip] 
No product, either commercial or open-source, provides the last one (read-write shared storage), although there are commercial products that provide for a shared-storage model single-writer, multiple-reader model (for example, Amazon Aurora).

This "lack of products" puzzles me, because DEC was doing this with VAX (then Alpha and Itanium) clusters 40 years ago via a Distributed Lock Manager integrated deep into VMS.  Their Rdb and (CODASYL) DBMS products used those functions extensively.

(In the late 1990s, they sold the DLM code to Oracle, which is where RAC comes from.)

It was shared-disk, multiple-writer, because the DLM allowed for locking at the row level.  Thus, a half dozen cluster nodes could hold write locks on different rows on the same data page.

Re: PostgreSQL Active-Active Clustering

От
Christoph Moench-Tegeder
Дата:
## Ron Johnson (ronljohnsonjr@gmail.com):

> This "lack of products" puzzles me, because DEC was doing this with VAX
> (then Alpha and Itanium) clusters 40 years ago via a Distributed Lock
> Manager integrated deep into VMS.  Their Rdb and (CODASYL) DBMS products

Tech and trade-offs have changed over the last 40 years :)
These days you can so many cores in one package, while "more than one
processor" was quite a feat in the 80ies ("A dual processor VAX 11/780",
1982 https://dl.acm.org/doi/10.5555/800048.801738; also the 11/782 and
11/784), and you get so much RAM and storage (even fast storage, if
you keep it local) with that package. Response Latency really jumps
if you have to communicate with anything outside your box.
While latency matters, the number of problems where you absolutely
need that distributed lock manager has not really grown that much,
I think.

Regards,
Christoph

-- 
Spare Space



Re: PostgreSQL Active-Active Clustering

От
Ron Johnson
Дата:
On Mon, Jul 15, 2024 at 5:54 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote:
## Ron Johnson (ronljohnsonjr@gmail.com):

> This "lack of products" puzzles me, because DEC was doing this with VAX
> (then Alpha and Itanium) clusters 40 years ago via a Distributed Lock
> Manager integrated deep into VMS.  Their Rdb and (CODASYL) DBMS products

Tech and trade-offs have changed over the last 40 years :)
These days you can so many cores in one package, while "more than one
processor" was quite a feat in the 80ies ("A dual processor VAX 11/780",
1982 https://dl.acm.org/doi/10.5555/800048.801738; also the 11/782 and
11/784), and you get so much RAM and storage (even fast storage, if
you keep it local) with that package. Response Latency really jumps
if you have to communicate with anything outside your box.
While latency matters, the number of problems where you absolutely
need that distributed lock manager has not really grown that much,
I think.

 Customers still want High Availability, and VMS Clusters were great for HA. 

Re: PostgreSQL Active-Active Clustering

От
Achilleas Mantzios
Дата:
Στις 15/7/24 22:55, ο/η Ron Johnson έγραψε:
On Mon, Jul 15, 2024 at 3:28 PM Christophe Pettus <xof@thebuild.com> wrote:

> On Jul 15, 2024, at 12:06, Sarkar, Subhadeep <subhadeepsarkar@kpmg.com> wrote:
>
[snip] 
>   • In the Community edition of PostgreSQL is it possible to setup  a cluster where all the nodes are able to concurrently read-write the underlying database image using NATIVE features (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).
[snip] 
No product, either commercial or open-source, provides the last one (read-write shared storage), although there are commercial products that provide for a shared-storage model single-writer, multiple-reader model (for example, Amazon Aurora).

This "lack of products" puzzles me, because DEC was doing this with VAX (then Alpha and Itanium) clusters 40 years ago via a Distributed Lock Manager integrated deep into VMS.  Their Rdb and (CODASYL) DBMS products used those functions extensively.
IMHO IBM did something similar with their shared DASDi back in 70s+. There was serialization mechanism enabling concurrent writes to data sets (meaning files). Not to mention IBM had great VM technology back in the day.

(In the late 1990s, they sold the DLM code to Oracle, which is where RAC comes from.)

It was shared-disk, multiple-writer, because the DLM allowed for locking at the row level.  Thus, a half dozen cluster nodes could hold write locks on different rows on the same data page.

-- 
Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt (as agents only)

Re: PostgreSQL Active-Active Clustering

От
Christophe Pettus
Дата:

> On Jul 15, 2024, at 12:06, Sarkar, Subhadeep <subhadeepsarkar@kpmg.com> wrote:
>
>  • Does the Community edition of PostgreSQL provide NATIVE active-active high availability clustering with objectives
ofscalability, load balancing and high availability without using any extensions or external components or usage of
Kubernetes/Dockers. 
>
>  • In the Community edition of PostgreSQL is it possible to setup a database cluster in load balancing mode and
providevertical and horizontal scalability without repartitioning or changes to the database objects or 3rd party
transactionrouting mechanisms using NATIVE features only (i.e. without using any extensions or external components or
usageof Kubernetes/Dockers). 
>
>  • In the Community edition of PostgreSQL is it possible to setup  a cluster where all the nodes are able to
concurrentlyread-write the underlying database image using NATIVE features (i.e. without using any extensions or
externalcomponents or usage of Kubernetes/Dockers). 

I do have to add that this list of requirements sounds very much like a set supplied by the sales organization of a
commercialdatabase vendor (in fact, a particular commercial database vendor) in order to exclude open-source software.
Open-sourcesoftware relies very much on a whole ecosystem, rather than attempting to deliver every imaginable feature
inthe core distribution.  This is especially true of projects like PostgreSQL which are not backed by a single company
anddo not have an official commercial distribution.  Can your client articulate why they need these specific features,
andwhy they must be in the core distribution?