Обсуждение: PostgreSQL Active-Active Clustering
Hi,
We are evaluating features of the Community edition of PostgreSQL in relation to a proposal for a prospective client and need help with the queries below:-
- Does the Community edition of PostgreSQL provide NATIVE active-active high availability clustering with objectives of scalability, load balancing and high availability without using any extensions or external components or usage of Kubernetes/Dockers.
- In the Community edition of PostgreSQL is it possible to setup a database cluster in load balancing mode and provide vertical and horizontal scalability without repartitioning or changes to the database objects or 3rd party transaction routing mechanisms using NATIVE features only (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).
- In the Community edition of PostgreSQL is it possible to setup a cluster where all the nodes are able to concurrently read-write the underlying database image using NATIVE features (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).
Kindly direct us to the right contributor/user group. Thanks in advance!
Regards,
Subhadeep
Subhadeep Sarkar
Program Director
Government & Public Service - Technology.
O +91 33 4403 4000
M +91 98301 69398
Email: :subhadeepsarkar@kpmg.com
KPMG Advisory Services Private Limited
Unit No. 604, 6th Floor, Tower 1, Plot No. 5,
Block DP, Godrej Waterside, Sector V,
Salt Lake,
Kolkata - 700 091
________________________________________
KPMG (in India) allows reasonable personal use of the e-mail system. Views and opinions expressed in these communications do not necessarily represent those of KPMG (in India).
*******************************************************************************************************
DISCLAIMER
The information in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee. Access to this e-mail by anyone else is unauthorized. If you have received this communication in error, please address with the subject heading "Received in error," send to postmaster1@kpmg.com, then delete the e-mail and destroy any copies of it. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Any opinions or advice contained in this e-mail are subject to the terms and conditions expressed in the governing KPMG client engagement letter. Opinions, conclusions and other information in this e-mail and any attachments that do not relate to the official business of the firm are neither given nor endorsed by it.
If this email is being sent by a KPMG team member deployed with a client in relation to KPMG’s engagement with such client for supporting the client through deployment of a team as per the requirement of the [Work Order/Letter of Award/Purchase order/Contract] and such team member is working under the client’s supervision and control, then the respective client is responsible for the content of this email and/or attachments to the email regardless of whether such team member is using client/project domain ID or that of KPMG.
If this email is received by Government Authority on a public email IDs (like gmail, yahoo etc.), then it should be noted that this was sent on public email ID since KPMG was not provided with an alternate secure email ID by the recipient. It is highlighted to the recipient that Government guidelines have advised not to use public emails and an alternate secure email id in line with Government IT Security policy should be provided to KPMG for further communications. KPMG will not be responsible for any situations which may arise on account of using public email id by such recipient.
KPMG cannot guarantee that e-mail communications are secure or error-free, as information could be intercepted, corrupted, amended, lost, destroyed, arrive late or incomplete, or contain viruses.
KPMG, an Indian partnership and a member firm of KPMG International Cooperative ("KPMG International"), an English entity that serves as a coordinating entity for a network of independent firms operating under the KPMG name. KPMG International Cooperative (“KPMG International”) provides no services to clients. Each member firm of KPMG International Cooperative (“KPMG International”) is a legally distinct and separate entity and each describes itself as such.
*******************************************************************************************************
On Mon, Jul 15, 2024 at 12:06 PM Sarkar, Subhadeep <subhadeepsarkar@kpmg.com> wrote:
We are evaluating features of the Community edition of PostgreSQL in relation to a proposal for a prospective client and need help with the queries below:-
- Does the Community edition of PostgreSQL provide NATIVE active-active high availability clustering with objectives of scalability, load balancing and high availability without using any extensions or external components or usage of Kubernetes/Dockers.
Basically, no. See "logical replication" if you'd like to delve into why I qualified this one.
- In the Community edition of PostgreSQL is it possible to setup a database cluster in load balancing mode and provide vertical and horizontal scalability without repartitioning or changes to the database objects or 3rd party transaction routing mechanisms using NATIVE features only (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).
No
- In the Community edition of PostgreSQL is it possible to setup a cluster where all the nodes are able to concurrently read-write the underlying database image using NATIVE features (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).
No
There is only one server process per underlying data directory in PostgreSQL.
David J.
> On Jul 15, 2024, at 12:06, Sarkar, Subhadeep <subhadeepsarkar@kpmg.com> wrote: > > • Does the Community edition of PostgreSQL provide NATIVE active-active high availability clustering with objectivesof scalability, load balancing and high availability without using any extensions or external components or usageof Kubernetes/Dockers. > > • In the Community edition of PostgreSQL is it possible to setup a database cluster in load balancing mode and providevertical and horizontal scalability without repartitioning or changes to the database objects or 3rd party transactionrouting mechanisms using NATIVE features only (i.e. without using any extensions or external components or usageof Kubernetes/Dockers). > > • In the Community edition of PostgreSQL is it possible to setup a cluster where all the nodes are able to concurrentlyread-write the underlying database image using NATIVE features (i.e. without using any extensions or externalcomponents or usage of Kubernetes/Dockers). Short answer: No. The community version of PostgreSQL, without any extensions beyond what is available in the core distribution, supports noneof these. I will offer that your client is not being realistic if these are their requirements, and they expect themto be fulfilled by the core distribution of any open-source database. There are commercial extensions to PostgreSQL that provide the first, but PostgreSQL does not do so out of the box. It ispossible, now, to build this on top of community PostgreSQL with logical replication, but there is notable developmentwork involved, and you cannot just drop an existing database into PostgreSQL and expect this to work. In anyevent, you will need to make sure the schema is compatible with an active-active model. For the second, you can explore open-source projects such as Citus, but some attention to the schema and queries will berequired. No product, either commercial or open-source, provides the last one (read-write shared storage), although there are commercialproducts that provide for a shared-storage model single-writer, multiple-reader model (for example, Amazon Aurora).
On Mon, Jul 15, 2024 at 3:28 PM Christophe Pettus <xof@thebuild.com> wrote:
> On Jul 15, 2024, at 12:06, Sarkar, Subhadeep <subhadeepsarkar@kpmg.com> wrote:
>
[snip]
> • In the Community edition of PostgreSQL is it possible to setup a cluster where all the nodes are able to concurrently read-write the underlying database image using NATIVE features (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).
[snip]
No product, either commercial or open-source, provides the last one (read-write shared storage), although there are commercial products that provide for a shared-storage model single-writer, multiple-reader model (for example, Amazon Aurora).
This "lack of products" puzzles me, because DEC was doing this with VAX (then Alpha and Itanium) clusters 40 years ago via a Distributed Lock Manager integrated deep into VMS. Their Rdb and (CODASYL) DBMS products used those functions extensively.
(In the late 1990s, they sold the DLM code to Oracle, which is where RAC comes from.)
It was shared-disk, multiple-writer, because the DLM allowed for locking at the row level. Thus, a half dozen cluster nodes could hold write locks on different rows on the same data page.
## Ron Johnson (ronljohnsonjr@gmail.com): > This "lack of products" puzzles me, because DEC was doing this with VAX > (then Alpha and Itanium) clusters 40 years ago via a Distributed Lock > Manager integrated deep into VMS. Their Rdb and (CODASYL) DBMS products Tech and trade-offs have changed over the last 40 years :) These days you can so many cores in one package, while "more than one processor" was quite a feat in the 80ies ("A dual processor VAX 11/780", 1982 https://dl.acm.org/doi/10.5555/800048.801738; also the 11/782 and 11/784), and you get so much RAM and storage (even fast storage, if you keep it local) with that package. Response Latency really jumps if you have to communicate with anything outside your box. While latency matters, the number of problems where you absolutely need that distributed lock manager has not really grown that much, I think. Regards, Christoph -- Spare Space
On Mon, Jul 15, 2024 at 5:54 PM Christoph Moench-Tegeder <cmt@burggraben.net> wrote:
## Ron Johnson (ronljohnsonjr@gmail.com):
> This "lack of products" puzzles me, because DEC was doing this with VAX
> (then Alpha and Itanium) clusters 40 years ago via a Distributed Lock
> Manager integrated deep into VMS. Their Rdb and (CODASYL) DBMS products
Tech and trade-offs have changed over the last 40 years :)
These days you can so many cores in one package, while "more than one
processor" was quite a feat in the 80ies ("A dual processor VAX 11/780",
1982 https://dl.acm.org/doi/10.5555/800048.801738; also the 11/782 and
11/784), and you get so much RAM and storage (even fast storage, if
you keep it local) with that package. Response Latency really jumps
if you have to communicate with anything outside your box.
While latency matters, the number of problems where you absolutely
need that distributed lock manager has not really grown that much,
I think.
Customers still want High Availability, and VMS Clusters were great for HA.
Στις 15/7/24 22:55, ο/η Ron Johnson έγραψε:
IMHO IBM did something similar with their shared DASDi back in 70s+. There was serialization mechanism enabling concurrent writes to data sets (meaning files). Not to mention IBM had great VM technology back in the day.On Mon, Jul 15, 2024 at 3:28 PM Christophe Pettus <xof@thebuild.com> wrote:
> On Jul 15, 2024, at 12:06, Sarkar, Subhadeep <subhadeepsarkar@kpmg.com> wrote:
>[snip]> • In the Community edition of PostgreSQL is it possible to setup a cluster where all the nodes are able to concurrently read-write the underlying database image using NATIVE features (i.e. without using any extensions or external components or usage of Kubernetes/Dockers).[snip]No product, either commercial or open-source, provides the last one (read-write shared storage), although there are commercial products that provide for a shared-storage model single-writer, multiple-reader model (for example, Amazon Aurora).This "lack of products" puzzles me, because DEC was doing this with VAX (then Alpha and Itanium) clusters 40 years ago via a Distributed Lock Manager integrated deep into VMS. Their Rdb and (CODASYL) DBMS products used those functions extensively.
(In the late 1990s, they sold the DLM code to Oracle, which is where RAC comes from.)It was shared-disk, multiple-writer, because the DLM allowed for locking at the row level. Thus, a half dozen cluster nodes could hold write locks on different rows on the same data page.
-- Achilleas Mantzios IT DEV - HEAD IT DEPT Dynacom Tankers Mgmt (as agents only)
> On Jul 15, 2024, at 12:06, Sarkar, Subhadeep <subhadeepsarkar@kpmg.com> wrote: > > • Does the Community edition of PostgreSQL provide NATIVE active-active high availability clustering with objectives ofscalability, load balancing and high availability without using any extensions or external components or usage of Kubernetes/Dockers. > > • In the Community edition of PostgreSQL is it possible to setup a database cluster in load balancing mode and providevertical and horizontal scalability without repartitioning or changes to the database objects or 3rd party transactionrouting mechanisms using NATIVE features only (i.e. without using any extensions or external components or usageof Kubernetes/Dockers). > > • In the Community edition of PostgreSQL is it possible to setup a cluster where all the nodes are able to concurrentlyread-write the underlying database image using NATIVE features (i.e. without using any extensions or externalcomponents or usage of Kubernetes/Dockers). I do have to add that this list of requirements sounds very much like a set supplied by the sales organization of a commercialdatabase vendor (in fact, a particular commercial database vendor) in order to exclude open-source software. Open-sourcesoftware relies very much on a whole ecosystem, rather than attempting to deliver every imaginable feature inthe core distribution. This is especially true of projects like PostgreSQL which are not backed by a single company anddo not have an official commercial distribution. Can your client articulate why they need these specific features, andwhy they must be in the core distribution?