Обсуждение: Architecture of walreceiver (Streaming Replication)

Поиск
Список
Период
Сортировка

Architecture of walreceiver (Streaming Replication)

От
Fujii Masao
Дата:
Hi,

Recently, the development of SR is not progressing because of
the indecision on whether walreceiver should be a subprocess
of the startup process (i.e., a stand-alone program), or of
postmaster. Since time is running out, I'd like to discuss
about this and advance the project.

The related threads are:
http://archives.postgresql.org/pgsql-hackers/2009-09/msg01101.php
http://archives.postgresql.org/pgsql-hackers/2009-09/msg01291.php

IMO, walreceiver should be a subprocess of postmaster for
the following reasons.

1. It's not easy to give a GUC parameter to a stand-alone  walreceiver program. A simple approach is giving a
parameteras a command-line argument. But this wouldn't  cover a reload of parameter.
 

2. It's not easy to treat the log messages generated by  a stand-alone walreceiver as well as the other postgres
messages.A straightforward approach is that the startup  process passes along the messages to the logger process.  But
thisis not simple.
 

I agree that a stand-alone walreceiver is useful for some
cases. But I think that it's sufficient to provide that as
contrib or pgfoundry tool. Not need to provide that in core.
The communication interface to walsender is going to be
provided as libpq, so it's not difficult to implement such
a stand-alone tool.

Thought? Please feel free to comment.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Architecture of walreceiver (Streaming Replication)

От
Robert Haas
Дата:
On Nov 2, 2009, at 5:06 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

> Hi,
>
> Recently, the development of SR is not progressing because of
> the indecision on whether walreceiver should be a subprocess
> of the startup process (i.e., a stand-alone program), or of
> postmaster. Since time is running out, I'd like to discuss
> about this and advance the project.
>
> The related threads are:
> http://archives.postgresql.org/pgsql-hackers/2009-09/msg01101.php
> http://archives.postgresql.org/pgsql-hackers/2009-09/msg01291.php
>
> IMO, walreceiver should be a subprocess of postmaster for
> the following reasons.
>
> 1. It's not easy to give a GUC parameter to a stand-alone
>   walreceiver program. A simple approach is giving a
>   parameter as a command-line argument. But this wouldn't
>   cover a reload of parameter.
>
> 2. It's not easy to treat the log messages generated by
>   a stand-alone walreceiver as well as the other postgres
>   messages. A straightforward approach is that the startup
>   process passes along the messages to the logger process.
>   But this is not simple.
>
> I agree that a stand-alone walreceiver is useful for some
> cases. But I think that it's sufficient to provide that as
> contrib or pgfoundry tool. Not need to provide that in core.
> The communication interface to walsender is going to be
> provided as libpq, so it's not difficult to implement such
> a stand-alone tool.
>
> Thought? Please feel free to comment.

I agree. A stand-alone tool seems like a good idea (which is why I  
proposed it) but I don't think that should mean that we can't have a  
tightly integrated core facility. We can decide later whether there it  
is helpful for those things to share code; right now, we should focus  
on getting an initial version of this feature out the door.

Speaking of getting things out the door, what's up with Hot Standby?   
It seemed like the outstanding issues were just about dealt with, and  
then the discussion died off...

...Robert


Re: Architecture of walreceiver (Streaming Replication)

От
Euler Taveira de Oliveira
Дата:
Fujii Masao escreveu:
> IMO, walreceiver should be a subprocess of postmaster for
> the following reasons.
> 
+1. I agree that the first version should be as close as possible to
postmaster. My points are: (i) it will be easier to install (no need to
install another third-party software), (ii) it will be easier to administrate
(the options will be available in one central point -- postgresql.conf), and
(iii) it will be easier to control (it is a postmaster subprocess).

But I see some value if it would be possible to design it in a way that other
third-party softwares could replace it completely (even if it couldn't take
advantage of some postmaster features).

Of course, there is no need to develop such a POC external walreceiver tool.
You just need to have in mind that available interfaces should be accessible
by external tools. If someone decides to code a tool to mimic walreceiver but
with some aditional features such as wal filtering then (s)he is free to do it
because we provide entry points in the API.

BTW, are you going to submit another WIP patch for next commitfest?


--  Euler Taveira de Oliveira http://www.timbira.com/


Re: Architecture of walreceiver (Streaming Replication)

От
Robert Haas
Дата:
On Mon, Nov 2, 2009 at 10:14 AM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:
> BTW, are you going to submit another WIP patch for next commitfest?

Well, Heikki was going to keep working on this and Hot Standby between
CommitFests "until it gets committed", but things seem to be stalled
at the moment, possibly because Heikki is tied up with internal
EnterpriseDB projects.  I don't think the hold-up is with Fujii Masao.

...Robert


Re: Architecture of walreceiver (Streaming Replication)

От
Heikki Linnakangas
Дата:
Robert Haas wrote:
> On Mon, Nov 2, 2009 at 10:14 AM, Euler Taveira de Oliveira
> <euler@timbira.com> wrote:
>> BTW, are you going to submit another WIP patch for next commitfest?
> 
> Well, Heikki was going to keep working on this and Hot Standby between
> CommitFests "until it gets committed", but things seem to be stalled
> at the moment, possibly because Heikki is tied up with internal
> EnterpriseDB projects.  I don't think the hold-up is with Fujii Masao.

Right. I got dragged away into other stuff for the last week or so.

wrt. synchronous replication, if someone else has the cycles to look at
it, that would be great. I got stuck on the postmaster-process or not
question Fujii raised again now, not being able to decide.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Architecture of walreceiver (Streaming Replication)

От
Heikki Linnakangas
Дата:
Euler Taveira de Oliveira wrote:
> Fujii Masao escreveu:
>> IMO, walreceiver should be a subprocess of postmaster for
>> the following reasons.
>>
> +1. I agree that the first version should be as close as possible to
> postmaster. My points are: (i) it will be easier to install (no need to
> install another third-party software), (ii) it will be easier to administrate
> (the options will be available in one central point -- postgresql.conf), and
> (iii) it will be easier to control (it is a postmaster subprocess).

None of these points are really for or against either approach. In any
case, we would ship with all the required components, so no need to
install 3rd party software. The recovery related options would come from
recovery.conf in both models, although that could be changed if we
wanted to.

Not sure what easier to control (iii) means, although admittedly it's a
bit tricky to make it walreceiver behave correctly as a subprocess of
the startup process, making sure it responds to shutdown requests etc.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Architecture of walreceiver (Streaming Replication)

От
Fujii Masao
Дата:
Hi,

On Tue, Nov 3, 2009 at 3:23 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> wrt. synchronous replication, if someone else has the cycles to look at
> it, that would be great. I got stuck on the postmaster-process or not
> question Fujii raised again now, not being able to decide.

What is your worry about the postmaster-subprocess walreceiver?

One of those is that the startup process would become stuck because
of failure of launching of walreceiver, and I have addressed that.
http://archives.postgresql.org/pgsql-hackers/2009-09/msg02003.php

If you have another worry, I'll address that.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Architecture of walreceiver (Streaming Replication)

От
Fujii Masao
Дата:
On Tue, Nov 3, 2009 at 12:33 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Nov 2, 2009 at 10:14 AM, Euler Taveira de Oliveira
> <euler@timbira.com> wrote:
>> BTW, are you going to submit another WIP patch for next commitfest?
>
> Well, Heikki was going to keep working on this and Hot Standby between
> CommitFests "until it gets committed", but things seem to be stalled
> at the moment, possibly because Heikki is tied up with internal
> EnterpriseDB projects.  I don't think the hold-up is with Fujii Masao.

BTW, my replication patch is on git repository:
   git://git.postgresql.org/git/users/fujii/postgres.git   branch: replication

The changes against Heikki's repository
(git://git.postgresql.org/git/users/heikki/postgres.git,
branch: replication-orig) are:

- Prevent pq_wait from being called more than once for the connection which has already turned out to have data ready
tobe read. 
 Sometimes walsender was calling pq_wait more than once for the connection before actually reading data. This is OK in
Linux,the subsequent pq_wait returns immediately. OTOH, in Windows, this makes the subsequent pq_wait get stuck, i.e.,
thepq_wait doesn't return even if there is data ready to be read in the connection. Which seems to be derived from the
half-bakedimplementation of pgwin32_select. 
 So I changed pq_wait not to call select/poll until data was read from the connection, once it turned out to be
available.

- Fix the bug that has crossed a logid boundary wrongly. This bug was introduced by sr-paging-rework.patch.
http://archives.postgresql.org/pgsql-hackers/2009-10/msg00384.php

- Apply the sr_rework_1001.patch. http://archives.postgresql.org/pgsql-hackers/2009-09/msg01996.php

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Architecture of walreceiver (Streaming Replication)

От
Tatsuo Ishii
Дата:
> Recently, the development of SR is not progressing because of
> the indecision on whether walreceiver should be a subprocess
> of the startup process (i.e., a stand-alone program), or of
> postmaster. Since time is running out, I'd like to discuss
> about this and advance the project.
> 
> The related threads are:
> http://archives.postgresql.org/pgsql-hackers/2009-09/msg01101.php
> http://archives.postgresql.org/pgsql-hackers/2009-09/msg01291.php
> 
> IMO, walreceiver should be a subprocess of postmaster for
> the following reasons.
> 
> 1. It's not easy to give a GUC parameter to a stand-alone
>    walreceiver program. A simple approach is giving a
>    parameter as a command-line argument. But this wouldn't
>    cover a reload of parameter.
> 
> 2. It's not easy to treat the log messages generated by
>    a stand-alone walreceiver as well as the other postgres
>    messages. A straightforward approach is that the startup
>    process passes along the messages to the logger process.
>    But this is not simple.
> 
> I agree that a stand-alone walreceiver is useful for some
> cases. But I think that it's sufficient to provide that as
> contrib or pgfoundry tool. Not need to provide that in core.
> The communication interface to walsender is going to be
> provided as libpq, so it's not difficult to implement such
> a stand-alone tool.

+1. I agree with the idea walreceiver runs as subprocess of
postmaster.
--
Tatsuo Ishii
SRA OSS, Inc. Japan


Re: Architecture of walreceiver (Streaming Replication)

От
Heikki Linnakangas
Дата:
Fujii Masao wrote:
> On Tue, Nov 3, 2009 at 12:33 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Nov 2, 2009 at 10:14 AM, Euler Taveira de Oliveira
>> <euler@timbira.com> wrote:
>>> BTW, are you going to submit another WIP patch for next commitfest?
>> Well, Heikki was going to keep working on this and Hot Standby between
>> CommitFests "until it gets committed", but things seem to be stalled
>> at the moment, possibly because Heikki is tied up with internal
>> EnterpriseDB projects.  I don't think the hold-up is with Fujii Masao.
> 
> BTW, my replication patch is on git repository:
> 
>     git://git.postgresql.org/git/users/fujii/postgres.git
>     branch: replication

Thanks, I started to look at this again now. The consensus seems to be
to keep the current architecture where walreceiver is a child of postmaster.

I found the global LogstreamResult variable very confusing. It meant
different things in different processes. So I replaced it with static
globals in walsender.c and walreceiver.c, and renamed the fields to
match the purpose better. I removed some variables from shared memory
that are not necessary, at least not before we have synchronous mode:
Walsender only needs to publish how far it has sent, and walreceiver
only needs to tell startup process how far it has fsync'd.

I changed walreceiver so that it only lets the startup process to apply
WAL that it has fsync'd to disk, per recent discussion on hackers. Maybe
we want to support more esoteric modes in the future, but that's the
least surprising and most useful one.

Plus some other minor simplifications. My changes are in my git repo at
git://git.postgresql.org/git/users/heikki/postgres.git, branch
"replication".

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Architecture of walreceiver (Streaming Replication)

От
Fujii Masao
Дата:
Hi,

On Fri, Nov 20, 2009 at 5:54 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Thanks, I started to look at this again now.

Thanks a lot!

> I found the global LogstreamResult variable very confusing. It meant
> different things in different processes. So I replaced it with static
> globals in walsender.c and walreceiver.c, and renamed the fields to
> match the purpose better. I removed some variables from shared memory
> that are not necessary, at least not before we have synchronous mode:
> Walsender only needs to publish how far it has sent, and walreceiver
> only needs to tell startup process how far it has fsync'd.

OK.

> I changed walreceiver so that it only lets the startup process to apply
> WAL that it has fsync'd to disk, per recent discussion on hackers. Maybe
> we want to support more esoteric modes in the future, but that's the
> least surprising and most useful one.

OK. We'll need to go forward in stages.

> Plus some other minor simplifications. My changes are in my git repo at
> git://git.postgresql.org/git/users/heikki/postgres.git, branch
> "replication".

I fixed one bug. I also look through the code over and over again.
   git://git.postgresql.org/git/users/fujii/postgres.git   branch: replication

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center