Re: Allow replication roles to use file access functions

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Allow replication roles to use file access functions
Дата
Msg-id 20150903010556.GZ3685@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: Allow replication roles to use file access functions  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: Allow replication roles to use file access functions  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-hackers
Micahel,

* Michael Paquier (michael.paquier@gmail.com) wrote:
> On Thu, Sep 3, 2015 at 4:52 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > Michael Paquier wrote:
> >> As of now, file access functions in genfile.c can only be used by
> >> superusers. This proposal is to relax those functions so as
> >> replication users can use them as well. Here are the functions aimed
> >> by this patch:
> >> - pg_stat_file
> >> - pg_read_binary_file
> >> - pg_read_file
> >> - pg_ls_dir
> >> The main argument for this change is that pg_rewind makes use of those
> >> functions, forcing users to use a superuser role when rewinding a
> >> node.
> >
> > Can this piggyback on Stephen Frost's proposed patch for reserved roles et al?
>
> (Adding Stephen in the loop)

Thanks!

> I thought that this was rather independent on Stephen's default role
> patch because this is not based on a new role permission type.
> Stephen, do you think so, or should we merge the effort with your
> things?

It's not directly related to the default roles as it's adding
permissions to an existing role attribute control, but it's certainly
covering related territory and I'd like to make sure we're careful with
any increase in permissions for existing role attributes.

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
> > On Thu, Sep 3, 2015 at 4:52 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> >> Michael Paquier wrote:
> >>> The main argument for this change is that pg_rewind makes use of those
> >>> functions, forcing users to use a superuser role when rewinding a
> >>> node.
>
> >> Can this piggyback on Stephen Frost's proposed patch for reserved roles et al?
>
> > (Adding Stephen in the loop)
> > I thought that this was rather independent on Stephen's default role
> > patch because this is not based on a new role permission type.
> > Stephen, do you think so, or should we merge the effort with your
> > things?
>
> Just on general principles, this seems like a pretty horrid idea.

Agreed- it's far broader than what is necessary, as I understand it.

> To me replication privilege means the ability to transfer data out of
> the master, not to cause arbitrary state changes on the master.

Agreed, but as Andres points out, this doesn't allow state changes on
the master.

> Therefore, "allow replication users to run pg_rewind" seems less like
> a feature and more like a security bug.  Am I missing something?

I believe Andres clarified this- this is about remastering an old master
to follow a *new* master, and at this point, the old master needs
information from the new master to reset the old master system back to a
point where the old master can turn into a replica of the new master.

> Another problem with it is that access to the filesystem is about halfway
> to being superuser anyway, even if it's only read-only access.  It would
> certainly let you read things that one would not expect a replication
> user to be able to read (like files unrelated to Postgres).

The replication role already has read-only access to everything
(nearly?) in the PGDATA directory.  The specific issue in this case, I
believe, is if the functions mentioned provide access beyond what the
replication role already has access to.

> I don't entirely understand why pg_rewind should be invoking any of these
> functions to begin with.  Isn't it working directly with the disk data,
> with both postmasters shut down?

Certain information is needed from the new master to rewind the old
master.  What I would hope is that we'd provide a way for *just* that
information to be made available to the replication role rather than
completely general access through the functions mentioned.

* Andres Freund (andres@anarazel.de) wrote:
> On 2015-09-02 19:48:15 -0400, Tom Lane wrote:
> > Just on general principles, this seems like a pretty horrid idea.
> > To me replication privilege means the ability to transfer data out of
> > the master, not to cause arbitrary state changes on the master.
>
> It's not about the permission to trigger pg_rewind on the master - it's
> about being able to run pg_rewind (as the necessary OS user) on the
> *standby* when the connection to the primary has only replication rather
> than superuser privs.

Understood.

> > Another problem with it is that access to the filesystem is about halfway
> > to being superuser anyway, even if it's only read-only access.  It would
> > certainly let you read things that one would not expect a replication
> > user to be able to read (like files unrelated to Postgres).
>
> Doesn't pg_read_file et al insist that the path is below the data
> directorY?

I don't believe that's the case, actually..  I might be wrong though,
I'm not looking at the code right now.  This is definitely a big part of
the question, but I'd like to ask- what, exactly, does pg_rewind
actually need access to?  Is it only the WAL, or are heap and WAL files
needed?  How much of that access could be reasonably needed as part of
the replication protocol for other purposes?

Consider the discussion about delta backups, et al, using LSNs.  Perhaps
the replication protocol should be extended to allow access to arbitrary
WAL, querying what WAL is available, and access to the heap files,
perhaps even specific pages in the heap files and relation forks,
instead of giving pg_rewind access to these extremely general
nearly-OS-user-level functions.

* Michael Paquier (michael.paquier@gmail.com) wrote:
> On Thu, Sep 3, 2015 at 8:59 AM, Andres Freund <andres@anarazel.de> wrote:
> > On 2015-09-02 19:48:15 -0400, Tom Lane wrote:
> >> Just on general principles, this seems like a pretty horrid idea.
> >> To me replication privilege means the ability to transfer data out of
> >> the master, not to cause arbitrary state changes on the master.
> >
> > It's not about the permission to trigger pg_rewind on the master - it's
> > about being able to run pg_rewind (as the necessary OS user) on the
> > *standby* when the connection to the primary has only replication rather
> > than superuser privs.
>
> Yeah, I got poked by this limitation of pg_rewind some time ago
> internally actually, folks willing to be able to manage their cluster
> only with a replication role, and they were not really willing to have
> a superuser for such operations being used across the network.

What I believe Tom is, rightly, worried about is exactly this case.
That is, users believe that the replication role is less powerful and
less dangerous than the superuser role and we should be careful to
maintain that.  If we're unable or unwilling to do so, we should
eliminate the replication role as being redundant to superuser.

I don't believe that's the case here, but I do view giving access to the
replication role for these functions as going farther than we need to,
or should, go.

Further, for my two cents, I view any access to the system by a
replication role to a non-replication-protocol communication with the
server as being inappropriate.  To that point, I'd actually remove the
ability for the replication roles to call pg_start/stop_backup as normal
SQL calls since that implies the replication user has connected to a
normal backend instead of talking the replication protocol.  I'm not
sure if we can really put that cat back in the bag, but I'd be a lot
happier if we could.  Allowing access to these functions outside of the
replication protocol doubles-down on that bad idea, in my view.
Thanks!
    Stephen

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Horizontal scalability/sharding
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Improving test coverage of extensions with pg_dump