Обсуждение: Assertion failure in base backup code path

Поиск
Список
Период
Сортировка

Assertion failure in base backup code path

От
Antonin Houska
Дата:
In HEAD, pg_basebackup causes WAL sender to fail when the replication
user is not a superuser:


#0  0x00007f34f671dd25 in raise () from /lib64/libc.so.6
#1  0x00007f34f671f1a8 in abort () from /lib64/libc.so.6
#2  0x00000000008989a9 in ExceptionalCondition (conditionName=0xa51ac1
"!(IsTransactionState())", errorType=0xa51734 "FailedAssertion",
fileName=0xa516e0 "catcache.c", lineNumber=1111) at assert.c:54
#3  0x000000000087ea36 in SearchCatCache (cache=0x23c4fb8, v1=16384,
v2=0, v3=0, v4=0) at catcache.c:1111
#4  0x0000000000890cdd in SearchSysCache (cacheId=11, key1=16384,
key2=0, key3=0, key4=0) at syscache.c:909
#5  0x00000000008a9a99 in has_rolreplication (roleid=16384) at
miscinit.c:401
#6  0x00000000005146d1 in do_pg_start_backup (backupidstr=0x239d5b0
"bkp_01", fast=0 '\000', starttli_p=0x7fff78e4f8ec,
labelfile=0x7fff78e4f8e0) at xlog.c:9633
#7  0x0000000000733a24 in perform_base_backup (opt=0x7fff78e4fa30,
tblspcdir=0x242c6a0) at basebackup.c:106
#8  0x0000000000735013 in SendBaseBackup (cmd=0x239dbf8) at basebackup.c:563
#9  0x000000000072f4f2 in exec_replication_command (cmd_string=0x23ea078
"BASE_BACKUP LABEL 'bkp_01'  WAL  NOWAIT") at walsender.c:668
#10 0x000000000077c5c4 in PostgresMain (argc=1, argv=0x2385358,
dbname=0x2385248 "", username=0x2385210 "postgres_replication") at
postgres.c:4009
#11 0x0000000000717c94 in BackendRun (port=0x23a2e90) at postmaster.c:4085
#12 0x000000000071742e in BackendStartup (port=0x23a2e90) at
postmaster.c:3774
#13 0x0000000000713cc9 in ServerLoop () at postmaster.c:1585
#14 0x0000000000713370 in PostmasterMain (argc=3, argv=0x2381f60) at
postmaster.c:1240
#15 0x0000000000677698 in main (argc=3, argv=0x2381f60) at main.c:196

Some additional condition may be needed in the Assert() statement?

// Antonin Houska (Tony)



Re: Assertion failure in base backup code path

От
Andres Freund
Дата:
Hi,

On 2013-12-16 15:08:51 +0100, Antonin Houska wrote:
> In HEAD, pg_basebackup causes WAL sender to fail when the replication
> user is not a superuser:
> 
> 
> #0  0x00007f34f671dd25 in raise () from /lib64/libc.so.6
> #1  0x00007f34f671f1a8 in abort () from /lib64/libc.so.6
> #2  0x00000000008989a9 in ExceptionalCondition (conditionName=0xa51ac1
> "!(IsTransactionState())", errorType=0xa51734 "FailedAssertion",
> fileName=0xa516e0 "catcache.c", lineNumber=1111) at assert.c:54
> #3  0x000000000087ea36 in SearchCatCache (cache=0x23c4fb8, v1=16384,
> v2=0, v3=0, v4=0) at catcache.c:1111
> #4  0x0000000000890cdd in SearchSysCache (cacheId=11, key1=16384,
> key2=0, key3=0, key4=0) at syscache.c:909
> #5  0x00000000008a9a99 in has_rolreplication (roleid=16384) at
> miscinit.c:401
> #6  0x00000000005146d1 in do_pg_start_backup (backupidstr=0x239d5b0
> "bkp_01", fast=0 '\000', starttli_p=0x7fff78e4f8ec,
> labelfile=0x7fff78e4f8e0) at xlog.c:9633
> #7  0x0000000000733a24 in perform_base_backup (opt=0x7fff78e4fa30,
> tblspcdir=0x242c6a0) at basebackup.c:106
> #8  0x0000000000735013 in SendBaseBackup (cmd=0x239dbf8) at basebackup.c:563
> #9  0x000000000072f4f2 in exec_replication_command (cmd_string=0x23ea078
> "BASE_BACKUP LABEL 'bkp_01'  WAL  NOWAIT") at walsender.c:668
> #10 0x000000000077c5c4 in PostgresMain (argc=1, argv=0x2385358,
> dbname=0x2385248 "", username=0x2385210 "postgres_replication") at
> postgres.c:4009
> #11 0x0000000000717c94 in BackendRun (port=0x23a2e90) at postmaster.c:4085
> #12 0x000000000071742e in BackendStartup (port=0x23a2e90) at
> postmaster.c:3774
> #13 0x0000000000713cc9 in ServerLoop () at postmaster.c:1585
> #14 0x0000000000713370 in PostmasterMain (argc=3, argv=0x2381f60) at
> postmaster.c:1240
> #15 0x0000000000677698 in main (argc=3, argv=0x2381f60) at main.c:196
> 
> Some additional condition may be needed in the Assert() statement?

Actually it more looks like a bug around the basebackup facility. It
shouldn't do syscache lookups without having the resource management
stuff around it (the transaction state asserted in SearchCatCache()).

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Assertion failure in base backup code path

От
Andres Freund
Дата:
Hi Magnus,

It looks to me like the path to do_pg_start_backup() outside of a
transaction context comes from your initial commit of the base backup
facility.
The problem is that you're not allowed to do anything leading to a
syscache/catcache lookup in those contexts.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Assertion failure in base backup code path

От
Magnus Hagander
Дата:
<p dir="ltr"><br /> On Dec 19, 2013 12:06 AM, "Andres Freund" <<a
href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>>wrote:<br /> ><br /> > Hi Magnus,<br />
><br/> > It looks to me like the path to do_pg_start_backup() outside of a<br /> > transaction context comes
fromyour initial commit of the base backup<br /> > facility.<br /> > The problem is that you're not allowed to do
anythingleading to a<br /> > syscache/catcache lookup in those contexts.<br /><p dir="ltr">I think that may have
comewith the addition of the replication privilege actually but that doesn't change the fact that yes, it appears
broken..<p dir="ltr">/Magnus  

Re: Assertion failure in base backup code path

От
Andres Freund
Дата:
On 2013-12-23 18:28:51 +0100, Magnus Hagander wrote:
> On Dec 19, 2013 12:06 AM, "Andres Freund" <andres@2ndquadrant.com> wrote:
> >
> > Hi Magnus,
> >
> > It looks to me like the path to do_pg_start_backup() outside of a
> > transaction context comes from your initial commit of the base backup
> > facility.
> > The problem is that you're not allowed to do anything leading to a
> > syscache/catcache lookup in those contexts.
> 
> I think that may have come with the addition of the replication privilege
> actually but that doesn't change the fact that yes, it appears broken..

There was a if (!superuser()) check there before as well, that has the
same dangers.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Assertion failure in base backup code path

От
Magnus Hagander
Дата:
On Tue, Dec 24, 2013 at 1:24 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-12-23 18:28:51 +0100, Magnus Hagander wrote:
> On Dec 19, 2013 12:06 AM, "Andres Freund" <andres@2ndquadrant.com> wrote:
> >
> > Hi Magnus,
> >
> > It looks to me like the path to do_pg_start_backup() outside of a
> > transaction context comes from your initial commit of the base backup
> > facility.
> > The problem is that you're not allowed to do anything leading to a
> > syscache/catcache lookup in those contexts.
>
> I think that may have come with the addition of the replication privilege
> actually but that doesn't change the fact that yes, it appears broken..

There was a if (!superuser()) check there before as well, that has the
same dangers.


I think the correct fix is to move the security check outside of do_pg_start_backup() and leave it to the caller. That means pg_start_backup() for a manual backup. And for a streaming base backup the check has already been made - you can't get through the authentication step if you don't have the required permissions.

Does the attached seem right to you? 

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
Вложения

Re: Assertion failure in base backup code path

От
Andres Freund
Дата:
On 2014-01-07 17:40:07 +0100, Magnus Hagander wrote:
> On Tue, Dec 24, 2013 at 1:24 PM, Andres Freund <andres@2ndquadrant.com>wrote:
> 
> > On 2013-12-23 18:28:51 +0100, Magnus Hagander wrote:
> > > On Dec 19, 2013 12:06 AM, "Andres Freund" <andres@2ndquadrant.com>
> > wrote:
> > > >
> > > > Hi Magnus,
> > > >
> > > > It looks to me like the path to do_pg_start_backup() outside of a
> > > > transaction context comes from your initial commit of the base backup
> > > > facility.
> > > > The problem is that you're not allowed to do anything leading to a
> > > > syscache/catcache lookup in those contexts.
> > >
> > > I think that may have come with the addition of the replication privilege
> > > actually but that doesn't change the fact that yes, it appears broken..
> >
> > There was a if (!superuser()) check there before as well, that has the
> > same dangers.
> >
> >
> I think the correct fix is to move the security check outside of
> do_pg_start_backup() and leave it to the caller. That means
> pg_start_backup() for a manual backup. And for a streaming base backup the
> check has already been made - you can't get through the authentication step
> if you don't have the required permissions.

Yes, that's what I was thinking and was planning to write you about at
some point.

> Does the attached seem right to you?

I haven't run the code, but it looks right to me.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Assertion failure in base backup code path

От
Magnus Hagander
Дата:
On Tue, Jan 7, 2014 at 5:43 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2014-01-07 17:40:07 +0100, Magnus Hagander wrote:
> On Tue, Dec 24, 2013 at 1:24 PM, Andres Freund <andres@2ndquadrant.com>wrote:
>
> > On 2013-12-23 18:28:51 +0100, Magnus Hagander wrote:
> > > On Dec 19, 2013 12:06 AM, "Andres Freund" <andres@2ndquadrant.com>
> > wrote:
> > > >
> > > > Hi Magnus,
> > > >
> > > > It looks to me like the path to do_pg_start_backup() outside of a
> > > > transaction context comes from your initial commit of the base backup
> > > > facility.
> > > > The problem is that you're not allowed to do anything leading to a
> > > > syscache/catcache lookup in those contexts.
> > >
> > > I think that may have come with the addition of the replication privilege
> > > actually but that doesn't change the fact that yes, it appears broken..
> >
> > There was a if (!superuser()) check there before as well, that has the
> > same dangers.
> >
> >
> I think the correct fix is to move the security check outside of
> do_pg_start_backup() and leave it to the caller. That means
> pg_start_backup() for a manual backup. And for a streaming base backup the
> check has already been made - you can't get through the authentication step
> if you don't have the required permissions.

Yes, that's what I was thinking and was planning to write you about at
some point.

Good, then we think the same :)
 

> Does the attached seem right to you?

I haven't run the code, but it looks right to me.


It worked fine in my testing, so I've pushed that version.

Looks slightly different in both 9.2 and 9.1 (not clean backpatching) due to code reorganization and such, but AFAICT it's just code in different places.

Thanks!

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/