Обсуждение: doPickSplit stack buffer overflow in XLogInsert?

Поиск
Список
Период
Сортировка

doPickSplit stack buffer overflow in XLogInsert?

От
Kevin Grittner
Дата:
I happened to build in a shell that was still set up for the clang
address sanitizer, and got the attached report.  On a rerun it was
repeatable.  XLogInsert() seems to read past the end of a variable
allocated on the stack in doPickSplit(). I haven't tried to analyze
it past that, since this part of the code is unfamiliar to me.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: doPickSplit stack buffer overflow in XLogInsert?

От
Andres Freund
Дата:
On 2013-11-26 14:14:38 -0800, Kevin Grittner wrote:
> I happened to build in a shell that was still set up for the clang
> address sanitizer, and got the attached report.  On a rerun it was
> repeatable.  XLogInsert() seems to read past the end of a variable
> allocated on the stack in doPickSplit(). I haven't tried to analyze
> it past that, since this part of the code is unfamiliar to me.

Yea, I've seen that one before as well and planned to report it at some
point. The reason is the MAXALIGN()s in ACCEPT_RDATA_DATA(). That rounds
up to 8byte boundaries, while we've e.g. only added 2bytes of slop to
toDelete.


Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: doPickSplit stack buffer overflow in XLogInsert?

От
Kevin Grittner
Дата:
Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-11-26 14:14:38 -0800, Kevin Grittner wrote:
>
>> I happened to build in a shell that was still set up for the clang
>> address sanitizer, and got the attached report.  On a rerun it was
>> repeatable.  XLogInsert() seems to read past the end of a variable
>> allocated on the stack in doPickSplit(). I haven't tried to analyze
>> it past that, since this part of the code is unfamiliar to me.
>
> Yea, I've seen that one before as well and planned to report it at some
> point. The reason is the MAXALIGN()s in ACCEPT_RDATA_DATA(). That rounds
> up to 8byte boundaries, while we've e.g. only added 2bytes of slop to
> toDelete.

Have you established whether having the CRC calculation read past
the end of the buffer can cause problems on recovery or standby
systems?  Should we try to get this fixed by Monday?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: doPickSplit stack buffer overflow in XLogInsert?

От
Noah Misch
Дата:
On Wed, Nov 27, 2013 at 06:23:38AM -0800, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-11-26 14:14:38 -0800, Kevin Grittner wrote:
> >
> >> I happened to build in a shell that was still set up for the clang
> >> address sanitizer, and got the attached report.  On a rerun it was

(Kevin, I saw no attachment.)

> >> repeatable.  XLogInsert() seems to read past the end of a variable
> >> allocated on the stack in doPickSplit(). I haven't tried to analyze
> >> it past that, since this part of the code is unfamiliar to me.
> >
> > Yea, I've seen that one before as well and planned to report it at some
> > point. The reason is the MAXALIGN()s in ACCEPT_RDATA_DATA(). That rounds
> > up to 8byte boundaries, while we've e.g. only added 2bytes of slop to
> > toDelete.
> 
> Have you established whether having the CRC calculation read past
> the end of the buffer can cause problems on recovery or standby
> systems?  Should we try to get this fixed by Monday?

The threat is that rounding the read size up to the next MAXALIGN would cross
into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not aware
of an ABI where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic.  We should fix
this in due course on code hygiene grounds, but I would not back-patch it.

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



Re: doPickSplit stack buffer overflow in XLogInsert?

От
Kevin Grittner
Дата:
Noah Misch <noah@leadboat.com> wrote:

> (Kevin, I saw no attachment.)

Apologies.  Trying again.

> The threat is that rounding the read size up to the next MAXALIGN
> would cross into an unreadable memory page, resulting in a
> SIGSEGV.  Every palloc chunk has MAXALIGN'd size under the hood,
> so the excess read of "toDelete" cannot cause a SIGSEGV.  For a
> stack variable, it depends on the ABI.  I'm not aware of an ABI
> where the four bytes past the end of this stack variable could be
> unreadable, which is not to claim I'm well-read on the topic.  We
> should fix this in due course on code hygiene grounds, but I
> would not back-patch it.

If you're sure.  I hadn't worked through the code, but had two
concerns (neither of which was about a SEGSEGV):

(1)  That multiple MAXALIGNs of shorter values could push the
structure into overlap with the next thing on the stack, allowing
one or the other to get stepped on.

(2)  That the CRC calculation might picking up uninitialized data
which was not actually going to match what was used during
recovery, leading to "end of recovery" on replay.

If you are confident that neither of these is a real risk, I'll
relax about this.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

Re: doPickSplit stack buffer overflow in XLogInsert?

От
Noah Misch
Дата:
On Wed, Nov 27, 2013 at 11:38:23AM -0800, Kevin Grittner wrote:
> Noah Misch <noah@leadboat.com> wrote:
> > The threat is that rounding the read size up to the next MAXALIGN
> > would cross into an unreadable memory page, resulting in a
> > SIGSEGV.  Every palloc chunk has MAXALIGN'd size under the hood,
> > so the excess read of "toDelete" cannot cause a SIGSEGV.  For a
> > stack variable, it depends on the ABI.  I'm not aware of an ABI
> > where the four bytes past the end of this stack variable could be
> > unreadable, which is not to claim I'm well-read on the topic.  We
> > should fix this in due course on code hygiene grounds, but I
> > would not back-patch it.
> 
> If you're sure.  I hadn't worked through the code, but had two
> concerns (neither of which was about a SEGSEGV):
> 
> (1)  That multiple MAXALIGNs of shorter values could push the
> structure into overlap with the next thing on the stack, allowing
> one or the other to get stepped on.

These are out-of-bounds reads only, not writes.  Also, the excess doesn't
accumulate that way; the code reads beyond any particular stack variable or
palloc chunk by no more than 7 bytes.

> (2)  That the CRC calculation might picking up uninitialized data
> which was not actually going to match what was used during
> recovery, leading to "end of recovery" on replay.

The CRC calculation does pick up unspecified bytes, but we copy those same
bytes into the actual WAL record.  The effect is similar to that of
unspecified pad bytes in the middle of xlrec structures.

> If you are confident that neither of these is a real risk, I'll
> relax about this.

If there is a real risk, I'm not seeing it.

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



Re: doPickSplit stack buffer overflow in XLogInsert?

От
Andres Freund
Дата:
On 2013-11-27 15:29:24 -0500, Noah Misch wrote:
> > If you are confident that neither of these is a real risk, I'll
> > relax about this.
> 
> If there is a real risk, I'm not seeing it.

Me neither.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: doPickSplit stack buffer overflow in XLogInsert?

От
Peter Eisentraut
Дата:
On 11/26/13, 5:14 PM, Kevin Grittner wrote:
> I happened to build in a shell that was still set up for the clang
> address sanitizer, and got the attached report.  On a rerun it was
> repeatable.  XLogInsert() seems to read past the end of a variable
> allocated on the stack in doPickSplit(). I haven't tried to analyze
> it past that, since this part of the code is unfamiliar to me.

I also see that.  It only happens in 64-bit builds.



Re: doPickSplit stack buffer overflow in XLogInsert?

От
Peter Geoghegan
Дата:
On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch <noah@leadboat.com> wrote:
> The threat is that rounding the read size up to the next MAXALIGN would cross
> into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
> has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
> cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not aware
> of an ABI where the four bytes past the end of this stack variable could be
> unreadable, which is not to claim I'm well-read on the topic.  We should fix
> this in due course on code hygiene grounds, but I would not back-patch it.

Attached patch silences the "Invalid read of size n" complaints of
Valgrind. I agree with your general thoughts around backpatching. Note
that the patch addresses a distinct complaint from Kevin's, as
Valgrind doesn't take issue with the invalid reads past the end of
spgxlogPickSplit variables on the stack.

--
Peter Geoghegan

Вложения

Re: doPickSplit stack buffer overflow in XLogInsert?

От
Robert Haas
Дата:
On Wed, Mar 26, 2014 at 9:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch <noah@leadboat.com> wrote:
>> The threat is that rounding the read size up to the next MAXALIGN would cross
>> into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
>> has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
>> cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not aware
>> of an ABI where the four bytes past the end of this stack variable could be
>> unreadable, which is not to claim I'm well-read on the topic.  We should fix
>> this in due course on code hygiene grounds, but I would not back-patch it.
>
> Attached patch silences the "Invalid read of size n" complaints of
> Valgrind. I agree with your general thoughts around backpatching. Note
> that the patch addresses a distinct complaint from Kevin's, as
> Valgrind doesn't take issue with the invalid reads past the end of
> spgxlogPickSplit variables on the stack.

Is the needless zeroing this patch introduces apt to cause a
performance problem?

This function is actually pretty wacky.  If we're stuffing bytes with
undefined contents into the WAL record, maybe the answer isn't to
force the contents of those bytes to be defined, but rather to elide
them from the WAL record.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: doPickSplit stack buffer overflow in XLogInsert?

От
Andres Freund
Дата:
Hi,

We really should fix this one of these days.

On 2014-03-26 18:45:54 -0700, Peter Geoghegan wrote:
> Attached patch silences the "Invalid read of size n" complaints of
> Valgrind. I agree with your general thoughts around backpatching. Note
> that the patch addresses a distinct complaint from Kevin's, as
> Valgrind doesn't take issue with the invalid reads past the end of
> spgxlogPickSplit variables on the stack.

I don't think that's entirely sufficient. The local spgxlogPickSplit
xlrec;/spgxlogMoveLeafs xlrec; variables are also inserted while
MAXLIGNing their size. That's slightly harder to fix :(. I don't have a
better idea than also allocating them dynamically :(

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: doPickSplit stack buffer overflow in XLogInsert?

От
Heikki Linnakangas
Дата:
On 03/31/2014 09:08 PM, Robert Haas wrote:
> On Wed, Mar 26, 2014 at 9:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch <noah@leadboat.com> wrote:
>>> The threat is that rounding the read size up to the next MAXALIGN would cross
>>> into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
>>> has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
>>> cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not aware
>>> of an ABI where the four bytes past the end of this stack variable could be
>>> unreadable, which is not to claim I'm well-read on the topic.  We should fix
>>> this in due course on code hygiene grounds, but I would not back-patch it.
>>
>> Attached patch silences the "Invalid read of size n" complaints of
>> Valgrind. I agree with your general thoughts around backpatching. Note
>> that the patch addresses a distinct complaint from Kevin's, as
>> Valgrind doesn't take issue with the invalid reads past the end of
>> spgxlogPickSplit variables on the stack.
>
> Is the needless zeroing this patch introduces apt to cause a
> performance problem?
>
> This function is actually pretty wacky.  If we're stuffing bytes with
> undefined contents into the WAL record, maybe the answer isn't to
> force the contents of those bytes to be defined, but rather to elide
> them from the WAL record.

Agreed. I propose the attached, which removes the MAXALIGNs. It's not
suitable for backpatching, though, as it changes the format of the WAL
record.

- Heikki

Вложения

Re: doPickSplit stack buffer overflow in XLogInsert?

От
Andres Freund
Дата:
On 2014-05-06 13:33:01 +0300, Heikki Linnakangas wrote:
> On 03/31/2014 09:08 PM, Robert Haas wrote:
> >On Wed, Mar 26, 2014 at 9:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
> >>On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch <noah@leadboat.com> wrote:
> >>>The threat is that rounding the read size up to the next MAXALIGN would cross
> >>>into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
> >>>has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
> >>>cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not aware
> >>>of an ABI where the four bytes past the end of this stack variable could be
> >>>unreadable, which is not to claim I'm well-read on the topic.  We should fix
> >>>this in due course on code hygiene grounds, but I would not back-patch it.
> >>
> >>Attached patch silences the "Invalid read of size n" complaints of
> >>Valgrind. I agree with your general thoughts around backpatching. Note
> >>that the patch addresses a distinct complaint from Kevin's, as
> >>Valgrind doesn't take issue with the invalid reads past the end of
> >>spgxlogPickSplit variables on the stack.
> >
> >Is the needless zeroing this patch introduces apt to cause a
> >performance problem?
> >
> >This function is actually pretty wacky.  If we're stuffing bytes with
> >undefined contents into the WAL record, maybe the answer isn't to
> >force the contents of those bytes to be defined, but rather to elide
> >them from the WAL record.
> 
> Agreed. I propose the attached, which removes the MAXALIGNs. It's not
> suitable for backpatching, though, as it changes the format of the WAL
> record.

Not knowing anything about spgist this looks sane to me. Do we need a
backpatchable solution given that we seem to agree that these bugs
aren't likely to cause harm?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: doPickSplit stack buffer overflow in XLogInsert?

От
Heikki Linnakangas
Дата:
On 05/06/2014 07:36 PM, Andres Freund wrote:
> On 2014-05-06 13:33:01 +0300, Heikki Linnakangas wrote:
>> On 03/31/2014 09:08 PM, Robert Haas wrote:
>>> On Wed, Mar 26, 2014 at 9:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>>> On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch <noah@leadboat.com> wrote:
>>>>> The threat is that rounding the read size up to the next MAXALIGN would cross
>>>>> into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
>>>>> has MAXALIGN'd size under the hood, so the excess read of "toDelete" cannot
>>>>> cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not aware
>>>>> of an ABI where the four bytes past the end of this stack variable could be
>>>>> unreadable, which is not to claim I'm well-read on the topic.  We should fix
>>>>> this in due course on code hygiene grounds, but I would not back-patch it.
>>>>
>>>> Attached patch silences the "Invalid read of size n" complaints of
>>>> Valgrind. I agree with your general thoughts around backpatching. Note
>>>> that the patch addresses a distinct complaint from Kevin's, as
>>>> Valgrind doesn't take issue with the invalid reads past the end of
>>>> spgxlogPickSplit variables on the stack.
>>>
>>> Is the needless zeroing this patch introduces apt to cause a
>>> performance problem?
>>>
>>> This function is actually pretty wacky.  If we're stuffing bytes with
>>> undefined contents into the WAL record, maybe the answer isn't to
>>> force the contents of those bytes to be defined, but rather to elide
>>> them from the WAL record.
>>
>> Agreed. I propose the attached, which removes the MAXALIGNs. It's not
>> suitable for backpatching, though, as it changes the format of the WAL
>> record.
> y
> Not knowing anything about spgist this looks sane to me.

Since we did a catversion bump anyway, I revisited this. Committed an 
expanded version of the patch I posted earlier. I went through all the 
SP-GiST WAL record types and removed the alignment requirements of 
tuples from all of them. Most of them didn't have a problem because the 
structs happened to have suitable alignment by accident, but seems best 
to not rely on that, for the sake of consistency and robustness if the 
structs are modified later.

I ran this through my testing tool that compares page image on master 
and standby after every WAL record replay, with full_page_writes on and 
off, and found no errors.

> Do we need a
> backpatchable solution given that we seem to agree that these bugs
> aren't likely to cause harm?

I don't think we do.

- Heikki