Re: Memory Errors

Поиск
Список
Период
Сортировка
От Sam Nelson
Тема Re: Memory Errors
Дата
Msg-id AANLkTim3PnJdoh1K_cReRz3sRJjBgwUZmnR5QDOgZH8h@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Memory Errors  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Memory Errors  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-general
It figures I'd have an idea right after posting to the mailing list.

Yeah, running COPY foo TO stdout; gets me a list of data before erroring out, so I did a copy (select * from foo order by id asc) to stdout; to see if I could make some kind of guess as to whether this was related to a single row or something else.

I got the id of the last row the copy to command was able to grab normally and tried to figure out the next id.  The following started to make me think along the lines of some kinda bad corruption (even before getting responses that agreed with that):

Assuming that the last id copied was 1500:

1) select * from foo where id = (select min(id) from foo where id > 1500);
Results in 0 rows

2) select min(id) from foo where id > 1500;
Results in, for example, 200000

3) select max(id) from foo where id > 1500;
Results in, for example, 90000 (a much lower number than returned by min)

4) select id from foo where id > 1500 order by id asc limit 10;
Results in (for example):

200000
202000
210273
220980
15005
15102
15104
15110
15111
15113

So ... yes, it seems that those four id's are somehow part of the problem.

They're on amazon EC2 boxes (yeah, we're not too fond of the EC2 boxes either), so memtest isn't available, but no new corruption has cropped up since they stopped killing the waiting queries (I just double checked - they were getting corrupted rows constantly, and we haven't gotten one since that script stopped killing queries).

We're going to have them attempt to delete the rows with those id's (even though the rows don't exist) and if that fails, we're going to copy (select * from foo where id not in (<list>)) to file;, drop table foo;, create table foo;, and copy foo from file.  I'll try to remember to write back with whether or not any of those things worked.

On Wed, Sep 8, 2010 at 1:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Sam Nelson <samn@consistentstate.com> writes:
> pg_dump: Error message from server: ERROR:  invalid memory alloc request
> size 18446744073709551613
> pg_dump: The command was: COPY public.foo (<columns>) TO stdout;

> That seems like an incredibly large memory allocation request - it shouldn't
> be possible for the table to really be that large, should it?  Any idea what
> may be wrong if it's actually trying to allocate that much memory for a copy
> command?

What that looks like is data corruption; specifically, a bogus length
word for a variable-length field.

                       regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: "A.M."
Дата:
Сообщение: exclusion constraint with overlapping timestamps
Следующее
От: jackassplus
Дата:
Сообщение: Re: how do i count() similar items