Re: ERROR: could not open segment 1 of relation 1663/743352/743420 (target block 6407642): No such file or directory

Поиск
Список
Период
Сортировка
От Alex Hunsaker
Тема Re: ERROR: could not open segment 1 of relation 1663/743352/743420 (target block 6407642): No such file or directory
Дата
Msg-id 34d269d41003301328p760d4cci74da558baf1576b9@mail.gmail.com
обсуждение исходный текст
Ответ на Re: ERROR: could not open segment 1 of relation 1663/743352/743420 (target block 6407642): No such file or directory  (Mike Williams <mike.williams@comodo.com>)
Список pgsql-admin
On Tue, Mar 30, 2010 at 04:16, Mike Williams <mike.williams@comodo.com> wrote:
> Thanks Alex, good to know I've not screwed up the kernel somehow.
>
> I've been using 2.6.32 with grsecurity-2.1.14-2.6.32.9-201002231820 applied.

Looks like the first instance I had of this problem was with
2.6.31.1-rc1-grsec. I know I tried 2.6.32-grsec and various
2.6.32.X-grsecs but all those had this issue at some point.  Currently
im on a mostly stock 2.6.33.1 with no problems.  I have not had the
nerve to try a -grsec kernel on it again.

For reference here are the errors I got:

could not open segment 3 of relation base/4440720/8003730

COPY public.page_loads (cgi, content_length, date_created, defunct,
host, ip, page_load_id, protocol, proxy_ip, referrer, request_method,
sessionid, url, user_id, audit_tid, user_agent_id, action, server) TO
'/tmp/blah.sql';
ERROR:  invalid memory alloc request size 18446744073709551613


There were more could not open segment errors... but I seem to have lost them.

Normally I would think the above is corrupt data, but it would
sometimes work.  It *always* worked on the non grsec kernel.   So
instead it smells like bad ram,  well its got ecc ram and survived
multiple runs of memtest, memtest86+ various versions.  [ Yeah I know
people including me have seen ram that passes all that and is still
bad ]

Since you are having similar problems with a -grsec kernel sounds like
there might be some kind of memory corruption bug with it.  I would
recommend trying a stock kernel and seeing if the problem goes away.
I also think the general attitude here is if you run crazy security
patches you get to keep both pieces. :)

Another fact that seemed to point to bad ram or some kind of kernel
corruption was trying to find the bad row COPY reported above:

SELECT count(*) from (select * from page_loads order by page_load_id
desc limit 937980) as foo;
ERROR:  could not open segment 3 of relation base/4440720/8003730
(target block 4680336): No such file or directory

SELECT count(*) from ( select * from page_loads order by page_load_id
desc limit 937970) as foo;
 count
--------
 937970

<70-79 snipped all worked>

SELECT count(*) from (select * from page_loads order by page_load_id
desc limit 937979) as foo;
 count
--------
 937979

-- Uhh this was just broken...
SELECT count(*) from (select * order by page_load_id desc limit 937980) as foo;
 count
--------
 937980


I thought I had some stacktraces... but they are not in my notes...  I
do remember tracing through them and coming to the conclusion that its
most likely some kind of kernel bug. (That error can only happen if we
try to open a file that does not exist, but we can only get that far
if the file existed or some such)  Sorry Im a bit hazy this was back
in September.

В списке pgsql-admin по дате отправления:

Предыдущее
От: "Greg Sabino Mullane"
Дата:
Сообщение: Re: Migrate postgres to newer hardware
Следующее
От: "Rodger Donaldson"
Дата:
Сообщение: Re: Virtualization vs. sharing a server