Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

Поиск
Список
Период
Сортировка
От Anastasia Lubennikova
Тема Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
Дата
Msg-id d7649878-8d73-20eb-dc60-c26ac4e495d1@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem  (Claudio Freire <klaussfreire@gmail.com>)
Ответы Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem  (Claudio Freire <klaussfreire@gmail.com>)
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
23.12.2016 22:54, Claudio Freire:
On Fri, Dec 23, 2016 at 1:39 PM, Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
I found the reason. I configure postgres with CFLAGS="-O0" and it causes
Segfault on initdb.
It works fine and passes tests with default configure flags, but I'm pretty
sure that we should fix segfault before testing the feature.
If you need it, I'll send a core dump.
I just ran it with CFLAGS="-O0" and it passes all checks too:

CFLAGS='-O0' ./configure --enable-debug --enable-cassert
make clean && make -j8 && make check-world

A stacktrace and a thorough description of your build environment
would be helpful to understand why it breaks on your system.

I ran configure using following set of flags:
 ./configure --enable-tap-tests --enable-cassert --enable-debug --enable-depend CFLAGS="-O0 -g3 -fno-omit-frame-pointer"
And then ran make check. Here is the stacktrace:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000006941e7 in lazy_vacuum_heap (onerel=0x1ec2360, vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
1417                tblk = ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);
(gdb) bt
#0  0x00000000006941e7 in lazy_vacuum_heap (onerel=0x1ec2360, vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
#1  0x0000000000693dfe in lazy_scan_heap (onerel=0x1ec2360, options=9, vacrelstats=0x1ef6e00, Irel=0x1ef7168, nindexes=2, aggressive=1 '\001')
    at vacuumlazy.c:1337
#2  0x0000000000691e66 in lazy_vacuum_rel (onerel=0x1ec2360, options=9, params=0x7ffe0f866310, bstrategy=0x1f1c4a8) at vacuumlazy.c:290
#3  0x000000000069191f in vacuum_rel (relid=1247, relation=0x0, options=9, params=0x7ffe0f866310) at vacuum.c:1418
#4  0x0000000000690122 in vacuum (options=9, relation=0x0, relid=0, params=0x7ffe0f866310, va_cols=0x0, bstrategy=0x1f1c4a8,
    isTopLevel=1 '\001') at vacuum.c:320
#5  0x000000000068fd0b in vacuum (options=-1652367447, relation=0x0, relid=3324614038, params=0x1f11bf0, va_cols=0xb59f63,
    bstrategy=0x1f1c620, isTopLevel=0 '\000') at vacuum.c:150
#6  0x0000000000852993 in standard_ProcessUtility (parsetree=0x1f07e60, queryString=0x1f07468 "VACUUM FREEZE;\n",
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0xea5cc0 <debugtupDR>, completionTag=0x7ffe0f866750 "") at utility.c:669
#7  0x00000000008520da in standard_ProcessUtility (parsetree=0x401ef6cd8, queryString=0x18 <error: Cannot access memory at address 0x18>,
    context=PROCESS_UTILITY_TOPLEVEL, params=0x68, dest=0x9e5d62 <AllocSetFree+60>, completionTag=0x7ffe0f8663f0 "`~\360\001")
    at utility.c:360
#8  0x0000000000851161 in PortalRunMulti (portal=0x7ffe0f866750, isTopLevel=0 '\000', setHoldSnapshot=-39 '\331',
    dest=0x851161 <PortalRunMulti+19>, altdest=0x7ffe0f8664f0, completionTag=0x1f07e60 "\341\002") at pquery.c:1219
#9  0x0000000000851374 in PortalRunMulti (portal=0x1f0a488, isTopLevel=1 '\001', setHoldSnapshot=0 '\000', dest=0xea5cc0 <debugtupDR>,
    altdest=0xea5cc0 <debugtupDR>, completionTag=0x7ffe0f866750 "") at pquery.c:1345
#10 0x0000000000850889 in PortalRun (portal=0x1f0a488, count=9223372036854775807, isTopLevel=1 '\001', dest=0xea5cc0 <debugtupDR>,
    altdest=0xea5cc0 <debugtupDR>, completionTag=0x7ffe0f866750 "") at pquery.c:824
#11 0x000000000084a4dc in exec_simple_query (query_string=0x1f07468 "VACUUM FREEZE;\n") at postgres.c:1113
#12 0x000000000084e960 in PostgresMain (argc=10, argv=0x1e60a50, dbname=0x1e823b0 "template1", username=0x1e672a0 "anastasia")
    at postgres.c:4091
#13 0x00000000006f967e in init_locale (categoryname=0x100000000000000 <error: Cannot access memory at address 0x100000000000000>,
    category=32766, locale=0xa004692f0 <error: Cannot access memory at address 0xa004692f0>) at main.c:310
#14 0x00007f1e5f463830 in __libc_start_main (main=0x6f93e1 <main+85>, argc=10, argv=0x7ffe0f866a78, init=<optimized out>,
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffe0f866a68) at ../csu/libc-start.c:291
#15 0x0000000000469319 in _start ()

core file is quite big, so I didn't attach it to the mail. You can download it here: core dump file.

Here are some notes about the first patch:

1. prefetchBlkno = blkno & ~0x1f;
    prefetchBlkno = (prefetchBlkno > 32) ? prefetchBlkno - 32 : 0;

I didn't get it what for we need these tricks. How does it differ from:
prefetchBlkno = (blkno > 32) ? blkno - 32 : 0;

2. Why do we decrease prefetchBlckno twice?

Here:
+    prefetchBlkno = (prefetchBlkno > 32) ? prefetchBlkno - 32 : 0;
And here:
if (prefetchBlkno >= 32)
+                prefetchBlkno -= 32;
   

I'll inspect second patch in a few days and write questions about it.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: [HACKERS] ALTER TABLE parent SET WITHOUT OIDS and the oid column
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem