Обсуждение: BUG #15144: *** glibc detected *** postgres: postgres smsconsole[local] SELECT: double free or corruption (!pre

Поиск
Список
Период
Сортировка

BUG #15144: *** glibc detected *** postgres: postgres smsconsole[local] SELECT: double free or corruption (!pre

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15144
Logged by:          Vitaly Voronov
Email address:      wizard_1024@tut.by
PostgreSQL version: 9.6.8
Operating system:   CentOS 6.9
Description:

Hello, 

We have problem at our Master server (second time). 
From first time, we update CentOS to latest version (6.9) 
But today we have such bug:
*** glibc detected *** postgres: postgres smsconsole [local] SELECT: double
free or corruption (!prev): 0x00000000022529e0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x75dee)[0x7f6b19433dee]
/lib64/libc.so.6(+0x78c80)[0x7f6b19436c80]
postgres: postgres smsconsole [local]
SELECT(tuplestore_end+0x17)[0x808887]
postgres: postgres smsconsole [local]
SELECT(ExecEndFunctionScan+0x75)[0x5e94e5]
postgres: postgres smsconsole [local]
SELECT(standard_ExecutorEnd+0x2e)[0x5cbaae]
postgres: postgres smsconsole [local] SELECT(PortalCleanup+0x9e)[0x593d6e]
postgres: postgres smsconsole [local] SELECT(PortalDrop+0x2a)[0x7fcaca]
postgres: postgres smsconsole [local] SELECT[0x6e0eb2]
postgres: postgres smsconsole [local] SELECT(PostgresMain+0xdcc)[0x6e256c]
postgres: postgres smsconsole [local]
SELECT(PostmasterMain+0x1875)[0x6823e5]
postgres: postgres smsconsole [local] SELECT(main+0x7a8)[0x609fe8]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f6b193dcd1d]
postgres: postgres smsconsole [local] SELECT[0x46c039]
======= Memory map: ========
00400000-00a14000 r-xp 00000000 08:03 3938726
/usr/pgsql-9.6/bin/postgres
00c14000-00c21000 rw-p 00614000 08:03 3938726
/usr/pgsql-9.6/bin/postgres
00c21000-00c72000 rw-p 00000000 00:00 0
0211e000-02183000 rw-p 00000000 00:00 0
02183000-0228f000 rw-p 00000000 00:00 0
3799000000-3799016000 r-xp 00000000 08:03 26214446
/lib64/libgcc_s-4.4.7-20120601.so.1
3799016000-3799215000 ---p 00016000 08:03 26214446
/lib64/libgcc_s-4.4.7-20120601.so.1
3799215000-3799216000 rw-p 00015000 08:03 26214446
/lib64/libgcc_s-4.4.7-20120601.so.1
3d56000000-3d56015000 r-xp 00000000 08:03 26214420
/lib64/libz.so.1.2.3 (deleted)
3d56015000-3d56214000 ---p 00015000 08:03 26214420
/lib64/libz.so.1.2.3 (deleted)
3d56214000-3d56215000 r--p 00014000 08:03 26214420
/lib64/libz.so.1.2.3 (deleted)
3d56215000-3d56216000 rw-p 00015000 08:03 26214420
/lib64/libz.so.1.2.3 (deleted)
3d56800000-3d5681d000 r-xp 00000000 08:03 26214488
/lib64/libselinux.so.1.#prelink#.znRAKV (deleted)
3d5681d000-3d56a1c000 ---p 0001d000 08:03 26214488
/lib64/libselinux.so.1.#prelink#.znRAKV (deleted)
3d56a1c000-3d56a1d000 r--p 0001c000 08:03 26214488
/lib64/libselinux.so.1.#prelink#.znRAKV (deleted)
3d56a1d000-3d56a1e000 rw-p 0001d000 08:03 26214488
/lib64/libselinux.so.1.#prelink#.znRAKV (deleted)
3d56a1e000-3d56a1f000 rw-p 00000000 00:00 0
3d57c00000-3d57c02000 r-xp 00000000 08:03 26214456
/lib64/libfreebl3.so (deleted)
3d57c02000-3d57e01000 ---p 00002000 08:03 26214456
/lib64/libfreebl3.so (deleted)
3d57e01000-3d57e02000 r--p 00001000 08:03 26214456
/lib64/libfreebl3.so (deleted)
3d57e02000-3d57e03000 rw-p 00002000 08:03 26214456
/lib64/libfreebl3.so (deleted)
3d5b000000-3d5b002000 r-xp 00000000 08:03 26214486
/lib64/libkeyutils.so.1.3.#prelink#.zuP6V2 (deleted)
3d5b002000-3d5b201000 ---p 00002000 08:03 26214486
/lib64/libkeyutils.so.1.3.#prelink#.zuP6V2 (deleted)
3d5b201000-3d5b202000 r--p 00001000 08:03 26214486
/lib64/libkeyutils.so.1.3.#prelink#.zuP6V2 (deleted)
3d5b202000-3d5b203000 rw-p 00002000 08:03 26214486
/lib64/libkeyutils.so.1.3.#prelink#.zuP6V2 (deleted)
3d5e000000-3d5e149000 r-xp 00000000 08:03 3934405
/usr/lib64/libxml2.so.2.7.6.#prelink#.b6SBFV (deleted)
3d5e149000-3d5e348000 ---p 00149000 08:03 3934405
/usr/lib64/libxml2.so.2.7.6.#prelink#.b6SBFV (deleted)
3d5e348000-3d5e352000 rw-p 00148000 08:03 3934405
/usr/lib64/libxml2.so.2.7.6.#prelink#.b6SBFV (deleted)
3d5e352000-3d5e353000 rw-p 00000000 00:00 0
3d5ec00000-3d5ec19000 r-xp 00000000 08:03 3938131
/usr/lib64/libsasl2.so.2.0.23.#prelink#.NiWMBM (deleted)
3d5ec19000-3d5ee18000 ---p 00019000 08:03 3938131
/usr/lib64/libsasl2.so.2.0.23.#prelink#.NiWMBM (deleted)
3d5ee18000-3d5ee19000 r--p 00018000 08:03 3938131
/usr/lib64/libsasl2.so.2.0.23.#prelink#.NiWMBM (deleted)
3d5ee19000-3d5ee1a000 rw-p 00019000 08:03 3938131
/usr/lib64/libsasl2.so.2.0.23.#prelink#.NiWMBM (deleted)
7f67f4000000-7f67f4021000 rw-p 00000000 00:00 0
7f67f4021000-7f67f8000000 ---p 00000000 00:00 0
7f67f8c24000-7f67f9599000 rw-p 00000000 00:00 0
7f67f9599000-7f67fc59f000 rw-p 00000000 00:00 0
7f67fc59f000-7f67fdda2000 rw-p 00000000 00:00 0
7f67fe1a3000-7f67fe9a4000 rw-p 00000000 00:00 0
7f67feba5000-7f67ff3a6000 rw-p 00000000 00:00 0
7f67ff4a7000-7f67ff8a8000 rw-p 00000000 00:00 0
7f67ff929000-7f67ffb2a000 rw-p 00000000 00:00 0
7f67ffb6b000-7f67ffc6c000 rw-p 00000000 00:00 0
7f6802c6d000-7f6802c6f000 r-xp 00000000 08:03 3938903
/usr/pgsql-9.6/lib/pg_buffercache.so
7f6802c6f000-7f6802e6e000 ---p 00002000 08:03 3938903
/usr/pgsql-9.6/lib/pg_buffercache.so
7f6802e6e000-7f6802e6f000 rw-p 00001000 08:03 3938903
/usr/pgsql-9.6/lib/pg_buffercache.so
7f6802e6f000-7f6802e7c000 r-xp 00000000 08:03 26214806
/lib64/libnss_files-2.12.so
7f6802e7c000-7f680307b000 ---p 0000d000 08:03 26214806
/lib64/libnss_files-2.12.so
7f680307b000-7f680307c000 r--p 0000c000 08:03 26214806
/lib64/libnss_files-2.12.so
7f680307c000-7f680307d000 rw-p 0000d000 08:03 26214806
/lib64/libnss_files-2.12.so
7f680307d000-7f6b16dc3000 rw-s 00000000 00:04 22567
/dev/zero (deleted)
7f6b16dc3000-7f6b16dcb000 r-xp 00000000 08:03 3938906
/usr/pgsql-9.6/lib/pg_stat_statements.so
7f6b16dcb000-7f6b16fca000 ---p 00008000 08:03 3938906
/usr/pgsql-9.6/lib/pg_stat_statements.so
7f6b16fca000-7f6b16fcb000 rw-p 00007000 08:03 3938906
/usr/pgsql-9.6/lib/pg_stat_statements.so
7f6b16fcb000-7f6b17004000 r-xp 00000000 08:03 26214950
/lib64/libnspr4.so (deleted)
7f6b17004000-7f6b17204000 ---p 00039000 08:03 26214950
/lib64/libnspr4.so (deleted)
7f6b17204000-7f6b17205000 r--p 00039000 08:03 26214950
/lib64/libnspr4.so (deleted)
7f6b17205000-7f6b17207000 rw-p 0003a000 08:03 26214950
/lib64/libnspr4.so (deleted)
7f6b17207000-7f6b17209000 rw-p 00000000 00:00 0
7f6b17209000-7f6b1720d000 r-xp 00000000 08:03 26214951
/lib64/libplc4.so (deleted)
7f6b1720d000-7f6b1740c000 ---p 00004000 08:03 26214951
/lib64/libplc4.so (deleted)
7f6b1740c000-7f6b1740d000 r--p 00003000 08:03 26214951
/lib64/libplc4.so (deleted)
7f6b1740d000-7f6b1740e000 rw-p 00004000 08:03 26214951
/lib64/libplc4.so (deleted)
7f6b1740e000-7f6b17411000 r-xp 00000000 08:03 26214952
/lib64/libplds4.so (deleted)
7f6b17411000-7f6b17610000 ---p 00003000 08:03 26214952
/lib64/libplds4.so (deleted)
7f6b17610000-7f6b17611000 r--p 00002000 08:03 26214952
/lib64/libplds4.so (deleted)
7f6b17611000-7f6b17612000 rw-p 00003000 08:03 26214952
/lib64/libplds4.so (deleted)
7f6b17612000-7f6b17638000 r-xp 00000000 08:03 3933172
/usr/lib64/libnssutil3.so (deleted)
7f6b17638000-7f6b17837000 ---p 00026000 08:03 3933172
/usr/lib64/libnssutil3.so (deleted)
7f6b17837000-7f6b1783e000 r--p 00025000 08:03 3933172
/usr/lib64/libnssutil3.so (deleted)
7f6b1783e000-7f6b1783f000 rw-p 0002c000 08:03 3933172
/usr/lib64/libnssutil3.so (deleted)
7f6b1783f000-7f6b17979000 r-xp 00000000 08:03 3934884
/usr/lib64/libnss3.so (deleted)
7f6b17979000-7f6b17b78000 ---p 0013a000 08:03 3934884
/usr/lib64/libnss3.so (deleted)
7f6b17b78000-7f6b17b7e000 r--p 00139000 08:03 3934884
/usr/lib64/libnss3.so (deleted)
7f6b17b7e000-7f6b17b80000 rw-p 0013f000 08:03 3934884
/usr/lib64/libnss3.so (deleted)
7f6b17b80000-7f6b17b82000 rw-p 00000000 00:00 0
7f6b17b82000-7f6b17baa000 r-xp 00000000 08:03 3939143
/usr/lib64/libsmime3.so (deleted)
7f6b17baa000-7f6b17da9000 ---p 00028000 08:03 3939143
/usr/lib64/libsmime3.so (deleted)
7f6b17da9000-7f6b17dad000 r--p 00027000 08:03 3939143
/usr/lib64/libsmime3.so (deleted)
7f6b17dad000-7f6b17dae000 rw-p 0002b000 08:03 3939143
/usr/lib64/libsmime3.so (deleted)
7f6b17dae000-7f6b17df5000 r-xp 00000000 08:03 3939144
/usr/lib64/libssl3.so (deleted)
7f6b17df5000-7f6b17ff5000 ---p 00047000 08:03 3939144
/usr/lib64/libssl3.so (deleted)
7f6b17ff5000-7f6b17ff9000 r--p 00047000 08:03 3939144
/usr/lib64/libssl3.so (deleted)
7f6b17ff9000-7f6b17ffa000 rw-p 0004b000 08:03 3939144
/usr/lib64/libssl3.so (deleted)
7f6b17ffa000-7f6b17ffb000 rw-p 00000000 00:00 0
7f6b17ffb000-7f6b18009000 r-xp 00000000 08:03 26214830
/lib64/liblber-2.4.so.2.10.3.#prelink#.7hI5fW (deleted)
7f6b18009000-7f6b18208000 ---p 0000e000 08:03 26214830
/lib64/liblber-2.4.so.2.10.3.#prelink#.7hI5fW (deleted)
7f6b18208000-7f6b18209000 r--p 0000d000 08:03 26214830
/lib64/liblber-2.4.so.2.10.3.#prelink#.7hI5fW (deleted)
7f6b18209000-7f6b1820a000 rw-p 0000e000 08:03 26214830
/lib64/liblber-2.4.so.2.10.3.#prelink#.7hI5fW (deleted)
7f6b1820a000-7f6b18221000 r-xp 00000000 08:03 26214436
/lib64/libpthread-2.12.so.#prelink#.KCRT1K (deleted)
7f6b18221000-7f6b18421000 ---p 00017000 08:03 26214436
/lib64/libpthread-2.12.so.#prelink#.KCRT1K (deleted)
7f6b18421000-7f6b18422000 r--p 00017000 08:03 26214436
/lib64/libpthread-2.12.so.#prelink#.KCRT1K (deleted)
7f6b18422000-7f6b18423000 rw-p 00018000 08:03 26214436
/lib64/libpthread-2.12.so.#prelink#.KCRT1K (deleted)
7f6b18423000-7f6b18427000 rw-p 00000000 00:00 0
7f6b18427000-7f6b1843d000 r-xp 00000000 08:03 26214808
/lib64/libresolv-2.12.so.#prelink#.CZY6kZ (deleted)
7f6b1843d000-7f6b1863d000 ---p 00016000 08:03 26214808
/lib64/libresolv-2.12.so.#prelink#.CZY6kZ (deleted)
7f6b1863d000-7f6b1863e000 r--p 00016000 08:03 26214808
/lib64/libresolv-2.12.so.#prelink#.CZY6kZ (deleted)
7f6b1863e000-7f6b1863f000 rw-p 00017000 08:03 26214808
/lib64/libresolv-2.12.so.#prelink#.CZY6kZ (deleted)
7f6b1863f000-7f6b18641000 rw-p 00000000 00:00 0
7f6b18641000-7f6b1864b000 r-xp 00000000 08:03 26214829
/lib64/libkrb5support.so.0.1.#prelink#.S6UyaS (deleted)
7f6b1864b000-7f6b1884a000 ---p 0000a000 08:03 26214829
/lib64/libkrb5support.so.0.1.#prelink#.S6UyaS (deleted)
7f6b1884a000-7f6b1884b000 r--p 00009000 08:03 26214829
/lib64/libkrb5support.so.0.1.#prelink#.S6UyaS (deleted)
7f6b1884b000-7f6b1884c000 rw-p 0000a000 08:03 26214829
/lib64/libkrb5support.so.0.1.#prelink#.S6UyaS (deleted)
7f6b1884c000-7f6b18875000 r-xp 00000000 08:03 26214619
/lib64/libk5crypto.so.3.1.#prelink#.V1CVAO (deleted)
7f6b18875000-7f6b18a75000 ---p 00029000 08:03 26214619
/lib64/libk5crypto.so.3.1.#prelink#.V1CVAO (deleted)
7f6b18a75000-7f6b18a76000 r--p 00029000 08:03 26214619
/lib64/libk5crypto.so.3.1.#prelink#.V1CVAO (deleted)
7f6b18a76000-7f6b18a77000 rw-p 0002a000 08:03 26214619
/lib64/libk5crypto.so.3.1.#prelink#.V1CVAO (deleted)
7f6b18a77000-7f6b18a78000 rw-p 00000000 00:00 0
7f6b18a78000-7f6b18a7b000 r-xp 00000000 08:03 26214466
/lib64/libcom_err.so.2.1.#prelink#.pCWhtH (deleted)
7f6b18a7b000-7f6b18c7a000 ---p 00003000 08:03 26214466
/lib64/libcom_err.so.2.1.#prelink#.pCWhtH (deleted)
7f6b18c7a000-7f6b18c7b000 r--p 00002000 08:03 26214466
/lib64/libcom_err.so.2.1.#prelink#.pCWhtH (deleted)
7f6b18c7b000-7f6b18c7c000 rw-p 00003000 08:03 26214466
/lib64/libcom_err.so.2.1.#prelink#.pCWhtH (deleted)
7f6b18c7c000-7f6b18d58000 r-xp 00000000 08:03 26214828
/lib64/libkrb5.so.3.3 (deleted)
7f6b18d58000-7f6b18f57000 ---p 000dc000 08:03 26214828
/lib64/libkrb5.so.3.3 (deleted)
7f6b18f57000-7f6b18f61000 r--p 000db000 08:03 26214828
/lib64/libkrb5.so.3.3 (deleted)
7f6b18f61000-7f6b18f63000 rw-p 000e5000 08:03 26214828
/lib64/libkrb5.so.3.3 (deleted)
7f6b18f63000-7f6b18f6a000 r-xp 00000000 08:03 26214416
/lib64/libcrypt-2.12.so.#prelink#.cXl2OP (deleted)
7f6b18f6a000-7f6b1916a000 ---p 00007000 08:03 26214416
/lib64/libcrypt-2.12.so.#prelink#.cXl2OP (deleted)
7f6b1916a000-7f6b1916b000 r--p 00007000 08:03 26214416
/lib64/libcrypt-2.12.so.#prelink#.cXl2OP (deleted)
7f6b1916b000-7f6b1916c000 rw-p 00008000 08:03 26214416
/lib64/libcrypt-2.12.so.#prelink#.cXl2OP (deleted)
7f6b1916c000-7f6b1919a000 rw-p 00000000 00:00 0
7f6b1919a000-7f6b191b2000 r-xp 00000000 08:03 26214814
/lib64/libaudit.so.1.0.0.#prelink#.jWjLty (deleted)
7f6b191b2000-7f6b193b1000 ---p 00018000 08:03 26214814
/lib64/libaudit.so.1.0.0.#prelink#.jWjLty (deleted)
7f6b193b1000-7f6b193b3000 r--p 00017000 08:03 26214814
/lib64/libaudit.so.1.0.0.#prelink#.jWjLty (deleted)
7f6b193b3000-7f6b193be000 rw-p 00019000 08:03 26214814
/lib64/libaudit.so.1.0.0.#prelink#.jWjLty (deleted)
7f6b193be000-7f6b19548000 r-xp 00000000 08:03 26214412
/lib64/libc-2.12.so (deleted)
7f6b19548000-7f6b19748000 ---p 0018a000 08:03 26214412
/lib64/libc-2.12.so (deleted)
7f6b19748000-7f6b1974c000 r--p 0018a000 08:03 26214412
/lib64/libc-2.12.so (deleted)
7f6b1974c000-7f6b1974e000 rw-p 0018e000 08:03 26214412
/lib64/libc-2.12.so (deleted)
7f6b1974e000-7f6b19752000 rw-p 00000000 00:00 0
7f6b19752000-7f6b197a0000 r-xp 00000000 08:03 26214831
/lib64/libldap-2.4.so.2.10.3.#prelink#.bg5LwW (deleted)
7f6b197a0000-7f6b1999f000 ---p 0004e000 08:03 26214831
/lib64/libldap-2.4.so.2.10.3.#prelink#.bg5LwW (deleted)
7f6b1999f000-7f6b199a1000 r--p 0004d000 08:03 26214831
/lib64/libldap-2.4.so.2.10.3.#prelink#.bg5LwW (deleted)
7f6b199a1000-7f6b199a3000 rw-p 0004f000 08:03 26214831
/lib64/libldap-2.4.so.2.10.3.#prelink#.bg5LwW (deleted)
7f6b199a3000-7f6b19a26000 r-xp 00000000 08:03 26214803
/lib64/libm-2.12.so (deleted)
7f6b19a26000-7f6b19c25000 ---p 00083000 08:03 26214803
/lib64/libm-2.12.so (deleted)
7f6b19c25000-7f6b19c26000 r--p 00082000 08:03 26214803
/lib64/libm-2.12.so (deleted)
7f6b19c26000-7f6b19c27000 rw-p 00083000 08:03 26214803
/lib64/libm-2.12.so (deleted)
7f6b19c27000-7f6b19c29000 r-xp 00000000 08:03 26214802
/lib64/libdl-2.12.so (deleted)
7f6b19c29000-7f6b19e29000 ---p 00002000 08:03 26214802
/lib64/libdl-2.12.so (deleted)
7f6b19e29000-7f6b19e2a000 r--p 00002000 08:03 26214802
/lib64/libdl-2.12.so (deleted)
7f6b19e2a000-7f6b19e2b000 rw-p 00003000 08:03 26214802
/lib64/libdl-2.12.so (deleted)
7f6b19e2b000-7f6b19e32000 r-xp 00000000 08:03 26214809
/lib64/librt-2.12.so (deleted)
7f6b19e32000-7f6b1a031000 ---p 00007000 08:03 26214809
/lib64/librt-2.12.so (deleted)
7f6b1a031000-7f6b1a032000 r--p 00006000 08:03 26214809
/lib64/librt-2.12.so (deleted)
7f6b1a032000-7f6b1a033000 rw-p 00007000 08:03 26214809
/lib64/librt-2.12.so (deleted)
7f6b1a033000-7f6b1a074000 r-xp 00000000 08:03 26214826
/lib64/libgssapi_krb5.so.2.2.#prelink#.tB60pA (deleted)
7f6b1a074000-7f6b1a274000 ---p 00041000 08:03 26214826
/lib64/libgssapi_krb5.so.2.2.#prelink#.tB60pA (deleted)
7f6b1a274000-7f6b1a275000 r--p 00041000 08:03 26214826
/lib64/libgssapi_krb5.so.2.2.#prelink#.tB60pA (deleted)
7f6b1a275000-7f6b1a277000 rw-p 00042000 08:03 26214826
/lib64/libgssapi_krb5.so.2.2.#prelink#.tB60pA (deleted)
7f6b1a277000-7f6b1a431000 r-xp 00000000 08:03 3934773
/usr/lib64/libcrypto.so.1.0.1e.#prelink#.xwVltt (deleted)
7f6b1a431000-7f6b1a631000 ---p 001ba000 08:03 3934773
/usr/lib64/libcrypto.so.1.0.1e.#prelink#.xwVltt (deleted)
7f6b1a631000-7f6b1a64c000 r--p 001ba000 08:03 3934773
/usr/lib64/libcrypto.so.1.0.1e.#prelink#.xwVltt (deleted)
7f6b1a64c000-7f6b1a658000 rw-p 001d5000 08:03 3934773
/usr/lib64/libcrypto.so.1.0.1e.#prelink#.xwVltt (deleted)
7f6b1a658000-7f6b1a65c000 rw-p 00000000 00:00 0
7f6b1a65c000-7f6b1a6be000 r-xp 00000000 08:03 3938254
/usr/lib64/libssl.so.1.0.1e.#prelink#.QXZO4p (deleted)
7f6b1a6be000-7f6b1a8be000 ---p 00062000 08:03 3938254
/usr/lib64/libssl.so.1.0.1e.#prelink#.QXZO4p (deleted)
7f6b1a8be000-7f6b1a8c2000 r--p 00062000 08:03 3938254
/usr/lib64/libssl.so.1.0.1e.#prelink#.QXZO4p (deleted)
7f6b1a8c2000-7f6b1a8c8000 rw-p 00066000 08:03 3938254
/usr/lib64/libssl.so.1.0.1e.#prelink#.QXZO4p (deleted)
7f6b1a8c8000-7f6b1a8d4000 r-xp 00000000 08:03 26214531
/lib64/libpam.so.0.82.2.#prelink#.kKy58y (deleted)
7f6b1a8d4000-7f6b1aad4000 ---p 0000c000 08:03 26214531
/lib64/libpam.so.0.82.2.#prelink#.kKy58y (deleted)
7f6b1aad4000-7f6b1aad5000 r--p 0000c000 08:03 26214531
/lib64/libpam.so.0.82.2.#prelink#.kKy58y (deleted)
7f6b1aad5000-7f6b1aad6000 rw-p 0000d000 08:03 26214531
/lib64/libpam.so.0.82.2.#prelink#.kKy58y (deleted)
7f6b1aad6000-7f6b1aaf6000 r-xp 00000000 08:03 26214468
/lib64/ld-2.12.so (deleted)
7f6b1ab95000-7f6b1acda000 rw-p 00000000 00:00 0
7f6b1acda000-7f6b1aceb000 rw-p 00000000 00:00 0
7f6b1acf1000-7f6b1acf2000 rw-p 00000000 00:00 0
7f6b1acf2000-7f6b1acf4000 rw-s 00000000 00:10 22574
/dev/shm/PostgreSQL.1602759649
7f6b1acf4000-7f6b1acf5000 rw-s 00000000 00:04 32769
/SYSV0052e2c1 (deleted)
7f6b1acf5000-7f6b1acf6000 rw-p 00000000 00:00 0
7f6b1acf6000-7f6b1acf7000 r--p 00020000 08:03 26214468
/lib64/ld-2.12.so (deleted)
7f6b1acf7000-7f6b1acf8000 rw-p 00021000 08:03 26214468
/lib64/ld-2.12.so (deleted)
7f6b1acf8000-7f6b1acf9000 rw-p 00000000 00:00 0
7ffe33948000-7ffe3395d000 rw-p 00000000 00:00 0
[stack]
7ffe339f9000-7ffe339fa000 r-xp 00000000 00:00 0
[vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
[vsyscall]
< 2018-04-06 06:02:23.225 JST > LOG:  server process (PID 9045) was
terminated by signal 6: Aborted


On Thu, Apr 5, 2018 at 3:39 PM, PG Bug reporting form
<noreply@postgresql.org> wrote:
> We have problem at our Master server (second time).
> From first time, we update CentOS to latest version (6.9)
> But today we have such bug:
> *** glibc detected *** postgres: postgres smsconsole [local] SELECT: double
> free or corruption (!prev): 0x00000000022529e0 ***

Can you get a full stack trace from a coredump?

https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

It would be particularly helpful if you were able to collect a
coredump, and run "p *debug_query_string" from GDB.

It would also be nice to be able to get the query string from some
other source, such as the server log. Perhaps it can be correlated to
something?

> ======= Backtrace: =========
> /lib64/libc.so.6(+0x75dee)[0x7f6b19433dee]
> /lib64/libc.so.6(+0x78c80)[0x7f6b19436c80]
> postgres: postgres smsconsole [local]
> SELECT(tuplestore_end+0x17)[0x808887]
> postgres: postgres smsconsole [local]
> SELECT(ExecEndFunctionScan+0x75)[0x5e94e5]
> postgres: postgres smsconsole [local]
> SELECT(standard_ExecutorEnd+0x2e)[0x5cbaae]

Offhand, I suspect that this could be a bug that is analogous to the
one just fixed within tuplesort, by
c2d4eb1b1fa252fd8c407e1519308017a18afed1. There is a fairly long
history of these kinds of bugs, including one or two in tuplestore
that I can recall from memory.

-- 
Peter Geoghegan


Hello,

06.04.2018, 02:18, "Peter Geoghegan" <pg@bowt.ie>:
> On Thu, Apr 5, 2018 at 3:39 PM, PG Bug reporting form
> <noreply@postgresql.org> wrote:
>>  We have problem at our Master server (second time).
>>  From first time, we update CentOS to latest version (6.9)
>>  But today we have such bug:
>>  *** glibc detected *** postgres: postgres smsconsole [local] SELECT: double
>>  free or corruption (!prev): 0x00000000022529e0 ***
>
> Can you get a full stack trace from a coredump?
>
> https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD
>
> It would be particularly helpful if you were able to collect a
> coredump, and run "p *debug_query_string" from GDB.
>
> It would also be nice to be able to get the query string from some
> other source, such as the server log. Perhaps it can be correlated to
> something?
Sorry. I can't do this. Because, 1s - this is production and 2nd - it had been initialized as replica.
We use 2 server, and this server (with error) now initialized as secondary.
We use pgpool for switching servers.
And i posted full log from Postgres logs.
All other logs is clear.
>
>>  ======= Backtrace: =========
>>  /lib64/libc.so.6(+0x75dee)[0x7f6b19433dee]
>>  /lib64/libc.so.6(+0x78c80)[0x7f6b19436c80]
>>  postgres: postgres smsconsole [local]
>>  SELECT(tuplestore_end+0x17)[0x808887]
>>  postgres: postgres smsconsole [local]
>>  SELECT(ExecEndFunctionScan+0x75)[0x5e94e5]
>>  postgres: postgres smsconsole [local]
>>  SELECT(standard_ExecutorEnd+0x2e)[0x5cbaae]
>
> Offhand, I suspect that this could be a bug that is analogous to the
> one just fixed within tuplesort, by
> c2d4eb1b1fa252fd8c407e1519308017a18afed1. There is a fairly long
> history of these kinds of bugs, including one or two in tuplestore
> that I can recall from memory.
We have zabbix monitoring, which actively using pg_stat_* and pg_buffercache.
Can this cause such problem?

Can you answer, when such changes will be released in production?
>
> --
> Peter Geoghegan

Thanks for you answers!


On Fri, Apr 6, 2018 at 12:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote:
> Sorry. I can't do this. Because, 1s - this is production and 2nd - it had been initialized as replica.
> We use 2 server, and this server (with error) now initialized as secondary.
> We use pgpool for switching servers.
> And i posted full log from Postgres logs.
> All other logs is clear.

I won't be able to help you without this information.

-- 
Peter Geoghegan


Hello, Peter.

We have cat crash at our secondary server.
This full stack trace from crash dump:
(gdb) bt full
#0  0x0000003d22232495 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x0000003d22233c75 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x0000003d222703a7 in __libc_message () from /lib64/libc.so.6
No symbol table info available.
#3  0x0000003d22275dee in malloc_printerr () from /lib64/libc.so.6
No symbol table info available.
#4  0x0000003d22278c80 in _int_free () from /lib64/libc.so.6
No symbol table info available.
#5  0x0000000000808887 in tuplestore_end (state=0x1794898) at tuplestore.c:455
        i = <value optimized out>
#6  0x00000000005e94e5 in ExecEndFunctionScan (node=0x178e3d8) at nodeFunctionscan.c:550
        fs = 0x178e388
        i = <value optimized out>
#7  0x00000000005cbaae in ExecEndPlan (queryDesc=0x16c6a38) at execMain.c:1451
        resultRelInfo = <value optimized out>
        i = <value optimized out>
        l = <value optimized out>
#8  standard_ExecutorEnd (queryDesc=0x16c6a38) at execMain.c:468
        estate = 0x178e278
        oldcontext = 0x1677e48
#9  0x0000000000593d6e in PortalCleanup (portal=0x16c3138) at portalcmds.c:280
        save_exception_stack = 0x7fff07538610
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {23867704, 2475436528305129399, 24351312, 24686200, 2, 24686152,
-2475701272642631753,2475437018370099127},
 
            __mask_was_saved = 0, __saved_mask = {__val = {24686152, 15971042801079502775, 2475436856846141367, 0,
8280655,0, 24149016, 9856140, 2, 1,
 
                23867704, 140733316302126, 88, 23867704, 24351312, 9377246}}}}
        saveResourceOwner = 0x1678878
        queryDesc = 0x16c6a38
#10 0x00000000007fcaca in PortalDrop (portal=0x16c3138, isTopCommit=0 '\000') at portalmem.c:510
        __func__ = "PortalDrop"
#11 0x00000000006e0eb2 in exec_simple_query (query_string=0x1738468 "select count(*) from pg_buffercache where
isdirty")at postgres.c:1095
 
        parsetree = 0x1739140
        portal = 0x16c3138
        snapshot_set = <value optimized out>
        commandTag = <value optimized out>
        completionTag = "SELECT 1\000\000\000\000\000\000\000\000h\204s\001\000\000\000\000h\204s\001\000\000\000\000m
pg_buf\220!\002\"=\000\000\000\336\000\000\000\000\000\000\000\205\312~\000\000\000\000"
        querytree_list = <value optimized out>
        plantree_list = 0x178ae48
        receiver = 0x178ae78
        format = 0
        dest = DestRemote
        oldcontext = 0x1677e48
---Type <return> to continue, or q <return> to quit---
        parsetree_list = 0x1739270
        parsetree_item = 0x1739250
        save_log_statement_stats = 0 '\000'
        was_logged = 0 '\000'
        isTopLevel = 1 '\001'
        msec_str = "\220\205S\a\377\177\000\000\000\207S\a\377\177\000\000h\204s\001", '\000' <repeats 11 times>
        __func__ = "exec_simple_query"
#12 0x00000000006e256c in PostgresMain (argc=<value optimized out>, argv=<value optimized out>, dbname=0x16c8f08
"smsconsole",
    username=<value optimized out>) at postgres.c:4072
        query_string = 0x1738468 "select count(*) from pg_buffercache where isdirty"
        firstchar = 81
        input_message = {data = 0x1738468 "select count(*) from pg_buffercache where isdirty", len = 50, maxlen = 1024,
cursor= 50}
 
        local_sigjmp_buf = {{__jmpbuf = {140733316302560, 2475436528304998327, 1, 1523890764, -9187201950435737471, 0,
-2475701272437110857,
              2475436853937391543}, __mask_was_saved = 1, __saved_mask = {__val = {0, 0, 4294967295, 12667896, 1,
12667240,0, 9259542123273814145, 0, 0,
 
                1024, 23891912, 12732800, 1523890764, 8372124, 140733316302592}}}}
        send_ready_for_query = 0 '\000'
        disable_idle_in_transaction_timeout = 0 '\000'
        __func__ = "PostgresMain"
#13 0x00000000006823e5 in BackendRun (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:4342
        ac = 1
        usecs = 446360
        i = 1
        av = 0x16c8fc8
        maxac = <value optimized out>
#14 BackendStartup (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:4016
        bn = <value optimized out>
        pid = 0
#15 ServerLoop (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1721
        rmask = {fds_bits = {32, 0 <repeats 15 times>}}
        selres = <value optimized out>
        now = <value optimized out>
        readmask = {fds_bits = {120, 0 <repeats 15 times>}}
        nSockets = 7
        last_lockfile_recheck_time = 1523890764
        last_touch_time = 1523887759
#16 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1329
        opt = <value optimized out>
        status = <value optimized out>
        userDoption = <value optimized out>
        listen_addr_saved = <value optimized out>
        i = <value optimized out>
        output_config_variable = <value optimized out>
        __func__ = "PostmasterMain"
#17 0x0000000000609fe8 in main (argc=3, argv=0x1676990) at main.c:228
No locals.


10.04.2018, 22:56, "Peter Geoghegan" <pg@bowt.ie>:
> On Fri, Apr 6, 2018 at 12:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote:
>>  Sorry. I can't do this. Because, 1s - this is production and 2nd - it had been initialized as replica.
>>  We use 2 server, and this server (with error) now initialized as secondary.
>>  We use pgpool for switching servers.
>>  And i posted full log from Postgres logs.
>>  All other logs is clear.
>
> I won't be able to help you without this information.
>
> --
> Peter Geoghegan


Hello, Peter.

Also from logs:
*** glibc detected *** postgres: postgres smsconsole [local] SELECT: double free or corruption (!prev):
0x000000000179e370***
 
======= Backtrace: =========
/lib64/libc.so.6[0x3d22275dee]
/lib64/libc.so.6[0x3d22278c80]
postgres: postgres smsconsole [local] SELECT(tuplestore_end+0x17)[0x808887]
postgres: postgres smsconsole [local] SELECT(ExecEndFunctionScan+0x75)[0x5e94e5]
postgres: postgres smsconsole [local] SELECT(standard_ExecutorEnd+0x2e)[0x5cbaae]
postgres: postgres smsconsole [local] SELECT(PortalCleanup+0x9e)[0x593d6e]
postgres: postgres smsconsole [local] SELECT(PortalDrop+0x2a)[0x7fcaca]
postgres: postgres smsconsole [local] SELECT[0x6e0eb2]
postgres: postgres smsconsole [local] SELECT(PostgresMain+0xdcc)[0x6e256c]
postgres: postgres smsconsole [local] SELECT(PostmasterMain+0x1875)[0x6823e5]
postgres: postgres smsconsole [local] SELECT(main+0x7a8)[0x609fe8]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3d2221ed1d]
postgres: postgres smsconsole [local] SELECT[0x46c039]


--
Great thanks for your answers.

10.04.2018, 22:56, "Peter Geoghegan" <pg@bowt.ie>:
> On Fri, Apr 6, 2018 at 12:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote:
>>  Sorry. I can't do this. Because, 1s - this is production and 2nd - it had been initialized as replica.
>>  We use 2 server, and this server (with error) now initialized as secondary.
>>  We use pgpool for switching servers.
>>  And i posted full log from Postgres logs.
>>  All other logs is clear.
>
> I won't be able to help you without this information.
>
> --
> Peter Geoghegan


Hello, Peter.

Sorry for such mailing.

> It would be particularly helpful if you were able to collect a 
> coredump, and run "p *debug_query_string" from GDB. 

(gdb) p *debug_query_string
$1 = 115 's'


10.04.2018, 22:56, "Peter Geoghegan" <pg@bowt.ie>:
> On Fri, Apr 6, 2018 at 12:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote:
>>  Sorry. I can't do this. Because, 1s - this is production and 2nd - it had been initialized as replica.
>>  We use 2 server, and this server (with error) now initialized as secondary.
>>  We use pgpool for switching servers.
>>  And i posted full log from Postgres logs.
>>  All other logs is clear.
>
> I won't be able to help you without this information.
>
> --
> Peter Geoghegan


On Mon, Apr 16, 2018 at 8:46 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote:
> Hello, Peter.
>
> Sorry for such mailing.
>
>> It would be particularly helpful if you were able to collect a
>> coredump, and run "p *debug_query_string" from GDB.
>
> (gdb) p *debug_query_string
> $1 = 115 's'

Sorry, I meant "p debug_query_string" -- lose the *.

-- 
Peter Geoghegan


On Mon, Apr 16, 2018 at 9:04 AM, Peter Geoghegan <pg@bowt.ie> wrote:
>> (gdb) p *debug_query_string
>> $1 = 115 's'
>
> Sorry, I meant "p debug_query_string" -- lose the *.

This now seems unnecessary, since it's already evident from your "bt
full" output that the query involved in the crash was "select count(*)
from pg_buffercache where isdirty".


-- 
Peter Geoghegan


Peter Geoghegan wrote:

> This now seems unnecessary, since it's already evident from your "bt
> full" output that the query involved in the crash was "select count(*)
> from pg_buffercache where isdirty".

Hmm, so is the Zabbix monitoring running that query frequently?
Because, as I recall, pg_buffercache is pretty heavy on the system,
since it needs to acquire all the bufmgr locks simultaneously?

In other words, this seems a terrible query to be running in zabbix.  I
have vague memories of somebody submitting a version of this code that
returned approximate answers, good enough for monitoring ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On 2018-04-16 14:03:40 -0300, Alvaro Herrera wrote:
> Peter Geoghegan wrote:
> 
> > This now seems unnecessary, since it's already evident from your "bt
> > full" output that the query involved in the crash was "select count(*)
> > from pg_buffercache where isdirty".
> 
> Hmm, so is the Zabbix monitoring running that query frequently?
> Because, as I recall, pg_buffercache is pretty heavy on the system,
> since it needs to acquire all the bufmgr locks simultaneously?
> 
> In other words, this seems a terrible query to be running in zabbix.

Can be extremely useful however, to predict how much longer your
workload's hot data set fits into cache. That's worth the cost in a
number of cases...  Either way, a crash is clearly something separate.


> I have vague memories of somebody submitting a version of this code
> that returned approximate answers, good enough for monitoring ...

That might have been me, but I don't recall the details anymore...

Greetings,

Andres Freund


On Mon, Apr 16, 2018 at 10:09 AM, Andres Freund <andres@anarazel.de> wrote:
>> I have vague memories of somebody submitting a version of this code
>> that returned approximate answers, good enough for monitoring ...
>
> That might have been me, but I don't recall the details anymore...

Obviously you're thinking of 6e654546fb61f62cc982d0c8f62241b3b30e7ef8.
I have a hard time imagining how that could be implicated in this hard
crash, though, except perhaps by removing something that masked the
problem in earlier versions.

-- 
Peter Geoghegan


On 2018-04-16 10:13:27 -0700, Peter Geoghegan wrote:
> On Mon, Apr 16, 2018 at 10:09 AM, Andres Freund <andres@anarazel.de> wrote:
> >> I have vague memories of somebody submitting a version of this code
> >> that returned approximate answers, good enough for monitoring ...
> >
> > That might have been me, but I don't recall the details anymore...
> 
> Obviously you're thinking of 6e654546fb61f62cc982d0c8f62241b3b30e7ef8.
> I have a hard time imagining how that could be implicated in this hard
> crash, though, except perhaps by removing something that masked the
> problem in earlier versions.

Can't be involved, because the crashing version is 9.6, which doesn't
include that afaics.

Greetings,

Andres Freund


Hi,

On 2018-04-05 22:39:42 +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      15144
> Logged by:          Vitaly Voronov
> Email address:      wizard_1024@tut.by
> PostgreSQL version: 9.6.8
> Operating system:   CentOS 6.9
> Description:        
> 
> Hello, 
> 
> We have problem at our Master server (second time). 
> From first time, we update CentOS to latest version (6.9) 

What's your shared_buffers setting? Is this a 32bit or 64bit
installation?

Greetings,

Andres Freund


Peter Geoghegan wrote:
> On Mon, Apr 16, 2018 at 10:09 AM, Andres Freund <andres@anarazel.de> wrote:
> >> I have vague memories of somebody submitting a version of this code
> >> that returned approximate answers, good enough for monitoring ...
> >
> > That might have been me, but I don't recall the details anymore...
> 
> Obviously you're thinking of 6e654546fb61f62cc982d0c8f62241b3b30e7ef8.
> I have a hard time imagining how that could be implicated in this hard
> crash, though, except perhaps by removing something that masked the
> problem in earlier versions.

Yeah, I wasn't commenting on the crash itself -- just on how bad it is
to let Zabbix monitor your database in this way.  Maybe it *is* useful
in certain situations, as Andres says, but I bet zabbix doesn't actually
discriminate like that.

Now, looking at the code

    for (i = 0; i < node->nfuncs; i++)
    {
        FunctionScanPerFuncState *fs = &node->funcstates[i];

        if (fs->func_slot)
            ExecClearTuple(fs->func_slot);

        if (fs->tstore != NULL)
        {
            tuplestore_end(node->funcstates[i].tstore);
            fs->tstore = NULL;
        }


and tuplestore_end does this:
    if (state->myfile)
        BufFileClose(state->myfile);
without setting anything in state to NULL; so we're relying on the
caller fs->tstore to null to avoid repeated tuplestore_end calls.  I
can't see any way for this to misbehave, but maybe the funcstate appears
more than once in the PerFuncState array, and we clean it correctly the
first time around and then invoke the tuplestore_end() the second time
to the memory that was previously freed?  I think this makes no sense
unless we share FunctionScanPerFuncState elements -- do we?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On Mon, Apr 16, 2018 at 10:48 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:
> and tuplestore_end does this:
>         if (state->myfile)
>                 BufFileClose(state->myfile);
> without setting anything in state to NULL; so we're relying on the
> caller fs->tstore to null to avoid repeated tuplestore_end calls.  I
> can't see any way for this to misbehave, but maybe the funcstate appears
> more than once in the PerFuncState array, and we clean it correctly the
> first time around and then invoke the tuplestore_end() the second time
> to the memory that was previously freed?  I think this makes no sense
> unless we share FunctionScanPerFuncState elements -- do we?

I have no reason to think that we do. Offhand, I find it more likely
that some executor slot that imagines that it owns the tuple frees the
tuple once, which is followed by a call to tuplestore_end() that frees
the same tuple a second time (a double-free). As I mentioned, we've
seen several bugs of that general variety in both tuplestore and
tuplesort in the past. Some of these have been very subtle.

Note that pgpool is involved here. I don't know much about pgpool, and
maybe that's totally irrelevant.

-- 
Peter Geoghegan


Hello, Andres.

From postgresql.conf:
shared_buffers = 12GB

This is 64 bit installation

16.04.2018, 20:29, "Andres Freund" <andres@anarazel.de>:
> Hi,
>
> On 2018-04-05 22:39:42 +0000, PG Bug reporting form wrote:
>>  The following bug has been logged on the website:
>>
>>  Bug reference: 15144
>>  Logged by: Vitaly Voronov
>>  Email address: wizard_1024@tut.by
>>  PostgreSQL version: 9.6.8
>>  Operating system: CentOS 6.9
>>  Description:
>>
>>  Hello,
>>
>>  We have problem at our Master server (second time).
>>  From first time, we update CentOS to latest version (6.9)
>
> What's your shared_buffers setting? Is this a 32bit or 64bit
> installation?
>
> Greetings,
>
> Andres Freund


Hello, Peter.

Pgpool used only from App side at another server.

Zabbix running its queries directly at database server.

16.04.2018, 21:03, "Peter Geoghegan" <pg@bowt.ie>:
> On Mon, Apr 16, 2018 at 10:48 AM, Alvaro Herrera
> <alvherre@alvh.no-ip.org> wrote:
>>  and tuplestore_end does this:
>>          if (state->myfile)
>>                  BufFileClose(state->myfile);
>>  without setting anything in state to NULL; so we're relying on the
>>  caller fs->tstore to null to avoid repeated tuplestore_end calls. I
>>  can't see any way for this to misbehave, but maybe the funcstate appears
>>  more than once in the PerFuncState array, and we clean it correctly the
>>  first time around and then invoke the tuplestore_end() the second time
>>  to the memory that was previously freed? I think this makes no sense
>>  unless we share FunctionScanPerFuncState elements -- do we?
>
> I have no reason to think that we do. Offhand, I find it more likely
> that some executor slot that imagines that it owns the tuple frees the
> tuple once, which is followed by a call to tuplestore_end() that frees
> the same tuple a second time (a double-free). As I mentioned, we've
> seen several bugs of that general variety in both tuplestore and
> tuplesort in the past. Some of these have been very subtle.
>
> Note that pgpool is involved here. I don't know much about pgpool, and
> maybe that's totally irrelevant.
>
> --
> Peter Geoghegan


On Mon, Apr 16, 2018 at 11:05 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote:
> Hello, Peter.
>
> Pgpool used only from App side at another server.
>
> Zabbix running its queries directly at database server.

Can you reliably crash the server by running "select count(*) from
pg_buffercache where isdirty" from psql?

What work_mem setting does Postgres have when Zabbix runs this query?

-- 
Peter Geoghegan


Hello, Peter.

> Can you reliably crash the server by running "select count(*) from
pg_buffercache where isdirty" from psql?

No. 
I run from psql and get answer:
# select count(*) from
pg_buffercache where isdirty;
 count
-------
    71
(1 row)

> What work_mem setting does Postgres have when Zabbix runs this query?
work_mem=64MB


16.04.2018, 21:35, "Peter Geoghegan" <pg@bowt.ie>:
> On Mon, Apr 16, 2018 at 11:05 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote:
>>  Hello, Peter.
>>
>>  Pgpool used only from App side at another server.
>>
>>  Zabbix running its queries directly at database server.
>
> Can you reliably crash the server by running "select count(*) from
> pg_buffercache where isdirty" from psql?
>
> What work_mem setting does Postgres have when Zabbix runs this query?
>
> --
> Peter Geoghegan


Peter Geoghegan <pg@bowt.ie> writes:
> Offhand, I find it more likely
> that some executor slot that imagines that it owns the tuple frees the
> tuple once, which is followed by a call to tuplestore_end() that frees
> the same tuple a second time (a double-free). As I mentioned, we've
> seen several bugs of that general variety in both tuplestore and
> tuplesort in the past. Some of these have been very subtle.

I see that in 9.6, nodeFunctionScan thinks it should do ExecClearTuple
on the func_slot that it's received from tuplestore_gettupleslot,
which it calls with copy = false, meaning that ExecClearTuple might be
deleting a tuple returned by tuplestore_gettuple.  I wonder if this
is the same kind of issue we fixed in 90decdba3, only for tuplestore
rather than tuplesort.

tuplestore_gettuple doesn't return should_free = true unless the
tuplestore spilled to disk, so the sort of issue I'm imagining
would only arise for function results large enough to cause a spill.

BTW, I notice that in this situation, readtup_heap seems to be
palloc'ing in the caller's context, but it counts the memory as
if it were in the tuplestore's context.  Somebody's confused there.

            regards, tom lane


On Mon, Apr 16, 2018 at 1:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Geoghegan <pg@bowt.ie> writes:
>> Offhand, I find it more likely
>> that some executor slot that imagines that it owns the tuple frees the
>> tuple once, which is followed by a call to tuplestore_end() that frees
>> the same tuple a second time (a double-free). As I mentioned, we've
>> seen several bugs of that general variety in both tuplestore and
>> tuplesort in the past. Some of these have been very subtle.
>
> I see that in 9.6, nodeFunctionScan thinks it should do ExecClearTuple
> on the func_slot that it's received from tuplestore_gettupleslot,
> which it calls with copy = false, meaning that ExecClearTuple might be
> deleting a tuple returned by tuplestore_gettuple.  I wonder if this
> is the same kind of issue we fixed in 90decdba3, only for tuplestore
> rather than tuplesort.

I'm going to spend some time trying to reproduce the bug tomorrow. I
suspect that we can justify bringing tuplestore in line with tuplesort
defensively, though (i.e. doing something like 90decdba3 for
tuplestore, even in the absence of strong evidence that that will
prevent this crash).

> tuplestore_gettuple doesn't return should_free = true unless the
> tuplestore spilled to disk, so the sort of issue I'm imagining
> would only arise for function results large enough to cause a spill.

Sounds familiar.

> BTW, I notice that in this situation, readtup_heap seems to be
> palloc'ing in the caller's context, but it counts the memory as
> if it were in the tuplestore's context.  Somebody's confused there.

I could just kick myself for not going through tuplestore (and its
version of readtup_heap) as part of the 90decdba3 work.

-- 
Peter Geoghegan


Peter Geoghegan <pg@bowt.ie> writes:
> On Mon, Apr 16, 2018 at 1:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> BTW, I notice that in this situation, readtup_heap seems to be
>> palloc'ing in the caller's context, but it counts the memory as
>> if it were in the tuplestore's context.  Somebody's confused there.

> I could just kick myself for not going through tuplestore (and its
> version of readtup_heap) as part of the 90decdba3 work.

Yeah, I should have thought to question that too.  tuplestore was
originally built by stripping down tuplesort, and at least in the
beginning, I'm pretty sure that all these semantic API details were
the same.  We should likely have made more effort to keep them in
sync.  (Still, until we've proven that there *is* a bug here,
let's not kick ourselves too hard.)

            regards, tom lane


On Mon, Apr 16, 2018 at 3:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah, I should have thought to question that too.  tuplestore was
> originally built by stripping down tuplesort, and at least in the
> beginning, I'm pretty sure that all these semantic API details were
> the same.  We should likely have made more effort to keep them in
> sync.  (Still, until we've proven that there *is* a bug here,
> let's not kick ourselves too hard.)

FWIW, I think that tuplesort remains a good example for tuplestore to
follow, since the enhancements that prevented the tuplesort crash on
v10+ make just as much sense for tuplestore (and could even have been
justified purely on robustness grounds). Many small palloc() calls are
certainly something that we should try to avoid.

Actually, I once looked into writing such a patch for tuplestore
myself, but IIRC tuplestore_clear() and interXact support made it more
painful than initially thought.

-- 
Peter Geoghegan


>>>>> "Vitaly" == Vitaly V Voronov <wizard_1024@tut.by> writes:

 Vitaly> #4  0x0000003d22278c80 in _int_free () from /lib64/libc.so.6
 Vitaly> No symbol table info available.
 Vitaly> #5  0x0000000000808887 in tuplestore_end (state=0x1794898) at tuplestore.c:455
 Vitaly>         i = <value optimized out>

So I may be off base here but...

Line 455 isn't anything to do with tuples; it's the BufFileClose() line.

Furthermore, there's no stack frame between the free() and the
tuplestore_end. From looking at optimized builds, this suggests that
free() has been reached via tail calls, and the only way I see that
happening is when pfree() is being called on a large allocation (one
large enough to be its own chunk), which shouldn't happen for tuples in
this example. (It can happen for the memtuples array itself.)

BufFile is a struct with a big buffer in it, though, so it'll be a large
allocation, hence the pfree() at the end of BufFileClose will end up in
free() via tail calls.

Of course the weak point in this theory is that there seems to be no
reason at all why BufFileClose could possibly get called twice ...
the only other theory would be that something has somehow reset the
memory context _before_ we got here.

-- 
Andrew (irc:RhodiumToad)


>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

 Andrew> Of course the weak point in this theory is that there seems to
 Andrew> be no reason at all why BufFileClose could possibly get called
 Andrew> twice ... the only other theory would be that something has
 Andrew> somehow reset the memory context _before_ we got here.

... and in this context I just noticed that pg_stat_statements is in
play, which could be significant.

-- 
Andrew (irc:RhodiumToad)


On Mon, Apr 16, 2018 at 8:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote:
> We have cat crash at our secondary server.
> This full stack trace from crash dump:

Did the problem recur?

Sorry for letting this one lapse.

-- 
Peter Geoghegan


Hello, Peter.

After we disabled monitoring request to pg_buffercache and upgraded to 9.6.9 - we don't encounter problem.

20.06.2018, 00:35, "Peter Geoghegan" <pg@bowt.ie>:
> On Mon, Apr 16, 2018 at 8:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote:
>>  We have cat crash at our secondary server.
>>  This full stack trace from crash dump:
>
> Did the problem recur?
>
> Sorry for letting this one lapse.
>
> --
> Peter Geoghegan