Обсуждение: BUG #15144: *** glibc detected *** postgres: postgres smsconsole[local] SELECT: double free or corruption (!pre
BUG #15144: *** glibc detected *** postgres: postgres smsconsole[local] SELECT: double free or corruption (!pre
От
PG Bug reporting form
Дата:
The following bug has been logged on the website: Bug reference: 15144 Logged by: Vitaly Voronov Email address: wizard_1024@tut.by PostgreSQL version: 9.6.8 Operating system: CentOS 6.9 Description: Hello, We have problem at our Master server (second time). From first time, we update CentOS to latest version (6.9) But today we have such bug: *** glibc detected *** postgres: postgres smsconsole [local] SELECT: double free or corruption (!prev): 0x00000000022529e0 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x75dee)[0x7f6b19433dee] /lib64/libc.so.6(+0x78c80)[0x7f6b19436c80] postgres: postgres smsconsole [local] SELECT(tuplestore_end+0x17)[0x808887] postgres: postgres smsconsole [local] SELECT(ExecEndFunctionScan+0x75)[0x5e94e5] postgres: postgres smsconsole [local] SELECT(standard_ExecutorEnd+0x2e)[0x5cbaae] postgres: postgres smsconsole [local] SELECT(PortalCleanup+0x9e)[0x593d6e] postgres: postgres smsconsole [local] SELECT(PortalDrop+0x2a)[0x7fcaca] postgres: postgres smsconsole [local] SELECT[0x6e0eb2] postgres: postgres smsconsole [local] SELECT(PostgresMain+0xdcc)[0x6e256c] postgres: postgres smsconsole [local] SELECT(PostmasterMain+0x1875)[0x6823e5] postgres: postgres smsconsole [local] SELECT(main+0x7a8)[0x609fe8] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f6b193dcd1d] postgres: postgres smsconsole [local] SELECT[0x46c039] ======= Memory map: ======== 00400000-00a14000 r-xp 00000000 08:03 3938726 /usr/pgsql-9.6/bin/postgres 00c14000-00c21000 rw-p 00614000 08:03 3938726 /usr/pgsql-9.6/bin/postgres 00c21000-00c72000 rw-p 00000000 00:00 0 0211e000-02183000 rw-p 00000000 00:00 0 02183000-0228f000 rw-p 00000000 00:00 0 3799000000-3799016000 r-xp 00000000 08:03 26214446 /lib64/libgcc_s-4.4.7-20120601.so.1 3799016000-3799215000 ---p 00016000 08:03 26214446 /lib64/libgcc_s-4.4.7-20120601.so.1 3799215000-3799216000 rw-p 00015000 08:03 26214446 /lib64/libgcc_s-4.4.7-20120601.so.1 3d56000000-3d56015000 r-xp 00000000 08:03 26214420 /lib64/libz.so.1.2.3 (deleted) 3d56015000-3d56214000 ---p 00015000 08:03 26214420 /lib64/libz.so.1.2.3 (deleted) 3d56214000-3d56215000 r--p 00014000 08:03 26214420 /lib64/libz.so.1.2.3 (deleted) 3d56215000-3d56216000 rw-p 00015000 08:03 26214420 /lib64/libz.so.1.2.3 (deleted) 3d56800000-3d5681d000 r-xp 00000000 08:03 26214488 /lib64/libselinux.so.1.#prelink#.znRAKV (deleted) 3d5681d000-3d56a1c000 ---p 0001d000 08:03 26214488 /lib64/libselinux.so.1.#prelink#.znRAKV (deleted) 3d56a1c000-3d56a1d000 r--p 0001c000 08:03 26214488 /lib64/libselinux.so.1.#prelink#.znRAKV (deleted) 3d56a1d000-3d56a1e000 rw-p 0001d000 08:03 26214488 /lib64/libselinux.so.1.#prelink#.znRAKV (deleted) 3d56a1e000-3d56a1f000 rw-p 00000000 00:00 0 3d57c00000-3d57c02000 r-xp 00000000 08:03 26214456 /lib64/libfreebl3.so (deleted) 3d57c02000-3d57e01000 ---p 00002000 08:03 26214456 /lib64/libfreebl3.so (deleted) 3d57e01000-3d57e02000 r--p 00001000 08:03 26214456 /lib64/libfreebl3.so (deleted) 3d57e02000-3d57e03000 rw-p 00002000 08:03 26214456 /lib64/libfreebl3.so (deleted) 3d5b000000-3d5b002000 r-xp 00000000 08:03 26214486 /lib64/libkeyutils.so.1.3.#prelink#.zuP6V2 (deleted) 3d5b002000-3d5b201000 ---p 00002000 08:03 26214486 /lib64/libkeyutils.so.1.3.#prelink#.zuP6V2 (deleted) 3d5b201000-3d5b202000 r--p 00001000 08:03 26214486 /lib64/libkeyutils.so.1.3.#prelink#.zuP6V2 (deleted) 3d5b202000-3d5b203000 rw-p 00002000 08:03 26214486 /lib64/libkeyutils.so.1.3.#prelink#.zuP6V2 (deleted) 3d5e000000-3d5e149000 r-xp 00000000 08:03 3934405 /usr/lib64/libxml2.so.2.7.6.#prelink#.b6SBFV (deleted) 3d5e149000-3d5e348000 ---p 00149000 08:03 3934405 /usr/lib64/libxml2.so.2.7.6.#prelink#.b6SBFV (deleted) 3d5e348000-3d5e352000 rw-p 00148000 08:03 3934405 /usr/lib64/libxml2.so.2.7.6.#prelink#.b6SBFV (deleted) 3d5e352000-3d5e353000 rw-p 00000000 00:00 0 3d5ec00000-3d5ec19000 r-xp 00000000 08:03 3938131 /usr/lib64/libsasl2.so.2.0.23.#prelink#.NiWMBM (deleted) 3d5ec19000-3d5ee18000 ---p 00019000 08:03 3938131 /usr/lib64/libsasl2.so.2.0.23.#prelink#.NiWMBM (deleted) 3d5ee18000-3d5ee19000 r--p 00018000 08:03 3938131 /usr/lib64/libsasl2.so.2.0.23.#prelink#.NiWMBM (deleted) 3d5ee19000-3d5ee1a000 rw-p 00019000 08:03 3938131 /usr/lib64/libsasl2.so.2.0.23.#prelink#.NiWMBM (deleted) 7f67f4000000-7f67f4021000 rw-p 00000000 00:00 0 7f67f4021000-7f67f8000000 ---p 00000000 00:00 0 7f67f8c24000-7f67f9599000 rw-p 00000000 00:00 0 7f67f9599000-7f67fc59f000 rw-p 00000000 00:00 0 7f67fc59f000-7f67fdda2000 rw-p 00000000 00:00 0 7f67fe1a3000-7f67fe9a4000 rw-p 00000000 00:00 0 7f67feba5000-7f67ff3a6000 rw-p 00000000 00:00 0 7f67ff4a7000-7f67ff8a8000 rw-p 00000000 00:00 0 7f67ff929000-7f67ffb2a000 rw-p 00000000 00:00 0 7f67ffb6b000-7f67ffc6c000 rw-p 00000000 00:00 0 7f6802c6d000-7f6802c6f000 r-xp 00000000 08:03 3938903 /usr/pgsql-9.6/lib/pg_buffercache.so 7f6802c6f000-7f6802e6e000 ---p 00002000 08:03 3938903 /usr/pgsql-9.6/lib/pg_buffercache.so 7f6802e6e000-7f6802e6f000 rw-p 00001000 08:03 3938903 /usr/pgsql-9.6/lib/pg_buffercache.so 7f6802e6f000-7f6802e7c000 r-xp 00000000 08:03 26214806 /lib64/libnss_files-2.12.so 7f6802e7c000-7f680307b000 ---p 0000d000 08:03 26214806 /lib64/libnss_files-2.12.so 7f680307b000-7f680307c000 r--p 0000c000 08:03 26214806 /lib64/libnss_files-2.12.so 7f680307c000-7f680307d000 rw-p 0000d000 08:03 26214806 /lib64/libnss_files-2.12.so 7f680307d000-7f6b16dc3000 rw-s 00000000 00:04 22567 /dev/zero (deleted) 7f6b16dc3000-7f6b16dcb000 r-xp 00000000 08:03 3938906 /usr/pgsql-9.6/lib/pg_stat_statements.so 7f6b16dcb000-7f6b16fca000 ---p 00008000 08:03 3938906 /usr/pgsql-9.6/lib/pg_stat_statements.so 7f6b16fca000-7f6b16fcb000 rw-p 00007000 08:03 3938906 /usr/pgsql-9.6/lib/pg_stat_statements.so 7f6b16fcb000-7f6b17004000 r-xp 00000000 08:03 26214950 /lib64/libnspr4.so (deleted) 7f6b17004000-7f6b17204000 ---p 00039000 08:03 26214950 /lib64/libnspr4.so (deleted) 7f6b17204000-7f6b17205000 r--p 00039000 08:03 26214950 /lib64/libnspr4.so (deleted) 7f6b17205000-7f6b17207000 rw-p 0003a000 08:03 26214950 /lib64/libnspr4.so (deleted) 7f6b17207000-7f6b17209000 rw-p 00000000 00:00 0 7f6b17209000-7f6b1720d000 r-xp 00000000 08:03 26214951 /lib64/libplc4.so (deleted) 7f6b1720d000-7f6b1740c000 ---p 00004000 08:03 26214951 /lib64/libplc4.so (deleted) 7f6b1740c000-7f6b1740d000 r--p 00003000 08:03 26214951 /lib64/libplc4.so (deleted) 7f6b1740d000-7f6b1740e000 rw-p 00004000 08:03 26214951 /lib64/libplc4.so (deleted) 7f6b1740e000-7f6b17411000 r-xp 00000000 08:03 26214952 /lib64/libplds4.so (deleted) 7f6b17411000-7f6b17610000 ---p 00003000 08:03 26214952 /lib64/libplds4.so (deleted) 7f6b17610000-7f6b17611000 r--p 00002000 08:03 26214952 /lib64/libplds4.so (deleted) 7f6b17611000-7f6b17612000 rw-p 00003000 08:03 26214952 /lib64/libplds4.so (deleted) 7f6b17612000-7f6b17638000 r-xp 00000000 08:03 3933172 /usr/lib64/libnssutil3.so (deleted) 7f6b17638000-7f6b17837000 ---p 00026000 08:03 3933172 /usr/lib64/libnssutil3.so (deleted) 7f6b17837000-7f6b1783e000 r--p 00025000 08:03 3933172 /usr/lib64/libnssutil3.so (deleted) 7f6b1783e000-7f6b1783f000 rw-p 0002c000 08:03 3933172 /usr/lib64/libnssutil3.so (deleted) 7f6b1783f000-7f6b17979000 r-xp 00000000 08:03 3934884 /usr/lib64/libnss3.so (deleted) 7f6b17979000-7f6b17b78000 ---p 0013a000 08:03 3934884 /usr/lib64/libnss3.so (deleted) 7f6b17b78000-7f6b17b7e000 r--p 00139000 08:03 3934884 /usr/lib64/libnss3.so (deleted) 7f6b17b7e000-7f6b17b80000 rw-p 0013f000 08:03 3934884 /usr/lib64/libnss3.so (deleted) 7f6b17b80000-7f6b17b82000 rw-p 00000000 00:00 0 7f6b17b82000-7f6b17baa000 r-xp 00000000 08:03 3939143 /usr/lib64/libsmime3.so (deleted) 7f6b17baa000-7f6b17da9000 ---p 00028000 08:03 3939143 /usr/lib64/libsmime3.so (deleted) 7f6b17da9000-7f6b17dad000 r--p 00027000 08:03 3939143 /usr/lib64/libsmime3.so (deleted) 7f6b17dad000-7f6b17dae000 rw-p 0002b000 08:03 3939143 /usr/lib64/libsmime3.so (deleted) 7f6b17dae000-7f6b17df5000 r-xp 00000000 08:03 3939144 /usr/lib64/libssl3.so (deleted) 7f6b17df5000-7f6b17ff5000 ---p 00047000 08:03 3939144 /usr/lib64/libssl3.so (deleted) 7f6b17ff5000-7f6b17ff9000 r--p 00047000 08:03 3939144 /usr/lib64/libssl3.so (deleted) 7f6b17ff9000-7f6b17ffa000 rw-p 0004b000 08:03 3939144 /usr/lib64/libssl3.so (deleted) 7f6b17ffa000-7f6b17ffb000 rw-p 00000000 00:00 0 7f6b17ffb000-7f6b18009000 r-xp 00000000 08:03 26214830 /lib64/liblber-2.4.so.2.10.3.#prelink#.7hI5fW (deleted) 7f6b18009000-7f6b18208000 ---p 0000e000 08:03 26214830 /lib64/liblber-2.4.so.2.10.3.#prelink#.7hI5fW (deleted) 7f6b18208000-7f6b18209000 r--p 0000d000 08:03 26214830 /lib64/liblber-2.4.so.2.10.3.#prelink#.7hI5fW (deleted) 7f6b18209000-7f6b1820a000 rw-p 0000e000 08:03 26214830 /lib64/liblber-2.4.so.2.10.3.#prelink#.7hI5fW (deleted) 7f6b1820a000-7f6b18221000 r-xp 00000000 08:03 26214436 /lib64/libpthread-2.12.so.#prelink#.KCRT1K (deleted) 7f6b18221000-7f6b18421000 ---p 00017000 08:03 26214436 /lib64/libpthread-2.12.so.#prelink#.KCRT1K (deleted) 7f6b18421000-7f6b18422000 r--p 00017000 08:03 26214436 /lib64/libpthread-2.12.so.#prelink#.KCRT1K (deleted) 7f6b18422000-7f6b18423000 rw-p 00018000 08:03 26214436 /lib64/libpthread-2.12.so.#prelink#.KCRT1K (deleted) 7f6b18423000-7f6b18427000 rw-p 00000000 00:00 0 7f6b18427000-7f6b1843d000 r-xp 00000000 08:03 26214808 /lib64/libresolv-2.12.so.#prelink#.CZY6kZ (deleted) 7f6b1843d000-7f6b1863d000 ---p 00016000 08:03 26214808 /lib64/libresolv-2.12.so.#prelink#.CZY6kZ (deleted) 7f6b1863d000-7f6b1863e000 r--p 00016000 08:03 26214808 /lib64/libresolv-2.12.so.#prelink#.CZY6kZ (deleted) 7f6b1863e000-7f6b1863f000 rw-p 00017000 08:03 26214808 /lib64/libresolv-2.12.so.#prelink#.CZY6kZ (deleted) 7f6b1863f000-7f6b18641000 rw-p 00000000 00:00 0 7f6b18641000-7f6b1864b000 r-xp 00000000 08:03 26214829 /lib64/libkrb5support.so.0.1.#prelink#.S6UyaS (deleted) 7f6b1864b000-7f6b1884a000 ---p 0000a000 08:03 26214829 /lib64/libkrb5support.so.0.1.#prelink#.S6UyaS (deleted) 7f6b1884a000-7f6b1884b000 r--p 00009000 08:03 26214829 /lib64/libkrb5support.so.0.1.#prelink#.S6UyaS (deleted) 7f6b1884b000-7f6b1884c000 rw-p 0000a000 08:03 26214829 /lib64/libkrb5support.so.0.1.#prelink#.S6UyaS (deleted) 7f6b1884c000-7f6b18875000 r-xp 00000000 08:03 26214619 /lib64/libk5crypto.so.3.1.#prelink#.V1CVAO (deleted) 7f6b18875000-7f6b18a75000 ---p 00029000 08:03 26214619 /lib64/libk5crypto.so.3.1.#prelink#.V1CVAO (deleted) 7f6b18a75000-7f6b18a76000 r--p 00029000 08:03 26214619 /lib64/libk5crypto.so.3.1.#prelink#.V1CVAO (deleted) 7f6b18a76000-7f6b18a77000 rw-p 0002a000 08:03 26214619 /lib64/libk5crypto.so.3.1.#prelink#.V1CVAO (deleted) 7f6b18a77000-7f6b18a78000 rw-p 00000000 00:00 0 7f6b18a78000-7f6b18a7b000 r-xp 00000000 08:03 26214466 /lib64/libcom_err.so.2.1.#prelink#.pCWhtH (deleted) 7f6b18a7b000-7f6b18c7a000 ---p 00003000 08:03 26214466 /lib64/libcom_err.so.2.1.#prelink#.pCWhtH (deleted) 7f6b18c7a000-7f6b18c7b000 r--p 00002000 08:03 26214466 /lib64/libcom_err.so.2.1.#prelink#.pCWhtH (deleted) 7f6b18c7b000-7f6b18c7c000 rw-p 00003000 08:03 26214466 /lib64/libcom_err.so.2.1.#prelink#.pCWhtH (deleted) 7f6b18c7c000-7f6b18d58000 r-xp 00000000 08:03 26214828 /lib64/libkrb5.so.3.3 (deleted) 7f6b18d58000-7f6b18f57000 ---p 000dc000 08:03 26214828 /lib64/libkrb5.so.3.3 (deleted) 7f6b18f57000-7f6b18f61000 r--p 000db000 08:03 26214828 /lib64/libkrb5.so.3.3 (deleted) 7f6b18f61000-7f6b18f63000 rw-p 000e5000 08:03 26214828 /lib64/libkrb5.so.3.3 (deleted) 7f6b18f63000-7f6b18f6a000 r-xp 00000000 08:03 26214416 /lib64/libcrypt-2.12.so.#prelink#.cXl2OP (deleted) 7f6b18f6a000-7f6b1916a000 ---p 00007000 08:03 26214416 /lib64/libcrypt-2.12.so.#prelink#.cXl2OP (deleted) 7f6b1916a000-7f6b1916b000 r--p 00007000 08:03 26214416 /lib64/libcrypt-2.12.so.#prelink#.cXl2OP (deleted) 7f6b1916b000-7f6b1916c000 rw-p 00008000 08:03 26214416 /lib64/libcrypt-2.12.so.#prelink#.cXl2OP (deleted) 7f6b1916c000-7f6b1919a000 rw-p 00000000 00:00 0 7f6b1919a000-7f6b191b2000 r-xp 00000000 08:03 26214814 /lib64/libaudit.so.1.0.0.#prelink#.jWjLty (deleted) 7f6b191b2000-7f6b193b1000 ---p 00018000 08:03 26214814 /lib64/libaudit.so.1.0.0.#prelink#.jWjLty (deleted) 7f6b193b1000-7f6b193b3000 r--p 00017000 08:03 26214814 /lib64/libaudit.so.1.0.0.#prelink#.jWjLty (deleted) 7f6b193b3000-7f6b193be000 rw-p 00019000 08:03 26214814 /lib64/libaudit.so.1.0.0.#prelink#.jWjLty (deleted) 7f6b193be000-7f6b19548000 r-xp 00000000 08:03 26214412 /lib64/libc-2.12.so (deleted) 7f6b19548000-7f6b19748000 ---p 0018a000 08:03 26214412 /lib64/libc-2.12.so (deleted) 7f6b19748000-7f6b1974c000 r--p 0018a000 08:03 26214412 /lib64/libc-2.12.so (deleted) 7f6b1974c000-7f6b1974e000 rw-p 0018e000 08:03 26214412 /lib64/libc-2.12.so (deleted) 7f6b1974e000-7f6b19752000 rw-p 00000000 00:00 0 7f6b19752000-7f6b197a0000 r-xp 00000000 08:03 26214831 /lib64/libldap-2.4.so.2.10.3.#prelink#.bg5LwW (deleted) 7f6b197a0000-7f6b1999f000 ---p 0004e000 08:03 26214831 /lib64/libldap-2.4.so.2.10.3.#prelink#.bg5LwW (deleted) 7f6b1999f000-7f6b199a1000 r--p 0004d000 08:03 26214831 /lib64/libldap-2.4.so.2.10.3.#prelink#.bg5LwW (deleted) 7f6b199a1000-7f6b199a3000 rw-p 0004f000 08:03 26214831 /lib64/libldap-2.4.so.2.10.3.#prelink#.bg5LwW (deleted) 7f6b199a3000-7f6b19a26000 r-xp 00000000 08:03 26214803 /lib64/libm-2.12.so (deleted) 7f6b19a26000-7f6b19c25000 ---p 00083000 08:03 26214803 /lib64/libm-2.12.so (deleted) 7f6b19c25000-7f6b19c26000 r--p 00082000 08:03 26214803 /lib64/libm-2.12.so (deleted) 7f6b19c26000-7f6b19c27000 rw-p 00083000 08:03 26214803 /lib64/libm-2.12.so (deleted) 7f6b19c27000-7f6b19c29000 r-xp 00000000 08:03 26214802 /lib64/libdl-2.12.so (deleted) 7f6b19c29000-7f6b19e29000 ---p 00002000 08:03 26214802 /lib64/libdl-2.12.so (deleted) 7f6b19e29000-7f6b19e2a000 r--p 00002000 08:03 26214802 /lib64/libdl-2.12.so (deleted) 7f6b19e2a000-7f6b19e2b000 rw-p 00003000 08:03 26214802 /lib64/libdl-2.12.so (deleted) 7f6b19e2b000-7f6b19e32000 r-xp 00000000 08:03 26214809 /lib64/librt-2.12.so (deleted) 7f6b19e32000-7f6b1a031000 ---p 00007000 08:03 26214809 /lib64/librt-2.12.so (deleted) 7f6b1a031000-7f6b1a032000 r--p 00006000 08:03 26214809 /lib64/librt-2.12.so (deleted) 7f6b1a032000-7f6b1a033000 rw-p 00007000 08:03 26214809 /lib64/librt-2.12.so (deleted) 7f6b1a033000-7f6b1a074000 r-xp 00000000 08:03 26214826 /lib64/libgssapi_krb5.so.2.2.#prelink#.tB60pA (deleted) 7f6b1a074000-7f6b1a274000 ---p 00041000 08:03 26214826 /lib64/libgssapi_krb5.so.2.2.#prelink#.tB60pA (deleted) 7f6b1a274000-7f6b1a275000 r--p 00041000 08:03 26214826 /lib64/libgssapi_krb5.so.2.2.#prelink#.tB60pA (deleted) 7f6b1a275000-7f6b1a277000 rw-p 00042000 08:03 26214826 /lib64/libgssapi_krb5.so.2.2.#prelink#.tB60pA (deleted) 7f6b1a277000-7f6b1a431000 r-xp 00000000 08:03 3934773 /usr/lib64/libcrypto.so.1.0.1e.#prelink#.xwVltt (deleted) 7f6b1a431000-7f6b1a631000 ---p 001ba000 08:03 3934773 /usr/lib64/libcrypto.so.1.0.1e.#prelink#.xwVltt (deleted) 7f6b1a631000-7f6b1a64c000 r--p 001ba000 08:03 3934773 /usr/lib64/libcrypto.so.1.0.1e.#prelink#.xwVltt (deleted) 7f6b1a64c000-7f6b1a658000 rw-p 001d5000 08:03 3934773 /usr/lib64/libcrypto.so.1.0.1e.#prelink#.xwVltt (deleted) 7f6b1a658000-7f6b1a65c000 rw-p 00000000 00:00 0 7f6b1a65c000-7f6b1a6be000 r-xp 00000000 08:03 3938254 /usr/lib64/libssl.so.1.0.1e.#prelink#.QXZO4p (deleted) 7f6b1a6be000-7f6b1a8be000 ---p 00062000 08:03 3938254 /usr/lib64/libssl.so.1.0.1e.#prelink#.QXZO4p (deleted) 7f6b1a8be000-7f6b1a8c2000 r--p 00062000 08:03 3938254 /usr/lib64/libssl.so.1.0.1e.#prelink#.QXZO4p (deleted) 7f6b1a8c2000-7f6b1a8c8000 rw-p 00066000 08:03 3938254 /usr/lib64/libssl.so.1.0.1e.#prelink#.QXZO4p (deleted) 7f6b1a8c8000-7f6b1a8d4000 r-xp 00000000 08:03 26214531 /lib64/libpam.so.0.82.2.#prelink#.kKy58y (deleted) 7f6b1a8d4000-7f6b1aad4000 ---p 0000c000 08:03 26214531 /lib64/libpam.so.0.82.2.#prelink#.kKy58y (deleted) 7f6b1aad4000-7f6b1aad5000 r--p 0000c000 08:03 26214531 /lib64/libpam.so.0.82.2.#prelink#.kKy58y (deleted) 7f6b1aad5000-7f6b1aad6000 rw-p 0000d000 08:03 26214531 /lib64/libpam.so.0.82.2.#prelink#.kKy58y (deleted) 7f6b1aad6000-7f6b1aaf6000 r-xp 00000000 08:03 26214468 /lib64/ld-2.12.so (deleted) 7f6b1ab95000-7f6b1acda000 rw-p 00000000 00:00 0 7f6b1acda000-7f6b1aceb000 rw-p 00000000 00:00 0 7f6b1acf1000-7f6b1acf2000 rw-p 00000000 00:00 0 7f6b1acf2000-7f6b1acf4000 rw-s 00000000 00:10 22574 /dev/shm/PostgreSQL.1602759649 7f6b1acf4000-7f6b1acf5000 rw-s 00000000 00:04 32769 /SYSV0052e2c1 (deleted) 7f6b1acf5000-7f6b1acf6000 rw-p 00000000 00:00 0 7f6b1acf6000-7f6b1acf7000 r--p 00020000 08:03 26214468 /lib64/ld-2.12.so (deleted) 7f6b1acf7000-7f6b1acf8000 rw-p 00021000 08:03 26214468 /lib64/ld-2.12.so (deleted) 7f6b1acf8000-7f6b1acf9000 rw-p 00000000 00:00 0 7ffe33948000-7ffe3395d000 rw-p 00000000 00:00 0 [stack] 7ffe339f9000-7ffe339fa000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] < 2018-04-06 06:02:23.225 JST > LOG: server process (PID 9045) was terminated by signal 6: Aborted
On Thu, Apr 5, 2018 at 3:39 PM, PG Bug reporting form <noreply@postgresql.org> wrote: > We have problem at our Master server (second time). > From first time, we update CentOS to latest version (6.9) > But today we have such bug: > *** glibc detected *** postgres: postgres smsconsole [local] SELECT: double > free or corruption (!prev): 0x00000000022529e0 *** Can you get a full stack trace from a coredump? https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD It would be particularly helpful if you were able to collect a coredump, and run "p *debug_query_string" from GDB. It would also be nice to be able to get the query string from some other source, such as the server log. Perhaps it can be correlated to something? > ======= Backtrace: ========= > /lib64/libc.so.6(+0x75dee)[0x7f6b19433dee] > /lib64/libc.so.6(+0x78c80)[0x7f6b19436c80] > postgres: postgres smsconsole [local] > SELECT(tuplestore_end+0x17)[0x808887] > postgres: postgres smsconsole [local] > SELECT(ExecEndFunctionScan+0x75)[0x5e94e5] > postgres: postgres smsconsole [local] > SELECT(standard_ExecutorEnd+0x2e)[0x5cbaae] Offhand, I suspect that this could be a bug that is analogous to the one just fixed within tuplesort, by c2d4eb1b1fa252fd8c407e1519308017a18afed1. There is a fairly long history of these kinds of bugs, including one or two in tuplestore that I can recall from memory. -- Peter Geoghegan
Hello, 06.04.2018, 02:18, "Peter Geoghegan" <pg@bowt.ie>: > On Thu, Apr 5, 2018 at 3:39 PM, PG Bug reporting form > <noreply@postgresql.org> wrote: >> We have problem at our Master server (second time). >> From first time, we update CentOS to latest version (6.9) >> But today we have such bug: >> *** glibc detected *** postgres: postgres smsconsole [local] SELECT: double >> free or corruption (!prev): 0x00000000022529e0 *** > > Can you get a full stack trace from a coredump? > > https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD > > It would be particularly helpful if you were able to collect a > coredump, and run "p *debug_query_string" from GDB. > > It would also be nice to be able to get the query string from some > other source, such as the server log. Perhaps it can be correlated to > something? Sorry. I can't do this. Because, 1s - this is production and 2nd - it had been initialized as replica. We use 2 server, and this server (with error) now initialized as secondary. We use pgpool for switching servers. And i posted full log from Postgres logs. All other logs is clear. > >> ======= Backtrace: ========= >> /lib64/libc.so.6(+0x75dee)[0x7f6b19433dee] >> /lib64/libc.so.6(+0x78c80)[0x7f6b19436c80] >> postgres: postgres smsconsole [local] >> SELECT(tuplestore_end+0x17)[0x808887] >> postgres: postgres smsconsole [local] >> SELECT(ExecEndFunctionScan+0x75)[0x5e94e5] >> postgres: postgres smsconsole [local] >> SELECT(standard_ExecutorEnd+0x2e)[0x5cbaae] > > Offhand, I suspect that this could be a bug that is analogous to the > one just fixed within tuplesort, by > c2d4eb1b1fa252fd8c407e1519308017a18afed1. There is a fairly long > history of these kinds of bugs, including one or two in tuplestore > that I can recall from memory. We have zabbix monitoring, which actively using pg_stat_* and pg_buffercache. Can this cause such problem? Can you answer, when such changes will be released in production? > > -- > Peter Geoghegan Thanks for you answers!
On Fri, Apr 6, 2018 at 12:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote: > Sorry. I can't do this. Because, 1s - this is production and 2nd - it had been initialized as replica. > We use 2 server, and this server (with error) now initialized as secondary. > We use pgpool for switching servers. > And i posted full log from Postgres logs. > All other logs is clear. I won't be able to help you without this information. -- Peter Geoghegan
Hello, Peter. We have cat crash at our secondary server. This full stack trace from crash dump: (gdb) bt full #0 0x0000003d22232495 in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x0000003d22233c75 in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x0000003d222703a7 in __libc_message () from /lib64/libc.so.6 No symbol table info available. #3 0x0000003d22275dee in malloc_printerr () from /lib64/libc.so.6 No symbol table info available. #4 0x0000003d22278c80 in _int_free () from /lib64/libc.so.6 No symbol table info available. #5 0x0000000000808887 in tuplestore_end (state=0x1794898) at tuplestore.c:455 i = <value optimized out> #6 0x00000000005e94e5 in ExecEndFunctionScan (node=0x178e3d8) at nodeFunctionscan.c:550 fs = 0x178e388 i = <value optimized out> #7 0x00000000005cbaae in ExecEndPlan (queryDesc=0x16c6a38) at execMain.c:1451 resultRelInfo = <value optimized out> i = <value optimized out> l = <value optimized out> #8 standard_ExecutorEnd (queryDesc=0x16c6a38) at execMain.c:468 estate = 0x178e278 oldcontext = 0x1677e48 #9 0x0000000000593d6e in PortalCleanup (portal=0x16c3138) at portalcmds.c:280 save_exception_stack = 0x7fff07538610 save_context_stack = 0x0 local_sigjmp_buf = {{__jmpbuf = {23867704, 2475436528305129399, 24351312, 24686200, 2, 24686152, -2475701272642631753,2475437018370099127}, __mask_was_saved = 0, __saved_mask = {__val = {24686152, 15971042801079502775, 2475436856846141367, 0, 8280655,0, 24149016, 9856140, 2, 1, 23867704, 140733316302126, 88, 23867704, 24351312, 9377246}}}} saveResourceOwner = 0x1678878 queryDesc = 0x16c6a38 #10 0x00000000007fcaca in PortalDrop (portal=0x16c3138, isTopCommit=0 '\000') at portalmem.c:510 __func__ = "PortalDrop" #11 0x00000000006e0eb2 in exec_simple_query (query_string=0x1738468 "select count(*) from pg_buffercache where isdirty")at postgres.c:1095 parsetree = 0x1739140 portal = 0x16c3138 snapshot_set = <value optimized out> commandTag = <value optimized out> completionTag = "SELECT 1\000\000\000\000\000\000\000\000h\204s\001\000\000\000\000h\204s\001\000\000\000\000m pg_buf\220!\002\"=\000\000\000\336\000\000\000\000\000\000\000\205\312~\000\000\000\000" querytree_list = <value optimized out> plantree_list = 0x178ae48 receiver = 0x178ae78 format = 0 dest = DestRemote oldcontext = 0x1677e48 ---Type <return> to continue, or q <return> to quit--- parsetree_list = 0x1739270 parsetree_item = 0x1739250 save_log_statement_stats = 0 '\000' was_logged = 0 '\000' isTopLevel = 1 '\001' msec_str = "\220\205S\a\377\177\000\000\000\207S\a\377\177\000\000h\204s\001", '\000' <repeats 11 times> __func__ = "exec_simple_query" #12 0x00000000006e256c in PostgresMain (argc=<value optimized out>, argv=<value optimized out>, dbname=0x16c8f08 "smsconsole", username=<value optimized out>) at postgres.c:4072 query_string = 0x1738468 "select count(*) from pg_buffercache where isdirty" firstchar = 81 input_message = {data = 0x1738468 "select count(*) from pg_buffercache where isdirty", len = 50, maxlen = 1024, cursor= 50} local_sigjmp_buf = {{__jmpbuf = {140733316302560, 2475436528304998327, 1, 1523890764, -9187201950435737471, 0, -2475701272437110857, 2475436853937391543}, __mask_was_saved = 1, __saved_mask = {__val = {0, 0, 4294967295, 12667896, 1, 12667240,0, 9259542123273814145, 0, 0, 1024, 23891912, 12732800, 1523890764, 8372124, 140733316302592}}}} send_ready_for_query = 0 '\000' disable_idle_in_transaction_timeout = 0 '\000' __func__ = "PostgresMain" #13 0x00000000006823e5 in BackendRun (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:4342 ac = 1 usecs = 446360 i = 1 av = 0x16c8fc8 maxac = <value optimized out> #14 BackendStartup (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:4016 bn = <value optimized out> pid = 0 #15 ServerLoop (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1721 rmask = {fds_bits = {32, 0 <repeats 15 times>}} selres = <value optimized out> now = <value optimized out> readmask = {fds_bits = {120, 0 <repeats 15 times>}} nSockets = 7 last_lockfile_recheck_time = 1523890764 last_touch_time = 1523887759 #16 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1329 opt = <value optimized out> status = <value optimized out> userDoption = <value optimized out> listen_addr_saved = <value optimized out> i = <value optimized out> output_config_variable = <value optimized out> __func__ = "PostmasterMain" #17 0x0000000000609fe8 in main (argc=3, argv=0x1676990) at main.c:228 No locals. 10.04.2018, 22:56, "Peter Geoghegan" <pg@bowt.ie>: > On Fri, Apr 6, 2018 at 12:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote: >> Sorry. I can't do this. Because, 1s - this is production and 2nd - it had been initialized as replica. >> We use 2 server, and this server (with error) now initialized as secondary. >> We use pgpool for switching servers. >> And i posted full log from Postgres logs. >> All other logs is clear. > > I won't be able to help you without this information. > > -- > Peter Geoghegan
Hello, Peter. Also from logs: *** glibc detected *** postgres: postgres smsconsole [local] SELECT: double free or corruption (!prev): 0x000000000179e370*** ======= Backtrace: ========= /lib64/libc.so.6[0x3d22275dee] /lib64/libc.so.6[0x3d22278c80] postgres: postgres smsconsole [local] SELECT(tuplestore_end+0x17)[0x808887] postgres: postgres smsconsole [local] SELECT(ExecEndFunctionScan+0x75)[0x5e94e5] postgres: postgres smsconsole [local] SELECT(standard_ExecutorEnd+0x2e)[0x5cbaae] postgres: postgres smsconsole [local] SELECT(PortalCleanup+0x9e)[0x593d6e] postgres: postgres smsconsole [local] SELECT(PortalDrop+0x2a)[0x7fcaca] postgres: postgres smsconsole [local] SELECT[0x6e0eb2] postgres: postgres smsconsole [local] SELECT(PostgresMain+0xdcc)[0x6e256c] postgres: postgres smsconsole [local] SELECT(PostmasterMain+0x1875)[0x6823e5] postgres: postgres smsconsole [local] SELECT(main+0x7a8)[0x609fe8] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3d2221ed1d] postgres: postgres smsconsole [local] SELECT[0x46c039] -- Great thanks for your answers. 10.04.2018, 22:56, "Peter Geoghegan" <pg@bowt.ie>: > On Fri, Apr 6, 2018 at 12:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote: >> Sorry. I can't do this. Because, 1s - this is production and 2nd - it had been initialized as replica. >> We use 2 server, and this server (with error) now initialized as secondary. >> We use pgpool for switching servers. >> And i posted full log from Postgres logs. >> All other logs is clear. > > I won't be able to help you without this information. > > -- > Peter Geoghegan
Hello, Peter. Sorry for such mailing. > It would be particularly helpful if you were able to collect a > coredump, and run "p *debug_query_string" from GDB. (gdb) p *debug_query_string $1 = 115 's' 10.04.2018, 22:56, "Peter Geoghegan" <pg@bowt.ie>: > On Fri, Apr 6, 2018 at 12:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote: >> Sorry. I can't do this. Because, 1s - this is production and 2nd - it had been initialized as replica. >> We use 2 server, and this server (with error) now initialized as secondary. >> We use pgpool for switching servers. >> And i posted full log from Postgres logs. >> All other logs is clear. > > I won't be able to help you without this information. > > -- > Peter Geoghegan
On Mon, Apr 16, 2018 at 8:46 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote: > Hello, Peter. > > Sorry for such mailing. > >> It would be particularly helpful if you were able to collect a >> coredump, and run "p *debug_query_string" from GDB. > > (gdb) p *debug_query_string > $1 = 115 's' Sorry, I meant "p debug_query_string" -- lose the *. -- Peter Geoghegan
On Mon, Apr 16, 2018 at 9:04 AM, Peter Geoghegan <pg@bowt.ie> wrote: >> (gdb) p *debug_query_string >> $1 = 115 's' > > Sorry, I meant "p debug_query_string" -- lose the *. This now seems unnecessary, since it's already evident from your "bt full" output that the query involved in the crash was "select count(*) from pg_buffercache where isdirty". -- Peter Geoghegan
Peter Geoghegan wrote: > This now seems unnecessary, since it's already evident from your "bt > full" output that the query involved in the crash was "select count(*) > from pg_buffercache where isdirty". Hmm, so is the Zabbix monitoring running that query frequently? Because, as I recall, pg_buffercache is pretty heavy on the system, since it needs to acquire all the bufmgr locks simultaneously? In other words, this seems a terrible query to be running in zabbix. I have vague memories of somebody submitting a version of this code that returned approximate answers, good enough for monitoring ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-04-16 14:03:40 -0300, Alvaro Herrera wrote: > Peter Geoghegan wrote: > > > This now seems unnecessary, since it's already evident from your "bt > > full" output that the query involved in the crash was "select count(*) > > from pg_buffercache where isdirty". > > Hmm, so is the Zabbix monitoring running that query frequently? > Because, as I recall, pg_buffercache is pretty heavy on the system, > since it needs to acquire all the bufmgr locks simultaneously? > > In other words, this seems a terrible query to be running in zabbix. Can be extremely useful however, to predict how much longer your workload's hot data set fits into cache. That's worth the cost in a number of cases... Either way, a crash is clearly something separate. > I have vague memories of somebody submitting a version of this code > that returned approximate answers, good enough for monitoring ... That might have been me, but I don't recall the details anymore... Greetings, Andres Freund
On Mon, Apr 16, 2018 at 10:09 AM, Andres Freund <andres@anarazel.de> wrote: >> I have vague memories of somebody submitting a version of this code >> that returned approximate answers, good enough for monitoring ... > > That might have been me, but I don't recall the details anymore... Obviously you're thinking of 6e654546fb61f62cc982d0c8f62241b3b30e7ef8. I have a hard time imagining how that could be implicated in this hard crash, though, except perhaps by removing something that masked the problem in earlier versions. -- Peter Geoghegan
On 2018-04-16 10:13:27 -0700, Peter Geoghegan wrote: > On Mon, Apr 16, 2018 at 10:09 AM, Andres Freund <andres@anarazel.de> wrote: > >> I have vague memories of somebody submitting a version of this code > >> that returned approximate answers, good enough for monitoring ... > > > > That might have been me, but I don't recall the details anymore... > > Obviously you're thinking of 6e654546fb61f62cc982d0c8f62241b3b30e7ef8. > I have a hard time imagining how that could be implicated in this hard > crash, though, except perhaps by removing something that masked the > problem in earlier versions. Can't be involved, because the crashing version is 9.6, which doesn't include that afaics. Greetings, Andres Freund
Hi, On 2018-04-05 22:39:42 +0000, PG Bug reporting form wrote: > The following bug has been logged on the website: > > Bug reference: 15144 > Logged by: Vitaly Voronov > Email address: wizard_1024@tut.by > PostgreSQL version: 9.6.8 > Operating system: CentOS 6.9 > Description: > > Hello, > > We have problem at our Master server (second time). > From first time, we update CentOS to latest version (6.9) What's your shared_buffers setting? Is this a 32bit or 64bit installation? Greetings, Andres Freund
Peter Geoghegan wrote: > On Mon, Apr 16, 2018 at 10:09 AM, Andres Freund <andres@anarazel.de> wrote: > >> I have vague memories of somebody submitting a version of this code > >> that returned approximate answers, good enough for monitoring ... > > > > That might have been me, but I don't recall the details anymore... > > Obviously you're thinking of 6e654546fb61f62cc982d0c8f62241b3b30e7ef8. > I have a hard time imagining how that could be implicated in this hard > crash, though, except perhaps by removing something that masked the > problem in earlier versions. Yeah, I wasn't commenting on the crash itself -- just on how bad it is to let Zabbix monitor your database in this way. Maybe it *is* useful in certain situations, as Andres says, but I bet zabbix doesn't actually discriminate like that. Now, looking at the code for (i = 0; i < node->nfuncs; i++) { FunctionScanPerFuncState *fs = &node->funcstates[i]; if (fs->func_slot) ExecClearTuple(fs->func_slot); if (fs->tstore != NULL) { tuplestore_end(node->funcstates[i].tstore); fs->tstore = NULL; } and tuplestore_end does this: if (state->myfile) BufFileClose(state->myfile); without setting anything in state to NULL; so we're relying on the caller fs->tstore to null to avoid repeated tuplestore_end calls. I can't see any way for this to misbehave, but maybe the funcstate appears more than once in the PerFuncState array, and we clean it correctly the first time around and then invoke the tuplestore_end() the second time to the memory that was previously freed? I think this makes no sense unless we share FunctionScanPerFuncState elements -- do we? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Apr 16, 2018 at 10:48 AM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > and tuplestore_end does this: > if (state->myfile) > BufFileClose(state->myfile); > without setting anything in state to NULL; so we're relying on the > caller fs->tstore to null to avoid repeated tuplestore_end calls. I > can't see any way for this to misbehave, but maybe the funcstate appears > more than once in the PerFuncState array, and we clean it correctly the > first time around and then invoke the tuplestore_end() the second time > to the memory that was previously freed? I think this makes no sense > unless we share FunctionScanPerFuncState elements -- do we? I have no reason to think that we do. Offhand, I find it more likely that some executor slot that imagines that it owns the tuple frees the tuple once, which is followed by a call to tuplestore_end() that frees the same tuple a second time (a double-free). As I mentioned, we've seen several bugs of that general variety in both tuplestore and tuplesort in the past. Some of these have been very subtle. Note that pgpool is involved here. I don't know much about pgpool, and maybe that's totally irrelevant. -- Peter Geoghegan
Hello, Andres. From postgresql.conf: shared_buffers = 12GB This is 64 bit installation 16.04.2018, 20:29, "Andres Freund" <andres@anarazel.de>: > Hi, > > On 2018-04-05 22:39:42 +0000, PG Bug reporting form wrote: >> The following bug has been logged on the website: >> >> Bug reference: 15144 >> Logged by: Vitaly Voronov >> Email address: wizard_1024@tut.by >> PostgreSQL version: 9.6.8 >> Operating system: CentOS 6.9 >> Description: >> >> Hello, >> >> We have problem at our Master server (second time). >> From first time, we update CentOS to latest version (6.9) > > What's your shared_buffers setting? Is this a 32bit or 64bit > installation? > > Greetings, > > Andres Freund
Hello, Peter. Pgpool used only from App side at another server. Zabbix running its queries directly at database server. 16.04.2018, 21:03, "Peter Geoghegan" <pg@bowt.ie>: > On Mon, Apr 16, 2018 at 10:48 AM, Alvaro Herrera > <alvherre@alvh.no-ip.org> wrote: >> and tuplestore_end does this: >> if (state->myfile) >> BufFileClose(state->myfile); >> without setting anything in state to NULL; so we're relying on the >> caller fs->tstore to null to avoid repeated tuplestore_end calls. I >> can't see any way for this to misbehave, but maybe the funcstate appears >> more than once in the PerFuncState array, and we clean it correctly the >> first time around and then invoke the tuplestore_end() the second time >> to the memory that was previously freed? I think this makes no sense >> unless we share FunctionScanPerFuncState elements -- do we? > > I have no reason to think that we do. Offhand, I find it more likely > that some executor slot that imagines that it owns the tuple frees the > tuple once, which is followed by a call to tuplestore_end() that frees > the same tuple a second time (a double-free). As I mentioned, we've > seen several bugs of that general variety in both tuplestore and > tuplesort in the past. Some of these have been very subtle. > > Note that pgpool is involved here. I don't know much about pgpool, and > maybe that's totally irrelevant. > > -- > Peter Geoghegan
On Mon, Apr 16, 2018 at 11:05 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote: > Hello, Peter. > > Pgpool used only from App side at another server. > > Zabbix running its queries directly at database server. Can you reliably crash the server by running "select count(*) from pg_buffercache where isdirty" from psql? What work_mem setting does Postgres have when Zabbix runs this query? -- Peter Geoghegan
Hello, Peter. > Can you reliably crash the server by running "select count(*) from pg_buffercache where isdirty" from psql? No. I run from psql and get answer: # select count(*) from pg_buffercache where isdirty; count ------- 71 (1 row) > What work_mem setting does Postgres have when Zabbix runs this query? work_mem=64MB 16.04.2018, 21:35, "Peter Geoghegan" <pg@bowt.ie>: > On Mon, Apr 16, 2018 at 11:05 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote: >> Hello, Peter. >> >> Pgpool used only from App side at another server. >> >> Zabbix running its queries directly at database server. > > Can you reliably crash the server by running "select count(*) from > pg_buffercache where isdirty" from psql? > > What work_mem setting does Postgres have when Zabbix runs this query? > > -- > Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > Offhand, I find it more likely > that some executor slot that imagines that it owns the tuple frees the > tuple once, which is followed by a call to tuplestore_end() that frees > the same tuple a second time (a double-free). As I mentioned, we've > seen several bugs of that general variety in both tuplestore and > tuplesort in the past. Some of these have been very subtle. I see that in 9.6, nodeFunctionScan thinks it should do ExecClearTuple on the func_slot that it's received from tuplestore_gettupleslot, which it calls with copy = false, meaning that ExecClearTuple might be deleting a tuple returned by tuplestore_gettuple. I wonder if this is the same kind of issue we fixed in 90decdba3, only for tuplestore rather than tuplesort. tuplestore_gettuple doesn't return should_free = true unless the tuplestore spilled to disk, so the sort of issue I'm imagining would only arise for function results large enough to cause a spill. BTW, I notice that in this situation, readtup_heap seems to be palloc'ing in the caller's context, but it counts the memory as if it were in the tuplestore's context. Somebody's confused there. regards, tom lane
On Mon, Apr 16, 2018 at 1:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <pg@bowt.ie> writes: >> Offhand, I find it more likely >> that some executor slot that imagines that it owns the tuple frees the >> tuple once, which is followed by a call to tuplestore_end() that frees >> the same tuple a second time (a double-free). As I mentioned, we've >> seen several bugs of that general variety in both tuplestore and >> tuplesort in the past. Some of these have been very subtle. > > I see that in 9.6, nodeFunctionScan thinks it should do ExecClearTuple > on the func_slot that it's received from tuplestore_gettupleslot, > which it calls with copy = false, meaning that ExecClearTuple might be > deleting a tuple returned by tuplestore_gettuple. I wonder if this > is the same kind of issue we fixed in 90decdba3, only for tuplestore > rather than tuplesort. I'm going to spend some time trying to reproduce the bug tomorrow. I suspect that we can justify bringing tuplestore in line with tuplesort defensively, though (i.e. doing something like 90decdba3 for tuplestore, even in the absence of strong evidence that that will prevent this crash). > tuplestore_gettuple doesn't return should_free = true unless the > tuplestore spilled to disk, so the sort of issue I'm imagining > would only arise for function results large enough to cause a spill. Sounds familiar. > BTW, I notice that in this situation, readtup_heap seems to be > palloc'ing in the caller's context, but it counts the memory as > if it were in the tuplestore's context. Somebody's confused there. I could just kick myself for not going through tuplestore (and its version of readtup_heap) as part of the 90decdba3 work. -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > On Mon, Apr 16, 2018 at 1:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> BTW, I notice that in this situation, readtup_heap seems to be >> palloc'ing in the caller's context, but it counts the memory as >> if it were in the tuplestore's context. Somebody's confused there. > I could just kick myself for not going through tuplestore (and its > version of readtup_heap) as part of the 90decdba3 work. Yeah, I should have thought to question that too. tuplestore was originally built by stripping down tuplesort, and at least in the beginning, I'm pretty sure that all these semantic API details were the same. We should likely have made more effort to keep them in sync. (Still, until we've proven that there *is* a bug here, let's not kick ourselves too hard.) regards, tom lane
On Mon, Apr 16, 2018 at 3:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, I should have thought to question that too. tuplestore was > originally built by stripping down tuplesort, and at least in the > beginning, I'm pretty sure that all these semantic API details were > the same. We should likely have made more effort to keep them in > sync. (Still, until we've proven that there *is* a bug here, > let's not kick ourselves too hard.) FWIW, I think that tuplesort remains a good example for tuplestore to follow, since the enhancements that prevented the tuplesort crash on v10+ make just as much sense for tuplestore (and could even have been justified purely on robustness grounds). Many small palloc() calls are certainly something that we should try to avoid. Actually, I once looked into writing such a patch for tuplestore myself, but IIRC tuplestore_clear() and interXact support made it more painful than initially thought. -- Peter Geoghegan
>>>>> "Vitaly" == Vitaly V Voronov <wizard_1024@tut.by> writes: Vitaly> #4 0x0000003d22278c80 in _int_free () from /lib64/libc.so.6 Vitaly> No symbol table info available. Vitaly> #5 0x0000000000808887 in tuplestore_end (state=0x1794898) at tuplestore.c:455 Vitaly> i = <value optimized out> So I may be off base here but... Line 455 isn't anything to do with tuples; it's the BufFileClose() line. Furthermore, there's no stack frame between the free() and the tuplestore_end. From looking at optimized builds, this suggests that free() has been reached via tail calls, and the only way I see that happening is when pfree() is being called on a large allocation (one large enough to be its own chunk), which shouldn't happen for tuples in this example. (It can happen for the memtuples array itself.) BufFile is a struct with a big buffer in it, though, so it'll be a large allocation, hence the pfree() at the end of BufFileClose will end up in free() via tail calls. Of course the weak point in this theory is that there seems to be no reason at all why BufFileClose could possibly get called twice ... the only other theory would be that something has somehow reset the memory context _before_ we got here. -- Andrew (irc:RhodiumToad)
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: Andrew> Of course the weak point in this theory is that there seems to Andrew> be no reason at all why BufFileClose could possibly get called Andrew> twice ... the only other theory would be that something has Andrew> somehow reset the memory context _before_ we got here. ... and in this context I just noticed that pg_stat_statements is in play, which could be significant. -- Andrew (irc:RhodiumToad)
On Mon, Apr 16, 2018 at 8:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote: > We have cat crash at our secondary server. > This full stack trace from crash dump: Did the problem recur? Sorry for letting this one lapse. -- Peter Geoghegan
Hello, Peter. After we disabled monitoring request to pg_buffercache and upgraded to 9.6.9 - we don't encounter problem. 20.06.2018, 00:35, "Peter Geoghegan" <pg@bowt.ie>: > On Mon, Apr 16, 2018 at 8:42 AM, Vitaly V. Voronov <wizard_1024@tut.by> wrote: >> We have cat crash at our secondary server. >> This full stack trace from crash dump: > > Did the problem recur? > > Sorry for letting this one lapse. > > -- > Peter Geoghegan