Обсуждение: amcheck assert failure
Hello, everyone! I was running some automated tests with amcheck on PG11 and encountered following assertion failure: TRAP: FailedAssertion("!((( (&(itup)->t_tid)->ip_posid ) & 0xF000) == 0)", File: "nbtutils.c", Line: 2168) 2019-04-20 01:03:37 GMT [17923]: [1-1] LOG: connection received: host=[local] 2019-04-20 01:03:37 GMT [17923]: [2-1] LOG: connection authorized: user=gsmol database=db1 2019-04-20 01:03:37 GMT [17886]: [9-1] LOG: server process (PID 17922) was terminated by signal 6: Aborted 2019-04-20 01:03:37 GMT [17886]: [10-1] DETAIL: Failed process was running: SELECT public.bt_index_check('pgbench_accounts_pkey', false); 1. pgbench setup in question was initialized with scale_factor 5. bt_index_check() was launched right after initialization. 2. pg_config: BINDIR = /home/gsmol/task/11/bin DOCDIR = /home/gsmol/task/11/share/doc/postgresql HTMLDIR = /home/gsmol/task/11/share/doc/postgresql INCLUDEDIR = /home/gsmol/task/11/include PKGINCLUDEDIR = /home/gsmol/task/11/include/postgresql INCLUDEDIR-SERVER = /home/gsmol/task/11/include/postgresql/server LIBDIR = /home/gsmol/task/11/lib PKGLIBDIR = /home/gsmol/task/11/lib/postgresql LOCALEDIR = /home/gsmol/task/11/share/locale MANDIR = /home/gsmol/task/11/share/man SHAREDIR = /home/gsmol/task/11/share/postgresql SYSCONFDIR = /home/gsmol/task/11/etc/postgresql PGXS = /home/gsmol/task/11/lib/postgresql/pgxs/src/makefiles/pgxs.mk CONFIGURE = '--prefix=/home/gsmol/task/11' '--enable-debug' '--enable-cassert' 'CFLAGS=-g3 -O0' '--with-python' CC = gcc CPPFLAGS = -D_GNU_SOURCE CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -g3 -O0 CFLAGS_SL = -fPIC LDFLAGS = -Wl,--as-needed -Wl,-rpath,'/home/gsmol/task/11/lib',--enable-new-dtags LDFLAGS_EX = LDFLAGS_SL = LIBS = -lpgcommon -lpgport -lpthread -lz -lreadline -lrt -lcrypt -ldl -lm VERSION = PostgreSQL 11.2 3. PG configuration: listen_addresses = '127.0.0.1' max_worker_processes = 10 port = 21387 fsync = off log_statement = 'all' max_replication_slots = 10 hot_standby = on wal_keep_segments = 20 wal_level = 'replica' max_connections = 100 shared_buffers = 10MB fsync = off wal_level = logical hot_standby = off log_line_prefix = '%t [%p]: [%l-1] ' log_statement = none log_duration = on log_min_duration_statement = 0 log_connections = on log_disconnections = on max_wal_senders = 10 P.S. If some additional information will be needed, I`d be glad to help. -- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri, Apr 19, 2019 at 6:15 PM Grigory Smolkin <g.smolkin@postgrespro.ru> wrote: > Hello, everyone! > I was running some automated tests with amcheck on PG11 and encountered > following assertion failure: > > TRAP: FailedAssertion("!((( (&(itup)->t_tid)->ip_posid ) & 0xF000) == > 0)", File: "nbtutils.c", Line: 2168) Was the assertion failure on Postgres Pro's fork? I understand that there are differences in this exact area on the forked version, because it has to deal with two representations -- the fork had INCLUDE indexes before the community version, which explicitly represents the number of attributes in pivot tuples. I wouldn't usually ask this, but the failure is very basic. It's hard to believe that it would take this long to notice it. I also note that you have "fsync = off". I suppose that that might be relevant, too. -- Peter Geoghegan
Hello! On 04/20/2019 04:30 AM, Peter Geoghegan wrote: > On Fri, Apr 19, 2019 at 6:15 PM Grigory Smolkin > <g.smolkin@postgrespro.ru> wrote: >> Hello, everyone! >> I was running some automated tests with amcheck on PG11 and encountered >> following assertion failure: >> >> TRAP: FailedAssertion("!((( (&(itup)->t_tid)->ip_posid ) & 0xF000) == >> 0)", File: "nbtutils.c", Line: 2168) > Was the assertion failure on Postgres Pro's fork? I understand that > there are differences in this exact area on the forked version, > because it has to deal with two representations -- the fork had > INCLUDE indexes before the community version, which explicitly > represents the number of attributes in pivot tuples. No, it`s vanilla PG11: db2=# select version(); version -------------------------------------------------------------------------------------------------------- PostgreSQL 11.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1), 64-bit (1 row) > > I wouldn't usually ask this, but the failure is very basic. It's hard > to believe that it would take this long to notice it. Maybe it wasn't noticed because binaries, shipped to users, usually compiled without '--cassert-enable'. > I also note that you have "fsync = off". I suppose that that might be > relevant, too. It`s automated test, instance get inited, filled with data, subjected with various tests and deleted in span of several seconds. And no crash has happened during this time. -- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Sat, Apr 20, 2019 at 2:08 AM Grigory Smolkin <g.smolkin@postgrespro.ru> wrote: > It`s automated test, instance get inited, filled with data, subjected > with various tests and deleted in span of several seconds. And no crash > has happened during this time. Can you send me a raw binary dump of the page using pageinspect? Something like the following command line recipe will do it: $ psql -XAtc "SELECT encode(get_raw_page('pgbench_pkey', 42),'base64')" | base64 -d > dump_block_42.page Obviously you'll need to figure out which block number the problem page is located in before you do this, but that shouldn't be hard -- get a coredump and inspect the state from amcheck a few frames down. -- Peter Geoghegan
On Sat, Apr 20, 2019 at 2:00 PM Peter Geoghegan <pg@bowt.ie> wrote: > Obviously you'll need to figure out which block number the problem > page is located in before you do this, but that shouldn't be hard -- > get a coredump and inspect the state from amcheck a few frames down. Alternatively, you could dump the page image from within gdb. Go to the frame for _bt_check_natts() (e.g., enter "f 1"), and then dump the contents of the page to file: (gdb) dump binary memory /tmp/dump_block.page page (page + 8192) -- Peter Geoghegan
On Sat, Apr 20, 2019 at 2:37 PM Peter Geoghegan <pg@bowt.ie> wrote: > Alternatively, you could dump the page image from within gdb. Go to > the frame for _bt_check_natts() (e.g., enter "f 1"), and then dump the > contents of the page to file: I was inspired to do a write-up on this for the Wiki: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Dumping_a_page_image_from_within_GDB Seems like a good idea to centralize knowledge of backend instrumentation that's useful for bug reports, even beyond GDB. -- Peter Geoghegan
I`m sorry for a delay, was away from PC during weekend. On 04/21/2019 02:06 AM, Peter Geoghegan wrote: > On Sat, Apr 20, 2019 at 2:37 PM Peter Geoghegan <pg@bowt.ie> wrote: >> Alternatively, you could dump the page image from within gdb. Go to >> the frame for _bt_check_natts() (e.g., enter "f 1"), and then dump the >> contents of the page to file: > I was inspired to do a write-up on this for the Wiki: > > https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Dumping_a_page_image_from_within_GDB > > Seems like a good idea to centralize knowledge of backend > instrumentation that's useful for bug reports, even beyond GDB. > Wow, that` neat, thank you for a dump instruction! Attached. -- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
On Mon, Apr 22, 2019 at 9:03 AM Grigory Smolkin <g.smolkin@postgrespro.ru> wrote: > Wow, that` neat, thank you for a dump instruction! > Attached. This looks like it was deliberately corrupted. Line pointer 255 is an LP_REDIRECT, line pointer, while line pointer 256 and 257 are LP_UNUSED, which should never happen. The ascii string "blablahblah" has been overwritten at byte offset 1040 (from the beginning of the page). I would take a closer look at what those tests do. Either the problem is in the tests, or your database is haunted, which is outside my area of expertise. -- Peter Geoghegan
On 04/22/2019 09:05 PM, Peter Geoghegan wrote: > On Mon, Apr 22, 2019 at 9:03 AM Grigory Smolkin > <g.smolkin@postgrespro.ru> wrote: >> Wow, that` neat, thank you for a dump instruction! >> Attached. > This looks like it was deliberately corrupted. Line pointer 255 is an > LP_REDIRECT, line pointer, while line pointer 256 and 257 are > LP_UNUSED, which should never happen. > > The ascii string "blablahblah" has been overwritten at byte offset > 1040 (from the beginning of the page). I would take a closer look at > what those tests do. Either the problem is in the tests, or your > database is haunted, which is outside my area of expertise. > I`m so stupid >.< It`s data corruption test, but I`ve forgot to turn checksums on. I`m sorry for wasting your time. -- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Mon, Apr 22, 2019 at 12:48 PM Grigory Smolkin <g.smolkin@postgrespro.ru> wrote: > I`m so stupid >.< > It`s data corruption test, but I`ve forgot to turn checksums on. > I`m sorry for wasting your time. Don't worry about it - everyone makes stupid mistakes. I got a laugh out of it, and it caused me to document the process for dumping out page images within GDB, so there was some benefit. -- Peter Geoghegan
Grigory Smolkin <g.smolkin@postgrespro.ru> writes: > On 04/22/2019 09:05 PM, Peter Geoghegan wrote: >> The ascii string "blablahblah" has been overwritten at byte offset >> 1040 (from the beginning of the page). I would take a closer look at >> what those tests do. Either the problem is in the tests, or your >> database is haunted, which is outside my area of expertise. > It`s data corruption test, but I`ve forgot to turn checksums on. > I`m sorry for wasting your time. Nonetheless ... do we really want an assertion failure rather than some more-mundane error report? Seems like the point of amcheck is to detect data corruption, so I'd rather get an ERROR than an assertion (which would not fire in production builds anyway). regards, tom lane
On Mon, Apr 22, 2019 at 12:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Nonetheless ... do we really want an assertion failure rather than > some more-mundane error report? Seems like the point of amcheck > is to detect data corruption, so I'd rather get an ERROR than an > assertion (which would not fire in production builds anyway). (Apologies for sending this a second time, Tom -- somehow failed to CC the list the first time) I think that the size cross-check must have failed to catch the problem: /* * lp_len should match the IndexTuple reported length exactly, since * lp_len is completely redundant in indexes, and both sources of * tuple length are MAXALIGN()'d. nbtree does not use lp_len all that * frequently, and is surprisingly tolerant of corrupt lp_len fields. */ if (tupsize != ItemIdGetLength(itemid)) ereport(ERROR, (errcode(ERRCODE_INDEX_CORRUPTED), errmsg("index tuple size does not equal lp_len in index \"%s\"", RelationGetRelationName(state->rel)), errdetail_internal("Index tid=(%u,%u) tuple size=%zu lp_len=%u page lsn=%X/%X.", state->targetblock, offset, tupsize, ItemIdGetLength(itemid), (uint32) (state->targetlsn >> 32), (uint32) state->targetlsn), errhint("This could be a torn page problem."))); It seems like the length from tupsize (IndexTupleSize()) happened to match the redundant ItemIdGetLength()-reported size from the line pointer. No idea how that could actually be possible, since the value returned by ItemIdGetLength() would be 12592. What are the chances of that accidentally matching what the wild itup pointer reported as its IndexTupleSize()? The only explanation I can think of is that the assertion failure was from the core code -- _bt_check_natts() is itself called within core code assertions, so it's just about possible that that was how this happened, before amcheck was even called. Doesn't seem like a very good explanation. Do you think that that's possible, Grigory? In any case, you're clearly right that amcheck could be more defensive here. My pg_hexedit tool detected that there was an LP_UNUSED item pointer with storage (i.e. whose lp_len > 0), which is a check that amcheck can and should perform, but doesn't. Especially because it would prevent a deference of a likely-wild pointer (IndexTuple). We could: * Throw an error when any line item reports lp_len == 0. * Throw an error when there is a line item that's either LP_UNUSED, or LP_REDIRECT. The corrupt page sent by Grigory had both. What do you think of the idea of committing some simple checks to do this with contrib/amcheck on v12? -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > On Mon, Apr 22, 2019 at 12:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Nonetheless ... do we really want an assertion failure rather than >> some more-mundane error report? Seems like the point of amcheck >> is to detect data corruption, so I'd rather get an ERROR than an >> assertion (which would not fire in production builds anyway). > The only explanation I can think of is that the assertion failure was > from the core code That's what I guessed. It might be impractical for amcheck to prevent it, but we should try to find out. > We could: > * Throw an error when any line item reports lp_len == 0. > * Throw an error when there is a line item that's either LP_UNUSED, or > LP_REDIRECT. The corrupt page sent by Grigory had both. > What do you think of the idea of committing some simple checks to do > this with contrib/amcheck on v12? No objection here, as long as it doesn't add an unreasonable amount of complexity. (If it does, we might have to think harder about what to do.) regards, tom lane
On Mon, Apr 22, 2019 at 1:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > The only explanation I can think of is that the assertion failure was > > from the core code > > That's what I guessed. It might be impractical for amcheck to prevent it, > but we should try to find out. On second thought, amcheck is very unlikely to have accessed a corrupt system catalog index or something at the same time the pgbench_pkey index was also corrupt. The page image did appear to be from pgbench_pkey, so it would take two corrupt indexes at the same time for the assertion failure to be from the core code, outside of amcheck. A slightly less insane theory is that this has something to do with undefined behavior. Hard to imagine how that could make amcheck's test go away, but I've heard horror stories that were at least superficially similar to what I have in mind. In any case, it will be possible to make amcheck avoid undefined behavior when a page image is corrupt. That seems like a worthy goal, regardless of the specifics of this case, because amcheck's charter is to absolutely distrust on-disk state. > No objection here, as long as it doesn't add an unreasonable amount > of complexity. (If it does, we might have to think harder about > what to do.) I want to do some basic sanity checking of the line pointer against the page header, to ensure that any index tuple pointer actually points to memory that it's safe to do a load from (i.e. from the temp BLCKSZ buffer). I'll add another check to that list: * Make sure that each line pointer is "lp_off < pd_special". Note that pg_special is already verified-sane for each page. These checks are simple and low risk in my opinion. I will come up with a patch in the next few days. -- Peter Geoghegan
On 04/22/2019 11:31 PM, Peter Geoghegan wrote: > On Mon, Apr 22, 2019 at 12:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Nonetheless ... do we really want an assertion failure rather than >> some more-mundane error report? Seems like the point of amcheck >> is to detect data corruption, so I'd rather get an ERROR than an >> assertion (which would not fire in production builds anyway). > (Apologies for sending this a second time, Tom -- somehow failed to CC > the list the first time) > > I think that the size cross-check must have failed to catch the problem: > > /* > * lp_len should match the IndexTuple reported length exactly, since > * lp_len is completely redundant in indexes, and both sources of > * tuple length are MAXALIGN()'d. nbtree does not use lp_len all that > * frequently, and is surprisingly tolerant of corrupt lp_len fields. > */ > if (tupsize != ItemIdGetLength(itemid)) > ereport(ERROR, > (errcode(ERRCODE_INDEX_CORRUPTED), > errmsg("index tuple size does not equal lp_len in > index \"%s\"", > RelationGetRelationName(state->rel)), > errdetail_internal("Index tid=(%u,%u) tuple > size=%zu lp_len=%u page lsn=%X/%X.", > state->targetblock, offset, > tupsize, ItemIdGetLength(itemid), > (uint32) (state->targetlsn >> 32), > (uint32) state->targetlsn), > errhint("This could be a torn page problem."))); > > It seems like the length from tupsize (IndexTupleSize()) happened to > match the redundant ItemIdGetLength()-reported size from the line > pointer. No idea how that could actually be possible, since the value > returned by ItemIdGetLength() would be 12592. What are the chances of > that accidentally matching what the wild itup pointer reported as its > IndexTupleSize()? > > The only explanation I can think of is that the assertion failure was > from the core code -- _bt_check_natts() is itself called within core > code assertions, so it's just about possible that that was how this > happened, before amcheck was even called. Doesn't seem like a very > good explanation. Do you think that that's possible, Grigory? No, it was definitely amcheck: https://pastebin.postgrespro.ru/?2c8ddf5254b2c625#PTbr+jbxFPu2wfy5SFDWdiqaqPN/N5WX8t6mo5rDpmo= > > In any case, you're clearly right that amcheck could be more defensive > here. My pg_hexedit tool detected that there was an LP_UNUSED item > pointer with storage (i.e. whose lp_len > 0), which is a check that > amcheck can and should perform, but doesn't. Especially because it > would prevent a deference of a likely-wild pointer (IndexTuple). > > We could: > > * Throw an error when any line item reports lp_len == 0. > > * Throw an error when there is a line item that's either LP_UNUSED, or > LP_REDIRECT. The corrupt page sent by Grigory had both. > > What do you think of the idea of committing some simple checks to do > this with contrib/amcheck on v12? > -- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Wed, Apr 24, 2019 at 1:50 AM Grigory Smolkin <g.smolkin@postgrespro.ru> wrote: > No, it was definitely amcheck: > https://pastebin.postgrespro.ru/?2c8ddf5254b2c625#PTbr+jbxFPu2wfy5SFDWdiqaqPN/N5WX8t6mo5rDpmo= Please just post the stack trace to the list in future. We value having that directly available from the archives. It looks like the assertion failure happens at the point where the next tuple is being compared to the current one, in respect of the next tuple. The assertion failure occurs in the core code, technically, though it is pretty much under the direct control of amcheck at the time. This means that my exotic explanation was wrong, and the actual explanation is quite simple. (I told you that everyone makes stupid mistakes.) -- Peter Geoghegan
On Mon, Apr 22, 2019 at 1:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > No objection here, as long as it doesn't add an unreasonable amount > of complexity. (If it does, we might have to think harder about > what to do.) Attached is a draft patch that adds a PageGetItemId() wrapper called PageGetItemIdCareful(), and has amcheck use it consistently. This does two basic sanity checks that seem very likely to avoid the kind of misbehaviors that result in a crash (including assertion failures) when amcheck is used. This is a bit more complicated than I'd hoped, because it is necessary to call PageGetItemIdCareful() before _bt_compare() and other core code in a way that wouldn't quite look right with raw PageGetItemId() calls. _bt_compare() will do its own PageGetItemId(), and will itself assert !bt_check_natts() (which was the issue in Grigory's problem report), so we need to make sure that that will be safe ahead of time. Even still, it doesn't seem too complicated to me. -- Peter Geoghegan