Обсуждение: VACUUM FULL results in ERROR: integer out of range
Hi everyone, Consider the example below: -- thread 1: \c db1; CREATE TABLE t1(c0 int); INSERT INTO t1(c0) VALUES(2147483647); UPDATE t1 SET c0 = 0; CREATE INDEX i0 ON t1((1 + t1.c0)); VACUUM FULL; -- unexpected: ERROR: integer out of range -- thread 2: DROP DATABASE db2; CREATE DATABASE db2; I would expect that the VACUUM FULL executes without error. However, it shows an error "integer out of range", which would be expected for the CREATE INDEX, had the UPDATE not been executed. I can reliably reproduce this on my machine on the first execution of thread 1's statements when having a second thread that (repeatedly) drops and creates a database. I've attached a Java program that should make it possible to reproduce this by throwing an exception "org.postgresql.util.PSQLException: ERROR: integer out of range" immediately after starting the program. This bug report is similar to a previous case that I reported, where executing VACUUM FULL on distinct databases can result in a deadlock (see https://www.postgresql.org/message-id/CA%2Bu7OA6pL%2B7Xm_NXHLenxffe3tCr3gTamVdr7zPjcWqW0RFM-A%40mail.gmail.com), which might also be unexpected from a user perspective, so I am not sure if this case bug report is considered relevant. Best, Manuel
Вложения
Manuel Rigger <rigger.manuel@gmail.com> writes: > Consider the example below: > -- thread 1: > \c db1; > CREATE TABLE t1(c0 int); > INSERT INTO t1(c0) VALUES(2147483647); > UPDATE t1 SET c0 = 0; > CREATE INDEX i0 ON t1((1 + t1.c0)); > VACUUM FULL; -- unexpected: ERROR: integer out of range > -- thread 2: > DROP DATABASE db2; > CREATE DATABASE db2; > I would expect that the VACUUM FULL executes without error. However, > it shows an error "integer out of range", which would be expected for > the CREATE INDEX, had the UPDATE not been executed. This can be reproduced in a less magical way thus: Session 1: regression=# begin transaction isolation level serializable; BEGIN regression=# select * from unrelated_table; ... Leave that sit, and in session 2 do: regression=# CREATE TABLE t1(c0 int); CREATE TABLE regression=# INSERT INTO t1(c0) VALUES(2147483647); INSERT 0 1 regression=# UPDATE t1 SET c0 = 0; UPDATE 1 regression=# CREATE INDEX i0 ON t1((1 + t1.c0)); CREATE INDEX regression=# vacuum t1; VACUUM regression=# vacuum full t1; ERROR: integer out of range What's evidently happening is that since the row with c0 = 2147483647 is still potentially live to some onlooker transaction, the index rebuild forced by VACUUM FULL is trying to create an index entry for it. I imagine that your original example with a concurrent database drop/create is likewise causing a transaction to be open during the relevant window. Now, what's curious is that the CREATE INDEX itself didn't fail likewise. Apparently, we have more-careful analysis of live vs. dead rows during the initial index creation than we do during a forced rebuild, because somehow CREATE INDEX is deciding that it needn't make an index entry for that row, even though it was exactly as live-to-somebody at that point as it was during the VACUUM FULL. I haven't dug into the details of what the difference is, nor whether it'd be practical to make the behavior the same for both cases. It's even possible that VACUUM FULL is doing the right thing and it's a bug that the CREATE INDEX didn't fail (though I doubt this). The larger point here is that even if we decide to change something about this specific case, there are going to be closely related cases that will fail and it won't be a bug, because construction of the failing index entry will be semantically required. In general, CREATE INDEX doesn't get to ignore rows just because they're dead to the current transaction. regards, tom lane
I wrote: > What's evidently happening is that since the row with c0 = 2147483647 > is still potentially live to some onlooker transaction, the index > rebuild forced by VACUUM FULL is trying to create an index entry for > it. I imagine that your original example with a concurrent database > drop/create is likewise causing a transaction to be open during the > relevant window. > Now, what's curious is that the CREATE INDEX itself didn't fail likewise. > Apparently, we have more-careful analysis of live vs. dead rows during > the initial index creation than we do during a forced rebuild, because > somehow CREATE INDEX is deciding that it needn't make an index entry > for that row, even though it was exactly as live-to-somebody at that > point as it was during the VACUUM FULL. Ah, here's the explanation (in HEAD this is in heapam_index_build_range_scan): case HEAPTUPLE_RECENTLY_DEAD: /* * If tuple is recently deleted then we must index it * anyway to preserve MVCC semantics. (Pre-existing * transactions could try to use the index after we finish * building it, and may need to see such tuples.) * * However, if it was HOT-updated then we must only index * the live tuple at the end of the HOT-chain. Since this * breaks semantics for pre-existing snapshots, mark the * index as unusable for them. * ... The tuple in question *was* HOT-updated, since there were no indexes in existence at the time of the UPDATE that would prevent that. So we more or less accidentally avoid generating the index-entry- that-would-fail, at the price that the index is not immediately usable by transactions with old snapshots. VACUUM FULL preserves RECENTLY_DEAD tuples, as it must, but does not preserve HOT-update tuple relationships. So it needs to generate an index entry for this row, and kaboom. The only way to make the two cases behave identically would be for VACUUM FULL to preserve HOT-update tuple relationships. Even if we wished to do that (unlikely, since it'd be a complicated and hard-to-test code path), doing that would have its own downsides. To name one, we'd be forced into putting all the copies of tuples in a HOT chain into the same output page, resulting in worse packing. Having the output indexes be not-immediately-usable would not be really nice either. In short, there's nothing I particularly want to change here. The HOT-induced behavior is a little surprising, but it's adjacent to behaviors that are absolutely required by the MVCC semantic model. regards, tom lane