Обсуждение: BRIN autosummarization lacking a snapshot
If you have a BRIN index with autosummarize=on, and it's based on a function that requires a snapshot, autovacuum will simply error out when trying to summarize a range. Here's a reproducer: create table journal (d timestamp); create function packdate(d timestamp) returns text as $$ begin return to_char(d, 'yyyymm'); end; $$ language plpgsql returns null on null input immutable; create index on journal using brin (packdate(d)) with (autosummarize = on, pages_per_range = 1); Now insert some data, insert into journal select generate_series(timestamp '2025-01-01', '2025-12-31', '1 day'); and wait for autovacuum to fire. You'll get an error like 2025-11-03 12:17:42.263 CET [536755] ERROR: cannot execute SQL without an outer snapshot or portal This patch fixes it. I haven't given much thought to adding a good way to test this yet ... -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Someone said that it is at least an order of magnitude more work to do production software than a prototype. I think he is wrong by at least an order of magnitude." (Brian Kernighan)
Вложения
On Mon, Nov 03, 2025 at 12:21:50PM +0100, Alvaro Herrera wrote: > This patch fixes it. I haven't given much thought to adding a good way > to test this yet ... Spawning an autovacuum worker can feel artistic as we try to make the tests run fast, but it's not that bad. The trick is to use an "autovacuum_naptime = 1". Then you could either scan the server logs for some 'autovacuum: processing database "blah"', or just a polling query based on pg_stat_all_tables.autovacuum_count. See for example 006_signal_autovacuum.pl. -- Michael
Вложения
On 2025-Nov-04, Michael Paquier wrote: > Spawning an autovacuum worker can feel artistic as we try to make the > tests run fast, but it's not that bad. The trick is to use an > "autovacuum_naptime = 1". Then you could either scan the server logs > for some 'autovacuum: processing database "blah"', or just a polling > query based on pg_stat_all_tables.autovacuum_count. See for example > 006_signal_autovacuum.pl. Ah yes ... and, actually, we already have a file doing a closely related thing, so I added to it. Here's the patch for master. Backbranches are essentially identical, modulo these changes for 13 and 14: -use Test::More tests => 2; +use Test::More tests => 4; I'm glad we got rid of that :-) With my initial try of this test, just counting the number of BRIN tuples, I was _really_ surprised that the index did indeed contain the expected number of tuples, even when the error was being thrown. This turned out to be expected, because the way BRIN summarization works is that we insert a placeholder tuple first, then update it to the correct value, and the error only aborts the second part. That's why I needed to add a WHERE clause to only count non-placeholder tuples. I also added a 'sleep(1)', to avoid looping on the query when we know autovacuum can't possibly have had a chance to run yet. I unleashed CI on branches 15 and master, and will push soon if they both turn green. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "La virtud es el justo medio entre dos defectos" (Aristóteles)