Обсуждение: BRIN autosummarization lacking a snapshot

Поиск
Список
Период
Сортировка

BRIN autosummarization lacking a snapshot

От
Álvaro Herrera
Дата:
If you have a BRIN index with autosummarize=on, and it's based on a
function that requires a snapshot, autovacuum will simply error out when
trying to summarize a range.  Here's a reproducer:

create table journal (d timestamp);
create function packdate(d timestamp) returns text as $$
   begin return to_char(d, 'yyyymm'); end; $$
   language plpgsql returns null on null input immutable;
create index on journal using brin (packdate(d))
  with (autosummarize = on, pages_per_range = 1);

Now insert some data,
  insert into journal select generate_series(timestamp '2025-01-01', '2025-12-31', '1 day');

and wait for autovacuum to fire.  You'll get an error like

  2025-11-03 12:17:42.263 CET [536755] ERROR:  cannot execute SQL without an outer snapshot or portal

This patch fixes it.  I haven't given much thought to adding a good way
to test this yet ...

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Someone said that it is at least an order of magnitude more work to do
production software than a prototype. I think he is wrong by at least
an order of magnitude."                              (Brian Kernighan)

Вложения

Re: BRIN autosummarization lacking a snapshot

От
Michael Paquier
Дата:
On Mon, Nov 03, 2025 at 12:21:50PM +0100, Alvaro Herrera wrote:
> This patch fixes it.  I haven't given much thought to adding a good way
> to test this yet ...

Spawning an autovacuum worker can feel artistic as we try to make the
tests run fast, but it's not that bad.  The trick is to use an
"autovacuum_naptime = 1".  Then you could either scan the server logs
for some 'autovacuum: processing database "blah"', or just a polling
query based on pg_stat_all_tables.autovacuum_count.  See for example
006_signal_autovacuum.pl.
--
Michael

Вложения

Re: BRIN autosummarization lacking a snapshot

От
Álvaro Herrera
Дата:
On 2025-Nov-04, Michael Paquier wrote:

> Spawning an autovacuum worker can feel artistic as we try to make the
> tests run fast, but it's not that bad.  The trick is to use an
> "autovacuum_naptime = 1".  Then you could either scan the server logs
> for some 'autovacuum: processing database "blah"', or just a polling
> query based on pg_stat_all_tables.autovacuum_count.  See for example
> 006_signal_autovacuum.pl.

Ah yes ... and, actually, we already have a file doing a closely related
thing, so I added to it.  Here's the patch for master.  Backbranches are
essentially identical, modulo these changes for 13 and 14:

-use Test::More tests => 2;
+use Test::More tests => 4;

I'm glad we got rid of that :-)


With my initial try of this test, just counting the number of BRIN
tuples, I was _really_ surprised that the index did indeed contain the
expected number of tuples, even when the error was being thrown.  This
turned out to be expected, because the way BRIN summarization works is
that we insert a placeholder tuple first, then update it to the correct
value, and the error only aborts the second part.  That's why I needed
to add a WHERE clause to only count non-placeholder tuples.

I also added a 'sleep(1)', to avoid looping on the query when we know
autovacuum can't possibly have had a chance to run yet.

I unleashed CI on branches 15 and master, and will push soon if they
both turn green.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La virtud es el justo medio entre dos defectos" (Aristóteles)

Вложения