Обсуждение: vacuumdb: add --dry-run

Поиск
Список
Период
Сортировка

vacuumdb: add --dry-run

От
Corey Huinker
Дата:
This is a small patch to add a new option to vacuumdb to answer the question "what commands will actually be run by this combination of command-line switches against this database?" without actually running the commands.

Including Nathan because we had previously discussed the utility of just such a thing.
Вложения

Re: vacuumdb: add --dry-run

От
Nathan Bossart
Дата:
On Mon, Nov 10, 2025 at 02:44:41PM -0500, Corey Huinker wrote:
> This is a small patch to add a new option to vacuumdb to answer the
> question "what commands will actually be run by this combination of
> command-line switches against this database?" without actually running the
> commands.

My attempts to test this all got stuck in wait_on_slots().  I haven't
looked too closely, but I suspect the issue is that the socket never
becomes readable because we don't send a query.  If I set free_slot->inUse
to false before printing the command, it no longer hangs.  We probably want
to create a function in parallel_slot.c to mark slots that we don't intend
to give a query as idle.

-- 
nathan



Re: vacuumdb: add --dry-run

От
Corey Huinker
Дата:

My attempts to test this all got stuck in wait_on_slots().  I haven't
looked too closely, but I suspect the issue is that the socket never
becomes readable because we don't send a query.  If I set free_slot->inUse
to false before printing the command, it no longer hangs.  We probably want
to create a function in parallel_slot.c to mark slots that we don't intend
to give a query as idle.

Would that be preferable to skipping the creation of extra connections for parallel workers? I can see it both ways. On the one hand we want to give as true a reflection of "what would happen with these options", and on the other hand one could view the creation of extra workers as "real" vs a dry run.

 

Re: vacuumdb: add --dry-run

От
Nathan Bossart
Дата:
On Mon, Nov 10, 2025 at 05:33:34PM -0500, Corey Huinker wrote:
>> My attempts to test this all got stuck in wait_on_slots().  I haven't
>> looked too closely, but I suspect the issue is that the socket never
>> becomes readable because we don't send a query.  If I set free_slot->inUse
>> to false before printing the command, it no longer hangs.  We probably want
>> to create a function in parallel_slot.c to mark slots that we don't intend
>> to give a query as idle.
> 
> Would that be preferable to skipping the creation of extra connections for
> parallel workers? I can see it both ways. On the one hand we want to give
> as true a reflection of "what would happen with these options", and on the
> other hand one could view the creation of extra workers as "real" vs a dry
> run.

I think what I'm proposing actually does skip creating extra connections.
If we're immediately marking the first connection as idle, each loop
iteration should reuse the same connection.

BTW it might be better to modify run_vacuum_command() to skip running the
command in dry-run mode.  That would also take care of the
ONLY_DATABASE_STATS stuff.  We should probably do something about the
executeCommand() for --analyze-in-stages, too.

-- 
nathan