Re: [PATCH] ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized
От | Stepan Neretin |
---|---|
Тема | Re: [PATCH] ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized |
Дата | |
Msg-id | CA+Yyo5RA+KQPYZnsaKRRYpAKUpVKt2_xviKrqNrX_6pD7HkDpg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized (Dimitrios Apostolou <jimis@gmx.net>) |
Ответы |
Re: [PATCH] ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized
|
Список | pgsql-performance |
On Mon, Jul 21, 2025 at 12:24 PM Dimitrios Apostolou <jimis@gmx.net> wrote:
>
> FWIW I implemented a pg_restore --freeze patch, see attached. It needs
> another patch of mine from [1] that implements pg_restore --data-only
> --clean, which for parallel restores encases each COPY in its own transaction
> and prepends it with a TRUNCATE. All feedback is welcome.
>
> [1] https://www.postgresql.org/message-id/c61263f2-7472-5dd8-703d-01e683421f61%40gmx.net
>
> It works really fast for the data, and I see that some, but not all items
> from section=post-data, start parallel plans. For example I see CREATE INDEX
> spawns parallel workers.
I added it to July's commitfest, mostly to trigger discussion around the
issue.
https://commitfest.postgresql.org/patch/5826/
Not sure how to mark on the commitfest page that the patch requires
another patch from another commitfest entry.
Dimitris
Hello Dimitris and fellow hackers,
I'm And Warda(currently unavailable :( ) picking this patch up from the July CommitFest for review. Thanks, Dimitris, for submitting it and for tackling this well-known performance issue with restoring very large databases. The problem of `ALTER TABLE ... ADD FOREIGN KEY` taking days after a restore due to the lack of a visibility map is a significant pain point, and your proposed solution is a logical step.
I've reviewed the patch and the discussion in the thread. I've attached a rebased version of the patch against the current master.
The core idea of adding a `--freeze` option to `pg_restore` to leverage `COPY ... WITH (FREEZE)` is sound. It directly addresses the problem of needing a very long `VACUUM` run after data loading to make subsequent `post-data` steps, like foreign key validation, performant. The benefit isn't just a minor speedup in the `COPY` itself but the major time-saving of avoiding a full-table `VACUUM` on a massive dataset.
I ran a simple performance script based on your patch. On a small test case, I saw a consistent 10-15% speedup during the data-loading phase. While this is encouraging, the main benefit, as noted, is in the steps that follow.
=== Summary ===
freeze average: 1.205 seconds
nofreeze average: 1.324 seconds
~/.pg-vanilla/bin ᐅ ./tst.sh
=== Summary ===
freeze average: 1.362 seconds
nofreeze average: 1.480 seconds
#!/bin/bash
set -euo pipefail
DUMP_FILE="parttest.dump"
DB_PREFIX="parttest"
JOBS=$(nproc)
ROUNDS=3
timings() {
local mode=$1
local total_time=0
for i in $(seq 1 $ROUNDS); do
DB_NAME="${DB_PREFIX}_${mode}_$i"
psql -q -c "DROP DATABASE IF EXISTS $DB_NAME"
psql -q -c "CREATE DATABASE $DB_NAME"
start_time=$(date +%s.%N)
if [ "$mode" == "freeze" ]; then
pg_restore -d "$DB_NAME" -j "$JOBS" --clean --if-exists --freeze "$DUMP_FILE" > /dev/null
else
pg_restore -d "$DB_NAME" -j "$JOBS" --clean --if-exists "$DUMP_FILE" > /dev/null
fi
end_time=$(date +%s.%N)
duration=$(echo "$end_time - $start_time" | bc)
total_time=$(echo "$total_time + $duration" | bc)
done
average=$(echo "scale=3; $total_time / $ROUNDS" | bc)
echo "$mode average: $average seconds"
}
freeze_avg=$(timings freeze)
nofreeze_avg=$(timings nofreeze)
echo ""
echo "=== Summary ==="
echo "$freeze_avg"
echo "$nofreeze_avg"
The main area of concern is the implementation method in `pg_backup_archiver.c`. The patch introduces a `do_copy` function that modifies the `COPY` statement string to inject the `WITH (FREEZE)` option.
```c
if (cp_end - 10 > cp &&
strncmp(cp_end - 10, "FROM stdin", 10) == 0
)
cp_is_well_formed = true;
```
This approach of string parsing is quite fragile. It makes a strong assumption that the `COPY` statement generated by `pg_dump` will always end with `... FROM stdin;` (preceded by optional whitespace). While this is true now, it creates a tight and brittle coupling between `pg_dump`'s output format and `pg_restore`'s parsing logic. Any future changes to `pg_dump` that might add options or slightly alter the `COPY` syntax could break this feature silently or with a non-obvious warning.
A more robust solution would be to avoid string manipulation entirely. `pg_restore` should assemble the `COPY` command from its constituent parts (table name, column list, etc.) and then conditionally append the `WITH (FREEZE)` clause before the final `FROM stdin;`. This would decouple the feature from the exact string representation in the dump archive.
An alternative—and arguably cleaner—approach might be to shift this logic to pg_dump.
One important consideration that needs to be highlighted in the documentation for this feature is the impact on WAL generation. `COPY ... WITH (FREEZE)` is known to generate significantly more WAL traffic because it must log the freezing of tuples, which can be a surprise for users. Maybe we can insert already frozen pages?
Additionally, it should be noted that the freeze option only works correctly when performing a fresh load of data into empty tables.
Thanks again for your work on this.
Regards,
Stepan Neretin
I'm And Warda(currently unavailable :( ) picking this patch up from the July CommitFest for review. Thanks, Dimitris, for submitting it and for tackling this well-known performance issue with restoring very large databases. The problem of `ALTER TABLE ... ADD FOREIGN KEY` taking days after a restore due to the lack of a visibility map is a significant pain point, and your proposed solution is a logical step.
I've reviewed the patch and the discussion in the thread. I've attached a rebased version of the patch against the current master.
The core idea of adding a `--freeze` option to `pg_restore` to leverage `COPY ... WITH (FREEZE)` is sound. It directly addresses the problem of needing a very long `VACUUM` run after data loading to make subsequent `post-data` steps, like foreign key validation, performant. The benefit isn't just a minor speedup in the `COPY` itself but the major time-saving of avoiding a full-table `VACUUM` on a massive dataset.
I ran a simple performance script based on your patch. On a small test case, I saw a consistent 10-15% speedup during the data-loading phase. While this is encouraging, the main benefit, as noted, is in the steps that follow.
=== Summary ===
freeze average: 1.205 seconds
nofreeze average: 1.324 seconds
~/.pg-vanilla/bin ᐅ ./tst.sh
=== Summary ===
freeze average: 1.362 seconds
nofreeze average: 1.480 seconds
#!/bin/bash
set -euo pipefail
DUMP_FILE="parttest.dump"
DB_PREFIX="parttest"
JOBS=$(nproc)
ROUNDS=3
timings() {
local mode=$1
local total_time=0
for i in $(seq 1 $ROUNDS); do
DB_NAME="${DB_PREFIX}_${mode}_$i"
psql -q -c "DROP DATABASE IF EXISTS $DB_NAME"
psql -q -c "CREATE DATABASE $DB_NAME"
start_time=$(date +%s.%N)
if [ "$mode" == "freeze" ]; then
pg_restore -d "$DB_NAME" -j "$JOBS" --clean --if-exists --freeze "$DUMP_FILE" > /dev/null
else
pg_restore -d "$DB_NAME" -j "$JOBS" --clean --if-exists "$DUMP_FILE" > /dev/null
fi
end_time=$(date +%s.%N)
duration=$(echo "$end_time - $start_time" | bc)
total_time=$(echo "$total_time + $duration" | bc)
done
average=$(echo "scale=3; $total_time / $ROUNDS" | bc)
echo "$mode average: $average seconds"
}
freeze_avg=$(timings freeze)
nofreeze_avg=$(timings nofreeze)
echo ""
echo "=== Summary ==="
echo "$freeze_avg"
echo "$nofreeze_avg"
The main area of concern is the implementation method in `pg_backup_archiver.c`. The patch introduces a `do_copy` function that modifies the `COPY` statement string to inject the `WITH (FREEZE)` option.
```c
if (cp_end - 10 > cp &&
strncmp(cp_end - 10, "FROM stdin", 10) == 0
)
cp_is_well_formed = true;
```
This approach of string parsing is quite fragile. It makes a strong assumption that the `COPY` statement generated by `pg_dump` will always end with `... FROM stdin;` (preceded by optional whitespace). While this is true now, it creates a tight and brittle coupling between `pg_dump`'s output format and `pg_restore`'s parsing logic. Any future changes to `pg_dump` that might add options or slightly alter the `COPY` syntax could break this feature silently or with a non-obvious warning.
A more robust solution would be to avoid string manipulation entirely. `pg_restore` should assemble the `COPY` command from its constituent parts (table name, column list, etc.) and then conditionally append the `WITH (FREEZE)` clause before the final `FROM stdin;`. This would decouple the feature from the exact string representation in the dump archive.
An alternative—and arguably cleaner—approach might be to shift this logic to pg_dump.
One important consideration that needs to be highlighted in the documentation for this feature is the impact on WAL generation. `COPY ... WITH (FREEZE)` is known to generate significantly more WAL traffic because it must log the freezing of tuples, which can be a surprise for users. Maybe we can insert already frozen pages?
Additionally, it should be noted that the freeze option only works correctly when performing a fresh load of data into empty tables.
Thanks again for your work on this.
Regards,
Stepan Neretin
Вложения
В списке pgsql-performance по дате отправления: