Обсуждение: table bloat very fast and free space can not be reused

Поиск
Список
Период
Сортировка

table bloat very fast and free space can not be reused

От
James Pang
Дата:
experts:
     source database v14 ,  pglogical extension 2.4.5 replication to new 17, source table very frequent UPDATEs /DELETES /INSERTS and has two text and jsonb. in source database, application update/delete/insert through SQL, table size did not increased quickly. but in target pg v17 , by pglogical apply, we found table increased very quickly , table size got doubled in days.  
    Vacuum analyze did not help , only vacuum full can help. pgstattuple show most of space are free space , that much more than source.  it looks like these replicate DML always asking new pages instead of reuse existing freespace. 
    
Thanks,

James

Re: table bloat very fast and free space can not be reused

От
Christophe Pettus
Дата:

> On Apr 18, 2026, at 18:28, James Pang <jamespang886@gmail.com> wrote:
>
> experts:
>      source database v14 ,  pglogical extension 2.4.5 replication to new 17, source table very frequent UPDATEs
/DELETES/INSERTS and has two text and jsonb. in source database, application update/delete/insert through SQL, table
sizedid not increased quickly. but in target pg v17 , by pglogical apply, we found table increased very quickly , table
sizegot doubled in days.   
>     Vacuum analyze did not help , only vacuum full can help. pgstattuple show most of space are free space , that
muchmore than source.  it looks like these replicate DML always asking new pages instead of reuse existing freespace.  
>     Thanks,
>
> James

Hello,

1. First, are you *sure* that the free space is the source of the bloat, and not dead tuples?  Could you share the
queriesyou ran to detrmine this? 

2. Have you set fillfactor to anything besides 100 on either the source or the destination?

3. You might consider using in-core logical replication rather than pglogical for this.  By PostgreSQL v14, in-core
logicalreplication is likely a better choice. 

Best,
-- Christophe


Re: table bloat very fast and free space can not be reused

От
James Pang
Дата:
1. we use default fill factor in both source and target, sorry, we have two environment, v14-->v17,  v16-->v17. both see similar issue.
2. any key difference between in-core and pglogical extension to apply UPDATEs? we want to use pglogical extension conflict resolution, it's a bi-directional replication(when workload on source v16 it replicate data to v17, when it running on v17, so it can replication back to v16)
3. free space is high in source and target, but the size of freespace show big difference

we found this issue , and try to vacuum full several time on v17, but after vacuum full , it bloating very quickly.you see total toast size
much more than source (workload is still running),  target v17(by pglogical)

xxx=> select * from pgstattuple('xxxx');
 table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
 191160320 |       81911 | 107823097 |          56.4 |             1439 |        2020471 |               1.06 |   79614944 |        41.65
(1 row)

xxx=>  select * from pgstattuple('pg_toast.pg_toast_xxxxx');
 table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
 905928704 |      117924 | 169760027 |         18.74 |             6330 |        9336459 |               1.03 |  722476480 |        79.75
(1 row)


target v17
xxxx=>  select * from pgstattuple('xxxxxx');
 table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
 194191360 |       81905 | 106028087 |          54.6 |             4004 |        5801923 |               2.99 |   80468096 |        41.44
(1 row)

xxxx=>  select * from pgstattuple('pg_toast.pg_toast_xxxxx');
 table_len  | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
------------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
 1666334720 |      117921 | 169758344 |         10.19 |             9302 |       13732019 |               0.82 | 1475454612 |        88.54
(1 row)

James

Christophe Pettus <xof@thebuild.com> 於 2026年4月19日週日 上午11:56寫道:


> On Apr 18, 2026, at 18:28, James Pang <jamespang886@gmail.com> wrote:
>
> experts:
>      source database v14 ,  pglogical extension 2.4.5 replication to new 17, source table very frequent UPDATEs /DELETES /INSERTS and has two text and jsonb. in source database, application update/delete/insert through SQL, table size did not increased quickly. but in target pg v17 , by pglogical apply, we found table increased very quickly , table size got doubled in days. 
>     Vacuum analyze did not help , only vacuum full can help. pgstattuple show most of space are free space , that much more than source.  it looks like these replicate DML always asking new pages instead of reuse existing freespace.
>     Thanks,
>
> James

Hello,

1. First, are you *sure* that the free space is the source of the bloat, and not dead tuples?  Could you share the queries you ran to detrmine this?

2. Have you set fillfactor to anything besides 100 on either the source or the destination?

3. You might consider using in-core logical replication rather than pglogical for this.  By PostgreSQL v14, in-core logical replication is likely a better choice.

Best,
-- Christophe

Re: table bloat very fast and free space can not be reused

От
Kristjan Mustkivi
Дата:
On Sun, Apr 19, 2026 at 4:28 AM James Pang <jamespang886@gmail.com> wrote:
>
> experts:
>      source database v14 ,  pglogical extension 2.4.5 replication to new 17, source table very frequent UPDATEs
/DELETES/INSERTS and has two text and jsonb. in source database, application update/delete/insert through SQL, table
sizedid not increased quickly. but in target pg v17 , by pglogical apply, we found table increased very quickly , table
sizegot doubled in days. 
>     Vacuum analyze did not help , only vacuum full can help. pgstattuple show most of space are free space , that
muchmore than source.  it looks like these replicate DML always asking new pages instead of reuse existing freespace. 

Hi,

We have had the exact same problem in the past - a large json column
getting very frequent updates (hundreds even thousand per minute) on
the source and the replica side table bloating up. No amount of tuning
attempts made any difference (conversion of json to text, different
table specific autovacuum settings etc). Eventually, the table was
taken out of the replication and later on made UNLOGGED which also
freed up a giant swathes of storage from WAL written. Such design was
then called an antipattern and has not been allowed since.

With best regards,

Kristjan



Re: table bloat very fast and free space can not be reused

От
Rick Otten
Дата:


On Sun, Apr 19, 2026 at 7:44 AM Kristjan Mustkivi <sonicmonkey@gmail.com> wrote:
On Sun, Apr 19, 2026 at 4:28 AM James Pang <jamespang886@gmail.com> wrote:
>
> experts:
>      source database v14 ,  pglogical extension 2.4.5 replication to new 17, source table very frequent UPDATEs /DELETES /INSERTS and has two text and jsonb. in source database, application update/delete/insert through SQL, table size did not increased quickly. but in target pg v17 , by pglogical apply, we found table increased very quickly , table size got doubled in days.
>     Vacuum analyze did not help , only vacuum full can help. pgstattuple show most of space are free space , that much more than source.  it looks like these replicate DML always asking new pages instead of reuse existing freespace.

Hi,

We have had the exact same problem in the past - a large json column
getting very frequent updates (hundreds even thousand per minute) on
the source and the replica side table bloating up. No amount of tuning
attempts made any difference (conversion of json to text, different
table specific autovacuum settings etc). Eventually, the table was
taken out of the replication and later on made UNLOGGED which also
freed up a giant swathes of storage from WAL written. Such design was
then called an antipattern and has not been allowed since.

 Which design is an antipattern?  Using json for volatile data sets or unlogging the table?

Does `pg_repack` help?  I know it probably isn't practical to run it every couple of days.  It also can sometimes causes headaches when repacking a table with a ton of logical replication activity, but it might be a tool to consider if you haven't already.   If you can partition the table with the crazy amount of json changes, you don't have to repack all the partitions, you might be able to repack just the older ones with the worst bloat.

Re: table bloat very fast and free space can not be reused

От
Kristjan Mustkivi
Дата:
On Mon, Apr 20, 2026 at 5:39 PM Rick Otten <rottenwindfish@gmail.com> wrote:
>
>  Which design is an antipattern?  Using json for volatile data sets or unlogging the table?

For us, a large (i.e spilling over to TOAST) json blob in a table
where this json blob (or text, does not matter) gets very frequent
(hundreds and thousands of) updates per minute.

> Does `pg_repack` help?  I know it probably isn't practical to run it every couple of days.  It also can sometimes
causesheadaches when repacking a table with a ton of logical replication activity, but it might be a tool to consider
ifyou haven't already.   If you can partition the table with the crazy amount of json changes, you don't have to repack
allthe partitions, you might be able to repack just the older ones with the worst bloat. 

This feels like a complexity I personally would like to avoid. Far
more preferable is to get rid of the json. Split the large json into a
normalized table design based on the most useful/frequent patterns.

Br,
--
Kristjan Mustkivi

Email: kristjan.mustkivi@gmail.com