Обсуждение: BUG #10155: BUG? Cann't remove new generated tuples after repeatable read transaction start.
BUG #10155: BUG? Cann't remove new generated tuples after repeatable read transaction start.
От
digoal@126.com
Дата:
The following bug has been logged on the website:
Bug reference: 10155
Logged by: digoal.zhou
Email address: digoal@126.com
PostgreSQL version: 9.3.3
Operating system: CentOS 6.4 x64
Description:
SESSION A :
digoal=# begin isolation level repeatable read;
BEGIN
digoal=# \dt
List of relations
Schema | Name | Type | Owner
--------+-----------------------+-------+----------
public | stat_pg_stat_database | table | postgres
public | tbl_cost_align | table | postgres
public | test | table | postgres
public | ttt | table | postgres
public | tttt | table | postgres
public | ttttt | table | postgres
(6 rows)
SESSION B :
digoal=# create table t as select * from pg_class;
SELECT 306
SESSION A :
no table t in it , A cann't see the t metadata in pg_class and pg_attr and
so on.
digoal=# \dt
List of relations
Schema | Name | Type | Owner
--------+-----------------------+-------+----------
public | stat_pg_stat_database | table | postgres
public | tbl_cost_align | table | postgres
public | test | table | postgres
public | ttt | table | postgres
public | tttt | table | postgres
public | ttttt | table | postgres
(6 rows)
SESSION B :
but B cann't reclaim rows from table t.
why?
i think postgresql cann't reclaim tuples already exists before repeatable
read transaction start, why this case t's tuples after session a and cann't
reclaim.
digoal=# delete from t;
DELETE 306
digoal=# vacuum freeze verbose t;
INFO: vacuuming "public.t"
INFO: "t": found 0 removable, 306 nonremovable row versions in 2 out of 2
pages
DETAIL: 306 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_33578"
INFO: index "pg_toast_33578_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_33578": found 0 removable, 0 nonremovable row versions in 0
out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
when SESSION end;
session b can reclaim these dead tuple.
Re: BUG #10155: BUG? Cann't remove new generated tuples after repeatable read transaction start.
От
Heikki Linnakangas
Дата:
On 04/28/2014 11:37 AM, digoal@126.com wrote: > SESSION B : > but B cann't reclaim rows from table t. > why? > i think postgresql cann't reclaim tuples already exists before repeatable > read transaction start, why this case t's tuples after session a and cann't > reclaim. I think what you're arguing is that the system should be smarter and be able to reclaim the dead tuples. Because session A began before the table was even created, and there are no other backends that would need to see them either, they could indeed be safely vacuumed. The system just isn't smart enough to distinguish the case. The short answer is that such an optimization just doesn't exist in PostgreSQL. It's certainly not a bug. The long answer is that actually, even though the table was created after the transaction in session A began, session A *can* access the table. Schema changes don't follow the normal MVCC rules. If you do "SELECT * FROM t" in session A, it will work. However, the rows still won't be visible, to sessin A, because they were inserted after the snapshot was taken, so they could still be vacuumed if the system tracked the snapshots more carefully and was able to deduce that. But the fact that a new table was created is not relevant. - Heikki
HI,
This is so bad in pg_dump use, when a database so big.
because pg_dump is also use repeatable read isolation. and when pg_dump backup database , the database will bloat .
Can we optimize it?
--
公益是一辈子的事,I'm Digoal,Just Do It.
公益是一辈子的事,I'm Digoal,Just Do It.
At 2014-04-29 02:53:33,"Heikki Linnakangas" <hlinnakangas@vmware.com> wrote: >On 04/28/2014 11:37 AM, digoal@126.com wrote: >> SESSION B : >> but B cann't reclaim rows from table t. >> why? >> i think postgresql cann't reclaim tuples already exists before repeatable >> read transaction start, why this case t's tuples after session a and cann't >> reclaim. > >I think what you're arguing is that the system should be smarter and be >able to reclaim the dead tuples. Because session A began before the >table was even created, and there are no other backends that would need >to see them either, they could indeed be safely vacuumed. The system >just isn't smart enough to distinguish the case. > >The short answer is that such an optimization just doesn't exist in >PostgreSQL. It's certainly not a bug. > >The long answer is that actually, even though the table was created >after the transaction in session A began, session A *can* access the >table. Schema changes don't follow the normal MVCC rules. If you do >"SELECT * FROM t" in session A, it will work. However, the rows still >won't be visible, to sessin A, because they were inserted after the >snapshot was taken, so they could still be vacuumed if the system >tracked the snapshots more carefully and was able to deduce that. But >the fact that a new table was created is not relevant. > >- Heikki
Re: BUG #10155: BUG? Cann't remove new generated tuples after repeatable read transaction start.
От
Heikki Linnakangas
Дата:
On 04/29/2014 02:18 AM, å¾·å¥ wrote: > HI, > This is so bad in pg_dump use, when a database so big. > because pg_dump is also use repeatable read isolation. and when pg_dump backup database , the database will bloat . Yep. One approach is to take a filesystem-level backup (ie. with pg_start/stop_backup() or pg_basebackup), start a second server from that backup, and run pg_dump against that. > Can we optimize it? Not easily. I'd love to do something about it, but it's going to be a long project. - Heikki