Re: PostgreSQL clustering VS MySQL clustering
От | Simon Riggs |
---|---|
Тема | Re: PostgreSQL clustering VS MySQL clustering |
Дата | |
Msg-id | 1106511352.31592.112.camel@localhost.localdomain обсуждение исходный текст |
Ответ на | Re: PostgreSQL clustering VS MySQL clustering (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: PostgreSQL clustering VS MySQL clustering
|
Список | pgsql-performance |
On Sat, 2005-01-22 at 16:10 -0500, Tom Lane wrote: > Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > Probably VACUUM works well for small to medium size tables, but not > > for huge ones. I'm considering about to implement "on the spot > > salvaging dead tuples". > > That's impossible on its face, except for the special case where the > same transaction inserts and deletes a tuple. In all other cases, the > transaction deleting a tuple cannot know whether it will commit. Perhaps Tatsuo has an idea... As Tom says, if you have only a single row version and then you update that row to create a second version, then we must not remove the first version, since it is effectively the Undo copy. However, if there were already 2+ row versions, then as Tatsuo suggests, it might be possible to use on the spot salvaging of dead tuples. It might be worth checking the Xid of the earlier row version(s), to see if they are now expired and could be removed immediately. However, if you had a high number of concurrent updaters, this extra effort would not be that useful, since the other row versions might still be transaction-in-progress versions. That would mean implementing this idea would be useful often, but not in the case of repeatedly updated rows. Changing the idea slightly might be better: if a row update would cause a block split, then if there is more than one row version then we vacuum the whole block first, then re-attempt the update. That way we wouldn't do the row every time, just when it becomes a problem. I'm suggesting putting a call to vacuum_page() into heap_update(), immediately before any call to RelationGetBufferForTuple(). We already know that page splitting is an expensive operation, so doing some work to try to avoid that could frequently pay off. This would be isolated to updating. This wouldn't remove the need for vacuuming, but it would act to prevent severe performance degradation caused by frequent re-updating. What do you think? -- Best Regards, Simon Riggs
В списке pgsql-performance по дате отправления: