Re: a vacuum thread is not the answer

Поиск
Список
Период
Сортировка
От mlw
Тема Re: a vacuum thread is not the answer
Дата
Msg-id 3C8F7CB4.DD172133@mohawksoft.com
обсуждение исходный текст
Ответ на a vacuum thread is not the answer  (mlw <markw@mohawksoft.com>)
Список pgsql-hackers
Hannu Krosing wrote:
> 
> On Thu, 2002-03-07 at 20:15, mlw wrote:
> > I was just toying around with things, and you know, running vacuum in the
> > background doesn't work. It slows things down too much.
> >
> > The worst case senario is when one does this:
> >
> > update accounts set abalance = abalance + 1 ;
> >
> > This takes forever to run and doubles the size of the table.
> 
> How is this related to running vacuum in background ?
> 
> Does it run fast when vacuum is not running ?

The problem is that it doubles the size of a table. This invariably means that
you have more I/O. If there were a way to reuse old tulples, while they are
still in the buffer cache, then PostgreSQL could handle this query faster.

It was, however, pointed out that (obviously) you can't do reclaimation during
a transaction because if it fails or someone issues "rollback" you have broken
the database. 

So, I guess I'm saying ignore that part.

> 
> > Is there a way that a separate thread managing the freelist can perform a "per
> > row" vacuum concurrently? Maybe I am stating the problem incorrectly, but we
> > need to be able to recover rows already in memory for performance.
> 
> What could be possibly done (and is probably not very useful anyway) is
> reusing the row modified _in_the_same_transaction_ so that
> 
> begin;
> abalance = abalance + 1 ;
> abalance = abalance + 1 ;
> abalance = abalance + 1 ;
> end;
> 
> would consume just 2x the tablespace and not 4x. But this does not
> require a separate thread, just some changes in update logic.
> 
> OTOH, this will probably interfere with some transaction modes that make
> use of command ids.

I haven't looked at the code, so I don't even know if it is doable. Could a
small vacuum thread run in the background and monitor the buffer cache? When it
finds a buffer with an unreferenced tuple, do what vacuum does, but only to
that block?

Here is my problem with vacuum. It scans the whole damn table and it takes a
long time. In many, dare I say most, SQL databases, the rows which are updated
are likely a small percent.

If a small vacuum routine can be run against the blocks that are already in the
buffer, this will eliminate a block read, and focus more on blocks which are
likely to have been modified.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephan Szabo
Дата:
Сообщение: Re: select max(column) not using index
Следующее
От: Hannu Krosing
Дата:
Сообщение: Re: select max(column) not using index