Re: 9.2.1 & index-only scans : abnormal heap fetches after VACUUM FULL

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: 9.2.1 & index-only scans : abnormal heap fetches after VACUUM FULL
Дата
Msg-id 20131127213320.GA3785@momjian.us
обсуждение исходный текст
Ответ на Re: 9.2.1 & index-only scans : abnormal heap fetches after VACUUM FULL  ("Kevin Grittner" <kgrittn@mail.com>)
Ответы Re: 9.2.1 & index-only scans : abnormal heap fetches after VACUUM FULL  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Sat, Jan 12, 2013 at 02:14:03PM -0500, Kevin Grittner wrote:
> Amit Kapila wrote:
> > On Thursday, January 10, 2013 6:09 AM Josh Berkus wrote:
>
> >> Surely VACUUM FULL should rebuild the visibility map, and make
> >> tuples in the new relation all-visible, no?
>
> Certainly it seems odd to me that VACUUM FULL leaves the the table
> in a less-well maintained state in terms of visibility than a
> "normal" vacuum. VACUUM FULL should not need to be followed by
> another VACUUM.
>
> > I think it cannot made all visible.
>
> I don't think all tuples in the relation are necessarily visible to
> all transactions, but the ones which are should probably be flagged
> that way.

I have developed the attached proof-of-concept patch to fix the problem
of having no visibility map after CLUSTER or VACUUM FULL.  I tested with
these queries:

    CREATE TABLE test(x INT PRIMARY KEY);
    INSERT INTO test VALUES (1);
    VACUUM FULL test; -- or CLUSTER
    SELECT relfilenode FROM pg_class WHERE relname = 'test';
     relfilenode
    -------------
           16399

Then 'ls -l data/base/16384/16399*' to see the *_vm file.  I am not sure
how to test that the vm contents are valid.

This patch is fairly tricky because our CLUSTER/VACUUM FULL behavior
does not do writes through the shared buffer cache, as outlined in this
C comment block:

 * We can't use the normal heap_insert function to insert into the new
 * heap, because heap_insert overwrites the visibility information.
 * We use a special-purpose raw_heap_insert function instead, which
 * is optimized for bulk inserting a lot of tuples, knowing that we have
 * exclusive access to the heap.  raw_heap_insert builds new pages in
 * local storage.  When a page is full, or at the end of the process,
 * we insert it to WAL as a single record and then write it to disk
 * directly through smgr.  Note, however, that any data sent to the new
 * heap's TOAST table will go through the normal bufmgr.

I originally tried to do this higher up in the stack but ran into
problems because I couldn't access the new heap page so I had to do it
at the non-shared-buffer page level.  I reused the lazy vacuum routines.

I need to know this is the right approach, and need to know what things
are wrong or missing.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [GENERAL] pg_upgrade ?deficiency