Re: Indirect indexes

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: Indirect indexes
Дата
Msg-id CAPpHfduZ_E5fPnUVQsB11yHMjE9wd4r=_j_Ct4FNcbya4tZU7g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Indirect indexes  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Ответы Re: Indirect indexes  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
On Wed, Oct 19, 2016 at 12:21 AM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Tue, Oct 18, 2016 at 9:28 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Vacuuming presents an additional challenge: in order to remove index
items from an indirect index, it's critical to scan the PK index first
and collect the PK values that are being removed.  Then scan the
indirect index and remove any items that match the PK items removed.
This is a bit problematic because of the additional memory needed to
store the array of PK values.  I haven't implemented this yet.

Imagine another situation: PK column was not updated, but indirect indexed column was updated.
Thus, for single heap tuple we would have single PK tuple and two indirect index tuples (correct me if I'm wrong).
How are we going to delete old indirect index tuple?

Let me explain it in more details.

There is a table with two columns and indirect index on it.

CREATE TABLE tbl (id integer primary key, val integer);
CREAET INDIRECT INDEX tbl_val_indirect_idx ON tbl (val);

Then do insert and update.

INSERT INTO tbl VALUES (1, 1);
UPDATE tbl SET val = 2 WHERE id = 1;

Then heap would contain two tuples.

 ctid  | id | val
-------+----+-----
 (0;1) |  1 |   1
 (0;2) |  1 |   2

tbl_pk_idx would contain another two tuples

 id | item_pointer
----+--------------
  1 |        (0;1)
  1 |        (0;2)

And tbl_val_indirect_idx would have also two tuples

 val | id
-----+----
   1 |  1
   2 |  1

Then vacuum removes (0;1) from heap, reference to (0;1) from tbl_pk_idx.  But how will it remove (1,1) tuple from tbl_val_indirect_idx?  Thus, before vacuuming tbl_val_indirect_idx we should know not only values of id which are being removed, but actually (id, val) pairs which are being removed.  Should we collect those paris while scanning heap?  But we should also take into account that multiple heap tuples might have same (id, val) pair values (assuming there could be other columns being updated).  Therefore, we should take into account when last pair of particular (id, val) pair value was deleted from heap.  That would be very huge change to vacuum, may be even writing way more complex vacuum algorithm from scratch.  Probably, you see the better solution of this problem.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: Remove vacuum_defer_cleanup_age
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Mention column name in error messages