Обсуждение: Bulk inserts and usage_count

Поиск
Список
Период
Сортировка

Bulk inserts and usage_count

От
Heikki Linnakangas
Дата:
While testing the buffer ring patch, I noticed that bulk inserts with 
both INSERT and COPY pin and unpin the buffer they insert to for every 
tuple. That means that the usage_count of all those buffers are bumped 
up to 5. That's gotta be bad if you try to run a COPY concurrently with 
other activity. It also affects tables like TPC-C order_line where a 
tuples are always inserted and updated in groups.

To demonstrate:

postgres=# truncate foo; TRUNCATE TABLE
postgres=# COPY foo FROM '/tmp/foo.data'; COPY 1000
postgres=# SELECT c.relname, bufferid, relblocknumber, isdirty, 
usagecount FROM pg_buffercache bc, pg_class c WHERE c.relfilenode = 
bc.relfilenode and c.relname='foo'; relname | bufferid | relblocknumber | isdirty | usagecount
---------+----------+----------------+---------+------------ foo     |   105078 |              4 | f       |          5
foo    |   105079 |              3 | f       |          5 foo     |   105080 |              2 | f       |          5
foo    |   105081 |              1 | f       |          5 foo     |   105082 |              0 | f       |          5
 
(5 rows)

A fix for COPY will fall naturally out of the buffer ring patch, but not 
for INSERT.

A more general fix would be to somehow keep the last insertion page 
pinned across calls to heap_insert.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Bulk inserts and usage_count

От
"Jim C. Nasby"
Дата:
On Tue, May 15, 2007 at 04:37:28PM +0100, Heikki Linnakangas wrote:
> While testing the buffer ring patch, I noticed that bulk inserts with 
> both INSERT and COPY pin and unpin the buffer they insert to for every 
> tuple. That means that the usage_count of all those buffers are bumped 
<snip> 
> A fix for COPY will fall naturally out of the buffer ring patch, but not 
> for INSERT.
> 
> A more general fix would be to somehow keep the last insertion page 
> pinned across calls to heap_insert.

ISTR discussion in the past about having things like COPY and INSERT
INTO ... SELECT building entire pages in one shot once they exhaust the
FSM. Not only would it address this issue, but it would probably improve
performance in many ways (less locking and unlocking, ability to
pre-sort before inserting into indexes, fewer calls to FSM, probably a
bunch of other things).
-- 
Jim Nasby                                      decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)