Обсуждение: visibility map - what do i miss?

Поиск
Список
Период
Сортировка

visibility map - what do i miss?

От
hubert depesz lubaczewski
Дата:
hi,
i tried to test new "visibility map" feature.

to do so i:
1. fetched postgresql sources from cvs
2. compiled
3. turned autovacuum off
4. started pg
5. ran this queries:
  - CREATE TABLE test_1 (i INT4);
  - CREATE TABLE test_2 (i INT4);
  - CREATE TABLE test_3 (i INT4);
  - CREATE TABLE test_4 (i INT4);
  - INSERT INTO test_1 SELECT generate_series(1, 100000000);
  - INSERT INTO test_2 SELECT generate_series(1, 100000000);
  - INSERT INTO test_3 SELECT generate_series(1, 100000000);
  - INSERT INTO test_4 SELECT generate_series(1, 100000000);
  - UPDATE test_2 SET i = i + 1 WHERE i < 10000000;
  - UPDATE test_3 SET i = i + 1 WHERE i < 50000000;
  - UPDATE test_4 SET i = i + 1 WHERE i < 90000000;
  - VACUUM test_1;
  - VACUUM test_2;
  - VACUUM test_3;
  - VACUUM test_4;

I did it 2 times, first with sources of pg from 1st of november, and
second - with head from yesterday evening (warsaw, poland time).

results puzzled me.

First run - without visibility maps, timing of vacuums:
Time: 267844.822 ms
Time: 138854.592 ms
Time: 305467.950 ms
Time: 487133.179 ms

Second run - on head:

Time: 252218.609 ms
Time: 234388.763 ms
Time: 334016.413 ms
Time: 575698.750 ms

Now - as I understand the change - visilibity maps should make second run much faster?

Tests were performed on laptop. During first test I used it to browse the web,
read mail. During second test - nobody used the laptop.

Relation forms seem to exist:
# select oid from pg_database where datname = 'depesz';
  oid
-------
 16389
(1 row)
# select relfilenode from pg_class where relname ~ 'test_';
 relfilenode
-------------
       26756
       26759
       26762
       26765
(4 rows)

=> ls -l {26756,26759,26762,26765}*
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:31 26756
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:33 26756.1
-rw------- 1 pgdba pgdba 1065066496 2008-12-06 01:34 26756.2
-rw------- 1 pgdba pgdba     811008 2008-12-06 01:34 26756_fsm
-rw------- 1 pgdba pgdba      57344 2008-12-06 01:34 26756_vm
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:35 26759
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:36 26759.1
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:37 26759.2
-rw------- 1 pgdba pgdba  312582144 2008-12-06 01:39 26759.3
-rw------- 1 pgdba pgdba     892928 2008-12-06 01:39 26759_fsm
-rw------- 1 pgdba pgdba      57344 2008-12-06 01:39 26759_vm
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:39 26762
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:49 26762.1
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:41 26762.2
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:42 26762.3
-rw------- 1 pgdba pgdba  523862016 2008-12-06 01:43 26762.4
-rw------- 1 pgdba pgdba    1204224 2008-12-06 01:43 26762_fsm
-rw------- 1 pgdba pgdba      81920 2008-12-06 01:53 26762_vm
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 02:01 26765
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 02:08 26765.1
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 02:18 26765.2
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:50 26765.3
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:51 26765.4
-rw------- 1 pgdba pgdba  735141888 2008-12-06 02:00 26765.5
-rw------- 1 pgdba pgdba    1523712 2008-12-06 02:00 26765_fsm
-rw------- 1 pgdba pgdba      98304 2008-12-06 02:18 26765_vm

What do I miss?

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

Re: visibility map - what do i miss?

От
"Merlin Moncure"
Дата:
On Sat, Dec 6, 2008 at 6:46 AM, hubert depesz lubaczewski
<depesz@depesz.com> wrote:
> hi,
> i tried to test new "visibility map" feature.

here's the test again in a more illustrative way:
postgres=# INSERT INTO test_1 SELECT generate_series(1, 100000000);
INSERT 0 100000000
Time: 136229.455 ms
postgres=# VACUUM test_1;
VACUUM
Time: 40643.705 ms  <-- setting hint bits
postgres=# VACUUM test_1;
VACUUM
Time: 6112.946 ms  <-- fast now!
postgres=# VACUUM test_1;
VACUUM
Time: 5906.454 ms <-- just to be sure!!
postgres=# update test_1 set i = i where i = 99999999;
UPDATE 1
Time: 10201.296 ms
postgres=# VACUUM test_1;
VACUUM
Time: 5896.648 ms  <-- still fast
postgres=# UPDATE test_1 SET i = i + 1 WHERE i < 90000000;
UPDATE 89999999
Time: 352955.281 ms  <--uggh!
postgres=# VACUUM test_1;
VACUUM
Time: 200082.556 ms <-- not bad
postgres=# VACUUM test_1;
VACUUM
Time: 17313.576 ms  <-- faster now!
postgres=# UPDATE test_1 SET i = i + 1 WHERE i < 10000000;
UPDATE 9999998
Time: 55188.942 ms
postgres=# VACUUM test_1;
VACUUM
Time: 21353.182 ms < -- fast!


So what do we gather from this?  Well, the feature works as
advertised.  I think that as long as your updates are not uniformly
distributed across pages, vismap is a huge performance win for many
workloads.  I think the benefit will increase as the feature is
tweaked in future versions.  vacuum times are one of the things that
make dealing with large tables difficult, and force us to use
partitioning (which is, frankly, a hack).

Why are new pages initialized dirty?  Do inserts on pages set the dirty bit?

merlin

Re: visibility map - what do i miss?

От
"Merlin Moncure"
Дата:
On Sat, Dec 6, 2008 at 8:38 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> So what do we gather from this?  Well, the feature works as
> advertised.  I think that as long as your updates are not uniformly
> distributed across pages, vismap is a huge performance win for many
> workloads.  I think the benefit will increase as the feature is
> tweaked in future versions.  vacuum times are one of the things that
> make dealing with large tables difficult, and force us to use
> partitioning (which is, frankly, a hack).
>
> Why are new pages initialized dirty?  Do inserts on pages set the dirty bit?

dumb question...there is no guarantee the transaction will be committed.

merlin