Re: Relation extension scalability

Поиск

Список

Период

Сортировка

От	Dilip Kumar
Тема	Re: Relation extension scalability
Дата	10 февраля 2016 г. 08:02:54
Msg-id	CAFiTN-uGtS0FbjL-URmRoYYMaMz2DWDkU-rcm3ibRRbHD5+yJQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Relation extension scalability (Amit Kapila <amit.kapila16@gmail.com>)
Ответы	Re: Relation extension scalability (Andres Freund <andres@anarazel.de>)
Список	pgsql-hackers

Дерево обсуждения

On Fri, Feb 5, 2016 at 4:50 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

> Could you also measure how this behaves for an INSERT instead of a COPY
> workload?

I think such a test will be useful.

I have measured the performance with insert to see the behavior when it don't use strategy. I have tested multiple option, small tuple, big tuple, data fits in shared buffer and doesn't fit in shared buffer.

Observation:
------------------
Apart from this test I have also used some tool which can take a many stack traces with some delay.
1. I have observed with base code (when data don't fits in shared buffer) almost all the stack traces are waiting on "LockRelationForExtension" and
many on "FlushBuffer" also (Flushing the dirty buffer).

Total Stack Captured: 204, FlushBuffer: 13, LockRelationForExtension: 187

(This test run with 8 thread (shared buf 512MB) and after every 5 second stack is captured.)

2. If I change shared buf 48GB then Obviously FlushBuffer disappeared but still LockRelationForExtension remains in very high number.

3.Performance of base code in both the cases when Data fits in shared buffers or doesn't fits in shared buffer remain very low and non-scaling(we can see that in below results).

Test--1 (big record insert and Data fits in shared Buffer)
------------------------------------------------------------
setup
--------
./psql -d postgres -c "create table test_data(a int, b text)"
./psql -d postgres -c "insert into test_data values(generate_series(1,1000),repeat('x', 1024))"
./psql -d postgres -c "create table data (a int) with(extend_by_blocks=$variable)" {create table data (a int) for base code}
echo "insert into data select * from test_data;" >> copy_script

test:
-----
shared_buffers=48GB max_wal_size=20GB checkpoint_timeout=10min
./pgbench -c $ -j $ -f copy_script -T 120 postgres


client    base    extend_by_block=50    extend_by_block=1000
1    113                115                            118
4    50                  220                            216
8    43                  202                            302

Test--2 (big record insert and Data doesn't fits in shared Buffer)
------------------------------------------------------------------
setup:
-------
./psql -d postgres -c "create table test_data(a int, b text)"
./psql -d postgres -c "insert into test_data values(generate_series(1,1000),repeat('x', 1024))"
./psql -d postgres -c "create table data (a int) with(extend_by_blocks=1000)"
echo "insert into data select * from test_data;" >> copy_script

test:
------
shared_buffers=512MB max_wal_size=20GB checkpoint_timeout=10min
./pgbench -c $ -j $ -f copy_script -T 120 postgres

client    base    extend_by_block=1000
1         125        125
4         49          236
8         41          294
16        39         279

Test--3 (small record insert and Data fits in shared Buffer)
------------------------------------------------------------------
setup:
--------
./psql -d postgres -c "create table test_data(a int)"
./psql -d postgres -c "insert into test_data values(generate_series(1,10000))"
./psql -d postgres -c "create table data (a int) with(extend_by_blocks=20)"
echo "insert into data select * from test_data;" >> copy_script

test:
-----
shared_buffers=48GB -c max_wal_size=20GB -c checkpoint_timeout=10min
./pgbench -c $ -j $ -f copy_script -T 120 postgres

client    base    Patch-extend_by_block=20
1          137          143
2          269          250
4          377          443
8         170           690
16        145      745

*All test done with Data on MD and Wal on SSD

Note: Last patch have max limit of extend_by_block=100 so for taking performance with extend_by_block=1000 i localy changed it.
I will send the modified patch once we finalize on which approach to proceed with.

> I'm doubtful that anything that does the victim buffer search while
> holding the extension lock will actually scale in a wide range of
> scenarios.
>

I think the problem for victim buffer could be visible if the blocks
are dirty and it needs to write the dirty buffer and especially as
the patch is written where after acquiring the extension lock, it again
tries to extend the relation without checking if it can get a page with
space from FSM. It seems to me that we should re-check the
availability of page because while one backend is waiting on extension
lock, other backend might have added pages. To re-check the
availability we might want to use something similar to
LWLockAcquireOrWait() semantics as used during WAL writing.

I will work on this in next version...

Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Michael Paquier
Дата: 10 февраля 2016 г., 07:53:40
Сообщение: Re: GinPageIs* don't actually return a boolean

Следующее

От: Michael Paquier
Дата: 10 февраля 2016 г., 08:06:57
Сообщение: Re: Bug in StartupSUBTRANS

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Relation extension scalability

Предыдущее

Следующее