Re: Why could different data in a table be processed with differentperformance?

Поиск

Список

Период

Сортировка

От	Justin Pryzby
Тема	Re: Why could different data in a table be processed with differentperformance?
Дата	25 сентября 2018 г. 05:34:23
Msg-id	20180924233423.GV2471@telsasoft.com обсуждение исходный текст
Ответ на	Re: Why could different data in a table be processed with different performance? (Vladimir Ryabtsev <greatvovan@gmail.com>)
Ответы	Re: Why could different data in a table be processed with different performance? Re: Why could different data in a table be processed with different performance? Re: Why could different data in a table be processed with different performance?
Список	pgsql-performance

Дерево обсуждения

On Mon, Sep 24, 2018 at 03:28:15PM -0700, Vladimir Ryabtsev wrote:
> > it is not unusual to have 1GB cache or more...  and do not forget to drop
> the cache between tests + do a sync
> I also reviewed import scripts and found the import was done in DESCENDING
> order of IDs.

This seems significant..it means the heap was probably written in backwards
order relative to the IDs, and the OS readahead is ineffective when index
scanning across a range of IDs.  From memory, linux since (at least) 2.6.32 can
optimize this.  You mentioned you're using 4.4.  Does your LVM have readahead
ramped up ?  Try lvchange -r 65536 data/postgres (or similar).

Also..these might be an impractical solution for several reasons, but did you
try either 1) forcing a bitmap scan (of only one index), to force the heap
reads to be ordered, if not sequential?  SET enable_indexscan=off (and maybe
SET enable_seqscan=off and others as needed).

Or, 2) Using a brin index (scanning of which always results in bitmap heap
scan).

> > - how big is the table?
> pg_table_size('articles') = 427 GB
> pg_table_size('pg_toast.pg_toast_221558') = 359 GB

Ouch .. if it were me, I would definitely want to make that a partitioned table..
Or perhaps two unioned together with a view? One each for the sparse and dense
range?

> > You can make use of pg_buffercache in order to see what is actually
> cached.
> It seems that there is no such a view in my DB, could it be that the module
> is not installed?

Right, it's in the postgresql -contrib package.
And you have to "CREATE EXTENSION pg_buffercache".

> > - As Laurenz suggested (VACUUM FULL), you might want to move data around.
> You can try also a dump + restore to narrow the problem to data or disk
> I launched VACUUM FULL, but it ran very slowly, according to my calculation
> it might take 17 hours. I will try to do copy data into another table with
> the same structure or spin up another server, and let you know.

I *suspect* VACUUM FULL won't help, since (AIUI) it copies all "visible" tuples
from the source table into a new table (and updates indices as necessary).  It
can resolve bloat due to historic DELETEs, but since I think your table was
written in reverse order of pkey, I think it'll also copy it in reverse order.
CLUSTER will fix that.  You can use pg_repack to do so online...but it's going
to be painful for a table+toast 1TiB in size: it'll take all day, and also
require an additional 1TB while running (same as VAC FULL).

Justin

В списке pgsql-performance по дате отправления:

Предыдущее

От: Vladimir Ryabtsev
Дата: 25 сентября 2018 г., 04:28:15
Сообщение: Re: Why could different data in a table be processed with different performance?

Следующее

От: Vladimir Ryabtsev
Дата: 25 сентября 2018 г., 06:21:26
Сообщение: Re: Why could different data in a table be processed with different performance?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Why could different data in a table be processed with differentperformance?

Предыдущее

Следующее