Re: Select count(*) on a 2B Rows Tables Takes ~20 Hours

Поиск

Список

Период

Сортировка

От	Schneider, Jeremy
Тема	Re: Select count(*) on a 2B Rows Tables Takes ~20 Hours
Дата	19 сентября 2018 г. 07:25:46
Msg-id	8E4AC460-ED4A-4AB4-B19E-F8CB5E94B04F@amazon.com обсуждение исходный текст
Ответ на	Re: Select count(*) on a 2B Rows Tables Takes ~20 Hours (Laurenz Albe <laurenz.albe@cybertec.at>)
Список	pgsql-performance

Дерево обсуждения

Aside from I/O going to a different kind of storage, I don't think anything Aurora-specific should be at play here.

Would the 118 million buffer accesses (hits+reads) only include the index scan, or would that number also reflect
buffersaccessed for the 500 million heap fetches? 

While Aurora doesn't have a filesystem cache (since it's a different kind of storage), it does default the buffer_cache
to75% to offset this. It appears that as Laurenz has pointed out, this is simply a lot of I/O requests in a serial
process. 

BTW that's 900GB of data that was read (118 million buffers of 8k each) - on a box with only 61GB of memory available
forcaching. 

-Jeremy

Sent from my TI-83

> On Sep 17, 2018, at 12:04 PM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> Fred Habash wrote:
>> If I'm reading this correctly, it took 57M ms out of an elapsed time of 61M ms to read 45M pages from the
filesystem?
>> If the average service time per sarr is < 5 ms, Is this a case of bloated index where re-indexing is warranted?
>>
>> explain (analyze,buffers,timing,verbose,costs)
>> select count(*) from jim.pitations ;
>>                                                                                 QUERY PLAN
                                                      
>>
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Aggregate  (cost=72893810.73..72893810.74 rows=1 width=8) (actual time=61141110.437..61141110.437 rows=1 loops=1)
>>   Output: count(*)
>>   Buffers: shared hit=72620045 read=45297330
>>   I/O Timings: read=57489958.088
>>   ->  Index Only Scan using pit_indx_fk03 on jim.pitations  (cost=0.58..67227187.37 rows=2266649344 width=0) (actual
time=42.327..60950272.189rows=2269623575 loops=1) 
>>         Output: vsr_number
>>         Heap Fetches: 499950392
>>         Buffers: shared hit=72620045 read=45297330
>>         I/O Timings: read=57489958.088
>> Planning time: 14.014 ms
>> Execution time: 61,141,110.516 ms
>> (11 rows)
>
> 2269623575 / (45297330 + 72620045) ~ 20, so you have an average 20
> items per block.  That is few, and the index seems indeed bloated.
>
> Looking at the read times, you average out at about 1 ms per block
> read from I/O, but with that many blocks that's of course still a long time.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>

В списке pgsql-performance по дате отправления:

Предыдущее

От: Tom Lane
Дата: 19 сентября 2018 г., 04:21:14
Сообщение: Re: LEFT JOIN LATERAL optimisation at plan time

Следующее

От: "Sam R."
Дата: 19 сентября 2018 г., 14:30:49
Сообщение: How to see/calculate size of index in memory?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Select count(*) on a 2B Rows Tables Takes ~20 Hours

Предыдущее

Следующее