Re: [HACKERS] pgsql 10: hash indexes testing

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [HACKERS] pgsql 10: hash indexes testing
Дата
Msg-id CAA4eK1+H3u6oMCgTeE2AZq7Xyc6WuGjKiqKwG3s9t_=4u5vM=w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] pgsql 10: hash indexes testing  (AP <ap@zip.com.au>)
Ответы Re: [HACKERS] pgsql 10: hash indexes testing  (AP <ap@zip.com.au>)
Список pgsql-hackers
On Thu, Jul 6, 2017 at 2:40 AM, AP <ap@zip.com.au> wrote:
> On Wed, Jul 05, 2017 at 05:52:32PM +0530, Amit Kapila wrote:
>> >> >  version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items |
free_percent
>> >> >
---------+--------------+----------------+--------------+--------------+------------+------------+------------------
>> >> >        3 |     10485760 |        2131192 |           66 |            0 | 2975444240 |          0 |
1065.19942179026
>> >> > (1 row)
> ...
>> >> > And I do appear to have an odd percentage of free space. :)
>>
>> Are you worried about "unused_pages"? If so, then this is not a major
>
> Nope. "free_percent" Just a bit weird that I have it at over 1000% free. :)
> Shouldn't that number be < 100?
>

Yes, there seems to be some gotcha in free percent calculation.  Is it
possible for you to debug or in some way share the test?

>> reason to worry, because these are probably freed overflow pages which
>> can be used in future.  In the hash index, when we free the overflow
>> pages, they are not returned back to OS, rather they are tracked in
>> the index as unused pages which will get used when required in future.
>
>> >> It looks like Vacuum hasn't been triggered.
>>
>> Vacuum won't be triggered on insert load.  I think that is one of the
>> reasons why in your initial copy, you might have got the error.  We
>> had some discussion in the past to trigger Vacuum on insert heavy
>> workloads [1], but the patch still didn't get committed.  I think if
>> that patch or some other form of that patch gets committed, it will
>> help the workload what you are trying here.
>
> Well, if this is the cause of my little issue, it might be nice. ATM
> my import script bombs out on errors (that I've duplicated! :) It took
> 11 hours but it bombed) and it sounds like I'll need to do a manual
> VACUUM before it can be run again.
>

Yeah, I think after manual vacuum you should be able to proceed.

> The stats you were looking for before are:
>
> # select * from  pgstathashindex('link_datum_id_idx');
>  version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items |   free_percent
> ---------+--------------+----------------+--------------+--------------+------------+------------+------------------
>        3 |      8559258 |        4194176 |          128 |      1926502 | 3591812743 |          0 | 942.873199357466
> (1 row)
>
> # select 4194176.0/128/8;
>        ?column?
> -----------------------
>  4095.8750000000000000
> (1 row)
>

From above stats, it is clear that you have hit the maximum number of
overflow pages we can support today.  Now, here one can argue that we
should increase the limit of overflow pages in hash index which we can
do, but I think you can again hit such a problem after some more time.
So at this stage, there are two possibilities for you (a) run manual
Vacuum in-between (b) create the index after bulk load.  In general,
whatever I have mentioned in (b) is a better way for bulk loading.
Note here again the free_percent seems to be wrong.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: [HACKERS] hash index on unlogged tables doesn't behave as expected
Следующее
От: AP
Дата:
Сообщение: Re: [HACKERS] pgsql 10: hash indexes testing