Re: [HACKERS] pgsql 10: hash indexes testing

Поиск
Список
Период
Сортировка
От AP
Тема Re: [HACKERS] pgsql 10: hash indexes testing
Дата
Msg-id 20170707025218.7jwlg4ajli46g62u@zip.com.au
обсуждение исходный текст
Ответ на Re: [HACKERS] pgsql 10: hash indexes testing  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] pgsql 10: hash indexes testing  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Thu, Jul 06, 2017 at 05:19:59PM +0530, Amit Kapila wrote:
> I think if you are under development, it is always advisable to create
> indexes after initial bulk load.  That way it will be faster and will
> take lesser space atleast in case of hash index.

This is a bit of a pickle, actually:
* if I do have a hash index I'll wind up with a bloated one at some stage that refused to allow more inserts until the
indexis re-created
 
* if I don't have an index then I'll wind up with a table where I cannot create a hash index because it has too many
rowsfor it to handle
 

I'm at a bit of a loss as to how to deal with this. The DB design does come
with a kind of partitioning where a bundle of tables get put off to the side
and searched seperately as needed but too many of those and the impact on
performance can be noticed so I need to minimise them.

> >> As mentioned above REINDEX might be a better option.  I think for such
> >> situation we should have some provision to allow squeeze functionality
> >> of hash exposed to the user, this will be less costly than REINDEX and
> >> might serve the purpose for the user.  Hey, can you try some hack in
> >
> > Assuming it does help, would this be something one would need to guess
> > at? "I did a whole bunch of concurrent INSERT heavy transactions so I
> > guess I should do a squeeze now"?
> >
> > Or could it be figured out programmatically?
> 
> I think one can refer free_percent and number of overflow pages to
> perform such a command.  It won't be 100% correct, but we can make a
> guess.  We can even check free space in overflow pages with page
> inspect to make it more accurate.

Does this take much time? Main reason I am asking is that this looks like
something that the db ought to handle underneath (say as part of an autovac
run) and so if there are stats that the index code can maintain that can
then be used by the autovac (or something) code to trigger a cleanup this
I think would be of benefit.

Unless I am being /so/ unusual that it's not worth it. :)

I'll reply to the rest in a separate stream as I'm still poking other
work related things atm so can't do the debug testing as yet.

AP



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: [HACKERS] Multi column range partition table
Следующее
От: Masahiko Sawada
Дата:
Сообщение: [HACKERS] Fix header comment of streamutil.c