Re: Performance on inserts

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Performance on inserts
Дата
Msg-id 200010152144.RAA15769@candle.pha.pa.us
обсуждение исходный текст
Ответ на Re: Performance on inserts  (Jules Bean <jules@jellybean.co.uk>)
Ответы Re: Performance on inserts  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
> > 98304                        22.07    5545984
> > 196608                        45.60    11141120
> > 393216                        92.53    22290432
> > 
> > I tried probabilities from 0.67 to 0.999 and found that runtimes didn't
> > vary a whole lot (though this is near the minimum), while index size
> > consistently got larger as the probability of moving right decreased.
> > The runtime is nicely linear throughout the range.
> 
> That looks brilliant!! (Bearing in mind that I have over 10 million
> tuples in my table, you can imagine what performance was like for me!)
> Is there any chance you could generate a patch against released 7.0.2
> to add just this functionality... It would be the kiss of life for my
> code!
> 
> (Not in a hurry, I'm not back in work until Wednesday, as it happens)
> 
> And, of course, what would /really/ get my code going speedily would
> be the partial indices mentioned elsewhere in this thread.  If the
> backend could automagically drop keys containing > 10% (tunable) of
> the rows from the index, then my index would be (a) about 70% smaller!
> and (b) only used when it's faster. [This means it would have to
> update some simple histogram data.  However, I can't see that being
> much of an overhead]
> 
> For the short term, if I can get a working version of the above
> randomisation patch, I think I shall 'fake' a partial index by
> manually setting 'enable_seqscan=off' for all but the 4 or 5 most
> common categories. Those two factors combined will speed up my bulk
> inserts a lot.

What would be really nifty is to take the most common value found by
VACUUM ANALYZE, and cause sequential scans if that value represents more
than 50% of the entries in the table.

Added to TODO:

* Prevent index lookups (or index entries using partial index) on most common values; instead use sequential scan 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Performance on inserts
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Performance on inserts