Re: Fast insertion indexes: why no developments

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: Fast insertion indexes: why no developments
Дата
Msg-id CA+U5nMKba+bCem+uRXm4CQShZoVyS26r_fq7WSWnTtfQk5LYjg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Fast insertion indexes: why no developments  (Leonardo Francalanci <m_lists@yahoo.it>)
Ответы Re: Fast insertion indexes: why no developments  (Leonardo Francalanci <m_lists@yahoo.it>)
Re: Fast insertion indexes: why no developments  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-hackers
On 13 November 2013 09:31, Leonardo Francalanci <m_lists@yahoo.it> wrote:

> Or, in other words: what are you going to write in the minmax index
> documentation, "try and see if they work better for you"?

That seems like good advice to me. Bacon > Aristotle.

There is very little written about suitability of any index type for
Postgres. I'm sure the docs could be improved there.


> Plus, I'm *very* interested in the minmax index

Good, thanks.

> But, at the same time, I don't see any evidence that they work
> better than btrees (except for the size of the index).

Min max indexes are not designed to be a better btree, they are
designed to be roughly same as automatic partitioning. They offer
considerably improved time to build, significantly reduced index size
and significantly improved insert performance. Min max will be clearly
slower than btrees for small numbers of records, though for large
numbers of records we may expect min max to perform same as btrees,
though that requires better testing to get a more accurate picture.
There may yet be optimisations of the patch also.

Based what we have discussed here, we've come up with two new
optimisations that can be used with MinMax, namely the bulk DELETE
case and the Merge Append case. Thank you for highlighting additional
cases and requirements.

From our discussions here, IMHO there is a strong case for avoiding
btrees completely for larger historical data tables. That isn't
something I had even considered as desirable before this conversation
but ISTM now that taking that approach will be more fruitful than
attempting to implement LSM trees.

Having said that, I am also in favour of declarative partitioning,
just that there is no funding available to work on that at present.

Further work on bitmap indexes is expected. They are already designed
with good insert performance in mind and this discussion re-emphasises
that requirement.

> I would like to see some numbers.

Alvaro has given me some results for his patch. The figures I have are
for a 2GB table.

Index Build Time
MinMax 11 s
Btree   96s

Index Size
MinMax 2 pages + metapage
Btree approx 200,000 pages + metapage

Load time
MinMax no overhead, same as raw COPY
BTree - considerably slower

Index SELECT
Slower for small groups of rows
Broadly same for large requests (more results required on that assessment)

I expect to publish results against TPC-H cases in next few weeks.

Additional tests are welcome for other use cases.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: The number of character limitation of custom script on pgbench
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Replace duplicate_oids with Perl implementation