High-Concurrency GiST in postgreSQL

Поиск

Список

Период

Сортировка

От	C. Mundi
Тема	High-Concurrency GiST in postgreSQL
Дата	5 декабря 2011 г. 17:31:23
Msg-id	CAPvS8WZNQ8ysY=hyij5EJscrZZyG9V7uZqAxsQfu8tWDpQNvBg@mail.gmail.com обсуждение исходный текст
Ответы	Re: High-Concurrency GiST in postgreSQL Re: High-Concurrency GiST in postgreSQL
Список	pgsql-general

Дерево обсуждения

Hello. This is my first post. As such, feedback on style and choice of venue are especially welcome.

I am a regular but not especially expert user of a variety of databases, including postgreSQL.
I have only modest experience with spatial databases.

I have a new project[1] in which GiST could be very useful, provided I can achieve high concurrency. Starting with some empirical evidence that R* would be a good place to start, and after reading "High-Concurrency Locking in R-Trees" [2], I went looking for an implementation of R-link trees extended to R*. So I was very interested to read Hellerstein et al. where they wrote [3]:

High concurrency, recoverability, and degree-3 consis-
tency are critical factors in a full-fledged database sys-
tem. We are considering extending the results of Kor-
nacker and Banks for R-trees [KB95] to our implemen-
tation of GiSTs.

Since this information may be somewhat dated, and GiST has obviously come a long way in postgreSQL, I am looking for current information and advice on the state of concurrency in GiST in postgreSQL. If someone has already done an R*-link tree then that could really help me. ( I can wish, no?)

Thanks for reading and thanks for advice or pointers.

Carlos

[1] It's not a GiS prject, but it has some similarities:
(a) I need to manage up to 10 million three-dimensional "boxes" or as few as 1000 "boxes"
(b) The distribution of sizes, aspect ratios and locations in R3 are all unknown a priori and may change during execution under insert/delete.
(c) Queries may arrive asynchronously and at high rate from hundreds (or more?) of compute nodes.
(d) Successive queries from any node, viewed as a time-sequence, may have very low (or at best sporadic) spatial correlation -- lots of page jumps.
(e) R* will be advantageous over R, but Priority R is probably not especially useful since turnover may be greater than 20% during a "job."
(f) I would like to avoid teh complications of distributed databases, again because of the high turnover.

[2] Marcel Kornacker and Douglas Banks. High-Concurrency Locking in R-Trees. (1995)

[3] Hellerstein, Naughton, and Pfeffer. Generalized Search Trees for Database Systems. (1995)

В списке pgsql-general по дате отправления:

Предыдущее

От: Pavel Stehule
Дата: 05 декабря 2011 г., 15:48:02
Сообщение: Re: pl/pgsql and arrays[]

Следующее

От: Andreas Kretschmer
Дата: 05 декабря 2011 г., 17:42:08
Сообщение: disallow SET WORK_MEM

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

High-Concurrency GiST in postgreSQL

Предыдущее

Следующее