Re: Graph datatype addition

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: Graph datatype addition
Дата
Msg-id 517ED023.7080001@nasby.net
обсуждение исходный текст
Ответ на Re: Graph datatype addition  (Florian Pflug <fgp@phlo.org>)
Ответы Re: Graph datatype addition  (Atri Sharma <atri.jiit@gmail.com>)
Список pgsql-hackers
On 4/29/13 2:20 PM, Florian Pflug wrote:
> On Apr29, 2013, at 21:00 , Atri Sharma <atri.jiit@gmail.com> wrote:
>> I think we find work arounds or make shifts at the moment if we need
>> to use graphs in our database in postgres. If we have a datatype
>> itself, with support for commonly used operations built inside the
>> type itself, that will greatly simplify user's tasks, and open up a
>> whole new avenue of applications for us, such as recommender systems,
>> social network analysis, or anything that can be done with graphs.
>
> Usually though, you'd be interested a large graphs which include
> information for lots of records (e.g., nodes are individual users,
> or products, or whatever). A graph datatype is not well suited for
> that, because it'd store each graph as a single value, and updating
> the graph would mean rewriting that whole value. If you're e.g. doing
> social network analysis, and each new edge between two users requires
> you to pull the whole graph from disk, update it, and write it back,
> you'll probably hit problems once you reach a few hundred users or
> so… Which really isn't a lot for that kind of application.
>
> I'd love to see more support for those kinds of queries in postgres,
> (although WITH RECURSIVE already was a *huge* improvement in this
> area!). But storing each graph as a graph type would do isn't the
> way forward, IMHO.

My $0.02:

I believe it would be best to largely separate the questions of storage and access. Partly because of Florian's concern
thatyou'd frequently want only one representation of the whole graph, but also because the actual storage interface
doesNOT have to be user friendly if we have a good access layer. In particular, if rows had a low overhead, we'd
probablyjust store graphs that way. That's obviously not the case in PG, so is there some kind of hybrid approach we
coulduse? Perhaps sections of a graph could be stored with one piece of MVCC overhead per section? 

That's why I think separating access from storage is going to be very important; if we do that up-front, we can change
thestorage latter as we get real experience with this. 

Second, we should consider how much the access layer should build on WITH RECURSIVE and the like. Being able to detect
specificuse patterns of CTE/WITH RECURSIVE seems like it could add a lot of value; but I also worry that it's way to
magicalto be practical. 
--
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Misa Simic
Дата:
Сообщение: Re: Graph datatype addition
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: Remaining beta blockers