Re: Modeling Friendship Relationships

Поиск
Список
Период
Сортировка
От Bill Moran
Тема Re: Modeling Friendship Relationships
Дата
Msg-id 20141111201127.d80b6bc4c2a6970a6b941dec@potentialtech.com
обсуждение исходный текст
Ответ на Modeling Friendship Relationships  (Robert DiFalco <robert.difalco@gmail.com>)
Ответы Re: Modeling Friendship Relationships  (John R Pierce <pierce@hogranch.com>)
Список pgsql-general
On Tue, 11 Nov 2014 14:38:16 -0800
Robert DiFalco <robert.difalco@gmail.com> wrote:

> I have a question about modeling a mutual relationship. It seems basic but
> I can't decide, maybe it is 6 of one a half dozen of the other.
>
> In my system any user might be friends with another user, that means they
> have a reciprocal friend relationship.
>
> It seems I have two choices for modeling it.
>
> 1. I have a table with two columns userOne and userTwo. If John is friends
> with Jane there will be one row for both of them.
> 2. I have a table with two columns owner and friend. If John is friends
> with Jane there will be two rows, one that is {John, Jane} and another
> {Jane, John}.
>
> The first option has the advantage of saving table size. But queries are
> more complex because to get John's friends I have to JOIN friends f ON
>  f.userA = "John" OR f.userB = "John" (not the real query, these would be
> id's but you get the idea).
>
> In the second option the table rows would be 2x but the queries would be
> simpler -- JOIN friends f ON f.owner = "John".
>
> There could be >1M users. Each user would have <200 friends.
>
> Thoughts? Do I just choose one or is there a clear winner? TIA!

I recommend a single row per relationship, because your estimates suggest that
the size might be big enough to be worth optimizing on the basis of size.

As far as optimizing queries and what not, I did this recently, and here's what
worked well for me.

Take this example table definition:

CREATE TABLE friendship (
 person1 INT NOT NULL,
 person2 INT NOT NULL,
 PRIMARY KEY (person1, person2),
 CHECK (person1 < person2)
);
CREATE INDEX friendship_person2 ON friendship(person2);

The check constraint guarantees the data will always be stored in a certain
order, which allows you to optimize many queries (i.e., when figuring out
wheter person 57 and person 86 are friends, the where clause is simplified
becuase you know that person1 can't be 86).

If you'll need to do queries of the "list all person 57's friends" variety,
then the queries are still pretty simple, but you could create a stored
procedure to make them even easier on the part of application developers.
It's basically "WHERE person1 = 57 or person2 = 57" which will be able to
use the indexes to provide quick results.  Or something more like:

SELECT person1 AS friend FROM friendship WHERE person2 = 57
UNION
SELECT person2 AS friend FROM friendship WHERE person1 = 57;

A view should work very well:
CREATE VIEW friendship_view AS
SELECT person1 AS person, person2 AS friend FROM friendship
UNION
SELECT person2 AS person, person1 AS friend FORM friendship;

That should be a very performant view when a WHERE clause on person is
specified.

Those types of queries weren't a requirement in the implementation I did,
as the code only ever asked "is person x a friend of person y" and never
wanted the entire list.

--
Bill Moran
I need your help to succeed:
http://gamesbybill.com


В списке pgsql-general по дате отправления:

Предыдущее
От: David G Johnston
Дата:
Сообщение: Re: Autovacuum on partitioned tables in version 9.1
Следующее
От: John R Pierce
Дата:
Сообщение: Re: Modeling Friendship Relationships