Обсуждение: Selecting records with highest timestamp - for a join

Поиск
Список
Период
Сортировка

Selecting records with highest timestamp - for a join

От
Alexander Farber
Дата:
Good evening,

I have a question please on which kind of statement to use -

In a table I store user info coming from social networks (Facebook, Twitter, ...):

CREATE TABLE words_social (
        sid varchar(255) NOT NULL,

        social integer NOT NULL CHECK (0 <= social AND social <= 6),
        female integer NOT NULL CHECK (female = 0 OR female = 1),
        given  varchar(255) NOT NULL CHECK (given ~ '\S'),
        family varchar(255),
        photo  varchar(255) CHECK (photo ~* '^https?://...'),
        place  varchar(255),
        stamp  integer NOT NULL,  /* HOW TO USE THE LATEST stamp? */

        uid integer NOT NULL REFERENCES words_users ON DELETE CASCADE,
        PRIMARY KEY(sid, social)
);

I.e. a user can have several records in the above table, but I always use the most recent one (the one with the highest "stamp") to display that user in my game.

Then I use a custom function to retrieve current games info for a particular user:

CREATE OR REPLACE FUNCTION words_get_games(in_uid integer)
        RETURNS TABLE (
                out_gid integer, 
                out_created integer,
                out_finished integer,
                out_player1 integer,
                out_player2 integer,
                out_played1 integer,
                out_played2 integer,
                out_score1 integer,
                out_score2 integer,
                out_hand1 text,
                out_hand2 text,
                out_letters varchar[15][15],
                out_values integer[15][15],
                out_bid integer,
                out_last_tiles jsonb,
                out_last_score integer
        ) AS
$func$
        SELECT 
                g.gid,
                EXTRACT(EPOCH FROM g.created)::int,
                EXTRACT(EPOCH FROM g.finished)::int,
                g.player1,
                g.player2, -- can be NULL
                EXTRACT(EPOCH FROM g.played1)::int,
                EXTRACT(EPOCH FROM g.played2)::int,
                g.score1,
                g.score2,
                ARRAY_TO_STRING(g.hand1, ''),
                REGEXP_REPLACE(ARRAY_TO_STRING(g.hand2, ''), '.', '?', 'g'),
                g.letters,
                g.values,
                g.bid,
                m.tiles,
                m.score
        FROM words_games g LEFT JOIN words_moves m USING(mid)
        WHERE g.player1 = in_uid
        AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP - INTERVAL '1 day') 
        UNION SELECT
                g.gid,
                EXTRACT(EPOCH FROM g.created)::int,
                EXTRACT(EPOCH FROM g.finished)::int,
                g.player2,
                g.player1, -- can not be NULL
                EXTRACT(EPOCH FROM g.played2)::int,
                EXTRACT(EPOCH FROM g.played1)::int,
                g.score2,
                g.score1,
                ARRAY_TO_STRING(g.hand2, ''),
                REGEXP_REPLACE(ARRAY_TO_STRING(g.hand1, ''), '.', '?', 'g'),
                g.letters,
                g.values,
                g.bid,
                m.tiles,
                m.score
        FROM words_games g LEFT JOIN words_moves m USING(mid)
        WHERE g.player2 = in_uid
        AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP - INTERVAL '1 day');

$func$ LANGUAGE sql;

I would like to extend the above custom function, so that user info (given and last names, photo) is returned too.

How to approach this problem please, should I use CTE for this?

Thank you for any hints
Alex
 

Re: Selecting records with highest timestamp - for a join

От
Adrian Klaver
Дата:
On 10/19/2016 11:35 AM, Alexander Farber wrote:
> Good evening,
>
> I have a question please on which kind of statement to use -
>
> In a table I store user info coming from social networks (Facebook,
> Twitter, ...):
>
> CREATE TABLE words_social (
>         sid varchar(255) NOT NULL,
>
>         social integer NOT NULL CHECK (0 <= social AND social <= 6),
>         female integer NOT NULL CHECK (female = 0 OR female = 1),
>         given  varchar(255) NOT NULL CHECK (given ~ '\S'),
>         family varchar(255),
>         photo  varchar(255) CHECK (photo ~* '^https?://...'),
>         place  varchar(255),
>         stamp  integer NOT NULL,  /* HOW TO USE THE LATEST stamp? */
>
>         uid integer NOT NULL REFERENCES words_users ON DELETE CASCADE,
>         PRIMARY KEY(sid, social)
> );
>
> I.e. a user can have several records in the above table, but I always
> use the most recent one (the one with the highest "stamp") to display
> that user in my game.
>
> Then I use a custom function to retrieve current games info for a
> particular user:
>
> CREATE OR REPLACE FUNCTION words_get_games(in_uid integer)
>         RETURNS TABLE (
>                 out_gid integer,
>                 out_created integer,
>                 out_finished integer,
>                 out_player1 integer,
>                 out_player2 integer,
>                 out_played1 integer,
>                 out_played2 integer,
>                 out_score1 integer,
>                 out_score2 integer,
>                 out_hand1 text,
>                 out_hand2 text,
>                 out_letters varchar[15][15],
>                 out_values integer[15][15],
>                 out_bid integer,
>                 out_last_tiles jsonb,
>                 out_last_score integer
>         ) AS
> $func$
>         SELECT
>                 g.gid,
>                 EXTRACT(EPOCH FROM g.created)::int,
>                 EXTRACT(EPOCH FROM g.finished)::int,
>                 g.player1,
>                 g.player2, -- can be NULL
>                 EXTRACT(EPOCH FROM g.played1)::int,
>                 EXTRACT(EPOCH FROM g.played2)::int,
>                 g.score1,
>                 g.score2,
>                 ARRAY_TO_STRING(g.hand1, ''),
>                 REGEXP_REPLACE(ARRAY_TO_STRING(g.hand2, ''), '.', '?', 'g'),
>                 g.letters,
>                 g.values,
>                 g.bid,
>                 m.tiles,
>                 m.score
>         FROM words_games g LEFT JOIN words_moves m USING(mid)
>         WHERE g.player1 = in_uid
>         AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP -
> INTERVAL '1 day')
>         UNION SELECT
>                 g.gid,
>                 EXTRACT(EPOCH FROM g.created)::int,
>                 EXTRACT(EPOCH FROM g.finished)::int,
>                 g.player2,
>                 g.player1, -- can not be NULL
>                 EXTRACT(EPOCH FROM g.played2)::int,
>                 EXTRACT(EPOCH FROM g.played1)::int,
>                 g.score2,
>                 g.score1,
>                 ARRAY_TO_STRING(g.hand2, ''),
>                 REGEXP_REPLACE(ARRAY_TO_STRING(g.hand1, ''), '.', '?', 'g'),
>                 g.letters,
>                 g.values,
>                 g.bid,
>                 m.tiles,
>                 m.score
>         FROM words_games g LEFT JOIN words_moves m USING(mid)
>         WHERE g.player2 = in_uid
>         AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP -
> INTERVAL '1 day');
>
> $func$ LANGUAGE sql;
>
> I would like to extend the above custom function, so that user info
> (given and last names, photo) is returned too.
>
> How to approach this problem please, should I use CTE for this?

For player1, player2 or both?

Since you are returning a table from words_get_games() you can
experiment by joining it's output to words_social.

>
> Thank you for any hints
> Alex
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Selecting records with highest timestamp - for a join

От
Alexander Farber
Дата:
Adrian, for both player1 and player2 (because I need to display player photos above the game board).

SQL join with words_social - yes, but how to take the most recent record from that table?

For example there are user infos from Google+, Facebook, Twitter - but the user has used Facebook to login lately and would expect her Facebook-photo to be seen (the record with the highest "stamp" value).

Regards
Alex


On Wed, Oct 19, 2016 at 8:51 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 10/19/2016 11:35 AM, Alexander Farber wrote:
In a table I store user info coming from social networks:
 
CREATE TABLE words_social (
        sid varchar(255) NOT NULL,

        social integer NOT NULL CHECK (0 <= social AND social <= 6),
        female integer NOT NULL CHECK (female = 0 OR female = 1),
        given  varchar(255) NOT NULL CHECK (given ~ '\S'),
        family varchar(255),
        photo  varchar(255) CHECK (photo ~* '^https?://...'),
        place  varchar(255),
        stamp  integer NOT NULL,  /* HOW TO USE THE LATEST stamp? */

        uid integer NOT NULL REFERENCES words_users ON DELETE CASCADE,
        PRIMARY KEY(sid, social)
);

I.e. a user can have several records in the above table, but I always
use the most recent one (the one with the highest "stamp") to display
that user in my game.

Then I use a custom function to retrieve current games info for a
particular user:

CREATE OR REPLACE FUNCTION words_get_games(in_uid integer)
        RETURNS TABLE (
                out_gid integer,
                out_created integer,
                out_finished integer,
                out_player1 integer,
                out_player2 integer,
                out_played1 integer,
                out_played2 integer,
                out_score1 integer,
                out_score2 integer,
                out_hand1 text,
                out_hand2 text,
                out_letters varchar[15][15],
                out_values integer[15][15],
                out_bid integer,
                out_last_tiles jsonb,
                out_last_score integer
        ) AS
$func$
        SELECT
                g.gid,
                EXTRACT(EPOCH FROM g.created)::int,
                EXTRACT(EPOCH FROM g.finished)::int,
                g.player1,
                g.player2, -- can be NULL
                EXTRACT(EPOCH FROM g.played1)::int,
                EXTRACT(EPOCH FROM g.played2)::int,
                g.score1,
                g.score2,
                ARRAY_TO_STRING(g.hand1, ''),
                REGEXP_REPLACE(ARRAY_TO_STRING(g.hand2, ''), '.', '?', 'g'),
                g.letters,
                g.values,
                g.bid,
                m.tiles,
                m.score
        FROM words_games g LEFT JOIN words_moves m USING(mid)
        WHERE g.player1 = in_uid
        AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP -
INTERVAL '1 day')
        UNION SELECT
                g.gid,
                EXTRACT(EPOCH FROM g.created)::int,
                EXTRACT(EPOCH FROM g.finished)::int,
                g.player2,
                g.player1, -- can not be NULL
                EXTRACT(EPOCH FROM g.played2)::int,
                EXTRACT(EPOCH FROM g.played1)::int,
                g.score2,
                g.score1,
                ARRAY_TO_STRING(g.hand2, ''),
                REGEXP_REPLACE(ARRAY_TO_STRING(g.hand1, ''), '.', '?', 'g'),
                g.letters,
                g.values,
                g.bid,
                m.tiles,
                m.score
        FROM words_games g LEFT JOIN words_moves m USING(mid)
        WHERE g.player2 = in_uid
        AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP -
INTERVAL '1 day');

$func$ LANGUAGE sql;

I would like to extend the above custom function, so that user info
(given and last names, photo) is returned too.

How to approach this problem please, should I use CTE for this?

For player1, player2 or both?

Since you are returning a table from words_get_games() you can experiment by joining it's output to words_social.

Re: Selecting records with highest timestamp - for a join

От
Adrian Klaver
Дата:
On 10/19/2016 12:44 PM, Alexander Farber wrote:
> Adrian, for both player1 and player2 (because I need to display player
> photos above the game board).
>
> SQL join with words_social - yes, but how to take the most recent record
> from that table?
>
> For example there are user infos from Google+, Facebook, Twitter - but
> the user has used Facebook to login lately and would expect her
> Facebook-photo to be seen (the record with the highest "stamp" value).

I have not dug into your function deep enough to understand all the relationships
involved so I cannot offer anything specific. A generic method:

test[5432]=# create table ts_stamp_test(id serial PRIMARY KEY, uid integer,  stamp integer NOT NULL);
                                     
CREATE TABLE
                                     
test[5432]=# insert into ts_stamp_test (uid, stamp) values (1, 5), (2, 10), (1, 12), (2, 15), (1, 18), (2, 30);
INSERT 0 6
                                     
test[5432]=# select * from ts_stamp_test;
                                     
 id | uid | stamp
                                     
----+-----+-------
                                     
  1 |   1 |     5
                                     
  2 |   2 |    10
                                     
  3 |   1 |    12
                                     
  4 |   2 |    15
                                     
  5 |   1 |    18
                                     
  6 |   2 |    30
                                     
(6 rows)
                                     

                                     

test[5432]=# select * from ts_stamp_test where uid = 1 order by stamp desc limit 1;
 id | uid | stamp
----+-----+-------
  5 |   1 |    18
(1 row)


>
> Regards
> Alex
>
>

--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Selecting records with highest timestamp - for a join

От
"David G. Johnston"
Дата:
On Wed, Oct 19, 2016 at 11:35 AM, Alexander Farber <alexander.farber@gmail.com> wrote:
I.e. a user can have several records in the above table, but I always use the most recent one (the one with the highest "stamp") to display that user in my game.

​And if the second most recent has a picture but the most recent one does not?  Do you want to accept the missing value because its on a more recent record or do you want to take the most recent non-missing value?

​Assuming "most recent not missing" and given:

PRIMARY KEY(sid, social)

​You basically want:

SELECT s_id, first_nonnull(photo ORDER BY stamp DESC, social)
FROM ...
GROUP BY s_id

You need to write a custom first_nonnull function that ignores NULL and a custom aggregate to go along with it.  Examples abound on the Internet.

Note that the Window function first_value doesn't quite do this...you want to constrain the result to be non-null unless all candidate values are null (or there are none).

If you have a unique index on (sid, stamp) you could solve the alternative problem with a simple (sid, max(timestamptz) join back against the social table.

David J.

Re: Selecting records with highest timestamp - for a join

От
Alexander Farber
Дата:
Please let me rephrase my question so that it is better understandable -

In PostgreSQL 9.5.3 I keep player infos from various social networks:

# TABLE words_social;
  sid  | social | female |  given  | family | photo | place |   stamp    | uid
-------+--------+--------+---------+--------+-------+-------+------------+-----
 aaaaa |      1 |      0 | Abcde1  |        |       |       | 1470237061 |   1
 aaaaa |      2 |      0 | Abcde2  |        |       |       | 1477053188 |   1
 aaaaa |      3 |      0 | Abcde3  |        |       |       | 1477053330 |   1
 kkkkk |      3 |      0 | Klmnop3 |        |       |       | 1477053810 |   2
 kkkkk |      4 |      0 | Klmnop4 |        |       |       | 1477053857 |   2
 ggggg |      2 |      0 | Ghijk2  |        |       |       | 1477053456 |   3
 ggggg |      3 |      0 | Ghijk3  |        |       |       | 1477053645 |   3
 ggggg |      4 |      0 | Ghijk4  |        |       |       | 1477053670 |   3
 xxxxx |      4 |      0 | Xyzok   |        |       |       | 1470237393 |   4
(9 rows)

The 1,2,3,4 in column "social" means "Facebook", "Twitter", etc.

For a player I can always select her "most recent" info by:

# select * from words_social s1 WHERE stamp = (SELECT max(stamp) FROM words_social s2 WHERE s1.uid = s2.uid);
  sid  | social | female |  given  | family | photo | place |   stamp    | uid
-------+--------+--------+---------+--------+-------+-------+------------+-----
 aaaaa |      3 |      0 | Abcde3  |        |       |       | 1477053330 |   1
 kkkkk |      4 |      0 | Klmnop4 |        |       |       | 1477053857 |   2
 ggggg |      4 |      0 | Ghijk4  |        |       |       | 1477053670 |   3
 xxxxx |      4 |      0 | Xyzok   |        |       |       | 1470237393 |   4
(4 rows)

Then there is another table storing current games (I have omitted some columns with game data below):

# select gid, created, finished, player1, player2 from words_games;
 gid |            created            | finished | player1 | player2
-----+-------------------------------+----------+---------+---------
   1 | 2016-10-21 14:51:12.624507+02 |          |       4 |       1
   2 | 2016-10-21 14:51:22.631507+02 |          |       3 |
(2 rows)

Whenever a user (for example with uid=1) connects to the server, I send her the games she is taking part in:

# select gid, created, finished, player1, player2 from words_games where player1=1
   union select gid, created, finished, player2, player1 from words_games where player2=1;
 gid |            created            | finished | player1 | player2
-----+-------------------------------+----------+---------+---------
   1 | 2016-10-21 14:51:12.624507+02 |          |       4 |       1
(1 row)

My problem: to the above UNION SELECT statement I need to add user infos from words_social table.

(So that I can display user photos and names above the game board)

So I try this with CTE:

# with user_infos AS (select * from words_social s1 WHERE stamp = (SELECT max(stamp) FROM words_social s2 WHERE s1.uid = s2.uid))
 select          g.gid, g.created, g.finished, g.player1, g.player2, i.given from words_games g join user_infos i on (g.player1=i.uid) where g.player1=1
 union select g.gid, g.created, g.finished, g.player2, g.player1, i.given from words_games g join user_infos i on (g.player2=i.uid) where g.player2=1;
 gid |            created            | finished | player1 | player2 | given
-----+-------------------------------+----------+---------+---------+--------
   1 | 2016-10-21 14:51:12.624507+02 |          |       1 |       4 | Abcde3
(1 row)

This works well (I have advanced since me first asking few days ago), but I still have the following problem -

I am worried that the CTE-table user_infos will get very large, once my game has many players.

How to rewrite my query, so that I fetch games and users (player1, player2) for a certain user id (uid) - without making huge intermediate tables?

Thank you
Alex

Re: Selecting records with highest timestamp - for a join

От
Alexander Farber
Дата:
Should I use LATERAL JOIN?

On Fri, Oct 21, 2016 at 3:24 PM, Alexander Farber <alexander.farber@gmail.com> wrote:

In PostgreSQL 9.5.3 I keep player infos from various social networks:

# TABLE words_social;
  sid  | social | female |  given  | family | photo | place |   stamp    | uid
-------+--------+--------+---------+--------+-------+-------+------------+-----
 aaaaa |      1 |      0 | Abcde1  |        |       |       | 1470237061 |   1
 aaaaa |      2 |      0 | Abcde2  |        |       |       | 1477053188 |   1
 aaaaa |      3 |      0 | Abcde3  |        |       |       | 1477053330 |   1
 kkkkk |      3 |      0 | Klmnop3 |        |       |       | 1477053810 |   2
 kkkkk |      4 |      0 | Klmnop4 |        |       |       | 1477053857 |   2
 ggggg |      2 |      0 | Ghijk2  |        |       |       | 1477053456 |   3
 ggggg |      3 |      0 | Ghijk3  |        |       |       | 1477053645 |   3
 ggggg |      4 |      0 | Ghijk4  |        |       |       | 1477053670 |   3
 xxxxx |      4 |      0 | Xyzok   |        |       |       | 1470237393 |   4
(9 rows)

The 1,2,3,4 in column "social" means "Facebook", "Twitter", etc.

For a player I can always select her "most recent" info by:

# select * from words_social s1 WHERE stamp = (SELECT max(stamp) FROM words_social s2 WHERE s1.uid = s2.uid);
  sid  | social | female |  given  | family | photo | place |   stamp    | uid
-------+--------+--------+---------+--------+-------+-------+------------+-----
 aaaaa |      3 |      0 | Abcde3  |        |       |       | 1477053330 |   1
 kkkkk |      4 |      0 | Klmnop4 |        |       |       | 1477053857 |   2
 ggggg |      4 |      0 | Ghijk4  |        |       |       | 1477053670 |   3
 xxxxx |      4 |      0 | Xyzok   |        |       |       | 1470237393 |   4
(4 rows)

Then there is another table storing current games (I have omitted some columns with game data below):

# select gid, created, finished, player1, player2 from words_games;
 gid |            created            | finished | player1 | player2
-----+-------------------------------+----------+---------+---------
   1 | 2016-10-21 14:51:12.624507+02 |          |       4 |       1
   2 | 2016-10-21 14:51:22.631507+02 |          |       3 |
(2 rows)

Whenever a user (for example with uid=1) connects to the server, I send her the games she is taking part in:

# select gid, created, finished, player1, player2 from words_games where player1=1
   union select gid, created, finished, player2, player1 from words_games where player2=1;
 gid |            created            | finished | player1 | player2
-----+-------------------------------+----------+---------+---------
   1 | 2016-10-21 14:51:12.624507+02 |          |       4 |       1
(1 row)

My problem: to the above UNION SELECT statement I need to add user infos from words_social table.

(So that I can display user photos and names above the game board)

So I try this with CTE:

# with user_infos AS (select * from words_social s1 WHERE stamp = (SELECT max(stamp) FROM words_social s2 WHERE s1.uid = s2.uid))
 select          g.gid, g.created, g.finished, g.player1, g.player2, i.given from words_games g join user_infos i on (g.player1=i.uid) where g.player1=1
 union select g.gid, g.created, g.finished, g.player2, g.player1, i.given from words_games g join user_infos i on (g.player2=i.uid) where g.player2=1;
 gid |            created            | finished | player1 | player2 | given
-----+-------------------------------+----------+---------+---------+--------
   1 | 2016-10-21 14:51:12.624507+02 |          |       1 |       4 | Abcde3
(1 row)

This works well (I have advanced since me first asking few days ago), but I still have the following problem -

I am worried that the CTE-table user_infos will get very large, once my game has many players.