Обсуждение: How to optimize SELECT query with multiple CASE statements?

Поиск
Список
Период
Сортировка

How to optimize SELECT query with multiple CASE statements?

От
Alexander Farber
Дата:
Good afternoon,

is it please posible to optimize the following SQL query with numerous CASE statements (on same condition!) without switching to PL/pgSQL?

    SELECT
        g.gid,
        EXTRACT(EPOCH FROM g.created)::int,
        EXTRACT(EPOCH FROM g.finished)::int,
        g.letters,
        g.values,
        g.bid,
        m.tiles,
        m.score,
        /* HOW TO OPTIMIZE THE FOLLOWING CASE STATEMENTS? */
        CASE WHEN g.player1 = in_uid THEN g.player1 ELSE g.player2 END,
        CASE WHEN g.player1 = in_uid THEN g.player2 ELSE g.player1 END,
        EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played1 ELSE g.played2 END)::int,
        EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played2 ELSE g.played1 END)::int,
        CASE WHEN g.player1 = in_uid THEN g.score1 ELSE g.score2 END,
        CASE WHEN g.player1 = in_uid THEN g.score2 ELSE g.score1 END,
        ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN g.hand1 ELSE g.hand2 END, ''),
        REGEXP_REPLACE(ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN g.hand2 ELSE g.hand1 END, ''), '.', '?', 'g'),
        CASE WHEN g.player1 = in_uid THEN s1.female ELSE s2.female END,
        CASE WHEN g.player1 = in_uid THEN s2.female ELSE s1.female END,
        CASE WHEN g.player1 = in_uid THEN s1.given ELSE s2.given END,
        CASE WHEN g.player1 = in_uid THEN s2.given ELSE s1.given END,
        CASE WHEN g.player1 = in_uid THEN s1.photo ELSE s2.photo END,
        CASE WHEN g.player1 = in_uid THEN s2.photo ELSE s1.photo END,
        CASE WHEN g.player1 = in_uid THEN s1.place ELSE s2.place END,
        CASE WHEN g.player1 = in_uid THEN s2.place ELSE s1.place END
    FROM words_games g
        LEFT JOIN words_moves m ON m.gid = g.gid
                -- find move record with the most recent timestamp
        AND NOT EXISTS (SELECT 1
            FROM words_moves m2 WHERE m2.gid = m.gid
            AND m2.played > m.played)
    LEFT JOIN words_social s1 ON s1.uid = g.player1
                -- find social record with the most recent timestamp
        AND NOT EXISTS (SELECT 1
            FROM words_social s WHERE s1.uid = s.uid
            AND s.stamp > s1.stamp)
    LEFT JOIN words_social s2 ON s2.uid = g.player2
                -- find social record with the most recent timestamp
        AND NOT EXISTS (SELECT 1
            FROM words_social s WHERE s2.uid = s.uid
            AND s.stamp > s2.stamp)
    WHERE in_uid IN (g.player1, g.player2)
    AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP - INTERVAL '1 day');

It is a two-player, PostgreSQL-based game and in the statement above I am using the CASE-statements to ensure that always player1, given1, score1 columns are returned for the player in question.


Thank you
Alex

Re: How to optimize SELECT query with multiple CASE statements?

От
Adrian Klaver
Дата:
On 10/31/2016 05:53 AM, Alexander Farber wrote:
> Good afternoon,
>
> is it please posible to optimize the following SQL query with numerous
> CASE statements (on same condition!) without switching to PL/pgSQL?

Offered with the following caveats, one I do not fully understand what
is going on below and two it may not optimize anything but might help
with the first caveat:

If you used plpgsql you could use IF conditions to organize the
assignment of values for player1 and player2 and eliminate some of the
duplication of effort shown in the CASE statements.

>
>     SELECT
>         g.gid,
>         EXTRACT(EPOCH FROM g.created)::int,
>         EXTRACT(EPOCH FROM g.finished)::int,
>         g.letters,
>         g.values,
>         g.bid,
>         m.tiles,
>         m.score,
>         /* HOW TO OPTIMIZE THE FOLLOWING CASE STATEMENTS? */
>         CASE WHEN g.player1 = in_uid THEN g.player1 ELSE g.player2 END,
>         CASE WHEN g.player1 = in_uid THEN g.player2 ELSE g.player1 END,
>         EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played1
> ELSE g.played2 END)::int,
>         EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played2
> ELSE g.played1 END)::int,
>         CASE WHEN g.player1 = in_uid THEN g.score1 ELSE g.score2 END,
>         CASE WHEN g.player1 = in_uid THEN g.score2 ELSE g.score1 END,
>         ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN g.hand1 ELSE
> g.hand2 END, ''),
>         REGEXP_REPLACE(ARRAY_TO_STRING(CASE WHEN g.player1 = in_uid THEN
> g.hand2 ELSE g.hand1 END, ''), '.', '?', 'g'),
>         CASE WHEN g.player1 = in_uid THEN s1.female ELSE s2.female END,
>         CASE WHEN g.player1 = in_uid THEN s2.female ELSE s1.female END,
>         CASE WHEN g.player1 = in_uid THEN s1.given ELSE s2.given END,
>         CASE WHEN g.player1 = in_uid THEN s2.given ELSE s1.given END,
>         CASE WHEN g.player1 = in_uid THEN s1.photo ELSE s2.photo END,
>         CASE WHEN g.player1 = in_uid THEN s2.photo ELSE s1.photo END,
>         CASE WHEN g.player1 = in_uid THEN s1.place ELSE s2.place END,
>         CASE WHEN g.player1 = in_uid THEN s2.place ELSE s1.place END
>     FROM words_games g
>         LEFT JOIN words_moves m ON m.gid = g.gid
>                 -- find move record with the most recent timestamp
>         AND NOT EXISTS (SELECT 1
>             FROM words_moves m2 WHERE m2.gid = m.gid
>             AND m2.played > m.played)
>     LEFT JOIN words_social s1 ON s1.uid = g.player1
>                 -- find social record with the most recent timestamp
>         AND NOT EXISTS (SELECT 1
>             FROM words_social s WHERE s1.uid = s.uid
>             AND s.stamp > s1.stamp)
>     LEFT JOIN words_social s2 ON s2.uid = g.player2
>                 -- find social record with the most recent timestamp
>         AND NOT EXISTS (SELECT 1
>             FROM words_social s WHERE s2.uid = s.uid
>             AND s.stamp > s2.stamp)
>     WHERE in_uid IN (g.player1, g.player2)
>     AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP - INTERVAL
> '1 day');
>
> It is a two-player, PostgreSQL-based game and in the statement above I
> am using the CASE-statements to ensure that always player1, given1,
> score1 columns are returned for the player in question.
>
> Here is a bit more context:
> http://stackoverflow.com/questions/40342426/how-to-optimize-select-query-with-multiple-case-statements
>
> Thank you
> Alex


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: How to optimize SELECT query with multiple CASE statements?

От
Geoff Winkless
Дата:
On 31 October 2016 at 12:53, Alexander Farber
<alexander.farber@gmail.com> wrote:
>
> Good afternoon,
>
> is it please posible to optimize the following SQL query with numerous CASE statements (on same condition!) without
switchingto PL/pgSQL? 


You could break the game table apart into game and gameplayer.

That's more "normal" and fits much more nicely, IMO, and you could
then resolve the CASE by using joins between game and (twice)
gameplayer:

SELECT ...
FROM game INNER JOIN gameplayer AS myplayer ON
game.gameid=myplayer.gameid AND myplayer.uid=in_uid
INNER JOIN gameplayer AS otherplayer ON game.gameid=otherplayer.gameid
AND otherplayer.uid!=in_uid
...

Then all the other tables simply join to myplayer and otherplayer.

If you have to stick with the designed schema, you could probably do
something with arrays, but I doubt if that would get any less messy
and certainly no more readable.

FWIW you can resolve half of the CASEs by resolving it in the join to
s1 and s2 - so

LEFT JOIN words_social s1 ON s1.uid = in_uid
LEFT JOIN words_social s2 ON CASE WHEN g.player1 = in_uid THEN
g.player2 ELSE g.player1

etc

Geoff


Re: How to optimize SELECT query with multiple CASE statements?

От
Geoff Winkless
Дата:
On 31 October 2016 at 15:21, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
> LEFT JOIN words_social s1 ON s1.uid = in_uid
> LEFT JOIN words_social s2 ON CASE WHEN g.player1 = in_uid THEN
> g.player2 ELSE g.player1

Ugh.

Of course I meant

LEFT JOIN words_social s1 ON s1.uid = in_uid
LEFT JOIN words_social s2 ON s2.uid = CASE WHEN g.player1 = in_uid
THEN g.player2 ELSE g.player1 END

But I expect you know that...

Geoff


Re: How to optimize SELECT query with multiple CASE statements?

От
Alexander Farber
Дата:
Hi Geoff,

On Mon, Oct 31, 2016 at 4:21 PM, Geoff Winkless <pgsqladmin@geoff.dj> wrote:

You could break the game table apart into game and gameplayer.

That's more "normal" and fits much more nicely, IMO, and you could
then resolve the CASE by using joins between game and (twice)
gameplayer:

SELECT ...
FROM game INNER JOIN gameplayer AS myplayer ON
game.gameid=myplayer.gameid AND myplayer.uid=in_uid
INNER JOIN gameplayer AS otherplayer ON game.gameid=otherplayer.gameid
AND otherplayer.uid!=in_uid
...

Then all the other tables simply join to myplayer and otherplayer.


do you mean, instead of having player1, player2 columns in the words_games table (as in my current schema https://gist.github.com/afarber/c40b9fc5447335db7d24 ) - I should move the player stuff (uid, hand, score) to a separate table and then JOIN them?

Regards
Alex

Re: How to optimize SELECT query with multiple CASE statements?

От
"David G. Johnston"
Дата:
On Mon, Oct 31, 2016 at 5:53 AM, Alexander Farber <alexander.farber@gmail.com> wrote:
Good afternoon,

is it please posible to optimize the following SQL query with numerous CASE statements (on same condition!) without switching to PL/pgSQL?

    SELECT
        g.gid,
        EXTRACT(EPOCH FROM g.created)::int,
        EXTRACT(EPOCH FROM g.finished)::int,
        g.letters,
        g.values,
        g.bid,
        m.tiles,
        m.score,
        /* HOW TO OPTIMIZE THE FOLLOWING CASE STATEMENTS? */
        CASE WHEN g.player1 = in_uid THEN g.player1 ELSE g.player2 END,
        CASE WHEN g.player1 = in_uid THEN g.player2 ELSE g.player1 END,
        EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played1 ELSE g.played2 END)::int,
        EXTRACT(EPOCH FROM CASE WHEN g.player1 = in_uid THEN g.played2 ELSE g.played1 END)::int,
​[...]

When seeing the above repetition I consider implementing a composite type and passing that around in the main portion of the queries and then (composite_type).* at the presentation layer.

As Geoff​
 
​indicated normalization makes this a bit easier​; but you can still normalize "on-the-fly" via standalone composite types.

David J.

Re: How to optimize SELECT query with multiple CASE statements?

От
Geoff Winkless
Дата:
On 31 October 2016 at 15:46, Alexander Farber
<alexander.farber@gmail.com> wrote:
> do you mean, instead of having player1, player2 columns in the words_games
> table (as in my current schema
> https://gist.github.com/afarber/c40b9fc5447335db7d24 ) - I should move the
> player stuff (uid, hand, score) to a separate table and then JOIN them?

I do - it's something that my databases lecturer would almost
certainly have insisted upon, although I'm not sure it's worth it,
especially since (as I said in the last paragraph of my email), you
can remove all of the CASEs except the hand/score ones by just JOINing
the other tables via a CASE anyway.

Geoff


Re: How to optimize SELECT query with multiple CASE statements?

От
Alexander Farber
Дата:
Ah, thanks - I've got that with JOINing via CASE now...

On Mon, Oct 31, 2016 at 5:50 PM, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
especially since (as I said in the last paragraph of my email), you
can remove all of the CASEs except the hand/score ones by just JOINing
the other tables via a CASE anyway.