Re: SQL Property Graph Queries (SQL/PGQ)

Поиск
Список
Период
Сортировка
От Henson Choi
Тема Re: SQL Property Graph Queries (SQL/PGQ)
Дата
Msg-id CAAAe_zAEEAb=piH4n-mZUhqcL=oKbDv4v-_7C_7KyXroem=HUg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: SQL Property Graph Queries (SQL/PGQ)  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Список pgsql-hackers
Hi Ashutosh,

Thank you for reviewing the path pattern list implementation.
I've added comprehensive tests to address your concerns below.

2025년 12월 30일 (화) PM 6:50, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>님이 작성:
Hi Henson,

> 3. Multiple path patterns
>    - Syntax: MATCH (a)->(b), (b)->(c)
>    - Implements SQL/PGQ standard feature (was TODO)
>

This path pattern list feature is discussed in section 10.4 of the
standard. I think supporting this feature would be a good addition to
SQL/PGQ compliance. The current patchset is already large. I (and
AFAIK Peter as well) am focusing on getting that patchset in a
committable shape. Once committed, it will facilitate adding features
like path pattern list, quantified path patterns, label conjunction
etc. as incremental relatively smaller commits.

Agreed. Now is not the time to add this feature - the current patchset
should be stabilized first.
 
Since you could build one of the complex features without changing a
lot of earlier implementation, it validates that the basic
implementation is extensible and yet robust. Thanks for that
validation.

Thank you. That was indeed the intent.
 
The way you have implemented it, it can lead to a path pattern with
two vertex patterns next to each other without an interleaving edge
pattern. That would throw an error or assertion.

This does NOT throw an error. Single vertex patterns produce a
Cartesian product correctly (vl1 has 3 rows, vl3 has 6 rows = 18 rows):

    SELECT a_name, b_name FROM GRAPH_TABLE (g1
        MATCH (a IS vl1), (b IS vl3)
        COLUMNS (a.vname AS a_name, b.vname AS b_name)
    ) ORDER BY 1, 2;

     a_name | b_name
    --------+--------
     v11    | v21
     v11    | v22
     ...
    (18 rows)  -- 3 x 6 = 18, confirming cross join

 
If you try path
patterns without a common vertex, you will get that error.

Disconnected patterns (no shared variables) work correctly and produce
a Cartesian product (vl1->vl2 has 3 paths, vl1->vl3 has 5 paths = 15 rows):

    SELECT a_name, b_name, c_name, d_name FROM GRAPH_TABLE (g1
        MATCH (a IS vl1)-[]->(b IS vl2),
              (c IS vl1)-[]->(d IS vl3)
        COLUMNS (a.vname AS a_name, b.vname AS b_name,
                 c.vname AS c_name, d.vname AS d_name)
    ) ORDER BY 1, 2, 3, 4;

     a_name | b_name | c_name | d_name
    --------+--------+--------+--------
     v11    | v22    | v11    | v22
     v11    | v22    | v11    | v31
     v11    | v22    | v11    | v33
     ...
    (15 rows)  -- 3 x 5 = 15, confirming cross join

 
That is not
covered by your tests.

I have now added comprehensive tests:

Disconnected patterns (cross product):
  1. Two independent paths - 3 x 5 = 15 rows
  2. EXPLAIN verification - confirms Nested Loop (cross join)
  3. Single vertex patterns - 3 x 6 = 18 rows
  4. Three-way cross product - 3 x 3 x 6 = 54 rows
  5. Mixed connected + disconnected - 1 path x 3 vertices = 3 rows
  6. Filtered cross product - 1 x 6 = 6 rows

Connected patterns (join via shared variable):
  7. Star pattern: A->B, A->C, D->A with shared hub 'a' - 2 rows (NOT cross product)

    MATCH (a IS vl1)-[]->(b IS vl2),
          (a)-[]->(c IS vl3),
          (d IS vl2)-[]->(a)

     a_name | b_name | c_name | d_name
    --------+--------+--------+--------
     v12    | v21    | v21    | v21
     v13    | v23    | v23    | v23
    (2 rows)  -- all joined via 'a', NOT cross product


All row counts match expected behavior.
 
In such cases, the paths should end up being
full-joined rather than concatenated. I don't see it being implemented
that way.


The implementation DOES produce cross joins for disconnected patterns.
The EXPLAIN output confirms this:

    EXPLAIN (COSTS OFF) SELECT ... FROM GRAPH_TABLE (g1
        MATCH (a IS vl1)-[]->(b IS vl2),
              (c IS vl1)-[]->(d IS vl3)
        ...
    );
                                  QUERY PLAN
    -----------------------------------------------
     Append
       ->  Merge Join
             ...
             ->  Nested Loop           <-- Cross join here
                   ->  Index Scan ...
                   ->  Materialize
                         ->  Nested Loop
                               ...


The Nested Loop without join condition IS the cross join (Cartesian product).
 
I think more work is needed here.

All test cases pass. This work was intended to validate extensibility
of the current design, not to push for immediate inclusion at this stage.
 
[1] https://www.postgresql.org/message-id/CAExHW5sr+dJPCFw2OqXUfPPqPks8qmsivaAYhRiBdFANcX8RHw@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

Best regards,
Henson 
Вложения

В списке pgsql-hackers по дате отправления: