Re: SQL Property Graph Queries (SQL/PGQ)

Поиск
Список
Период
Сортировка
От Huanbing Lu
Тема Re: SQL Property Graph Queries (SQL/PGQ)
Дата
Msg-id SA1P221MB099446DA1937F02EB96C6E29D580A@SA1P221MB0994.NAMP221.PROD.OUTLOOK.COM
обсуждение исходный текст
Ответ на Re: SQL Property Graph Queries (SQL/PGQ)  (Henson Choi <assam258@gmail.com>)
Ответы Re: SQL Property Graph Queries (SQL/PGQ)
Список pgsql-hackers
Hi hackers,

Thank you for sharing this insightful discussion. I'm a new developer 
interested in this topic. The design of VLE is particularly fascinating 
to me because of the potential performance impact it could bring.

Regarding the challenges of the custom executor approach (Option 2) that 
you both highlighted, especially the integration with the PlanState 
stack and the need for push/pop semantics, I have been trying to 
understand it better.

 From my reading of the executor code, ExprContext and the 
TupleTableSlot lifecycle are managed per tuple. For a DFS backtracking 
model, would we need to introduce a new concept like a “TraversalState” 
that persists across the recursive hops, perhaps housed within the 
VlePathScan's PlanState? Or is there an existing mechanism (like the 
SubPlan state for parameterized subqueries) that we could adapt to 
manage the stack of visited edges and vertices?

Looking forward to learning more from the community discussion.

Best regards,

Huanbing


在 2026/1/9 21:08, Henson Choi 写道:
> Hi Junwang,
>
> Thanks for the feedback!
>
> 2026년 1월 9일 (금) PM 7:51, Junwang Zhao <zhjwpku@gmail.com>님이 작성:
>
>> On Mon, Jan 5, 2026 at 9:56 PM Henson Choi <assam258@gmail.com> wrote:
>>> Hi hackers,
>>>
>>> I think it may be a bit early for this, but I wanted to share this design
>>> document now so we can revisit and discuss when the time is right.
>>>
>>> ===================================================================
>>> Variable Length Edge (VLE) Implementation Strategies for PostgreSQL
>>> ===================================================================
>>>
>>> 1. Overview
>>> -----------
>>>
>>> This document describes two implementation strategies for Variable
>>> Length Edge (VLE) traversal in PostgreSQL-based graph databases.
>>>
>>> VLE Query Example:
>>>
>>>      MATCH (a)-[*1..5]->(b) RETURN a, b
>>>      MATCH p = (a)-[:knows*]->(b) RETURN p
>>>
>>> Two implementation approaches:
>>>
>>>      1. Unoptimized: Query rewriting to WITH RECURSIVE (CTE)
>>>      2. Optimized:   Custom executor node with DFS algorithm
>>>
>> Interesting ideas. I was thinking about expanding the edge pattern using
>> the
>> lower and upper but it seems it can not handle the * case.
>>
> Right. With a bounded quantifier like -[*1..3]->, we can unroll it into
> a fixed number of JOINs at compile time. But -[*]-> has no upper bound
> known at compile time, so unrolling isn't possible.
>
> That said, the traversal is still finite due to the edge uniqueness
> constraint (e.id <> ALL(p.edge_ids)) which enforces TRAIL semantics.
> This prevents traversing the same edge twice, not revisiting vertices.
> In a graph with E edges, the maximum path length is E. The challenge
> isn't infinity - it's that we need runtime iteration with dynamic
> termination.
>
>
>> ISTM these two approaches can both work when we need runtime
>> pruning, for example if we are going to support path mode.
>>
> Agreed. Both approaches can support path modes (WALK/TRAIL/ACYCLIC/SIMPLE).
> Edge predicates ([e* WHERE ...]) are applied at each hop, while path-level
> predicates (length, aggregates) require completion. The real difference is
> the execution model: CTE uses BFS storing all intermediate paths, whereas
> a custom executor enables DFS with backtracking and O(depth) memory.
>
>
>> The first option can be costly, since you have to pre calculate the
>> recursive
>> CTE, it can not utilize the predicate from previous hops. Can it work with
>> quantifier * case?
>>
> Yes, CTE can handle * - the edge uniqueness constraint (e.id <> ALL(...))
> ensures termination.
>
> As mentioned above, edge predicates ([e* WHERE ...]) go into the recursion,
> while path-level predicates (length, aggregates) stay outside. Since e* is
> a collection, scalar access like "WHERE e.weight > 50" outside the pattern
> would be invalid - edge filtering belongs inside [e* WHERE ...].
>
>
>> The second option seems more promising, but it needs to touch the planner
>> and executor. I guess you need a special node to represent the VLE in
>> the rewrite phase.
>>
> Right. Whether to introduce new syntax or hijack existing constructs
> (like SEARCH DEPTH FIRST) needs deeper thought.
>
> This won't be easy, but it's an unavoidable challenge. Adding a new plan
> node is one thing, but PlanState stack with push/pop could affect all
> executor nodes. I expect this will spark lively discussion among planner
> hackers.
>
> Best regards,
> Henson
>



В списке pgsql-hackers по дате отправления: