Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: [HACKERS] [PROPOSAL] Temporal query processing with range types
Дата
Msg-id CAKFQuwY3zTTvYySUswWnLL_EL01vevdNfXLn6a0AeGqOS-_Pdw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [PROPOSAL] Temporal query processing with range types  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [HACKERS] [PROPOSAL] Temporal query processing with range types  (Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] [PROPOSAL] Temporal query processing with range types  (Peter Moser <pitiz29a@gmail.com>)
Список pgsql-hackers
On Wed, Feb 15, 2017 at 12:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Jan 24, 2017 at 4:32 AM, Peter Moser <pitiz29a@gmail.com> wrote:
>> Using common terms such as ALIGN and NORMALIZE for such a specific
>> functionality seems a bit wrong.
>
> Would ALIGN RANGES/RANGE ALIGN and NORMALIZE RANGES/RANGE NORMALIZE be better
> options? We are also thankful for any suggestion or comments about the syntax.

So it seems like an ALIGN or NORMALIZE option is kind of like a JOIN,
except apparently there's no join type and the optimizer can never
reorder these operations with each other or with other joins.  Is that
right?  The optimizer changes in this patch seem fairly minimal, so
I'm guessing it can't be doing anything very complex here.

+ * INPUT:
+ *             (r ALIGN s ON q WITH (r.ts, r.te, s.ts, s.te)) c
+ *             where q can be any join qualifier, and r.ts, r.te, s.ts, and s.t
e
+ *             can be any column name.
+ *
+ * OUTPUT:
+ *             (
+ *             SELECT r.*, GREATEST(r.ts, s.ts) P1, LEAST(r.te, s.te) P2
+ *      FROM
+ *      (
+ *             SELECT *, row_id() OVER () rn FROM r
+ *      ) r
+ *      LEFT OUTER JOIN
+ *      s
+ *      ON q AND r.ts < s.te AND r.te > s.ts
+ *      ORDER BY rn, P1, P2
+ *      ) c

It's hard to see what's going on here.  What's ts?  What's te?  If you
used longer names for these things, it might be a bit more
self-documenting.

Just reasoning out loud here...​

ISTM ts and te are "temporal [range] start" and "temporal [range] end"​ (or probably just the common "timestamp start/end")

​From what I can see it is affecting an intersection of the two ranges and, furthermore, splitting the LEFT range into sub-ranges that match up with the sub-ranges found on the right side.  From the example above this seems like it should be acting on self-normalized ranges - but I may be missing something by evaluating this out of context.

r1 [1, 6] [ts, te] [time period start, time period end]
s1 [2, 3]
s2 [3, 4]
s3 [5, 7]

r LEFT JOIN s ON (r.ts < s.te AND r.te > s.ts)

r1[1, 6],s1[2, 3] => [max(r.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[2, 3]
r1[1, 6],s2[3, 4] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[3, 4]
r1[1, 6],s3[5, 7] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[5, 6]

Thus the intersection is [2,6] but since s1 has three ranges that begin between 2 and 6 (i.e., 2, 3, and 5) there are three output records that correspond to those sub-ranges.

The description in the OP basically distinguishes between NORMALIZE and ALIGN in that ALIGN, as described above, affects an INTERSECTION on the two ranges - discarding the non-overlapping data - while NORMALIZE performs the alignment while also retaining the non-overlapping data.

The rest of the syntax seems to deal with selecting subsets of range records based upon attribute data.

David J.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: [HACKERS] operator_precedence_warning vs make installcheck
Следующее
От: "Karl O. Pinc"
Дата:
Сообщение: Re: [HACKERS] Patch to implement pg_current_logfile() function