Re: Proper relational database?

Поиск
Список
Период
Сортировка
От
Тема Re: Proper relational database?
Дата
Msg-id 008301d19e22$48109ce0$d831d6a0$@andl.org
обсуждение исходный текст
Ответ на Re: Proper relational database?  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-general
> owner@postgresql.org] On Behalf Of Thomas Munro

> So what incremental improvements could we steal from "properly relational"
> query languages?

Here is my list of deficiencies in some or all dialects of SQL.
-Relation and tuple as data types
-Globally exclude NULLs and duplicate columns
-Relation with no columns, null key
-Natural antijoin
-Tuple join operations
-Tuple tests for equality, superset, subset
-Tuple test for set membership of relation
-Relation tests for equality, superset, subset
-Relation set operations include symmetric difference
-User-defined functions of arbitrary complexity
-User-defined aggregation functions of arbitrary complexity
-User-defined ordered queries of arbitrary complexity
-Iterative/recursive queries (when)

> Here's one I've thought about, trivial as it may be.  I noticed that Tutorial
> D (and apparently Andl too) includes a [NOT] MATCHING operator (alternatively
> spelled SEMIJOIN and SEMIMINUS) corresponding to the ⋉ (semi-join) and ▷
> (anti-join) operators from relational algebra.  In SQL you write [NOT] EXISTS
> or [NOT] IN in the WHERE clause, rather than something explicit in a <joined
> table> clause, though experienced users often talk explicitly about semi- and
> anti-joins, both because of the theory and because the terms show up in query
> plans.

Yes, there are two joins (join and antijoin). Semijoin is just one of a number of projections following a join, but
antijoinis a quite different algorithm. 

Antijoin is quite hard to write in SQL in such a way that the query planner will do the right thing. There is a lot of
variationbetween dialects. 

> A recent blog post[1] argues that SQL should have a SQL92-style JOIN syntax
> for this and points at a couple of products that have added one[2][3].  I
> guess it might be hard to convince the Postgres community to add support for
> a non-standard syntax that doesn't give you anything you can't already do,
> but as an idea I find it interesting and it seems to be in the spirit of the
> part of the Third Manifesto that says: "support[ing] the usual operators of
> the relational algebra [..].  All such operators shall be expressible without
> excessive circumlocution."

The purpose is simply that explicit syntax allows for explicit query optimisation.
>
> For example, say we want all students who have one or more exam today:
>
>   SELECT s.student_id, s.name
>     FROM student s
>    WHERE EXISTS (SELECT 1
>                    FROM exam e
>                   WHERE e.student_id = s.student_id
>                     AND e.exam_date = CURRENT_DATE)
>
> I don't know Tutorial D, but I think it might express that with something
> like:
>
>   ( student MATCHING exam
>     WHERE exam_date = CURRENT_DATE )
>   { student_id, name }
>
> With 10 minutes of prototype hacking I convinced Postgres to accept SEMI and
> ANTI like this:
>
>   SELECT s.student_id, s.name
>     FROM student s SEMI JOIN exam e USING (student_id)
>    WHERE e.exam_date = CURRENT_DATE
>
> I guess a real version should accept (or require?) LEFT or RIGHT before
> SEMIANTI.

Andl has semijoin, rsemijoin, ajoin and rajoin. They are trivial to add once you have the basic algorithm, but Tutorial
Ddoes not. 

When using this hypothetical syntax I think you should be allowed
> to refer to e.exam_date in the WHERE clause but not in the SELECT list
> (Impala apparently does allow you to see data from exam, and returns values
> from an arbitrary matching row, but that doesn't seem right to me).  But the
> EXISTS syntax is correspondingly strange in that it requires you to provide a
> SELECT list which is entirely discarded, so people often write "*" or "1"
> (erm, OK, I guess you can use an empty select list in recent Postgres).

SQL has an implicit ordering of query evaluation -- you will often need to write a nested subquery or correlated query
forwhat should be very straightforward situations. That's another thing that's easy to fix, if allowed. 

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org







В списке pgsql-general по дате отправления:

Предыдущее
От: Geoff Winkless
Дата:
Сообщение: Re: Proper relational database?
Следующее
От: Geoff Winkless
Дата:
Сообщение: Re: Proper relational database?