Обсуждение: Allowing additional commas between columns, and at the end of the SELECT clause

Поиск
Список
Период
Сортировка
Hello!
I have created a patch to allow additional commas between columns, and 
at the end of the SELECT clause.

Motivation:
Commas of this type are allowed in many programming languages, in some 
it is even recommended to use them at the ends of lists or objects. A 
new generation of programmers expects a more forgiving language just as 
our generation enjoyed LIMIT and the ability to write `select` in lowercase.

Accepted:
     SELECT 1,;
     SELECT 1,,,,,;
     SELECT *, from information_schema.sql_features;
     (...) RETURNING a,,b,c,;

Not accepted:
     SELECT ,;
     SELECT ,1;
     SELECT ,,,;

Advantages:
- simplifies the creation and debugging of queries by reducing the most 
common syntax error,
- eliminates the need to use the popular `1::int as dummy` at the end of 
a SELECT list,
- simplifies query generators,
- the query is still deterministic,

Disadvantages:
- counting of returned columns can be difficult,
- syntax checkers will still report errors,
- probably not SQL standard compliant,
- functionality can be controversial,

I attach the patch along with the tests.

What do you think?

Your opinions are very much welcome!


Вложения

Re: Allowing additional commas between columns, and at the end of the SELECT clause

От
Matthias van de Meent
Дата:
On Mon, 13 May 2024 at 10:42, Artur Formella <artur.formella3@gmail.com> wrote:
> Motivation:
> Commas of this type are allowed in many programming languages, in some
> it is even recommended to use them at the ends of lists or objects.

Single trailing commas are a feature that's more and more common in
languages, yes, but arbitrary excess commas is new to me. Could you
provide some examples of popular languages which have that, as I can't
think of any.

> Accepted:
>      SELECT 1,;
>      SELECT 1,,,,,;
>      SELECT *, from information_schema.sql_features;
>      (...) RETURNING a,,b,c,;
>
> Not accepted:
>      SELECT ,;
>      SELECT ,1;
>      SELECT ,,,;
>
> Advantages:
> - simplifies the creation and debugging of queries by reducing the most
> common syntax error,
> - eliminates the need to use the popular `1::int as dummy` at the end of
> a SELECT list,

This is the first time I've heard of this `1 as dummy`.

> - simplifies query generators,
> - the query is still deterministic,

What part of a query would (or would not) be deterministic? I don't
think I understand the potential concern here. Is it about whether the
statement can be parsed deterministically?

> Disadvantages:
> - counting of returned columns can be difficult,
> - syntax checkers will still report errors,
> - probably not SQL standard compliant,

I'd argue you better raise this with the standard committee if this
isn't compliant. I don't see enough added value to break standard
compliance here, especially when the standard may at some point allow
only a single trailing comma (and not arbitrarily many).

> What do you think?

Do you expect `SELECT 1,,,,,,,` to have an equivalent query identifier
to `SELECT 1;` in pg_stat_statements? Why, or why not?

Overall, I don't think unlimited commas is a good feature. A trailing
comma in the select list would be less problematic, but I'd still want
to follow the standard first and foremost.


Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



Hi,

As a developer, I love this feature.

But as a developer of an universal TDOP SQL parser[1], this can be a
pain. Please request it to the standard.

Regards,
Étienne


[1]: https://gitlab.com/dalibo/transqlate



Re: Allowing additional commas between columns, and at the end of the SELECT clause

От
Dagfinn Ilmari Mannsåker
Дата:
Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

> On Mon, 13 May 2024 at 10:42, Artur Formella <artur.formella3@gmail.com> wrote:
>> Motivation:
>> Commas of this type are allowed in many programming languages, in some
>> it is even recommended to use them at the ends of lists or objects.
>
> Single trailing commas are a feature that's more and more common in
> languages, yes, but arbitrary excess commas is new to me. Could you
> provide some examples of popular languages which have that, as I can't
> think of any.

The only one I can think of is Perl, which I'm not sure counts as
popular any more.  JavaScript allows consecutive commas in array
literals, but they're not no-ops, they create empty array slots:

    ❯ js
    Welcome to Node.js v18.19.0.
    Type ".help" for more information.
    > [1,,2,,]
    [ 1, <1 empty item>, 2, <1 empty item> ]

- ilmari



=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:
> Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
>> Single trailing commas are a feature that's more and more common in
>> languages, yes, but arbitrary excess commas is new to me. Could you
>> provide some examples of popular languages which have that, as I can't
>> think of any.

> The only one I can think of is Perl, which I'm not sure counts as
> popular any more.  JavaScript allows consecutive commas in array
> literals, but they're not no-ops, they create empty array slots:

I'm fairly down on this idea for SQL, because I think it creates
ambiguity for the ROW() constructor syntax.  That is:

    (x,y) is understood to be shorthand for ROW(x,y)

    (x) is not ROW(x), it's just x

    (x,) means what?

I realize the original proposal intended to restrict the legality of
excess commas to only a couple of places, but to me that just flags
it as a kluge.  ROW(...) ought to work pretty much the same as a
SELECT list.

As already mentioned, if you can get some variant of this through the
SQL standards process, we'll probably adopt it.  But I doubt that we
want to get out front of the committee in this area.

            regards, tom lane



Re: Allowing additional commas between columns, and at the end of the SELECT clause

От
Dagfinn Ilmari Mannsåker
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> =?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:
>> Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
>>> Single trailing commas are a feature that's more and more common in
>>> languages, yes, but arbitrary excess commas is new to me. Could you
>>> provide some examples of popular languages which have that, as I can't
>>> think of any.
>
>> The only one I can think of is Perl, which I'm not sure counts as
>> popular any more.  JavaScript allows consecutive commas in array
>> literals, but they're not no-ops, they create empty array slots:
>
> I'm fairly down on this idea for SQL, because I think it creates
> ambiguity for the ROW() constructor syntax.  That is:
>
>     (x,y) is understood to be shorthand for ROW(x,y)
>
>     (x) is not ROW(x), it's just x
>
>     (x,) means what?

Python has a similar issue: (x, y) is a tuple, but (x) is just x, and
they use the trailing comma to disambiguate, so (x,) creates a
single-item tuple.  AFAIK it's the only place where the trailing comma
is significant.

> I realize the original proposal intended to restrict the legality of
> excess commas to only a couple of places, but to me that just flags
> it as a kluge.  ROW(...) ought to work pretty much the same as a
> SELECT list.

Yeah, a more principled approach would be to not special-case target
lists, but to allow one (and only one) trailing comma everywhere:
select, order by, group by, array constructors, row constructors,
everything that looks like a function call, etc.

> As already mentioned, if you can get some variant of this through the
> SQL standards process, we'll probably adopt it.  But I doubt that we
> want to get out front of the committee in this area.

Agreed.

>             regards, tom lane

- ilmari



=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> I'm fairly down on this idea for SQL, because I think it creates
>> ambiguity for the ROW() constructor syntax.  That is:
>>     (x,y) is understood to be shorthand for ROW(x,y)
>>     (x) is not ROW(x), it's just x
>>     (x,) means what?

> Python has a similar issue: (x, y) is a tuple, but (x) is just x, and
> they use the trailing comma to disambiguate, so (x,) creates a
> single-item tuple.  AFAIK it's the only place where the trailing comma
> is significant.

Ugh :-(.  The semantic principle I'd prefer to have here is "a trailing
comma is ignored", but what they did breaks that.  But then again,
I'm not particularly a fan of anything about Python's syntax.

> Yeah, a more principled approach would be to not special-case target
> lists, but to allow one (and only one) trailing comma everywhere:
> select, order by, group by, array constructors, row constructors,
> everything that looks like a function call, etc.

If it can be made to work everywhere, that would get my vote.
I'm not sure if any other ambiguities arise, though.  SQL has
a lot of weird syntax corners (and the committee keeps adding
more :-().

            regards, tom lane



On 13.05.2024 11:24, Matthias van de Meent wrote:
On Mon, 13 May 2024 at 10:42, Artur Formella <artur.formella3@gmail.com> wrote:
Motivation:
Commas of this type are allowed in many programming languages, in some
it is even recommended to use them at the ends of lists or objects.
Single trailing commas are a feature that's more and more common in
languages, yes, but arbitrary excess commas is new to me. Could you
provide some examples of popular languages which have that, as I can't
think of any.
Thank for your comment.
I meant commas are recommended at the end of the list. Sorry for the lack of precision.
Typescript has a popular directive "rules": { "trailing-comma": false } in the tslint.json file, which forces trailing commas. Popular Airbnb coding style require trailing commas by eslint (https://github.com/airbnb/javascript?tab=readme-ov-file#functions--signature-invocation-indentation).

This is the first time I've heard of this `1 as dummy`.

dummy column is a popular way to end SELECT list on R&D phase to avoid the most common syntax error. This way you don't have to pay attention to commas.

SELECT <hacking /> , 1::int AS ignoreme FROM <hacking />

- simplifies query generators,
- the query is still deterministic,
What part of a query would (or would not) be deterministic? I don't
think I understand the potential concern here. Is it about whether the
statement can be parsed deterministically?

Bison doesn't report error or conflict.

I'd argue you better raise this with the standard committee if this
isn't compliant. I don't see enough added value to break standard
compliance here, especially when the standard may at some point allow
only a single trailing comma (and not arbitrarily many).


Do you expect `SELECT 1,,,,,,,` to have an equivalent query identifier
to `SELECT 1;` in pg_stat_statements? Why, or why not?
I don't know, I have a feeling that the queries are equivalent, but I don't know the mechanism.
Overall, I don't think unlimited commas is a good feature. A trailing
comma in the select list would be less problematic, but I'd still want
to follow the standard first and foremost.

I will prepare a patch with trailing comma only tomorrow.

Thank you.

Artur