Обсуждение: Invalid YAML output from EXPLAIN

Поиск
Список
Период
Сортировка

Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
Testing 9.0 beta, I found that EXPLAINing certain queries in YAML
format will produce invalid YAML, for example:

explain (format yaml) select * from foo where str_val = 'a: b';

The problem in this case is that a colon followed by whitespace is not
allowed in an unquoted plain YAML string because a parser would
interpret it as the start of a map.

So the current code in escape_yaml() is inadequate for producing valid
YAML. I think it would have to also consider at least the following
characters as special "-"  ":"  "["  "]"  "{"  "}"  ","  "\""  "'"
"|"  "*"  "&". Technically, it would also need to trap empty strings,
and strings with leading or trailing whitespace.

Making escape_yaml() completely bulletproof with this approach would
be quite difficult, and (IMO) not worth the effort, especially given
that an important requirement is that the output be machine readable,
and in my experience YAML parsers are often far from perfect.

I would therefore argue for simply calling escape_json() to produce
double quoted output for all string values, and only have numeric
values unquoted. This is not really any less human readable, and is
far more machine readable.

Patch attached.

 - Dean

Вложения

Re: Invalid YAML output from EXPLAIN

От
"Greg Sabino Mullane"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Dean Rasheed wrote:
...
> So the current code in escape_yaml() is inadequate for producing valid
> YAML. I think it would have to also consider at least the following
> characters as special "-"  ":"  "["  "]"  "{"  "}"  ","  "\""  "'"
> "|"  "*"  "&". Technically, it would also need to trap empty strings,
> and strings with leading or trailing whitespace.
>
> Making escape_yaml() completely bulletproof with this approach would
> be quite difficult, and (IMO) not worth the effort
...

Doesn't seem like a lot of effort to me. You've already laid out most of 
the exceptions above, although they require a few tweaks.
The rules should be:

Requires quoting only if the first character:&  *  !  |  >  '  "  %  @  ` #

Same as above, but no quoting if the second character is "safe":-  ?  :

Always requires quoting:":<space>"  "<space>#"  aka  ': '  ' #'

Always requires quoting:,  [  ]  {  }

Always require quoting:(leading space) (trailing space) (empty string)

See:
http://yaml.org/spec/1.2/spec.html section 5.3 and 7.3.3


- -- 
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201006070943
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkwM+wAACgkQvJuQZxSWSsgWZACcCgb0rDvA6ZVhHId/q568gBGo
sjgAoLY7HbkI7sRpO45vi0jSRJ2Fiytk
=v7T/
-----END PGP SIGNATURE-----




Re: Invalid YAML output from EXPLAIN

От
Tom Lane
Дата:
"Greg Sabino Mullane" <greg@turnstep.com> writes:
> The rules should be:

> Requires quoting only if the first character:
>  &  *  !  |  >  '  "  %  @  ` #

> Same as above, but no quoting if the second character is "safe":
>  -  ?  :

> Always requires quoting:
>  ":<space>"  "<space>#"  aka  ': '  ' #'

> Always requires quoting:
>  ,  [  ]  {  }

> Always require quoting:
>  (leading space) (trailing space) (empty string)

Egad ... this is supposed to be an easily machine-generatable format?

If it's really as broken as the above suggests, I think we should
rip it out while we still can.
        regards, tom lane


Re: Invalid YAML output from EXPLAIN

От
"Greg Sabino Mullane"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


Tom Lane wrote:
...
> Egad ... this is supposed to be an easily machine-generatable format?
>
> If it's really as broken as the above suggests, I think we should
> rip it out while we still can.

Heh ... not like you to shrink from a challenge. ;)

I don't think the above would be particularly hard to implement myself, 
but if it becomes a really big deal, we can certainly punt by simply 
quoting anything containing an indicator (the special characters above).
It will still be 100% valid YAML, just with some excess quoting for the 
very rare case when a value contains one of the special characters.

- -- 
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201006071035
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkwNA+4ACgkQvJuQZxSWSshSswCg81kd3FdYnQup1eLWGesm+vm+
VO8AoL1Fwil/vXfRdRHx4A4zZUTDbZuT
=oPDv
-----END PGP SIGNATURE-----




Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Mon, Jun 7, 2010 at 10:37 AM, Greg Sabino Mullane <greg@turnstep.com> wrote:
> Tom Lane wrote:
> I don't think the above would be particularly hard to implement myself,
> but if it becomes a really big deal, we can certainly punt by simply
> quoting anything containing an indicator (the special characters above).
> It will still be 100% valid YAML, just with some excess quoting for the
> very rare case when a value contains one of the special characters.

Since you're the main advocate of this feature, I think you should
implement it rather than leaving it to Tom or I.

The reason why I was initially skeptical of adding a YAML output
format is that JSON is a subset of YAML.  Therefore, the JSON output
format ought to be perfectly sufficient for anyone using a YAML
parser.  If it's not, that's because their YAML processor is broken,
and they should get a new one, or because the YAML spec is defective.
The YAML format got voted in by consensus because people thought that
it would also make a nice alternative to the text format for human
readable output.  I don't believe that (it uses way too much vertical
space) but even if you accept the argument, the more we make the YAML
format look like the JSON format, the less water that argument holds.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: [HACKERS] Invalid YAML output from EXPLAIN

От
Andrew Dunstan
Дата:

Robert Haas wrote:
> On Mon, Jun 7, 2010 at 10:37 AM, Greg Sabino Mullane <greg@turnstep.com> wrote:
>   
>> Tom Lane wrote:
>> I don't think the above would be particularly hard to implement myself,
>> but if it becomes a really big deal, we can certainly punt by simply
>> quoting anything containing an indicator (the special characters above).
>> It will still be 100% valid YAML, just with some excess quoting for the
>> very rare case when a value contains one of the special characters.
>>     
>
> Since you're the main advocate of this feature, I think you should
> implement it rather than leaving it to Tom or I.
>   

Or anyone else :-)

> The reason why I was initially skeptical of adding a YAML output
> format is that JSON is a subset of YAML.  Therefore, the JSON output
> format ought to be perfectly sufficient for anyone using a YAML
> parser.  
>   

There is some debate on this point, IIRC.

cheers

andrew


Re: Invalid YAML output from EXPLAIN

От
Tom Lane
Дата:
"Greg Sabino Mullane" <greg@turnstep.com> writes:
> I don't think the above would be particularly hard to implement myself, 
> but if it becomes a really big deal, we can certainly punt by simply 
> quoting anything containing an indicator (the special characters above).

I would go with that.  The quoting rules you proposed previously seem
way too complicated --- meaning potentially buggy, and even if they're
not buggy, the behavior would seem unpredictable to most users.
        regards, tom lane


Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 7 June 2010 15:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Greg Sabino Mullane" <greg@turnstep.com> writes:
>> I don't think the above would be particularly hard to implement myself,
>> but if it becomes a really big deal, we can certainly punt by simply
>> quoting anything containing an indicator (the special characters above).
>
> I would go with that.  The quoting rules you proposed previously seem
> way too complicated --- meaning potentially buggy, and even if they're
> not buggy, the behavior would seem unpredictable to most users.
>

Well actually it's not just everything containing a special character,
it's also anything with leading or trailing whitespace, and empty
strings (not sure that can ever happen in practice).

It's because of the potential for bugs in this area, that I'd propose
just quoting everything (except numeric values) as in my original
patch.

Regards,
Dean


Re: [HACKERS] Invalid YAML output from EXPLAIN

От
Florian Weimer
Дата:
* Tom Lane:

> Egad ... this is supposed to be an easily machine-generatable format?

Perhaps you could surround all strings with "" in the generator, and
escape all potentially special characters (which seems to include some
whitespace even in quoted strings, unfortunately)?

It has been claimed before that YAML is a superset of JSON, so why
can't the YAML folks use the existing JSON output instead?

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99


Re: [HACKERS] Invalid YAML output from EXPLAIN

От
Greg Smith
Дата:
Florian Weimer wrote:
> It has been claimed before that YAML is a superset of JSON, so why
> can't the YAML folks use the existing JSON output instead?
>   

Because JSON just crosses the line where it feels like there's so much 
markup that people expect a tool is necessary to read it, which has 
always been the issue with XML too--bad human readability.  I was on the 
fence about YAML until I used it for a client issue over the weekend.  I 
was able to hack together a quick tool to work on the issue that parsed 
enough YAML *without using an external library* well enough for my 
purposes in an hour, one that was still far more robust than a similar 
hack trying to read plain old text format for EXPLAIN.  And the client 
was able to follow what was going on as I passed YAML output back and 
forth with them.  Just having every field labeled clearly cut off all 
the usual "which of these is the startup cost again?" questions I'm used 
to getting.

The complaints about YAML taking up too much vertical space are 
understandable, but completely opposite of what I care about.  I can 
e-mail a customer a YAML plan and it will survive to the other side and 
even in a reply back to me.  Whereas any non-trivial text format one is 
guaranteed to utterly destroyed by line wrapping along the way.

I think this thread could use a fresh example to remind anyone who 
hasn't played with the curent YAML format what it looks like.  Here's 
one from a query against the Dell Store 2 database:

EXPLAIN SELECT * FROM customers WHERE customerid>1000 ORDER BY zip;
QUERY PLAN
----------
Sort  (cost=4449.30..4496.80 rows=19000 width=268)  Sort Key: zip  ->  Seq Scan on customers  (cost=0.00..726.00
rows=19000width=268)        Filter: (customerid > 1000)
 

EXPLAIN (FORMAT YAML) SELECT * FROM customers WHERE customerid>1000 
ORDER BY zip;            QUERY PLAN             
-------------------------------------- Plan:                            +    Node Type: Sort                +
StartupCost: 4449.30          +    Total Cost: 4496.80            +    Plan Rows: 19000               +    Plan Width:
268               +    Sort Key:                      +      - zip                        +    Plans:
     +      - Node Type: Seq Scan        +        Parent Relationship: Outer +        Relation Name: customers   +
 Alias: customers           +        Startup Cost: 0.00         +        Total Cost: 726.00         +        Plan Rows:
19000          +        Plan Width: 268            +        Filter: (customerid > 1000)
 

-- 
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us



Re: [HACKERS] Invalid YAML output from EXPLAIN

От
Tom Lane
Дата:
Greg Smith <greg@2ndquadrant.com> writes:
> The complaints about YAML taking up too much vertical space are 
> understandable, but completely opposite of what I care about.  I can 
> e-mail a customer a YAML plan and it will survive to the other side and 
> even in a reply back to me.  Whereas any non-trivial text format one is 
> guaranteed to utterly destroyed by line wrapping along the way.

> I think this thread could use a fresh example to remind anyone who 
> hasn't played with the curent YAML format what it looks like.

So?  This doesn't look amazingly unlike the current JSON output,
and to the extent that we have to add more quoting to it, it's
going to look even more like the JSON output.

Given the lack of any field separators other than newlines, I'm also
finding myself extremely doubtful about the claim that it survives
line-wrapping mutilations well.  For instance this bit:
      - Node Type: Seq Scan        Parent Relationship: Outer

doesn't appear to have anything but whitespace to distinguish it from
      - Node Type: Seq Scan Parent        Relationship: Outer
        regards, tom lane


Re: [HACKERS] Invalid YAML output from EXPLAIN

От
Greg Smith
Дата:
Tom Lane wrote:
> This doesn't look amazingly unlike the current JSON output,
> and to the extent that we have to add more quoting to it, it's
> going to look even more like the JSON output.
>   

I don't know about that; here's the JSON one:

EXPLAIN (FORMAT JSON) SELECT * FROM customers WHERE customerid>1000 
ORDER BY zip;               QUERY PLAN                
-------------------------------------------[                                        +  {
     +    "Plan": {                            +      "Node Type": "Sort",               +      "Startup Cost":
4449.30,          +      "Total Cost": 4496.80,             +      "Plan Rows": 19000,                +      "Plan
Width":268,                 +      "Sort Key": ["zip"],               +      "Plans": [                         +
{                                +          "Node Type": "Seq Scan",       +          "Parent Relationship": "Outer",+
       "Relation Name": "customers",  +          "Alias": "customers",          +          "Startup Cost": 0.00,
 +          "Total Cost": 726.00,          +          "Plan Rows": 19000,            +          "Plan Width": 268,
      +          "Filter": "(customerid > 1000)"+        }                                +      ]
           +    }                                    +  }                                      +]
 
From the perspective of how that's less useful as a human form of 
output, it's longer, wider, and has redundant punctuation that gets in 
the way.

I think that YAML quoting will need to respect one of the special cases 
to keep from ruining its readability:  "Requires quoting only if the 
first character" for " will make its current format look terrible if 
that rule is applied to the whole line instead.  That sounds like a 
necessary special case to include:  don't quote any quote characters 
that appear unless they're the first character on the line.  Everything 
else could switch back to really aggressive quoting in every spot and 
that wouldn't hurt the readability of the format very much IMHO.

> Given the lack of any field separators other than newlines, I'm also
> finding myself extremely doubtful about the claim that it survives
> line-wrapping mutilations well.

All I was claiming there is that the output is dramatically less wide 
than the standard text format of the same plan, and therefore far less 
likely to get nailed by a mail client that wraps at normal line widths.  
Agreed that once wrapping does occur, it has serious problems too.

Here are the stats for this plan, leaving off the QUERY PLAN header from 
each:

TEXT:  4 vertical, 69 horizontal
YAML:  18 vertical, 36 horizontal
JSON:  25 vertical, 43 horizontal
XML[1]:  27 vertical, 60 horizontal

Quote the TEXT line with "> " or get a plan with one more line of 
intendation, and you're likely to get wrapped badly at the 72 character 
line limit some clients use.  Quite a bit more headroom before the YAML 
format will wrap like that; JSON is in the middle.

I now see plenty of use for YAML when exchanging plans over e-mail, and 
it's a bonus that should survive that format to be parseable on the 
other side.   JSON and XML are certainly the preferred way to feed plans 
into analysis tools. unambiguously.

[1] Might as well make this a complete example:
<explain xmlns="http://www.postgresql.org/2009/explain">  +  <Query>                                                 +
 <Plan>                                                +      <Node-Type>Sort</Node-Type>                         +
<Startup-Cost>4449.30</Startup-Cost>                +      <Total-Cost>4496.80</Total-Cost>                    +
<Plan-Rows>19000</Plan-Rows>                       +      <Plan-Width>268</Plan-Width>                        +
<Sort-Key>                                         +        <Item>zip</Item>                                  +
</Sort-Key>                                        +      <Plans>                                             +
<Plan>                                           +          <Node-Type>Seq Scan</Node-Type>                 +
<Parent-Relationship>Outer</Parent-Relationship>+         <Relation-Name>customers</Relation-Name>        +
<Alias>customers</Alias>                       +          <Startup-Cost>0.00</Startup-Cost>               +
<Total-Cost>726.00</Total-Cost>                +          <Plan-Rows>19000</Plan-Rows>                    +
<Plan-Width>268</Plan-Width>                   +          <Filter>(customerid > 1000)</Filter>         +
</Plan>                                          +      </Plans>                                            +
</Plan>                                              +  </Query>
+</explain>

-- 
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us



Re: [HACKERS] Invalid YAML output from EXPLAIN

От
Florian Weimer
Дата:
* Greg Smith:

> Florian Weimer wrote:
>> It has been claimed before that YAML is a superset of JSON, so why
>> can't the YAML folks use the existing JSON output instead?
>>
>
> Because JSON just crosses the line where it feels like there's so much
> markup that people expect a tool is necessary to read it, which has
> always been the issue with XML too--bad human readability.

But YAML is not human-readable.  There are human-readable subsets of
it, but the general serializers do not produce them, and specific
serializers are difficult to get right (as we've seen).

> EXPLAIN (FORMAT YAML) SELECT * FROM customers WHERE customerid>1000
> ORDER BY zip;
>             QUERY PLAN
> -------------------------------------
> - Plan:                            +
>     Node Type: Sort                +
>     Startup Cost: 4449.30          +
>     Total Cost: 4496.80            +
>     Plan Rows: 19000               +
>     Plan Width: 268                +
>     Sort Key:                      +
>       - zip                        +
>     Plans:                         +
>       - Node Type: Seq Scan        +
>         Parent Relationship: Outer +
>         Relation Name: customers   +
>         Alias: customers           +
>         Startup Cost: 0.00         +
>         Total Cost: 726.00         +
>         Plan Rows: 19000           +
>         Plan Width: 268            +
>         Filter: (customerid > 1000)

What does your parser do with this (equivalent but shorter) YAML
output?

- Plan: !!map   &0 Node Type: Sort   &1 Startup Cost: 4449.30   &2 Total Cost: 4496.80   &3 Plan Rows: &5 19000   &4
PlanWidth: &6 268   Sort Key: ["zip"]   Plans: !!seq     - *0: Seq Scan       Parent Relationship: Outer       Relation
Name:&7 customers       Alias: *7       *1: 0.00       *2: 726.00       *3: *5       *4: *6       Filter: (customerid >
1000)

Looking at the spec, it's rather difficult to come up with a readable
subset which can parsed easily and is general in the sense that it can
express empty strings, strings with embedded newlines, and so on.
YAML's rules for dealing with whitespace are fairly complex, but are
probably needed to get a more compact notation than JSON.

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99


Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Mon, Jun 7, 2010 at 4:14 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wro=
te:
> Testing 9.0 beta, I found that EXPLAINing certain queries in YAML
> format will produce invalid YAML, for example:
>
> explain (format yaml) select * from foo where str_val =3D 'a: b';
>
> The problem in this case is that a colon followed by whitespace is not
> allowed in an unquoted plain YAML string because a parser would
> interpret it as the start of a map.
>
> So the current code in escape_yaml() is inadequate for producing valid
> YAML. I think it would have to also consider at least the following
> characters as special "-" =A0":" =A0"[" =A0"]" =A0"{" =A0"}" =A0"," =A0"\=
"" =A0"'"
> "|" =A0"*" =A0"&". Technically, it would also need to trap empty strings,
> and strings with leading or trailing whitespace.
>
> Making escape_yaml() completely bulletproof with this approach would
> be quite difficult, and (IMO) not worth the effort, especially given
> that an important requirement is that the output be machine readable,
> and in my experience YAML parsers are often far from perfect.
>
> I would therefore argue for simply calling escape_json() to produce
> double quoted output for all string values, and only have numeric
> values unquoted. This is not really any less human readable, and is
> far more machine readable.
>
> Patch attached.

I've committed a patch which I think will address this issue without
uglifying the output quite so much.  Also, I didn't like the idea of
not applying escaping to both the keys and values, even though we
think we'll never have a key that requires escaping.  With this
approach, that change isn't needed.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Tue, Jun 8, 2010 at 10:47 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Jun 7, 2010 at 4:14 AM, Dean Rasheed <dean.a.rasheed@gmail.com> w=
rote:
>> Testing 9.0 beta, I found that EXPLAINing certain queries in YAML
>> format will produce invalid YAML, for example:
>>
>> explain (format yaml) select * from foo where str_val =3D 'a: b';
>>
>> The problem in this case is that a colon followed by whitespace is not
>> allowed in an unquoted plain YAML string because a parser would
>> interpret it as the start of a map.
>>
>> So the current code in escape_yaml() is inadequate for producing valid
>> YAML. I think it would have to also consider at least the following
>> characters as special "-" =A0":" =A0"[" =A0"]" =A0"{" =A0"}" =A0"," =A0"=
\"" =A0"'"
>> "|" =A0"*" =A0"&". Technically, it would also need to trap empty strings,
>> and strings with leading or trailing whitespace.
>>
>> Making escape_yaml() completely bulletproof with this approach would
>> be quite difficult, and (IMO) not worth the effort, especially given
>> that an important requirement is that the output be machine readable,
>> and in my experience YAML parsers are often far from perfect.
>>
>> I would therefore argue for simply calling escape_json() to produce
>> double quoted output for all string values, and only have numeric
>> values unquoted. This is not really any less human readable, and is
>> far more machine readable.
>>
>> Patch attached.
>
> I've committed a patch which I think will address this issue without
> uglifying the output quite so much. =A0Also, I didn't like the idea of
> not applying escaping to both the keys and values, even though we
> think we'll never have a key that requires escaping. =A0With this
> approach, that change isn't needed.

Er, I should also say, thanks for the report, and please test.  I am
definitely not an expert on YAML.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 9 June 2010 03:48, Robert Haas <robertmhaas@gmail.com> wrote:
> Er, I should also say, thanks for the report, and please test. =A0I am
> definitely not an expert on YAML.
>

I'm not an expert on YAML either, but I don't think this works (at
least it breaks against the online YAML parser here:
http://yaml-online-parser.appspot.com/). If the string starts with a
".", then it tries to treat it as a floating point number and baulks
if the rest of the string isn't a valid number.

Actually that gives me another argument for quoting *all* string
values: if a string value happens to be a valid number and we output
it unquoted, then parsers will treat is a number, forcing the user to
cast it to a string in their code. This is likely to lead to bugs in
user code, where things initially appear to work, then unexpectedly
start failing in pathalogical cases.

Personally, I'd also not try to escape keys at all. If in the future
we add a new key, then we'd be forced to choose between quoting it, or
more likely choosing a different key that doesn't require quoting,
which is what most people do in my (admittedly limited) experience.

In short, despite everything that's been said on this thread, I still
prefer my patch - but then I suppose I would say that :-)

Cheers,
Dean

Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 9 June 2010 07:58, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> I prefer my patch - but then I suppose I would say that :-)
>

Seriously though, I can think of a number of good arguments in favour
of the "quote all string values unconditionally" approach:

- It's the simplest, least obtrusive fix to our code.
- It's the least likely to have bugs in our code.
- It's the least likely to lead to bugs in client code.
- It gives clean, consistent output (IMO).
- It's the easiest to machine parse (marginally).
- Strings are always strings.
- It's still perfectly human readable.
- Re arguments about mailers chopping it up, it's easier to piece
together again if you know all long lines end with a double quote.

I love Florian's example, because it illustrates that there are all
sorts of YAML output that we *could* produce that would be technically
equivalent. However, I think this is a case of the simplest solution
being the best.

Regards,
Dean

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 2:58 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wro=
te:
> On 9 June 2010 03:48, Robert Haas <robertmhaas@gmail.com> wrote:
>> Er, I should also say, thanks for the report, and please test. =A0I am
>> definitely not an expert on YAML.
>>
>
> I'm not an expert on YAML either, but I don't think this works (at
> least it breaks against the online YAML parser here:
> http://yaml-online-parser.appspot.com/). If the string starts with a
> ".", then it tries to treat it as a floating point number and baulks
> if the rest of the string isn't a valid number.

Really?  I enter:

- foo
- bar
- .baz

And it produces this JSON:

[
  "foo",
  "bar",
  ".baz"
]

That looks OK to me.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 9 June 2010 12:07, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 2:58 AM, Dean Rasheed <dean.a.rasheed@gmail.com> w=
rote:
>> On 9 June 2010 03:48, Robert Haas <robertmhaas@gmail.com> wrote:
>>> Er, I should also say, thanks for the report, and please test. =A0I am
>>> definitely not an expert on YAML.
>>>
>>
>> I'm not an expert on YAML either, but I don't think this works (at
>> least it breaks against the online YAML parser here:
>> http://yaml-online-parser.appspot.com/). If the string starts with a
>> ".", then it tries to treat it as a floating point number and baulks
>> if the rest of the string isn't a valid number.
>
> Really? =A0I enter:
>
> - foo
> - bar
> - .baz
>
> And it produces this JSON:
>
> [
> =A0"foo",
> =A0"bar",
> =A0".baz"
> ]
>
> That looks OK to me.
>

Ah, OK I didn't test those cases properly before composing my email.
It's actually only a "." on its own that it can't parse.

- just: write some
- yaml:
  - .

ERROR:

invalid literal for float(): .

I'm not sure if that's valid YAML or not.

My comment about numbers still applies though. The following are
different values:

- just: write some
- yaml:
  - 123
  - "123"

[
  {
    "just": "write some"
  },
  {
    "yaml": [
      123,
      "123"
    ]
  }
]

Regards,
Dean

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 7:23 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wro=
te:
> On 9 June 2010 12:07, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Jun 9, 2010 at 2:58 AM, Dean Rasheed <dean.a.rasheed@gmail.com> =
wrote:
>>> On 9 June 2010 03:48, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> Er, I should also say, thanks for the report, and please test. =A0I am
>>>> definitely not an expert on YAML.
>>>>
>>>
>>> I'm not an expert on YAML either, but I don't think this works (at
>>> least it breaks against the online YAML parser here:
>>> http://yaml-online-parser.appspot.com/). If the string starts with a
>>> ".", then it tries to treat it as a floating point number and baulks
>>> if the rest of the string isn't a valid number.
>>
>> Really? =A0I enter:
>>
>> - foo
>> - bar
>> - .baz
>>
>> And it produces this JSON:
>>
>> [
>> =A0"foo",
>> =A0"bar",
>> =A0".baz"
>> ]
>>
>> That looks OK to me.
>>
>
> Ah, OK I didn't test those cases properly before composing my email.
> It's actually only a "." on its own that it can't parse.

Well, at first blush, that looks like it might be a bug in the parser.
 I don't see anything in the spec to indicate that that case should be
treated specially.

> My comment about numbers still applies though. The following are
> different values:
>
> - just: write some
> - yaml:
> =A0- 123
> =A0- "123"

Well, you can't have abc mean the same thing as "abc" but then
complain that 123 isn't equivalent to "123"...

This format is really a pain to work with.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 9 June 2010 12:32, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 7:23 AM, Dean Rasheed <dean.a.rasheed@gmail.com> w=
rote:
>> On 9 June 2010 12:07, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Wed, Jun 9, 2010 at 2:58 AM, Dean Rasheed <dean.a.rasheed@gmail.com>=
 wrote:
>>>> On 9 June 2010 03:48, Robert Haas <robertmhaas@gmail.com> wrote:
>>>>> Er, I should also say, thanks for the report, and please test. =A0I am
>>>>> definitely not an expert on YAML.
>>>>>
>>>>
>>>> I'm not an expert on YAML either, but I don't think this works (at
>>>> least it breaks against the online YAML parser here:
>>>> http://yaml-online-parser.appspot.com/). If the string starts with a
>>>> ".", then it tries to treat it as a floating point number and baulks
>>>> if the rest of the string isn't a valid number.
>>>
>>> Really? =A0I enter:
>>>
>>> - foo
>>> - bar
>>> - .baz
>>>
>>> And it produces this JSON:
>>>
>>> [
>>> =A0"foo",
>>> =A0"bar",
>>> =A0".baz"
>>> ]
>>>
>>> That looks OK to me.
>>>
>>
>> Ah, OK I didn't test those cases properly before composing my email.
>> It's actually only a "." on its own that it can't parse.
>
> Well, at first blush, that looks like it might be a bug in the parser.
> =A0I don't see anything in the spec to indicate that that case should be
> treated specially.
>

Yeah, I think it *is* valid, and JYaml parses it OK.
Some people will no doubt say "if your parser can't handle it, get a
better parser", but I'd rather not make it more difficult than it
needs to be.


>> My comment about numbers still applies though. The following are
>> different values:
>>
>> - just: write some
>> - yaml:
>> =A0- 123
>> =A0- "123"
>
> Well, you can't have abc mean the same thing as "abc" but then
> complain that 123 isn't equivalent to "123"...
>

Yeah. I know that JYaml parses 123 to a java.lang.Integer, and "123"
to a java.lang.String.


> This format is really a pain to work with.
>

Agreed :-(

- Dean

Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 9 June 2010 03:48, Robert Haas <robertmhaas@gmail.com> wrote:
> please test.

Well your patch definitely fixes my original bug, and AFAICT always
produces valid YAML output now. I've only found one case where a
particular parser has difficulty parsing the output, and you'd have to
write a pretty perverse query to hit that case.

So that just leaves this sort of thing:

explain (format yaml) select * from foo as "123";
       QUERY PLAN
-------------------------
 - Plan:                +
     Node Type: Seq Scan+
     Relation Name: foo +
     Alias: 123         +
     Startup Cost: 0.00 +
     Total Cost: 23.10  +
     Plan Rows: 1310    +
     Plan Width: 32
(1 row)

Does anyone care that Alias will sometimes be a string, and sometimes a number?

ITSM that, since postgresql knows that it's a string, it ought to
output something that parsers can unambiguously treat as a string too.

But this is also a pretty obscure case that probably only someone
deliberately trying to be awkward would do (which is me, with my
tester hat on :-)).

Regards,
Dean

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 8:46 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wro=
te:
> On 9 June 2010 03:48, Robert Haas <robertmhaas@gmail.com> wrote:
>> please test.
>
> Well your patch definitely fixes my original bug, and AFAICT always
> produces valid YAML output now. I've only found one case where a
> particular parser has difficulty parsing the output, and you'd have to
> write a pretty perverse query to hit that case.

Excellent.

> So that just leaves this sort of thing:
>
> explain (format yaml) select * from foo as "123";
> =A0 =A0 =A0 QUERY PLAN
> -------------------------
> =A0- Plan: =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+
> =A0 =A0 Node Type: Seq Scan+
> =A0 =A0 Relation Name: foo +
> =A0 =A0 Alias: 123 =A0 =A0 =A0 =A0 +
> =A0 =A0 Startup Cost: 0.00 +
> =A0 =A0 Total Cost: 23.10 =A0+
> =A0 =A0 Plan Rows: 1310 =A0 =A0+
> =A0 =A0 Plan Width: 32
> (1 row)
>
> Does anyone care that Alias will sometimes be a string, and sometimes a n=
umber?
>
> ITSM that, since postgresql knows that it's a string, it ought to
> output something that parsers can unambiguously treat as a string too.
>
> But this is also a pretty obscure case that probably only someone
> deliberately trying to be awkward would do (which is me, with my
> tester hat on :-)).

I guess we could do this by (a) conditionalizing the YAML case in
ExplainProperty() in the same way that the JSON case is currently
conditionalized, and (b) changing the first if statement in
escape_yaml() to set needs_quoting =3D true unless the first character
is alphabetic or an underscore.

By the way, can I ask why you're not just using the JSON format for
this?  I mean, I'm glad you are, because it exposed a bug that we got
fixed before release, but it seems a little masochistic...!

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 9 June 2010 14:14, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 8:46 AM, Dean Rasheed <dean.a.rasheed@gmail.com> w=
rote:
>> On 9 June 2010 03:48, Robert Haas <robertmhaas@gmail.com> wrote:
>>> please test.
>>
>> Well your patch definitely fixes my original bug, and AFAICT always
>> produces valid YAML output now. I've only found one case where a
>> particular parser has difficulty parsing the output, and you'd have to
>> write a pretty perverse query to hit that case.
>
> Excellent.
>
>> So that just leaves this sort of thing:
>>
>> explain (format yaml) select * from foo as "123";
>> =A0 =A0 =A0 QUERY PLAN
>> -------------------------
>> =A0- Plan: =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+
>> =A0 =A0 Node Type: Seq Scan+
>> =A0 =A0 Relation Name: foo +
>> =A0 =A0 Alias: 123 =A0 =A0 =A0 =A0 +
>> =A0 =A0 Startup Cost: 0.00 +
>> =A0 =A0 Total Cost: 23.10 =A0+
>> =A0 =A0 Plan Rows: 1310 =A0 =A0+
>> =A0 =A0 Plan Width: 32
>> (1 row)
>>
>> Does anyone care that Alias will sometimes be a string, and sometimes a =
number?
>>
>> ITSM that, since postgresql knows that it's a string, it ought to
>> output something that parsers can unambiguously treat as a string too.
>>
>> But this is also a pretty obscure case that probably only someone
>> deliberately trying to be awkward would do (which is me, with my
>> tester hat on :-)).
>
> I guess we could do this by (a) conditionalizing the YAML case in
> ExplainProperty() in the same way that the JSON case is currently
> conditionalized, and (b) changing the first if statement in
> escape_yaml() to set needs_quoting =3D true unless the first character
> is alphabetic or an underscore.
>

Yes, I think that would do it.


> By the way, can I ask why you're not just using the JSON format for
> this? =A0I mean, I'm glad you are, because it exposed a bug that we got
> fixed before release, but it seems a little masochistic...!
>

Actually I doubt that I will use this feature at all! I only use
EXPLAIN from psql, and usually I'm the only one who needs to read it,
so the TEXT format will remain my preferred option.

I was just doing some random beta testing, working through the list of
cool new features.

Dean

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 9:35 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>>> Does anyone care that Alias will sometimes be a string, and sometimes a number?
>> I guess we could do this by (a) conditionalizing the YAML case in
>> ExplainProperty() in the same way that the JSON case is currently
>> conditionalized, and (b) changing the first if statement in
>> escape_yaml() to set needs_quoting = true unless the first character
>> is alphabetic or an underscore.
> Yes, I think that would do it.

After further review, it appears to me that this change is pretty much
required, because otherwise a string like 0xa won't be quoted.  I
might think it's OK for "123" to turn into 123, but I'm not going to
be so happy about "0xa" turning into 10.  Please test the attached
patch.

>> By the way, can I ask why you're not just using the JSON format for
>> this?  I mean, I'm glad you are, because it exposed a bug that we got
>> fixed before release, but it seems a little masochistic...!
>
> Actually I doubt that I will use this feature at all! I only use
> EXPLAIN from psql, and usually I'm the only one who needs to read it,
> so the TEXT format will remain my preferred option.
>
> I was just doing some random beta testing, working through the list of
> cool new features.

Quick, somebody give this man a cigar!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Вложения

Re: Invalid YAML output from EXPLAIN

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jun 9, 2010 at 9:35 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>>> Does anyone care that Alias will sometimes be a string, and sometimes a number?

> After further review, it appears to me that this change is pretty much
> required, because otherwise a string like 0xa won't be quoted.  I
> might think it's OK for "123" to turn into 123, but I'm not going to
> be so happy about "0xa" turning into 10.  Please test the attached
> patch.

I still agree with Dean's original proposal: always quote the values of
strings.

            regards, tom lane

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 11:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Jun 9, 2010 at 9:35 AM, Dean Rasheed <dean.a.rasheed@gmail.com> =
wrote:
>>>> Does anyone care that Alias will sometimes be a string, and sometimes =
a number?
>
>> After further review, it appears to me that this change is pretty much
>> required, because otherwise a string like 0xa won't be quoted. =A0I
>> might think it's OK for "123" to turn into 123, but I'm not going to
>> be so happy about "0xa" turning into 10. =A0Please test the attached
>> patch.
>
> I still agree with Dean's original proposal: always quote the values of
> strings.

I'd still rather rip the format out entirely than do that.  Dean's
proposal was based on the idea that it would be safe to quote only the
values and not the keys, which is not something I care to bank on.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jun 9, 2010 at 11:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I still agree with Dean's original proposal: always quote the values of
>> strings.

> I'd still rather rip the format out entirely than do that.

I'd be on board with that too ;-)

> Dean's
> proposal was based on the idea that it would be safe to quote only the
> values and not the keys, which is not something I care to bank on.

Why not?  Surely we can restrict EXPLAIN's set of key names to be safe.

            regards, tom lane

Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 9 June 2010 16:05, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 11:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Wed, Jun 9, 2010 at 9:35 AM, Dean Rasheed <dean.a.rasheed@gmail.com>=
 wrote:
>>>>> Does anyone care that Alias will sometimes be a string, and sometimes=
 a number?
>>
>>> After further review, it appears to me that this change is pretty much
>>> required, because otherwise a string like 0xa won't be quoted. =A0I
>>> might think it's OK for "123" to turn into 123, but I'm not going to
>>> be so happy about "0xa" turning into 10. =A0Please test the attached
>>> patch.
>>
>> I still agree with Dean's original proposal: always quote the values of
>> strings.
>
> I'd still rather rip the format out entirely than do that. =A0Dean's
> proposal was based on the idea that it would be safe to quote only the
> values and not the keys, which is not something I care to bank on.
>

Why? There are only a small number of keys, and we're completely in
control of them. You'd very quickly notice if one of them needed
quoting, which is not the case with values.

Dean

Re: Invalid YAML output from EXPLAIN

От
"Kevin Grittner"
Дата:
Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

> So that just leaves this sort of thing:
>
> explain (format yaml) select * from foo as "123";

What about other quoted identifiers?  If I recall correctly, a
quoted identifier can contain *any* characters (although a quote
must be represented as two adjacent quotes).

test=# create schema "0";
CREATE SCHEMA
test=# create table "0"."123" ("456" int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"123_pkey" for table "123"
CREATE TABLE
test=# explain (format yaml) select * from "0"."123" where "456" =
'789';
            QUERY PLAN
-----------------------------------
 - Plan:                          +
     Node Type: Index Scan        +
     Scan Direction: Forward      +
     Index Name: "\"123_pkey\""   +
     Relation Name: 123           +
     Alias: 123                   +
     Startup Cost: 0.00           +
     Total Cost: 8.27             +
     Plan Rows: 1                 +
     Plan Width: 4                +
     Index Cond: "(\"456\" = 789)"
(1 row)

Also, the trailing spaces in this format (stripped from this email
to avoid odd spacing) are more than a little annoying.

-Kevin

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 11:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Jun 9, 2010 at 11:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I still agree with Dean's original proposal: always quote the values of
>>> strings.
>
>> I'd still rather rip the format out entirely than do that.
>
> I'd be on board with that too ;-)
>
>> Dean's
>> proposal was based on the idea that it would be safe to quote only the
>> values and not the keys, which is not something I care to bank on.
>
> Why not? =A0Surely we can restrict EXPLAIN's set of key names to be safe.

It seems to me that it would be easy for a future patch to break this
by accident.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jun 9, 2010 at 11:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Why not?  Surely we can restrict EXPLAIN's set of key names to be safe.

> It seems to me that it would be easy for a future patch to break this
> by accident.

Really?  What likely key names would be in need of quoting?  I can't
imagine accepting a field name that contains punctuation or leading
or trailing whitespace, for example.

            regards, tom lane

Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 9 June 2010 17:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Jun 9, 2010 at 11:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Why not? =A0Surely we can restrict EXPLAIN's set of key names to be saf=
e.
>
>> It seems to me that it would be easy for a future patch to break this
>> by accident.
>
> Really? =A0What likely key names would be in need of quoting? =A0I can't
> imagine accepting a field name that contains punctuation or leading
> or trailing whitespace, for example.
>

And any of those things would break the XML output too.
I can't imagine that going unnoticed.

Dean

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 12:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Jun 9, 2010 at 11:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Why not? =A0Surely we can restrict EXPLAIN's set of key names to be saf=
e.
>
>> It seems to me that it would be easy for a future patch to break this
>> by accident.
>
> Really? =A0What likely key names would be in need of quoting? =A0I can't
> imagine accepting a field name that contains punctuation or leading
> or trailing whitespace, for example.

It seemed to me, in particular, that someone might use a # symbol,
like "# of Iterations".

Maybe I'm being paranoid.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 9 June 2010 17:52, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 12:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Wed, Jun 9, 2010 at 11:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Why not? =A0Surely we can restrict EXPLAIN's set of key names to be sa=
fe.
>>
>>> It seems to me that it would be easy for a future patch to break this
>>> by accident.
>>
>> Really? =A0What likely key names would be in need of quoting? =A0I can't
>> imagine accepting a field name that contains punctuation or leading
>> or trailing whitespace, for example.
>
> It seemed to me, in particular, that someone might use a # symbol,
> like "# of Iterations".
>

Then the resulting XML tagname would be invalid too
I think they would soon realise/be told that it was a bad idea.

Dean

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 12:57 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wr=
ote:
> On 9 June 2010 17:52, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Jun 9, 2010 at 12:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>> On Wed, Jun 9, 2010 at 11:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>>> Why not? =A0Surely we can restrict EXPLAIN's set of key names to be s=
afe.
>>>
>>>> It seems to me that it would be easy for a future patch to break this
>>>> by accident.
>>>
>>> Really? =A0What likely key names would be in need of quoting? =A0I can't
>>> imagine accepting a field name that contains punctuation or leading
>>> or trailing whitespace, for example.
>>
>> It seemed to me, in particular, that someone might use a # symbol,
>> like "# of Iterations".
>>
>
> Then the resulting XML tagname would be invalid too
> I think they would soon realise/be told that it was a bad idea.

Hmm, you're right.  Maybe we should go with your approach, then.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 12:58 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 12:57 PM, Dean Rasheed <dean.a.rasheed@gmail.com> =
wrote:
>> On 9 June 2010 17:52, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Wed, Jun 9, 2010 at 12:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>>> On Wed, Jun 9, 2010 at 11:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>>>> Why not? =A0Surely we can restrict EXPLAIN's set of key names to be =
safe.
>>>>
>>>>> It seems to me that it would be easy for a future patch to break this
>>>>> by accident.
>>>>
>>>> Really? =A0What likely key names would be in need of quoting? =A0I can=
't
>>>> imagine accepting a field name that contains punctuation or leading
>>>> or trailing whitespace, for example.
>>>
>>> It seemed to me, in particular, that someone might use a # symbol,
>>> like "# of Iterations".
>>>
>>
>> Then the resulting XML tagname would be invalid too
>> I think they would soon realise/be told that it was a bad idea.
>
> Hmm, you're right. =A0Maybe we should go with your approach, then.

After thinking about this further, I think I'd still like to take one
more crack at fixing this without quoting absolutely everything.  I
argued against this feature, but we decided to take it, and it seems
that one of the major arguments that is being put forward is that it
will be more readable than JSON, because it will have less
punctuation.  While the idea of optimizing a machine-readable format
for human-readability doesn't typically carry much water around here,
it's really the only use case for having this particular feature at
all, so, if we're not going to rip it out, ISTM we ought to respect
what it's there for.  I would be more than willing to agree that if
one more attempt isn't sufficient to fix the problem then we'll either
quote everything, or rip the whole thing out.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 12:58 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 12:57 PM, Dean Rasheed <dean.a.rasheed@gmail.com> =
wrote:
>> On 9 June 2010 17:52, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Wed, Jun 9, 2010 at 12:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>>> On Wed, Jun 9, 2010 at 11:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>>>> Why not? =A0Surely we can restrict EXPLAIN's set of key names to be =
safe.
>>>>
>>>>> It seems to me that it would be easy for a future patch to break this
>>>>> by accident.
>>>>
>>>> Really? =A0What likely key names would be in need of quoting? =A0I can=
't
>>>> imagine accepting a field name that contains punctuation or leading
>>>> or trailing whitespace, for example.
>>>
>>> It seemed to me, in particular, that someone might use a # symbol,
>>> like "# of Iterations".
>>>
>>
>> Then the resulting XML tagname would be invalid too
>> I think they would soon realise/be told that it was a bad idea.
>
> Hmm, you're right. =A0Maybe we should go with your approach, then.

After thinking about this further, I think I'd still like to take one
more crack at fixing this without quoting absolutely everything.  I
argued against this feature, but we decided to take it, and it seems
that one of the major arguments that is being put forward is that it
will be more readable than JSON, because it will have less
punctuation.  While the idea of optimizing a machine-readable format
for human-readability doesn't typically carry much water around here,
it's really the only use case for having this particular feature at
all, so, if we're not going to rip it out, ISTM we ought to respect
what it's there for.  I would be more than willing to agree that if
one more attempt isn't sufficient to fix the problem then we'll either
quote everything, or rip the whole thing out.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 9 June 2010 19:50, Robert Haas <robertmhaas@gmail.com> wrote:
> After thinking about this further, I think I'd still like to take one
> more crack at fixing this without quoting absolutely everything. =A0I
> argued against this feature, but we decided to take it, and it seems
> that one of the major arguments that is being put forward is that it
> will be more readable than JSON, because it will have less
> punctuation.

Hmm. Well it's quite subjective, but IMO it's already more readable
than JSON regardless of whether or not values are quoted, simply
because it doesn't have [ ] and { } for lists and maps, which for JSON
adds significantly to the number of lines in longer plans.

Compare

               QUERY PLAN
----------------------------------------
 - Plan:                               +
     Node Type: Nested Loop            +
     Join Type: Inner                  +
     Startup Cost: 0.00                +
     Total Cost: 57313.15              +
     Plan Rows: 4579600                +
     Plan Width: 16                    +
     Plans:                            +
       - Node Type: Seq Scan           +
         Parent Relationship: Outer    +
         Relation Name: foo            +
         Alias: f1                     +
         Startup Cost: 0.00            +
         Total Cost: 31.40             +
         Plan Rows: 2140               +
         Plan Width: 8                 +
       - Node Type: Materialize        +
         Parent Relationship: Inner    +
         Startup Cost: 0.00            +
         Total Cost: 42.10             +
         Plan Rows: 2140               +
         Plan Width: 8                 +
         Plans:                        +
           - Node Type: Seq Scan       +
             Parent Relationship: Outer+
             Relation Name: foo        +
             Alias: f2                 +
             Startup Cost: 0.00        +
             Total Cost: 31.40         +
             Plan Rows: 2140           +
             Plan Width: 8

with
                  QUERY PLAN
-----------------------------------------------
 [                                            +
   {                                          +
     "Plan": {                                +
       "Node Type": "Nested Loop",            +
       "Join Type": "Inner",                  +
       "Startup Cost": 0.00,                  +
       "Total Cost": 57313.15,                +
       "Plan Rows": 4579600,                  +
       "Plan Width": 16,                      +
       "Plans": [                             +
         {                                    +
           "Node Type": "Seq Scan",           +
           "Parent Relationship": "Outer",    +
           "Relation Name": "foo",            +
           "Alias": "f1",                     +
           "Startup Cost": 0.00,              +
           "Total Cost": 31.40,               +
           "Plan Rows": 2140,                 +
           "Plan Width": 8                    +
         },                                   +
         {                                    +
           "Node Type": "Materialize",        +
           "Parent Relationship": "Inner",    +
           "Startup Cost": 0.00,              +
           "Total Cost": 42.10,               +
           "Plan Rows": 2140,                 +
           "Plan Width": 8,                   +
           "Plans": [                         +
             {                                +
               "Node Type": "Seq Scan",       +
               "Parent Relationship": "Outer",+
               "Relation Name": "foo",        +
               "Alias": "f2",                 +
               "Startup Cost": 0.00,          +
               "Total Cost": 31.40,           +
               "Plan Rows": 2140,             +
               "Plan Width": 8                +
             }                                +
           ]                                  +
         }                                    +
       ]                                      +
     }                                        +
   }                                          +
 ]

For me, the presence or absence of quotes around the values would make
little difference to the readability, compared to all those brackets.



>=A0While the idea of optimizing a machine-readable format
> for human-readability doesn't typically carry much water around here,
> it's really the only use case for having this particular feature at
> all, so, if we're not going to rip it out, ISTM we ought to respect
> what it's there for. =A0I would be more than willing to agree that if
> one more attempt isn't sufficient to fix the problem then we'll either
> quote everything, or rip the whole thing out.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>> Hmm. Well it's quite subjective, but IMO it's already more readable
>> than JSON regardless of whether or not values are quoted, simply
>> because it doesn't have [ ] and { } for lists and maps, which for JSON
>> adds significantly to the number of lines in longer plans.
>
> Yeah. =A0Also, I think it would be fair to not quote values that are known
> constants (for example, Node Type: Seq Scan) and are chosen to not need
> quoting. =A0It's just the things that are variables that worry me.

Passing down information about which things are known constants seems
more complicated to me than just getting the quoting rules right in
the first place.  If you look at the patch I proposed, you'll see that
it's really quite simple and only a slight tightening of what I
committed already.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Tom Lane
Дата:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> Hmm. Well it's quite subjective, but IMO it's already more readable
> than JSON regardless of whether or not values are quoted, simply
> because it doesn't have [ ] and { } for lists and maps, which for JSON
> adds significantly to the number of lines in longer plans.

Yeah.  Also, I think it would be fair to not quote values that are known
constants (for example, Node Type: Seq Scan) and are chosen to not need
quoting.  It's just the things that are variables that worry me.

            regards, tom lane

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 4:47 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wro=
te:
> On 9 June 2010 20:56, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Jun 9, 2010 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>>>> Hmm. Well it's quite subjective, but IMO it's already more readable
>>>> than JSON regardless of whether or not values are quoted, simply
>>>> because it doesn't have [ ] and { } for lists and maps, which for JSON
>>>> adds significantly to the number of lines in longer plans.
>>>
>>> Yeah. =A0Also, I think it would be fair to not quote values that are kn=
own
>>> constants (for example, Node Type: Seq Scan) and are chosen to not need
>>> quoting. =A0It's just the things that are variables that worry me.
>>
>> Passing down information about which things are known constants seems
>> more complicated to me than just getting the quoting rules right in
>> the first place. =A0If you look at the patch I proposed, you'll see that
>> it's really quite simple and only a slight tightening of what I
>> committed already.
>>
>
> Reading the YAML spec, I've just spotted yet another case that'll
> break what you're proposing: if you don't quote "true" and "false",
> the parser will think they're booleans rather than strings.
>
> This is really why I'm opposed to this approach. There are just so
> many gotchas that it's impossible to be 100% sure that you've
> accounted for them all.

OK, I give up.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Dean Rasheed
Дата:
On 9 June 2010 20:56, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>>> Hmm. Well it's quite subjective, but IMO it's already more readable
>>> than JSON regardless of whether or not values are quoted, simply
>>> because it doesn't have [ ] and { } for lists and maps, which for JSON
>>> adds significantly to the number of lines in longer plans.
>>
>> Yeah. =A0Also, I think it would be fair to not quote values that are kno=
wn
>> constants (for example, Node Type: Seq Scan) and are chosen to not need
>> quoting. =A0It's just the things that are variables that worry me.
>
> Passing down information about which things are known constants seems
> more complicated to me than just getting the quoting rules right in
> the first place. =A0If you look at the patch I proposed, you'll see that
> it's really quite simple and only a slight tightening of what I
> committed already.
>

Reading the YAML spec, I've just spotted yet another case that'll
break what you're proposing: if you don't quote "true" and "false",
the parser will think they're booleans rather than strings.

This is really why I'm opposed to this approach. There are just so
many gotchas that it's impossible to be 100% sure that you've
accounted for them all.

Regards,
Dean

Re: Invalid YAML output from EXPLAIN

От
Robert Haas
Дата:
On Wed, Jun 9, 2010 at 4:48 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 9, 2010 at 4:47 PM, Dean Rasheed <dean.a.rasheed@gmail.com> w=
rote:
>> On 9 June 2010 20:56, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Wed, Jun 9, 2010 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>>>>> Hmm. Well it's quite subjective, but IMO it's already more readable
>>>>> than JSON regardless of whether or not values are quoted, simply
>>>>> because it doesn't have [ ] and { } for lists and maps, which for JSON
>>>>> adds significantly to the number of lines in longer plans.
>>>>
>>>> Yeah. =A0Also, I think it would be fair to not quote values that are k=
nown
>>>> constants (for example, Node Type: Seq Scan) and are chosen to not need
>>>> quoting. =A0It's just the things that are variables that worry me.
>>>
>>> Passing down information about which things are known constants seems
>>> more complicated to me than just getting the quoting rules right in
>>> the first place. =A0If you look at the patch I proposed, you'll see that
>>> it's really quite simple and only a slight tightening of what I
>>> committed already.
>>>
>>
>> Reading the YAML spec, I've just spotted yet another case that'll
>> break what you're proposing: if you don't quote "true" and "false",
>> the parser will think they're booleans rather than strings.
>>
>> This is really why I'm opposed to this approach. There are just so
>> many gotchas that it's impossible to be 100% sure that you've
>> accounted for them all.
>
> OK, I give up.

I have committed your patch, with some changes to the comments.

Thanks for bearing with me.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Invalid YAML output from EXPLAIN

От
Bruce Momjian
Дата:
Robert Haas wrote:
> On Wed, Jun 9, 2010 at 4:48 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Wed, Jun 9, 2010 at 4:47 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> >> On 9 June 2010 20:56, Robert Haas <robertmhaas@gmail.com> wrote:
> >>> On Wed, Jun 9, 2010 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>>> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> >>>>> Hmm. Well it's quite subjective, but IMO it's already more readable
> >>>>> than JSON regardless of whether or not values are quoted, simply
> >>>>> because it doesn't have [ ] and { } for lists and maps, which for JSON
> >>>>> adds significantly to the number of lines in longer plans.
> >>>>
> >>>> Yeah. ?Also, I think it would be fair to not quote values that are known
> >>>> constants (for example, Node Type: Seq Scan) and are chosen to not need
> >>>> quoting. ?It's just the things that are variables that worry me.
> >>>
> >>> Passing down information about which things are known constants seems
> >>> more complicated to me than just getting the quoting rules right in
> >>> the first place. ?If you look at the patch I proposed, you'll see that
> >>> it's really quite simple and only a slight tightening of what I
> >>> committed already.
> >>>
> >>
> >> Reading the YAML spec, I've just spotted yet another case that'll
> >> break what you're proposing: if you don't quote "true" and "false",
> >> the parser will think they're booleans rather than strings.
> >>
> >> This is really why I'm opposed to this approach. There are just so
> >> many gotchas that it's impossible to be 100% sure that you've
> >> accounted for them all.
> >
> > OK, I give up.
>
> I have committed your patch, with some changes to the comments.
>
> Thanks for bearing with me.

So, is there still value to a YAML format vs. JSON?  They look similar
to me in this simple case:

    test=> EXPLAIN (FORMAT JSON) SELECT * FROM pg_class;
                 QUERY PLAN
    ------------------------------------
     [                                 +
       {                               +
         "Plan": {                     +
           "Node Type": "Seq Scan",    +
           "Relation Name": "pg_class",+
           "Alias": "pg_class",        +
           "Startup Cost": 0.00,       +
           "Total Cost": 9.53,         +
           "Plan Rows": 253,           +
           "Plan Width": 190           +
         }                             +
       }                               +
     ]
    (1 row)

    test=> EXPLAIN (FORMAT YAML) SELECT * FROM pg_class;
              QUERY PLAN
    -------------------------------
     - Plan:                      +
         Node Type: "Seq Scan"    +
         Relation Name: "pg_class"+
         Alias: "pg_class"        +
         Startup Cost: 0.00       +
         Total Cost: 9.53         +
         Plan Rows: 253           +
         Plan Width: 190
    (1 row)

Is unquoted identifiers the only value for YAML?

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + None of us is going to be here forever. +

Re: Invalid YAML output from EXPLAIN

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> So, is there still value to a YAML format vs. JSON?  They look similar
> to me in this simple case:

Well, removing the various braces and brackets reduces the line count
significantly.  Not convinced it's really worth much though.

            regards, tom lane

Re: Invalid YAML output from EXPLAIN

От
Bruce Momjian
Дата:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > So, is there still value to a YAML format vs. JSON?  They look similar
> > to me in this simple case:
>
> Well, removing the various braces and brackets reduces the line count
> significantly.  Not convinced it's really worth much though.

Ah, I see that now.  Thanks.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + None of us is going to be here forever. +