Обсуждение: Use an enum for RELKIND_*?

Поиск
Список
Период
Сортировка

Use an enum for RELKIND_*?

От
Andres Freund
Дата:
Hi,

Right now there's no easy way to use the compiler to ensure that all
places that need to deal with all kinds of relkinds check a new
relkind.  I think we should make that easier by moving RELKIND_* to an
enum, with the existing letters as the value.

Obviously we cannot really do that for FormData_pg_class.relkind, but
switch() statements can easily cast that to RelationRelkind (or whatever
we name it).

Does anybody see a reason not to do so?

Greetings,

Andres Freund


Re: Use an enum for RELKIND_*?

От
Kyotaro HORIGUCHI
Дата:
Hello.

At Tue, 18 Dec 2018 17:13:08 -0800, Andres Freund <andres@anarazel.de> wrote in
<20181219011308.mopzyvc73nwjzdb6@alap3.anarazel.de>
> Hi,
> 
> Right now there's no easy way to use the compiler to ensure that all
> places that need to deal with all kinds of relkinds check a new
> relkind.  I think we should make that easier by moving RELKIND_* to an
> enum, with the existing letters as the value.

I feel the same pain and I had thought of a kind of that before.

> Obviously we cannot really do that for FormData_pg_class.relkind, but
> switch() statements can easily cast that to RelationRelkind (or whatever
> we name it).
> 
> Does anybody see a reason not to do so?

I think we cannot use enums having base-type, so it will work
unless we forget the cast within switch(). However, I don't think
it is not a reason not to do so.

switch ((RelationRelkind) rel->rd_rel->relkind)
{
  ...
}

char is compatible with integer under our usage there. FWIW I
don't mind explict assignments in the enum definition since we
already do the similar thing there.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Use an enum for RELKIND_*?

От
Tom Lane
Дата:
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> At Tue, 18 Dec 2018 17:13:08 -0800, Andres Freund <andres@anarazel.de> wrote in
<20181219011308.mopzyvc73nwjzdb6@alap3.anarazel.de>
>> Right now there's no easy way to use the compiler to ensure that all
>> places that need to deal with all kinds of relkinds check a new
>> relkind.  I think we should make that easier by moving RELKIND_* to an
>> enum, with the existing letters as the value.

> I think we cannot use enums having base-type, so it will work
> unless we forget the cast within switch(). However, I don't think
> it is not a reason not to do so.
>
> switch ((RelationRelkind) rel->rd_rel->relkind)

Hm.  This example doesn't seem very compelling.  If we put a
"default: elog(ERROR...)" into the switch, compilers will not warn
us about omitted values.  On the other hand, if we remove the default:
then we lose robustness against corrupted relkind values coming from disk,
which I think is going to be a net negative.  Not to mention that we get
no help at all unless we remember to add the cast to every such switch.

So, while I understand Andres' desire to make this more bulletproof,
I'm not quite sure how this proposal helps in practice.

            regards, tom lane


Re: Use an enum for RELKIND_*?

От
Andres Freund
Дата:
Hi,

On 2018-12-18 22:50:57 -0500, Tom Lane wrote:
> Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> > At Tue, 18 Dec 2018 17:13:08 -0800, Andres Freund <andres@anarazel.de> wrote in
<20181219011308.mopzyvc73nwjzdb6@alap3.anarazel.de>
> >> Right now there's no easy way to use the compiler to ensure that all
> >> places that need to deal with all kinds of relkinds check a new
> >> relkind.  I think we should make that easier by moving RELKIND_* to an
> >> enum, with the existing letters as the value.
> 
> > I think we cannot use enums having base-type, so it will work
> > unless we forget the cast within switch(). However, I don't think
> > it is not a reason not to do so.
> >
> > switch ((RelationRelkind) rel->rd_rel->relkind)
> 
> Hm.  This example doesn't seem very compelling.  If we put a
> "default: elog(ERROR...)" into the switch, compilers will not warn
> us about omitted values.  On the other hand, if we remove the default:
> then we lose robustness against corrupted relkind values coming from disk,
> which I think is going to be a net negative.

I don't quite see the from-disk bit being particularly critical, that
seems like a fairly minor safety net. Most of those switches are at
places considerably later than where on-disk corruption normally would
be apparent.  If we really are concerned with that, it'd not be too hard
to add a relkind_as_enum() wrapper function that errored out on an
inpermissible value.


> Not to mention that we get no help at all unless we remember to add
> the cast to every such switch.

That doesn't seem too bad. A manual search and replace would probably
find the majority within an hour or two of effort.


> So, while I understand Andres' desire to make this more bulletproof,
> I'm not quite sure how this proposal helps in practice.

We've repeatedly forgotten to add new relkinds for checks that we
already have. Manually adding the cast won't be bulletproof, but it sure
would be more robust than what we have now.  There's plenty switches
where we do know that we want the exhaustive list, adding the cast there
would already make things more robust, even if we miss other places.

It'd be nice if there were an easy way to write a switch() where the
compiler enforces that all enum values are checked, but still had the
possibility to have a 'default:' block for error checking... I can't
quite come up with a good approach to emulate that though.

Greetings,

Andres Freund


Re: Use an enum for RELKIND_*?

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> It'd be nice if there were an easy way to write a switch() where the
> compiler enforces that all enum values are checked, but still had the
> possibility to have a 'default:' block for error checking... I can't
> quite come up with a good approach to emulate that though.

Yeah, that would sure make things better.  I was considering
something like

    switch (enumvalue)
    {
        case A: ...
        case B: ...
        ...

#ifndef USE_ASSERT_CHECKING
        default:
            elog(ERROR, ...);
#endif
    }

so that you get the runtime protection in production builds but not
debug builds ... except that yes, you really want that protection in
debug builds too.  Maybe the #if could be on some other symbol so that
the default: is normally enabled in all builds, but we have some
lonely buildfarm animal that disables it and builds with -Werror to
get our attention for omitted cases?

            regards, tom lane


Re: Use an enum for RELKIND_*?

От
Andres Freund
Дата:
Hi,

On 2018-12-18 23:17:54 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > It'd be nice if there were an easy way to write a switch() where the
> > compiler enforces that all enum values are checked, but still had the
> > possibility to have a 'default:' block for error checking... I can't
> > quite come up with a good approach to emulate that though.
> 
> Yeah, that would sure make things better.  I was considering
> something like
> 
>     switch (enumvalue)
>     {
>         case A: ...
>         case B: ...
>         ...
> 
> #ifndef USE_ASSERT_CHECKING
>         default:
>             elog(ERROR, ...);
> #endif
>     }
> 
> so that you get the runtime protection in production builds but not
> debug builds ... except that yes, you really want that protection in
> debug builds too.  Maybe the #if could be on some other symbol so that
> the default: is normally enabled in all builds, but we have some
> lonely buildfarm animal that disables it and builds with -Werror to
> get our attention for omitted cases?

Yea, that's the best I can come up with too.  I think we'd probably want
to have the warning available during development mainly, so I'm not sure
the "lonely buildfarm animal" approach buys us enough.


There's -Wswitch-enum which causes a warning to emitted for missing enum
cases even if there's default:.  Obviously that's not generally usable,
because there's a lot of cases where intentionally do not want to be
exhaustive. We could just cast to int in those, or locally disable the
warnings, but neither sounds particularly enticing...

Greetings,

Andres Freund


Re: Use an enum for RELKIND_*?

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2018-12-18 23:17:54 -0500, Tom Lane wrote:
>> Yeah, that would sure make things better.  I was considering
>> something like
>>
>>     #ifndef USE_ASSERT_CHECKING
>>         default:
>>             elog(ERROR, ...);
>>     #endif

> Yea, that's the best I can come up with too.  I think we'd probably want
> to have the warning available during development mainly, so I'm not sure
> the "lonely buildfarm animal" approach buys us enough.

Well, if it's controlled by some dedicated symbol ("CHECK_ENUM_SWITCHES"
or such) then you could certainly run a test compile with that turned
on, if you felt the need to.  But I don't really buy the complaint
that leaving it to the buildfarm to find such problems wouldn't work.
They're almost always simple, easily-fixed oversights.

Also, if we wrap all this up in a macro then it'd be possible to do
other things by adjusting the macro.  I'm envisioning

    switch ((RelationRelkind) rel->rd_rel->relkind)
    {
        case ...
        ...

        CHECK_ENUM_SWITCH(RelationRelkind, rel->rd_rel->relkind);
    }

where the macro expands to empty when you want compiler warnings,
or to something like

    default: elog(ERROR, "unsupported value %d of enum %s", ...)

in normal builds, or maybe some other way for special purposes.

(I guess pgindent would indent this as though it were part of the last
"case", which is slightly annoying, but if you leave a blank line
before it then that's probably fine.)

            regards, tom lane


Re: Use an enum for RELKIND_*?

От
Greg Stark
Дата:
Out of curiosity I built with -Wswitch-enum to see if it would be
possible to just enable it. It looks like the main culprits are the
node types and if those were switched to #defines it might be feasible
to do so though it would still be a lot of hassle to add case labels
all over the place.

But I had a second look to see if the output pointed out any actual
bugs. I found one though it's pretty minor:

lockfuncs.c:234:3: warning: enumeration value
‘LOCKTAG_SPECULATIVE_TOKEN’ not handled in switch [-Wswitch-enum]
   switch ((LockTagType) instance->locktag.locktag_type)
   ^~~~~~

It just causes speculative locks to be printed wrong in the
pg_lock_status view. And speculative locks are only held transiently
so unless you're do a bulk load using ON CONFLICT (and also cared
about pg_lock_status, perhaps in some monitoring tool) you wouldn't
even notice this:

 postgres***=# select * from pg_lock_status() where locktype =
'speculative token';

┌───────────────────┬──────────┬──────────┬──────┬───────┬────────────┬───────────────┬─────────┬───────┬──────────┬────────────────────┬───────┬───────────────┬─────────┬──────────┐
│     locktype      │ database │ relation │ page │ tuple │ virtualxid
│ transactionid │ classid │ objid │ objsubid │ virtualtransaction │
pid  │     mode      │ granted │ fastpath │

├───────────────────┼──────────┼──────────┼──────┼───────┼────────────┼───────────────┼─────────┼───────┼──────────┼────────────────────┼───────┼───────────────┼─────────┼──────────┤
│ speculative token │      634 │          │      │       │
│               │       1 │     0 │        0 │ 4/32               │
18652 │ ExclusiveLock │ t       │ f        │

└───────────────────┴──────────┴──────────┴──────┴───────┴────────────┴───────────────┴─────────┴───────┴──────────┴────────────────────┴───────┴───────────────┴─────────┴──────────┘

The speculative lock is actually on a transaction ID and "speculative
lock token" so the "database", "objid", and "objsubid" are bogus here.

Re: Use an enum for RELKIND_*?

От
Alvaro Herrera
Дата:
On 2018-Dec-21, Greg Stark wrote:

> But I had a second look to see if the output pointed out any actual
> bugs. I found one though it's pretty minor:
> 
> lockfuncs.c:234:3: warning: enumeration value
> ‘LOCKTAG_SPECULATIVE_TOKEN’ not handled in switch [-Wswitch-enum]
>    switch ((LockTagType) instance->locktag.locktag_type)
>    ^~~~~~
> 
> It just causes speculative locks to be printed wrong in the
> pg_lock_status view.

So what's a good fix?  We can add a new case to the switch.  Reporting
the XID is easy since we have a column for that, but what to do about
the uint32 value of the token?  We could put it in the virtualxid column
(which is just text), or we could put it in some int32 column and let it
show as negative; or we could add a new column.

Or we could do nothing, since there are no complaints about this
problem.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Use an enum for RELKIND_*?

От
Kyotaro HORIGUCHI
Дата:
At Wed, 19 Dec 2018 09:55:22 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in <10268.1545231322@sss.pgh.pa.us>
> Andres Freund <andres@anarazel.de> writes:
> > On 2018-12-18 23:17:54 -0500, Tom Lane wrote:
> >> Yeah, that would sure make things better.  I was considering
> >> something like
> >>
> >>     #ifndef USE_ASSERT_CHECKING
> >>         default:
> >>             elog(ERROR, ...);
> >>     #endif
> 
> > Yea, that's the best I can come up with too.  I think we'd probably want
> > to have the warning available during development mainly, so I'm not sure
> > the "lonely buildfarm animal" approach buys us enough.
> 
> Well, if it's controlled by some dedicated symbol ("CHECK_ENUM_SWITCHES"
> or such) then you could certainly run a test compile with that turned
> on, if you felt the need to.  But I don't really buy the complaint
> that leaving it to the buildfarm to find such problems wouldn't work.
> They're almost always simple, easily-fixed oversights.
> 
> Also, if we wrap all this up in a macro then it'd be possible to do
> other things by adjusting the macro.  I'm envisioning
> 
>     switch ((RelationRelkind) rel->rd_rel->relkind)
>     {
>         case ...
>         ...
> 
>         CHECK_ENUM_SWITCH(RelationRelkind, rel->rd_rel->relkind);
>     }

I might misunderstand something, but my compiler (gcc 7.3.1)
won't be quiet about omitted value even with default:.

>   switch ((x) v)
>   {
>      case A: ...
>      case B: ...
>      default: ...
>   }
> 
> t.c:12:3: warning: enumeration value 'C' not handled in switch [-Wswitch-enum]

Isn't it enough that at least one platform correctly warns that?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Use an enum for RELKIND_*?

От
Greg Stark
Дата:


On Wed 16 Jan 2019, 18:10 Alvaro Herrera 

> Or we could do nothing, since there are no complaints about this
problem.

Unless there are objections my patch is to some add it to the xid case. The tag is pretty useless for users and it's incredibly hard to even see these rows anyways 

I'll take care of it if there's no objections

Re: Use an enum for RELKIND_*?

От
Tom Lane
Дата:
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> I might misunderstand something, but my compiler (gcc 7.3.1)
> won't be quiet about omitted value even with default:.
> ...

I would call that a compiler bug, TBH.  The code is 100% correct,
if you intended to allow the default case to handle some enum
values, which is perfectly reasonable coding.

> Isn't it enough that at least one platform correctly warns that?

No, especially not if it's only a warning.  Many developers would
not see it initially, and the buildfarm likely wouldn't complain
either.

            regards, tom lane


Re: Use an enum for RELKIND_*?

От
Kyotaro HORIGUCHI
Дата:
At Thu, 24 Jan 2019 09:37:41 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in <15760.1548340661@sss.pgh.pa.us>
> Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> > I might misunderstand something, but my compiler (gcc 7.3.1)
> > won't be quiet about omitted value even with default:.
> > ...
> 
> I would call that a compiler bug, TBH.  The code is 100% correct,
> if you intended to allow the default case to handle some enum
> values, which is perfectly reasonable coding.

Yeah, the code is correct. I had switch-enum in my mind.

We can use #pragma (or _Pragma) to apply an option to a specific
region.

====
#pragma GCC diagnostic push
#pragma GCC diagnostic warning "-Wswitch-enum"
    switch ((type) something)
    {
#pragma GCC diagnostic pop
====

but I don't find a usable form of syntax sugar to wrap this.  The
best I can think of is...


#define STRICT_SWITCH(type, value) { \
    _Pragma ("GCC diagnostic push")\
    _Pragma ("GCC diagnostic warning \"-Wswitch-enum\"")\
    switch((type) (value))

#define END_STRICT_SWITCH() \
    _Pragma ("GCC diagnostic pop") }


(The brace causes syntax error when END_ is omitted, but the
error messages is not so developer friendly...)

====
    STRICT_SWITCH(type, var)
    {
    case xxx:
    ...
    default:
      error(ERROR, (errmsg ("unexpected value %d" (int)var)));
    }
    END_STRICT_SWITCH();
====


> > Isn't it enough that at least one platform correctly warns that?
> 
> No, especially not if it's only a warning.  Many developers would
> not see it initially, and the buildfarm likely wouldn't complain
> either.

I agree that it would be bothersome for people who are working on
such platforms.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center