Re: Parser Cruft in gram.y

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: Parser Cruft in gram.y
Дата
Msg-id CAM-w4HMPNOVfp+PV18hsaC+YVpCtOgzfmy-H2yfLV=AMEDQOGg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parser Cruft in gram.y  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Parser Cruft in gram.y
Список pgsql-hackers
On Tue, Dec 18, 2012 at 10:44 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Yeah, that's why I don't know how to make it work.  It feels like this
> is partly artifact of the tool, though.  I mean, suppose we haven't
> read anything yet.  Then, the next token can't be an IDENT, so if we
> see an unreserved keyword, we know we're not going to convert it to an
> IDENT.  OTOH, if we've seen CREATE TABLE, the next token cannot be an
> unreserved keyword that is intended as a keyword; it has to be
> something that will reduce to ColId.
>
> I guess the problem situation is where we can shift the keyword and
> then use the following token to decide whether to reduce it to
> ColId/type_function_name/ColLabel or use some other rule instead; the
> CREATE INDEX CONCURRENTLY productions might be such a case.

It seems to me the avenue for simplifying the transition table would
be keywords that can never be used in the same place. That is, if we
replaced all the elements of such a set with a single token then the
grammar would be unambigous and we could insert a check that the right
actual token was present in each place it's used. I'm thinking of the
various "noise words" that the SQL standard introduces which are all
going to be reduced to IDENT except for a few places each of which
will only admit one such noise word anyways.

I think doing this manually would be unmaintainable since every time
we modified the grammar it would introduce random unpredictable
conflicts which would be hard to debug. But I wonder if we could
preparse the transitions table, find any such large sets and rewrite
either the transition table or regenerate the grammar and rerun bison
on it.

Alternately we could just replace the transition table with a
representation that is less wasteful such as a list of perfect hash
tables just large enough to hold the valid transition. Or even a
single very large perfect hash table where the key is <state,token>.

But I'm not entirely convinced any of this is actually useful. Just
becuase the transition table is large doesn't mean it's inefficient.
Most of these bytes are being wasted on transitions which can never
occur or which can never occur in syntactically valid SQL. The
transitions which can occur will still be present in any condensed
representation we come up with. The L2 cache might still be large
enough to hold these hot transitions which might not be a very large
subset at all.

valgrind comes with a tool called cachegrind which can emulate the
cache algorithm on some variants of various cpus and produce reports.
Can it be made to produce a report for a specific block of memory?

-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: strange OOM errors with EXECUTE in PL/pgSQL
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [GENERAL] trouble with pg_upgrade 9.0 -> 9.1