Reorganize collation lookup time and place

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Reorganize collation lookup time and place
Дата
Msg-id 559d6b57-bf1e-6a34-5e30-9f3f4bf5ba1c@2ndquadrant.com
обсуждение исходный текст
Ответы Re: Reorganize collation lookup time and place  (Andreas Karlsson <andreas@proxel.se>)
Re: Reorganize collation lookup time and place  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Attached is a patch that reorganizes how and where collations are looked up.

Until now, a fmgr C function that makes use of collation information
(for example varstr_cmp(), str_toupper()) gets passed the collation OID,
looks up the collation with pg_newlocale_from_collation(), and then does
something with it.

With this change, the lookup is moved earlier, typically during executor
initialization.  The fmgr functions receive the locale pointer that is
the result of that lookup.

This makes the collation lookup more similar to the function lookup.  In
many cases they now happen right next to each other.
pg_newlocale_from_collation() becomes analogous to fmgr_info() and
pg_locale_t to FmgrInfo *.

We also save repeated lookups in repeated function calls.  I don't
expect massive performance gains, since it's all cached either way, but
it just makes more sense to look up the collation once per executor node
instead of once per function call.

Moreover, this makes the structure and logic of functions like
varstr_cmp() easier, because they don't have to deal with a mix of OIDs
and locale pointers that are sometimes not set depending on the value of
the other.  A locale pointer of zero now has the more natural meaning of
"nothing was provided" instead of a mix of nothing or default.
Conversely, if a collation was assigned for the call, the passed-in
pointer is always non-zero.

I think this would also help other efforts like allowing ICU to be used
for the default collation.  For that, we need to store the ICU collators
somewhere, so having NULL represent the default locale won't do anymore.

Some renaming might be in order.  The original collation implementation
used pg_newlocale_... and pg_locale_t as thin wrappers around the
operating system's newlocale() and locale_t.  Now these have grown into
larger structures with various catalog lookup information.  It might be
worth renaming, for example pg_locale_t to CollInfo *.  (It's also
confusing in this context that pg_locale_t is already a pointer but
FmgrInfo is a struct.)

Also, in order to avoid including pg_locale.h in a lot more places, I
have created an incomplete type fmLocalePtr in fmgr.h.  This isn't
necessarily the final solution, given the discussion in the previous
paragraph.  Maybe a better structure would be to have, say, CollInfo *
and whatever the lookup function is called in fmgr.h and fmgr.c, and
then only have fmgr.c include pg_locale.h and call into pg_locale.c.  Or
something similar.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: John Naylor
Дата:
Сообщение: Re: docs: outdated reference to recursive expression evaluation
Следующее
От: David Steele
Дата:
Сообщение: Add timeline to partial WAL segments