Faster "SET search_path"

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Faster "SET search_path"
Дата
Msg-id 04c8592dbd694e4114a3ed87139a7a04e4363030.camel@j-davis.com
обсуждение исходный текст
Ответы Re: Faster "SET search_path"  (Isaac Morland <isaac.morland@gmail.com>)
Re: Faster "SET search_path"  (Nathan Bossart <nathandbossart@gmail.com>)
Re: Faster "SET search_path"  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Improve performance of "SET search_path".

Motivation:

Creating functions with a "SET search_path" clause is safer and more
secure because the function behavior doesn't change based on the
caller's search_path setting.

Setting search_path in the function declaration is especially important
for SECURITY DEFINER functions[1], but even SECURITY INVOKER functions
can be executed more like SECURITY DEFINER in some contexts (e.g.
REINDEX executing an index function). Also, it's just error-prone to
depend on the caller's search_path unless there's a specific reason you
want to do that.

Unfortunately, adding a "SET search_path" clause to functions slows
them down. The attached patches close the performance gap
substantially.

Changes:

0001: Transform the settings in proconfig into a List for faster
processing. This is simple and speeds up any proconfig setting.

0002: Introduce CheckIdentifierString(), which is a faster version of
SplitIdentifierString() that only validates, and can be used in
check_search_path().

0003: Cache of previous search_path settings. The key is the raw
namespace_search_path string and the role OID, and it caches the
computed OID list. Changes to the search_path setting or the role can
retrieve the cached OID list as long as nothing else invalidates the
cache (changes to the temp schema or a syscache invalidation of
pg_namespace or pg_role).

One behavior change in 0003 is that retrieving a cached OID list
doesn't call InvokeNamespaceSearchHook(). It would be easy enough to
disable caching when a hook exists, but I didn't see a reason to expect
that "SET search_path" must invoke that hook each time. Invoking it
when computing for the first time, or after a real invalidation, seemed
fine to me. Feedback on that is welcome.

Test:

  CREATE SCHEMA a;
  CREATE SCHEMA b;
  CREATE TABLE big(i) AS SELECT generate_series(1,20000000);
  VACUUM big; CHECKPOINT;
  CREATE FUNCTION inc(int) RETURNS INT
    LANGUAGE plpgsql
    AS $$ begin return $1+1; end; $$;
  CREATE FUNCTION inc_ab(int) RETURNS INT
    LANGUAGE plpgsql SET search_path = a, b
    AS $$ begin return $1+1; end; $$;

  -- baseline
  EXPLAIN ANALYZE SELECT inc(i) FROM big;

  -- test query
  EXPLAIN ANALYZE SELECT inc_ab(i) FROM big;

Results:

  baseline:          4.3s
  test query:
    without patch:  14.7s
    0001:           13.6s
    0001,0002:      10.4s
    0001,0002,0003:  8.6s

Timings were inconsistent for me so I took the middle of three runs.

It's a lot faster than without the patch. It's still 2X worse than not
specifying any search_path (baseline), but I think it brings it into
"usable" territory for more use cases.

Regards,
    Jeff Davis

[1]
https://www.postgresql.org/docs/current/sql-createfunction.html#SQL-CREATEFUNCTION-SECURITY

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Rui Zhao"
Дата:
Сообщение: pg_upgrade fails with in-place tablespace
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: POC, WIP: OR-clause support for indexes