Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
Дата	12 апреля 2017 г. 23:34:37
Msg-id	20170412173437.qfqfnl6k3icpfczx@alap3.anarazel.de обсуждение исходный текст
Ответ на	[HACKERS] Cutting initdb's runtime (Perl question embedded) (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения

On 2017-04-12 10:12:47 -0400, Tom Lane wrote:
> Andres mentioned, and I've confirmed locally, that a large chunk of
> initdb's runtime goes into regprocin's brute-force lookups of function
> OIDs from function names.  The recent discussion about cutting TAP test
> time prompted me to look into that question again.  We had had some
> grand plans for getting genbki.pl to perform the name-to-OID conversion
> as part of a big rewrite, but since that project is showing few signs
> of life, I'm thinking that a more localized performance fix would be
> a good thing to look into.  There seem to be a couple of plausible
> routes to a fix:
> 
> 1. The best thing would still be to make genbki.pl do the conversion,
> and write numeric OIDs into postgres.bki.  The core stumbling block
> here seems to be that for most catalogs, Catalog.pm and genbki.pl
> never really break down a DATA line into fields --- and we certainly
> have got to do that, if we're going to replace the values of regproc
> fields.  The places that do need to do that approximate it like this:
> 
>     # To construct fmgroids.h and fmgrtab.c, we need to inspect some
>     # of the individual data fields.  Just splitting on whitespace
>     # won't work, because some quoted fields might contain internal
>     # whitespace.  We handle this by folding them all to a simple
>     # "xxx". Fortunately, this script doesn't need to look at any
>     # fields that might need quoting, so this simple hack is
>     # sufficient.
>     $row->{bki_values} =~ s/"[^"]*"/"xxx"/g;
>     @{$row}{@attnames} = split /\s+/, $row->{bki_values};
> 
> We would need a bullet-proof, non-hack, preferably not too slow way to
> split DATA lines into fields properly.  I'm one of the world's worst
> Perl programmers, but surely there's a way?

I've done something like 1) before:
http://archives.postgresql.org/message-id/20150221230839.GE2037%40awork2.anarazel.de

I don't think the speeds matters all that much, because we'll only do it
when generating the .bki file - a couple ms more or less won't matter
much.

I IIRC spent some more time to also load the data files from a different
format:
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/sane-catalog
although that's presumably heavily outdated now.

- Andres

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alexander Kuzmenkov
Дата: 12 апреля 2017 г., 23:23:22
Сообщение: Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

Следующее

От: Heikki Linnakangas
Дата: 12 апреля 2017 г., 23:34:38
Сообщение: Re: [HACKERS] Letting the client choose the protocol to use during aSASL exchange

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

Предыдущее

Следующее