Re: patch: preload dictionary new version

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: patch: preload dictionary new version
Дата
Msg-id AANLkTim02dngqQfnSeYASVDu7jhWEbWRAzjGpdj-ABZL@mail.gmail.com
обсуждение исходный текст
Ответ на Re: patch: preload dictionary new version  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: patch: preload dictionary new version  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
2010/7/8 Robert Haas <robertmhaas@gmail.com>:
> On Thu, Jul 8, 2010 at 7:03 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> 2010/7/8 Robert Haas <robertmhaas@gmail.com>:
>>> On Wed, Jul 7, 2010 at 10:50 PM, Takahiro Itagaki
>>> <itagaki.takahiro@oss.ntt.co.jp> wrote:
>>>> This patch allocates memory with non-file-based mmap() to preload text search
>>>> dictionary files at the server start. Note that dist files are not mmap'ed
>>>> directly in the patch; mmap() is used for reallocatable shared memory.
>>>
>>> I thought someone (Tom?) had proposed idea previously of writing a
>>> dictionary precompiler that would produce a file which could then be
>>> mmap()'d into the backend.  Has any thought been given to that
>>> approach?
>>
>> The precompiler can save only some time related to parsing. But it
>> isn't main issue. Without simple allocation the data from dictionary
>> takes about 55 MB, with simple allocation about 10 MB. If you have a
>> 100 max_session, then these data can be 100 x repeated in memory -
>> about 1G (for Czech dictionary).  I think so memory can be used
>> better.
>
> A precompiler can give you all the same memory management benefits.
>
>> Minimally you have to read these 10MB from disc - maybe from file
>> cache - but it takes some time too - but it will be significantly
>> better than now.
>
> If you use mmap(), you don't need to anything of the sort.  And the
> EXEC_BACKEND case doesn't require as many gymnastics, either.  And the
> variable can be PGC_SIGHUP or even PGC_USERSET instead of
> PGC_POSTMASTER.

I use mmap(). And with  mmap the precompiler are not necessary.
Dictionary is loaded only one time - in original ispell format. I
think, it is much more simple for administration - just copy ispell
files. There are not some possible problems with binary
incompatibility, you don't need to solve serialisation,
deserialiasation, ...you don't need to copy TSearch ispell parser code
to client application - probably we would to support not compiled
ispell dictionaries still. Using a precompiler means a new questions
for upgrade!

The real problem is using a some API on MS Windows, where mmap doesn't exist.

I think we can divide this problem to three parts

a) simple allocator - it can be used not only for TSearch dictionaries.
b) sharing a data - it is important for large dictionaries
c) preloading - it decrease load time of first TSearch query

Regards

Pavel Stehule



>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: patch: preload dictionary new version
Следующее
От: Robert Haas
Дата:
Сообщение: leaky views, yet again