Re: patch: preload dictionary new version

Поиск
Список
Период
Сортировка
От Takahiro Itagaki
Тема Re: patch: preload dictionary new version
Дата
Msg-id 20100708115051.FFE1.52131E4D@oss.ntt.co.jp
обсуждение исходный текст
Ответ на patch: preload dictionary new version  (Pavel Stehule <pavel.stehule@gmail.com>)
Ответы Re: patch: preload dictionary new version  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: patch: preload dictionary new version  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Pavel Stehule <pavel.stehule@gmail.com> wrote:

> this version has enhanced AllocSet allocator - it can use a  mmap API.

I review your patch and will report some comments. However, I don't have
test cases for the patch because there is no large dictionaries in the
default postgres installation. I'd like to ask you to supply test data
for the patch.

This patch allocates memory with non-file-based mmap() to preload text search
dictionary files at the server start. Note that dist files are not mmap'ed
directly in the patch; mmap() is used for reallocatable shared memory.

The dictinary loader is also modified a bit to use simple_alloc() instead
of palloc() for long-lived cache. It can reduce calls of AllocSetAlloc(),
that have some overheads to support pfree(). Since the cache is never
released, simple_alloc() seems to have better performance than palloc().
Note that the optimization will also work for non-preloaded dicts.

=== Questions ===
- How do backends share the dict cache? You might expect postmaster's catalog is inherited to backends with fork(), but
wedon't use fork() on Windows.
 

- Why are SQL functions dpreloaddict_init() and dpreloaddict_lexize() defined but not used?

=== Design ===
- You added 3 custom parameters (dict_preload.dictfile/afffile/stopwords), but I think text search configuration names
isbetter than file names. However, it requires system catalog access but we cannot access any catalog at the moment of
preloading.If config-name-based setting is difficult, we need to write docs about where we can get the dict names to be
preloadedinstead. (from \dFd+ ?)
 

- Do we need to support multiple preloaded dicts? I think dict_preload.* should accept a list of items to be loaded.
GUC_LIST_INPUTwill be a help.
 

- Server doesn't start when I added dict_preload to shared_preload_libraries and didn't add any custom parameters.
FATAL: missing AffFile parameter But server should start with no effects or print WARNING messages for "no dicts are
preloaded"in such case.
 

- We could replace simple_alloc() to a new MemoryContextMethods that doesn't support pfree() but has better
performance.It doesn't look ideal for me to implement simple_alloc() on the top of palloc().
 

=== Implementation ===
I'm sure that your patch is WIP, but I'll note some issues just in case.

- We need Makefile for contrib/dict_preload.

- mmap() is not always portable. We should check the availability in configure, and also have an alternative
implementationfor Win32.
 


Regards,
---
Takahiro Itagaki
NTT Open Source Software Center




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: ALTER TABLE SET STATISTICS requires AccessExclusiveLock
Следующее
От: marcin mank
Дата:
Сообщение: Re: Proposal for 9.1: WAL streaming from WAL buffers