Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

Поиск

Список

Период

Сортировка

От	Joel Jacobson
Тема	Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Дата	21 марта 2019 г. 22:46:21
Msg-id	CAASwCXff9pO=bqw2xZha-u5WZ9bRoZ-vSt2_AydQ73zm5LnDtg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Mar 20, 2019 at 9:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Joel Jacobson <joel@trustly.com> writes:
> I've seen a performance trick in other hash functions [1]
> to instead read multiple bytes in each iteration,
> and then handle the remaining bytes after the loop.
> [1] https://github.com/wangyi-fudan/wyhash/blob/master/wyhash.h#L29

I can't get very excited about this, seeing that we're only going to
be hashing short strings. I don't really believe your 30% number
for short strings; and even if I did, there's no evidence that the
hash functions are worth any further optimization in terms of our
overall performance.

I went ahead and tested this approach anyway, since I need this algorithm in a completely different project.

The benchmark below shows stats for three different keywords per length, compiled with -O2:

$ c++ -O2 -std=c++14 -o bench_perfect_hash bench_perfect_hash.cc

$ ./bench_perfect_hash

keyword length char-a-time (ns) word-a-time (ns) diff (%)

as 2 3.30 2.62 -0.21

at 2 3.54 2.66 -0.25

by 2 3.30 2.59 -0.22

add 3 4.01 3.15 -0.21

all 3 4.04 3.11 -0.23

and 3 3.84 3.11 -0.19

also 4 4.50 3.17 -0.30

both 4 4.49 3.06 -0.32

call 4 4.95 3.42 -0.31

abort 5 6.09 4.02 -0.34

admin 5 5.26 3.65 -0.31

after 5 5.18 3.76 -0.27

access 6 5.97 3.91 -0.34

action 6 5.86 3.89 -0.34

always 6 6.10 3.77 -0.38

analyse 7 6.67 4.64 -0.30

analyze 7 7.09 4.87 -0.31

between 7 7.02 4.66 -0.34

absolute 8 7.49 3.82 -0.49

backward 8 7.13 3.88 -0.46

cascaded 8 7.23 4.17 -0.42

aggregate 9 8.04 4.49 -0.44

assertion 9 7.98 4.52 -0.43

attribute 9 8.03 4.44 -0.45

assignment 10 8.58 4.67 -0.46

asymmetric 10 9.07 4.57 -0.50

checkpoint 10 9.15 4.53 -0.51

constraints 11 9.58 5.14 -0.46

insensitive 11 9.62 5.30 -0.45

publication 11 10.30 5.60 -0.46

concurrently 12 10.36 4.81 -0.54

current_date 12 11.17 5.48 -0.51

current_role 12 11.15 5.10 -0.54

authorization 13 11.87 5.50 -0.54

configuration 13 11.50 5.51 -0.52

xmlattributes 13 11.72 5.66 -0.52

current_schema 14 12.17 5.58 -0.54

localtimestamp 14 11.78 5.46 -0.54

characteristics 15 12.77 5.97 -0.53

current_catalog 15 12.65 5.87 -0.54

current_timestamp 17 14.19 6.12 -0.57

Also, as best I can tell, the approach you propose would result
in an endianness dependence, meaning we'd have to have separate
lookup tables for BE and LE machines. That's not a dealbreaker
perhaps, but it is certainly another point on the "it's not worth it"
side of the argument.

I can see how the same problem has been worked-around in e.g. pg_crc32.h:

#ifdef WORDS_BIGENDIAN

#define FIN_CRC32C(crc) ((crc) = pg_bswap32(crc) ^ 0xFFFFFFFF)

#else

#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)

#endif

So I used the same trick in PerfectHash.pm:

$f .= sprintf "#ifdef WORDS_BIGENDIAN\n";

$f .= sprintf "\t\tc4 = pg_bswap32(c4);\n";

$f .= sprintf "#endif\n";

I've also tried to measure the overall effect by hacking postgres.c:

+ struct timespec start, stop;

+ clock_gettime( CLOCK_REALTIME, &start);

+ for (int i=0; i<100000; i++)

+ {

+ List *parsetree_list2;

+ MemoryContext oldcontext2;

+ oldcontext2 = MemoryContextSwitchTo(MessageContext);

+ parsetree_list2 = pg_parse_query(query_string);

+ MemoryContextSwitchTo(oldcontext2);

+// MemoryContextReset(MessageContext);

+ CHECK_FOR_INTERRUPTS();

+ }

+ clock_gettime( CLOCK_REALTIME, &stop);

+ printf("Bench: %f\n", ( stop.tv_sec - start.tv_sec ) + (double)( stop.tv_nsec - start.tv_nsec ) / 1000000000L );

I measured the time for a big query found here: https://wiki.postgresql.org/wiki/Index_Maintenance

I might be doing something wrong, but it looks like thee overall effect is a ~3% improvement.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Robert Haas
Дата: 21 марта 2019 г., 22:43:46
Сообщение: Re: propagating replica identity to partitions

Следующее

От: Robert Haas
Дата: 21 марта 2019 г., 22:51:40
Сообщение: Re: [HACKERS] Block level parallel vacuum

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

Вложения

Предыдущее

Следующее