full-text search with GiST indexes seems to be getting slower since 9.1

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема full-text search with GiST indexes seems to be getting slower since 9.1
Дата
Msg-id 20200502223649.5nm22aw3n3kqrell@development
обсуждение исходный текст
Ответы Re: full-text search with GiST indexes seems to be getting slower since 9.1  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

over the past couple of weeks I've been running various benchmarks with
the intent to demonstrate how the performance evolved since ~8.3. In
most tests we're doing pretty good, but I've noticed that full-text
search using GiST indexes is an annoying exception - it's getting slower
and slower since ~9.1.

I've been investigating it a bit, but I still don't have any idea what
might be causing this. So perhaps someone else might have an idea ...


The benchmark is not anything special - it simply loads all our mailing
list archives into a table, computes tsvectors from message bodies, and
then an index on the tsvectors. And then it runs ~33k tsqueries that
people were actually searching at our archives (I got this years ago
from someone, I don't recall the details). IMO it's a fairly realistic
benchmark, not something entirely made up.

Attached is a couple of charts illustrating the regression.

The machine is not particularly beefy (4 cores, 8GB of RAM, SSDs) but
it's the same for all the tests. The postgresql.conf was modified a bit,
nothing beyond basic typical tuning.


1) gist-91-vs-13-queries-per-second.png

This shows the throughput since 8.3, where we've been doing ~314 tps,
while now we're doing only about ~200 tps. In 9.1 we did about 270 tps,
and that's the number I'll use for comparisons because queries against
9.0 and older versions are returning fewer results, so and the index on
message body is much lower. So clearly something changed in 9.1, either
in how we compute the tsvector or something. Since 9.1 it seems pretty
stable, though. The main drop seems to happened between 9.2 and 9.3.

Note: All the durations (in all charts) are in milliseconds.


2) gist-large-91-vs-13.png

This plots durations of all 33k queries, comparing duration on 9.1
(x-axis) to 13 (y-axis). The diagonal means same duration on both
versions, anything above it means the query got slower. It's pretty
clear the queries are consistently slower, but the chart is log-scale so
it's not obvious what the slowdown is.


3) gist-large-91-vs-13-slowdown.png

This is a different view on the query durations, plotting 9.1 duration
on x-axis vs. (duration on 13 / duration on 9.1) on y-axis. So for
example 1.5 means the query on 13 takes about 1.5x longer than on 9.1.

The chart seems to say the slowdown is pretty consistent, about 50% for
the shortest queries and then gradually improving for longer ones. This
seems far too consistent to be noise, and the timings are actually
averages of 5 runs for each query (and it seems quite consistent).



I doubt this is merely due to changes in binary layout, or something to
do with compiler. The difference is a bit too high for that (the drop
from 270 to 200 is about 25%), and it seems to affect all versions since
9.3 about the same. Everything was built using the same gcc version
(9.2.0).  Only --enable-debug was used, everything else is the same.
There seem to be some minor variation in final CFLAGS, though.

I've done some basic profiling using perf, but I don't see anything
obvious in the profiles (attached).

I kept the data directories, so I can do additional test if needed.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: Failed test 'pg_recvlogical acknowledged changes, nothingpending on slot'
Следующее
От: Atsushi Torikoshi
Дата:
Сообщение: Re: pg_stat_reset_slru(name) doesn't seem to work as documented