Why is there a tenfold difference between Postgres's alleged query execution time and packet transmission time?

Поиск
Список
Период
Сортировка
От Rollo Konig-Brock
Тема Why is there a tenfold difference between Postgres's alleged query execution time and packet transmission time?
Дата
Msg-id CAOkhyB7Cm=5GuRYr6LJQcxw_XKkHDeV_FVsCk4LDzN-Yig5wrA@mail.gmail.com
обсуждение исходный текст
Ответы Re: Why is there a tenfold difference between Postgres's alleged query execution time and packet transmission time?
Список pgsql-performance

Hey all,

I've been pulling my hair out over this for days now, as I'm trying to build a low latency application. Databases should be fast, but I can not work out why so much latency is added between the actual database process and the application code. For simple queries, that should take less than a millisecond, this mystery latency is by far the biggest performance hit.

For example, I have a very simple table on a local Postgres database, which takes 324us to query. This is absolutely fine.

postgres=# EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) select count("ID") from example t;
-[ RECORD 1 ]------------------------------------------------------
QUERY PLAN | Aggregate (actual rows=1 loops=1)
-[ RECORD 2 ]------------------------------------------------------
QUERY PLAN |   ->  Seq Scan on example t (actual rows=4119 loops=1)
-[ RECORD 3 ]------------------------------------------------------
QUERY PLAN | Planning Time: 0.051 ms
-[ RECORD 4 ]------------------------------------------------------
QUERY PLAN | Execution Time: 0.324 ms

But then if I want to connect to the database in Python, getting the actual data takes over 2500us! The loopback interface is not the problem here, I sanity checked that (it adds 100us at a stretch).

In [1]: %time cursor.execute("select count(\"ID\") from example;");r = cursor.fetchmany()
CPU times: user 316 µs, sys: 1.12 ms, total: 1.44 ms
Wall time: 2.73 ms

Investigating further, I opened up WireShark to see how long the packets themselves take.

Wireshark screenshot

From this I can tell that Postgres took 2594us to send the data. This kind of overhead for an loopback IPC call is way beyond what I would expect. Why is this? And how can I optimise this away?

This is a problem for the software ecosystem, because people start thinking databases are too slow to do simple lookups, and they start caching queries in Redis that take 300us to execute, but 3ms to actually fetch back to the client.  It adds up to an awful lot of latency.


As an aside, I've been really miffed by how bad database IPC performance has been in general.  Oracle has a problem where each ~1500 bytes makes receiving the packet 800us slower (on a gigabit network).


I'd really love to know what's happening here.


Rollo


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: LWLocks by LockManager slowing large DB
Следующее
От: "Daniel Westermann (DWE)"
Дата:
Сообщение: Strange behavior once statistics are there