Re: [PERFORM] Question about network bandwidth usage between PostgreSQL’s client and server

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: [PERFORM] Question about network bandwidth usage between PostgreSQL’s client and server
Дата
Msg-id 517912C8.2010801@vmware.com
обсуждение исходный текст
Ответ на Question about network bandwidth usage between PostgreSQL’s client and server  (Kelphet Xiong <kelphet@gmail.com>)
Список pgsql-performance
On 25.04.2013 02:56, Kelphet Xiong wrote:
> In all the experiments, the lineitem and partsupp tables reside in memory
> because there is no io activities observed from iotop.
> Since there is enough network bandwidth (1Gb/s or 128MB/s) between
> client and server,
> I would like to know what determines the data transferring rate or the
> network bandwidth usage
> between a client and a server when network bandwidth is enough.

Since there's enough network bandwidth available, the bottleneck is
elsewhere. I don't know what it is in your example - maybe it's the I/O
capacity, or CPU required to process the result in the server before
it's sent over the network. It could also be in the client, on how fast
it can process the results coming from the server.

I'd suggest running 'top' on the server while the query is executed, and
keeping an eye on the CPU usage. If it's pegged at 100%, the bottleneck
is the server's CPU.

> For example, given that the size of each tuple of lineitem table is
> 88% of that of partsupp,
> why the average network usage for sequential scan of lineitem table is only 50%
> that of partsupp table? And why the average network usage of their
> join is higher
> than that of sequential scan of lineitem but lower than that of
> sequential scan of partsupp table?

Here's a wild guess: the query on lineitem is bottlenecked by CPU usage
in the server. A lot of CPU could be spent on converting the date fields
from on-disk format to the text representation that's sent over the
network; I've seen that conversion use up a lot of CPU time on some test
workloads. Try leaving out the date columns from the query to test that
theory.

If that's the bottleneck, you could try fetching the result in binary
format, that should consume less CPU in the server. You didn't mention
what client library you're using, but e.g with libpq, see the manual on
PQexecParams on how to set the result format.

- Heikki


В списке pgsql-performance по дате отправления:

Предыдущее
От: Kelphet Xiong
Дата:
Сообщение: Question about network bandwidth usage between PostgreSQL’s client and server
Следующее
От: Misa Simic
Дата:
Сообщение: different plans for the same query - different filter values