Re: pgbench could not send data to client: Broken pipe

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: pgbench could not send data to client: Broken pipe
Дата
Msg-id 4C8905C5.1060709@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: pgbench could not send data to client: Broken pipe  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Ответы Re: pgbench could not send data to client: Broken pipe
Список pgsql-performance
Kevin Grittner wrote:
> Of course, the only way to really know some of these numbers is to
> test your actual application on the real hardware under realistic
> load; but sometimes you can get a reasonable approximation from
> early tests or "gut feel" based on experience with similar
> applications.

And that latter part only works if your gut is as accurate as Kevin's.
For most people, even a rough direct measurement is much more useful
than any estimate.

Anyway, Kevin's point--that ultimately you cannot really be executing
more things at once than you have CPUs--is an accurate one to remember
here.  One reason to put connection pooling in front of your database is
that it cannot handle thousands of active connections at once without
switching between them very frequently.  That wastes both CPU and other
resources with contention that could be avoided.

If you expect, say, 1000 simultaneous users, and you have 48 CPUs, there
is only 48ms worth of CPU time available to each user per second on
average.  If you drop that to 100 users using a pooler, they'll each get
480ms worth of it.  But no matter what, when the CPUs are busy enough to
always have a queued backlog, they will clear at best 48 * 1 second =
48000 ms of work from that queue each second, best case, no matter how
you setup the ratios here.

Now, imagine that the average query takes 24ms.  The two scenarios work
out like this:

Without pooler:  takes 24 / 48 = 0.5 seconds to execute in parallel with
999 other processes

With pooler:  Worst-case, the pooler queue is filled and there are 900
users ahead of this one, representing 21600 ms worth of work to clear
before this request will become active.  The query waits 21600 / 48000 =
0.45 seconds to get runtime on the CPU.  Once it starts, though, it's
only contending with 99 other processes, so it gets 1/100 of the
available resources.  480 ms of CPU time executes per second for this
query; it runs in 0.05 seconds at that rate.  Total runtime:  0.45 +
0.05 = 0.5 seconds!

So the incoming query in this not completely contrived case (I just
picked the numbers to make the math even) takes the same amount of time
to deliver a result either way.  It's just a matter of whether it spends
that time waiting for a clear slice of CPU time, or fighting with a lot
of other processes the whole way.  Once the incoming connections exceeds
CPUs by enough of a margin that a pooler can expect to keep all the CPUs
busy, it delivers results at the same speed as using a larger number of
connections.  And since the "without pooler" case assumes perfect
slicing of time into units, it's the unrealistic one; contention among
the 1000 processes will actually make it slower than the pooled version
in the real world.  You won't see anywhere close to 48000 ms worth of
work delivered per second anymore if the server is constantly losing its
CPU cache, swapping among an average of an average of 21
connections/CPU.  Whereas if it's only slightly more than 2 connections
per CPU, each of them should alternate between the two processes easily
enough.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


В списке pgsql-performance по дате отправления:

Предыдущее
От: David Kerr
Дата:
Сообщение: Re: pgbench could not send data to client: Broken pipe
Следующее
От: "Kevin Grittner"
Дата:
Сообщение: Re: pgbench could not send data to client: Broken pipe